[PR] Bump org.springframework:spring-context from 5.3.32 to 5.3.33 [tika]

2024-03-14 Thread via GitHub
dependabot[bot] opened a new pull request, #1662: URL: https://github.com/apache/tika/pull/1662 [![Dependabot compatibility

[PR] Bump aws.version from 1.12.679 to 1.12.680 [tika]

2024-03-14 Thread via GitHub
dependabot[bot] opened a new pull request, #1661: URL: https://github.com/apache/tika/pull/1661 Bumps `aws.version` from 1.12.679 to 1.12.680. Updates `com.amazonaws:aws-java-sdk-s3` from 1.12.679 to 1.12.680 Changelog Sourced from

[PR] Bump pdfbox.version from 3.0.1 to 3.0.2 [tika]

2024-03-14 Thread via GitHub
dependabot[bot] opened a new pull request, #1660: URL: https://github.com/apache/tika/pull/1660 Bumps `pdfbox.version` from 3.0.1 to 3.0.2. Updates `org.apache.pdfbox:xmpbox` from 3.0.1 to 3.0.2 Updates `org.apache.pdfbox:fontbox` from 3.0.1 to 3.0.2 Updates

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-03-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827248#comment-17827248 ] Hudson commented on TIKA-4166: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1555 (See

[jira] [Created] (TIKA-4213) Improvements to jdbc pipes reporter

2024-03-14 Thread Tim Allison (Jira)
Tim Allison created TIKA-4213: - Summary: Improvements to jdbc pipes reporter Key: TIKA-4213 URL: https://issues.apache.org/jira/browse/TIKA-4213 Project: Tika Issue Type: New Feature

[jira] [Comment Edited] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827241#comment-17827241 ] Tim Allison edited comment on TIKA-4211 at 3/14/24 8:20 PM: Step 3: Is there

[jira] [Comment Edited] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827230#comment-17827230 ] Tim Allison edited comment on TIKA-4211 at 3/14/24 8:17 PM: Step 2: In this

[jira] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211 ] Tim Allison deleted comment on TIKA-4211: --- was (Author: talli...@mitre.org): Or, if you grep for "embeddings" in the in uncompressed zip, can you find a link to the xlsx file? > Tika extractor

[jira] [Commented] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827241#comment-17827241 ] Tim Allison commented on TIKA-4211: --- Step 3: Is there something like this in /ppt/slides/slide2.xml:

[jira] [Commented] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827233#comment-17827233 ] Tim Allison commented on TIKA-4211: --- Or, if you grep for "embeddings" in the in uncompressed zip, can

[jira] [Commented] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827230#comment-17827230 ] Tim Allison commented on TIKA-4211: --- In this file within the zip: /ppt/slides/_rels/slide2.xml.rels: Do

[jira] [Commented] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Xiaohong Yang (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827221#comment-17827221 ] Xiaohong Yang commented on TIKA-4211: - Hi Tim,  Yes, I found the right file

[jira] [Updated] (TIKA-4212) Tika fails to get file extension of file type image/x-rtf-raw-bitmap

2024-03-14 Thread Xiaohong Yang (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaohong Yang updated TIKA-4212: Attachment: tika-config-and-sample-file.zip > Tika fails to get file extension of file type

[jira] [Created] (TIKA-4212) Tika fails to get file extension of file type image/x-rtf-raw-bitmap

2024-03-14 Thread Xiaohong Yang (Jira)
Xiaohong Yang created TIKA-4212: --- Summary: Tika fails to get file extension of file type image/x-rtf-raw-bitmap Key: TIKA-4212 URL: https://issues.apache.org/jira/browse/TIKA-4212 Project: Tika

[jira] [Commented] (TIKA-4210) Not able to identify tika extension

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827193#comment-17827193 ] Tim Allison commented on TIKA-4210: --- Those files look like this in the rtf file: {code:java}

[jira] [Commented] (TIKA-4210) Not able to identify tika extension

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827191#comment-17827191 ] Tim Allison commented on TIKA-4210: --- Nick is right. The file is an RTF file. Tika does find two embedded

[jira] [Commented] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827190#comment-17827190 ] Tim Allison commented on TIKA-4211: --- Y, as you point out, Tika works with the example file that you

[jira] [Created] (TIKA-4211) Tika extractor fails to extract embedded excel from pptx

2024-03-14 Thread Xiaohong Yang (Jira)
Xiaohong Yang created TIKA-4211: --- Summary: Tika extractor fails to extract embedded excel from pptx Key: TIKA-4211 URL: https://issues.apache.org/jira/browse/TIKA-4211 Project: Tika Issue

[jira] [Commented] (TIKA-4210) Not able to identify tika extension

2024-03-14 Thread Tika User (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827036#comment-17827036 ] Tika User commented on TIKA-4210: - The attached file is doc extension and from that file it should detect

Re: [PR] Bump com.google.protobuf:protobuf-java from 3.25.3 to 4.26.0 [tika]

2024-03-14 Thread via GitHub
dependabot[bot] commented on PR #1659: URL: https://github.com/apache/tika/pull/1659#issuecomment-1997093738 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let

Re: [PR] Bump com.google.protobuf:protobuf-java from 3.25.3 to 4.26.0 [tika]

2024-03-14 Thread via GitHub
THausherr closed pull request #1659: Bump com.google.protobuf:protobuf-java from 3.25.3 to 4.26.0 URL: https://github.com/apache/tika/pull/1659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Updated] (TIKA-4210) Not able to identify tika extension

2024-03-14 Thread Tika User (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tika User updated TIKA-4210: Description: Hi Team, The attached embedded file contain .MPGA attachments which tika is  not able to

[jira] [Commented] (TIKA-4210) Not able to identify tika extension

2024-03-14 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827017#comment-17827017 ] Nick Burch commented on TIKA-4210: -- The attached file seems to be an RTF file. I'm not sure what a ".mega

[jira] [Created] (TIKA-4210) Not able to identify tika extension

2024-03-14 Thread Tika User (Jira)
Tika User created TIKA-4210: --- Summary: Not able to identify tika extension Key: TIKA-4210 URL: https://issues.apache.org/jira/browse/TIKA-4210 Project: Tika Issue Type: Bug Reporter:

[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1

2024-03-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826996#comment-17826996 ] Tilman Hausherr commented on TIKA-4199: --- The original error you reported wasn't really a bug in

[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1

2024-03-14 Thread Alexander Veit (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826992#comment-17826992 ] Alexander Veit commented on TIKA-4199: -- The same error also occurs with Tika 2.9.1 and

Re: [PR] Bump com.google.guava:guava from 33.0.0-jre to 33.1.0-jre [tika]

2024-03-14 Thread via GitHub
THausherr merged PR #1657: URL: https://github.com/apache/tika/pull/1657 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] Bump aws.version from 1.12.678 to 1.12.679 [tika]

2024-03-14 Thread via GitHub
THausherr merged PR #1658: URL: https://github.com/apache/tika/pull/1658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: