[jira] [Updated] (TIKA-4246) tika-pipes FileSystemFetcher configuration option for file name/path pattern selection

2024-04-26 Thread Emil Zegers (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emil Zegers updated TIKA-4246: -- Description: Would be useful to have the possibility to configure FileSystemFetcher for tika-pipes

[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-04-26 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841252#comment-17841252 ] Tim Allison commented on TIKA-4243: --- https://json-schema.org/learn/getting-started-step-by-step Yes

[jira] [Comment Edited] (TIKA-4243) tika configuration overhaul

2024-04-26 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841242#comment-17841242 ] Tim Allison edited comment on TIKA-4243 at 4/26/24 1:32 PM: I really, really

[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-04-26 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841243#comment-17841243 ] Tim Allison commented on TIKA-4243: --- Oh, sorry. Does this break anything? Can we add this as a new

[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-04-26 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841242#comment-17841242 ] Tim Allison commented on TIKA-4243: --- I really, really want to clean up our configuration, and moving

[jira] [Comment Edited] (TIKA-4245) Tika does not get html content properly

2024-04-26 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841221#comment-17841221 ] Tim Allison edited comment on TIKA-4245 at 4/26/24 1:23 PM: Oops, sorry. I

[jira] [Created] (TIKA-4246) tika-pipes FileSystemFetcher configuration option for file name/path pattern selection

2024-04-26 Thread Emil Zegers (Jira)
Emil Zegers created TIKA-4246: - Summary: tika-pipes FileSystemFetcher configuration option for file name/path pattern selection Key: TIKA-4246 URL: https://issues.apache.org/jira/browse/TIKA-4246 Project

[jira] [Commented] (TIKA-4245) Tika does not get html content properly

2024-04-26 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841221#comment-17841221 ] Tim Allison commented on TIKA-4245: --- Oops, sorry. I didn't realize you sent your tika-config.xml. Y, one

[jira] [Commented] (TIKA-4245) Tika does not get html content properly

2024-04-26 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841220#comment-17841220 ] Tim Allison commented on TIKA-4245: --- This is an ongoing area for improvement in Tika. The algorithm

[jira] [Commented] (TIKA-4245) Tika does not get html content properly

2024-04-26 Thread Xiaohong Yang (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841209#comment-17841209 ] Xiaohong Yang commented on TIKA-4245: - [~tilman]  Can you detect the right charset (utf-8) and fix

[jira] [Commented] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840922#comment-17840922 ] Tilman Hausherr commented on TIKA-4245: --- The file claims to be utf-16 but it isn't. If I change

[jira] [Commented] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840908#comment-17840908 ] Tilman Hausherr commented on TIKA-4245: --- Happens also with the tika app GUI. > Tika does not

[jira] [Updated] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4245: -- Description: We use org.apache.tika.parser.AutoDetectParser to get the content of html files

[jira] [Created] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Xiaohong Yang (Jira)
Xiaohong Yang created TIKA-4245: --- Summary: Tika does not get html content properly Key: TIKA-4245 URL: https://issues.apache.org/jira/browse/TIKA-4245 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840893#comment-17840893 ] Hudson commented on TIKA-4244: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1612 (See

[jira] [Resolved] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4244. --- Fix Version/s: 3.0.0 2.9.3 Resolution: Fixed Thank you [~boomxlucifer

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840860#comment-17840860 ] ASF GitHub Bot commented on TIKA-4244: -- tballison merged PR #1731: URL: https://github.com/apache

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840850#comment-17840850 ] ASF GitHub Bot commented on TIKA-4244: -- tballison opened a new pull request, #1731: URL: https

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840852#comment-17840852 ] Tim Allison commented on TIKA-4244: --- Thank you [~boomxlucifer] for finding this and reporting

[jira] [Updated] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-24 Thread Kartik Jain (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kartik Jain updated TIKA-4244: -- Description: When tika-core detect(InputStream input, Metadata metadata) API is used to determimne

[jira] [Updated] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-24 Thread Kartik Jain (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kartik Jain updated TIKA-4244: -- Description: When tika-core detect(InputStream input, Metadata metadata) API is used to determimne

[jira] [Updated] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-24 Thread Kartik Jain (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kartik Jain updated TIKA-4244: -- Priority: Major (was: Minor) > Tika idenifies MIME type of ics files with html content as text/h

[jira] [Updated] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-24 Thread Kartik Jain (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kartik Jain updated TIKA-4244: -- Description: When tika-core detect(InputStream input, Metadata metadata) API is used to determimne

[jira] [Created] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-24 Thread Kartik Jain (Jira)
Kartik Jain created TIKA-4244: - Summary: Tika idenifies MIME type of ics files with html content as text/html Key: TIKA-4244 URL: https://issues.apache.org/jira/browse/TIKA-4244 Project: Tika

[jira] [Created] (TIKA-4243) tika configuration overhaul

2024-04-24 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4243: --- Summary: tika configuration overhaul Key: TIKA-4243 URL: https://issues.apache.org/jira/browse/TIKA-4243 Project: Tika Issue Type: New Feature

[jira] [Updated] (TIKA-4243) tika configuration overhaul

2024-04-24 Thread Nicholas DiPiazza (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas DiPiazza updated TIKA-4243: Description: In 3.0.0 when dealing with Tika, it would greatly help to have a Typed

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-04-22 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839780#comment-17839780 ] Tim Allison commented on TIKA-4166: ---  Thank you! > dependency updates for Tika

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-04-22 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839779#comment-17839779 ] Hudson commented on TIKA-4166: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1611 (See

[jira] [Comment Edited] (TIKA-4166) dependency updates for Tika 3.0

2024-04-22 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839745#comment-17839745 ] Tilman Hausherr edited comment on TIKA-4166 at 4/22/24 3:27 PM: It turned

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-04-22 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839745#comment-17839745 ] Tilman Hausherr commented on TIKA-4166: --- It turned out to be something different than the missing

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-04-22 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839652#comment-17839652 ] Tilman Hausherr commented on TIKA-4166: --- The latest Apache parent update means a javadoc update

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-04-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839242#comment-17839242 ] Hudson commented on TIKA-4166: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1607 (See

[jira] [Resolved] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4242. --- Resolution: Fixed > Tika depends on non-existing plexus-utils vers

[jira] [Commented] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838329#comment-17838329 ] Hudson commented on TIKA-4242: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1606 (See

[jira] [Commented] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838279#comment-17838279 ] ASF GitHub Bot commented on TIKA-4242: -- tballison merged PR #1727: URL: https://github.com/apache

[jira] [Commented] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838271#comment-17838271 ] ASF GitHub Bot commented on TIKA-4242: -- tballison opened a new pull request, #1727: URL: https

[jira] [Commented] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838270#comment-17838270 ] ASF GitHub Bot commented on TIKA-4242: -- tballison merged PR #1726: URL: https://github.com/apache

[jira] [Commented] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838260#comment-17838260 ] Tim Allison commented on TIKA-4242: --- Looks like the reason we haven't found this problem is that we

[jira] [Commented] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838200#comment-17838200 ] ASF GitHub Bot commented on TIKA-4242: -- Vampire opened a new pull request, #1726: URL: https

[jira] [Created] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread Jira
Björn Kautler created TIKA-4242: --- Summary: Tika depends on non-existing plexus-utils version Key: TIKA-4242 URL: https://issues.apache.org/jira/browse/TIKA-4242 Project: Tika Issue Type: Bug

[jira] [Updated] (TIKA-4242) Tika depends on non-existing plexus-utils version

2024-04-17 Thread Jira
[ https://issues.apache.org/jira/browse/TIKA-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Björn Kautler updated TIKA-4242: Description: In [https://github.com/apache/tika/pull/1461] [~tallison] moved the versions to Maven

[jira] [Commented] (TIKA-4241) Consider handling LibreOffice's /AdditionalStreams "hybrid PDF" attachment embedding in PDFs

2024-04-16 Thread Peter Wyatt (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837918#comment-17837918 ] Peter Wyatt commented on TIKA-4241: --- See also [Issue 111]([https://github.com/pdf-association/arlington

[jira] [Commented] (TIKA-4241) Consider handling LibreOffice's /AdditionalStreams "hybrid PDF" attachment embedding in PDFs

2024-04-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837806#comment-17837806 ] Tim Allison commented on TIKA-4241: --- They add a custom key in the trailer {{/AdditionalStreams}} whose

[jira] [Updated] (TIKA-4241) Consider handling LibreOffice's /AdditionalStreams "hybrid PDF" attachment embedding in PDFs

2024-04-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4241: -- Attachment: testPDF_additionalStreams.pdf > Consider handling LibreOffice's /AdditionalStreams &quo

[jira] [Created] (TIKA-4241) Consider handling LibreOffice's /AdditionalStreams "hybrid PDF" attachment embedding in PDFs

2024-04-16 Thread Tim Allison (Jira)
Tim Allison created TIKA-4241: - Summary: Consider handling LibreOffice's /AdditionalStreams "hybrid PDF" attachment embedding in PDFs Key: TIKA-4241 URL: https://issues.apache.org/jira/browse

[jira] [Updated] (TIKA-4241) Consider handling LibreOffice's /AdditionalStreams "hybrid PDF" attachment embedding in PDFs

2024-04-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4241: -- Description: Some info here: https://stackoverflow.com/questions/67358370/what-the-standard-used

[jira] [Commented] (TIKA-4181) Grpc + Tika Pipes - pipe iterator and emitter

2024-04-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837378#comment-17837378 ] ASF GitHub Bot commented on TIKA-4181: -- nddipiazza commented on code in PR #1702: URL: https

[jira] [Commented] (TIKA-4181) Grpc + Tika Pipes - pipe iterator and emitter

2024-04-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837377#comment-17837377 ] ASF GitHub Bot commented on TIKA-4181: -- nddipiazza commented on code in PR #1702: URL: https

[jira] [Commented] (TIKA-4240) Change dependabot to weekly

2024-04-11 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836257#comment-17836257 ] Hudson commented on TIKA-4240: -- FAILURE: Integrated in Jenkins build Tika » tika-main-jdk11 #1601 (See

[jira] [Commented] (TIKA-4240) Change dependabot to weekly

2024-04-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836236#comment-17836236 ] Tilman Hausherr commented on TIKA-4240: --- I prefer daily but if more people feel pressured or annoyed

[jira] [Commented] (TIKA-4240) Change dependabot to weekly

2024-04-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836228#comment-17836228 ] Tim Allison commented on TIKA-4240: --- Thank you, [~tilman]! Should I revert to daily? > Cha

[jira] [Updated] (TIKA-4240) Change dependabot to weekly

2024-04-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4240: -- Component/s: build > Change dependabot to wee

[jira] [Commented] (TIKA-4240) Change dependabot to weekly

2024-04-11 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836224#comment-17836224 ] Tilman Hausherr commented on TIKA-4240: --- Not a burden (that was Eric, sort-of), I just don't have

[jira] [Resolved] (TIKA-4240) Change dependabot to weekly

2024-04-11 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4240. --- Resolution: Fixed Let's see how this goes. Thank you! > Change dependabot to wee

[jira] [Created] (TIKA-4240) Change dependabot to weekly

2024-04-11 Thread Tim Allison (Jira)
Tim Allison created TIKA-4240: - Summary: Change dependabot to weekly Key: TIKA-4240 URL: https://issues.apache.org/jira/browse/TIKA-4240 Project: Tika Issue Type: Task Reporter: Tim

[jira] [Commented] (TIKA-4232) Create and execute unit tests for tika-helm

2024-04-08 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835077#comment-17835077 ] Lewis John McGibbney commented on TIKA-4232: It turns out that the original GitHub action I

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-04-07 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834608#comment-17834608 ] Hudson commented on TIKA-4166: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1593 (See

[jira] [Updated] (TIKA-4233) Check tika-helm for deprecated k8s APIs

2024-04-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4233: -- Fix Version/s: (was: 3.0.0) > Check tika-helm for deprecated k8s A

[jira] [Updated] (TIKA-4232) Create and execute unit tests for tika-helm

2024-04-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4232: -- Fix Version/s: (was: 3.0.0) > Create and execute unit tests for tika-h

[jira] [Commented] (TIKA-4238) replace some deprecated code

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834529#comment-17834529 ] Tilman Hausherr commented on TIKA-4238: --- This was a low-hanging fruit. I could also have done

[jira] [Comment Edited] (TIKA-4238) replace some deprecated code

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834529#comment-17834529 ] Tilman Hausherr edited comment on TIKA-4238 at 4/6/24 2:12 PM

[jira] [Resolved] (TIKA-4219) Figure out what to do with epubs with encrypted non-core content

2024-04-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4219. --- Fix Version/s: 2.9.2 Resolution: Fixed > Figure out what to do with epubs with encrypted

[jira] [Updated] (TIKA-4233) Check tika-helm for deprecated k8s APIs

2024-04-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4233: -- Fix Version/s: 3.0.0 (was: 2.9.2) > Check tika-helm for deprecated k8s A

[jira] [Updated] (TIKA-4232) Create and execute unit tests for tika-helm

2024-04-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4232: -- Fix Version/s: 3.0.0 (was: 2.9.2) > Create and execute unit tests for tika-h

[jira] [Updated] (TIKA-4218) Run regression tests to support 2.9.2 release

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4218: -- Affects Version/s: 2.9.1 > Run regression tests to support 2.9.2 rele

[jira] [Resolved] (TIKA-4218) Run regression tests to support 2.9.2 release

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved TIKA-4218. --- Assignee: Tim Allison Resolution: Fixed > Run regression tests to support 2.9.2 rele

[jira] [Assigned] (TIKA-4171) Tika server only returns last value for PDFs that have multiple of the same key

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reassigned TIKA-4171: - Assignee: Tim Allison > Tika server only returns last value for PDFs that have multi

[jira] [Updated] (TIKA-4218) Run regression tests to support 2.9.2 release

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4218: -- Fix Version/s: 2.9.2 > Run regression tests to support 2.9.2 rele

[jira] [Resolved] (TIKA-4171) Tika server only returns last value for PDFs that have multiple of the same key

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved TIKA-4171. --- Resolution: Fixed > Tika server only returns last value for PDFs that have multi

[jira] [Commented] (TIKA-4238) replace some deprecated code

2024-04-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834526#comment-17834526 ] Hudson commented on TIKA-4238: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1592 (See

[jira] [Resolved] (TIKA-4238) replace some deprecated code

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved TIKA-4238. --- Resolution: Fixed > replace some deprecated c

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834522#comment-17834522 ] Hudson commented on TIKA-4236: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1591 (See

[jira] [Commented] (TIKA-4238) replace some deprecated code

2024-04-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834523#comment-17834523 ] Hudson commented on TIKA-4238: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1591 (See

[jira] [Created] (TIKA-4239) Update to 2.9.3

2024-04-06 Thread Tilman Hausherr (Jira)
Tilman Hausherr created TIKA-4239: - Summary: Update to 2.9.3 Key: TIKA-4239 URL: https://issues.apache.org/jira/browse/TIKA-4239 Project: Tika Issue Type: Task Components: build

[jira] [Updated] (TIKA-4239) Update to 2.9.3

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4239: -- Affects Version/s: 2.9.2 > Update to 2.9.3 > --- > > Ke

[jira] [Resolved] (TIKA-4162) Update to 2.9.2

2024-04-06 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved TIKA-4162. --- Assignee: Tilman Hausherr Resolution: Fixed > Update to 2.

[jira] [Commented] (TIKA-4166) dependency updates for Tika 3.0

2024-04-06 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834517#comment-17834517 ] Hudson commented on TIKA-4166: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1590 (See

[jira] [Created] (TIKA-4238) replace some deprecated code

2024-04-06 Thread Tilman Hausherr (Jira)
Tilman Hausherr created TIKA-4238: - Summary: replace some deprecated code Key: TIKA-4238 URL: https://issues.apache.org/jira/browse/TIKA-4238 Project: Tika Issue Type: Task Affects

[jira] [Created] (TIKA-4237) Add JWT authentication ability to the http fetcher

2024-04-05 Thread Nicholas DiPiazza (Jira)
Nicholas DiPiazza created TIKA-4237: --- Summary: Add JWT authentication ability to the http fetcher Key: TIKA-4237 URL: https://issues.apache.org/jira/browse/TIKA-4237 Project: Tika Issue

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4236: -- Fix Version/s: 2.9.3 > tika-parser-nlp-module has an unnecessary Guava depende

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4236: -- Fix Version/s: (was: 2.9.2) > tika-parser-nlp-module has an unnecessary Guava depende

[jira] [Resolved] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved TIKA-4236. --- Assignee: Tilman Hausherr Resolution: Fixed > tika-parser-nlp-module has an unnecess

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4236: -- Fix Version/s: 2.9.2 3.0.0 > tika-parser-nlp-module has an unnecessary Gu

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834385#comment-17834385 ] Tilman Hausherr commented on TIKA-4236: --- I found only a test dependency mentioned directly. It's

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834317#comment-17834317 ] Hudson commented on TIKA-4236: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1589 (See

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Julian Reschke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834284#comment-17834284 ] Julian Reschke commented on TIKA-4236: -- Yep, that's what I meant :-). > tika-parser-nlp-module

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834282#comment-17834282 ] Tilman Hausherr commented on TIKA-4236: --- https://tika.apache.org/ "The Apache Tika PMC ha

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Julian Reschke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834280#comment-17834280 ] Julian Reschke commented on TIKA-4236: -- AFAIU, 1.x might get updates when security relevant

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834279#comment-17834279 ] Manfred Baedke commented on TIKA-4236: -- Man, you are slow! Yes, that's it, thx :)   ??Btw guava

[jira] [Comment Edited] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834279#comment-17834279 ] Manfred Baedke edited comment on TIKA-4236 at 4/5/24 12:24 PM: --- Man, you

[jira] [Comment Edited] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834277#comment-17834277 ] Tilman Hausherr edited comment on TIKA-4236 at 4/5/24 12:21 PM

[jira] [Commented] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834277#comment-17834277 ] Tilman Hausherr commented on TIKA-4236: --- Is this what you had in mind? https://github.com/apache

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manfred Baedke updated TIKA-4236: - Description: This should be avoided, because it's prone to maintenance and security problems

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manfred Baedke updated TIKA-4236: - Affects Version/s: 2.9.2 (was: 2.9.1) > tika-parser-nlp-module

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manfred Baedke updated TIKA-4236: - Description: This should be avoided, because it's prone to maintenance and security problems

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manfred Baedke updated TIKA-4236: - Affects Version/s: 3.0.0-BETA 2.9.1 1.28.5

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manfred Baedke updated TIKA-4236: - Affects Version/s: 2.0.0 > tika-parser-nlp-module has an unnecessary Guava depende

[jira] [Updated] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manfred Baedke updated TIKA-4236: - Description: This should be avoided, because it's prone to maintenance and security problems

[jira] [Created] (TIKA-4236) tika-parser-nlp-module has an unnecessary Guava dependency

2024-04-05 Thread Manfred Baedke (Jira)
Manfred Baedke created TIKA-4236: Summary: tika-parser-nlp-module has an unnecessary Guava dependency Key: TIKA-4236 URL: https://issues.apache.org/jira/browse/TIKA-4236 Project: Tika Issue

[jira] [Commented] (TIKA-4235) Add pipeline parameter to OpenSearch emitter

2024-04-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834041#comment-17834041 ] ASF GitHub Bot commented on TIKA-4235: -- tballison opened a new pull request, #1709: URL: https

  1   2   3   4   5   6   7   8   9   10   >