[jira] [Updated] (TIKA-3324) Add checkstyle checker

2021-03-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-3324: -- Description: I _think_ we can introduce this gently at first. And slowly fix files as time allows.  

[jira] [Commented] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302070#comment-17302070 ] Hudson commented on TIKA-3320: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch1x-jdk8 #99 (See

[jira] [Assigned] (TIKA-3323) FileCommandDetectorTest incocnistent results depending on platform

2021-03-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reassigned TIKA-3323: - Assignee: Tim Allison > FileCommandDetectorTest incocnistent results depending on platform >

[jira] [Created] (TIKA-3324) Add checkstyle checker

2021-03-15 Thread Tim Allison (Jira)
Tim Allison created TIKA-3324: - Summary: Add checkstyle checker Key: TIKA-3324 URL: https://issues.apache.org/jira/browse/TIKA-3324 Project: Tika Issue Type: Task Reporter: Tim

[jira] [Commented] (TIKA-3313) Improve performance and usability of RereadableInputStream

2021-03-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302060#comment-17302060 ] Hudson commented on TIKA-3313: -- FAILURE: Integrated in Jenkins build Tika » tika-main-jdk8 #170 (See

[jira] [Created] (TIKA-3323) FileCommandDetectorTest incocnistent results depending on platform

2021-03-15 Thread Andrew Pavlin (Jira)
Andrew Pavlin created TIKA-3323: --- Summary: FileCommandDetectorTest incocnistent results depending on platform Key: TIKA-3323 URL: https://issues.apache.org/jira/browse/TIKA-3323 Project: Tika

[jira] [Resolved] (TIKA-3313) Improve performance and usability of RereadableInputStream

2021-03-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3313. --- Fix Version/s: 2.0.0 Resolution: Fixed Thank you [~peterkronenberg] ! > Improve performance

[jira] [Commented] (TIKA-3313) Improve performance and usability of RereadableInputStream

2021-03-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302005#comment-17302005 ] ASF GitHub Bot commented on TIKA-3313: -- tballison merged pull request #413: URL:

[GitHub] [tika] tballison merged pull request #413: TIKA-3313 Improve performance and usability of RereadableInputStream

2021-03-15 Thread GitBox
tballison merged pull request #413: URL: https://github.com/apache/tika/pull/413 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Resolved] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3320. --- Fix Version/s: 1.26 Resolution: Fixed This will come out shortly in 1.26.  Thank you for

[jira] [Commented] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301968#comment-17301968 ] ASF GitHub Bot commented on TIKA-3320: -- tballison merged pull request #414: URL:

[GitHub] [tika] tballison merged pull request #414: TIKA-3320 Added case-insensitivity to tika server ocr header names

2021-03-15 Thread GitBox
tballison merged pull request #414: URL: https://github.com/apache/tika/pull/414 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Created] (TIKA-3322) Upgrade to PDFBox 2.0.23 when available

2021-03-15 Thread Tim Allison (Jira)
Tim Allison created TIKA-3322: - Summary: Upgrade to PDFBox 2.0.23 when available Key: TIKA-3322 URL: https://issues.apache.org/jira/browse/TIKA-3322 Project: Tika Issue Type: Bug

Re: Python-tika: issues related to memory consumption

2021-03-15 Thread Tim Allison
Hi Manish, Lots of things can go wrong in parsing PDFs. Can you share links to files showing specific problems? On Mon, Mar 15, 2021 at 11:50 AM Chris Mattmann wrote: > > Hi Manish, I think you should ask this one upstream on the Tika Dev lists. > I’ve cc’ed them for you. > > > > > > > > > >

[jira] [Updated] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Subhajit Das (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subhajit Das updated TIKA-3320: --- Description: It seems that TikaServer 1.25 header like “X-Tika-PDFOcrStrategy” is case sensitive.

[jira] [Commented] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Subhajit Das (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301755#comment-17301755 ] Subhajit Das commented on TIKA-3320: Pull request: https://github.com/apache/tika/pull/414 >

[jira] [Commented] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301754#comment-17301754 ] ASF GitHub Bot commented on TIKA-3320: -- Subhajitdas298 opened a new pull request #414: URL:

[GitHub] [tika] Subhajitdas298 opened a new pull request #414: TIKA-3320 Added case-insensitivity to tika server ocr header names

2021-03-15 Thread GitBox
Subhajitdas298 opened a new pull request #414: URL: https://github.com/apache/tika/pull/414 TIKA-3320 Made TikaServer header names case insensitive This is an automated message from the Apache Git Service. To respond

Re: Python-tika: issues related to memory consumption

2021-03-15 Thread Chris Mattmann
Hi Manish, I think you should ask this one upstream on the Tika Dev lists. I’ve cc’ed them for you. From: manish mathur Date: Monday, March 15, 2021 at 4:41 AM To: Subject: Re: Python-tika: issues related to memory consumption Hi Chris, I am using python-tika library to

[jira] [Commented] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Julian Reschke (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301714#comment-17301714 ] Julian Reschke commented on TIKA-3320: -- (side note: RFC 2616 is really really obsolete) Yes, field

[jira] [Closed] (TIKA-3321) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Subhajit Das (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subhajit Das closed TIKA-3321. -- Resolution: Duplicate > TikaServer Header Name is Case-sensitive >

[jira] [Created] (TIKA-3320) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Subhajit Das (Jira)
Subhajit Das created TIKA-3320: -- Summary: TikaServer Header Name is Case-sensitive Key: TIKA-3320 URL: https://issues.apache.org/jira/browse/TIKA-3320 Project: Tika Issue Type: Bug

[jira] [Created] (TIKA-3321) TikaServer Header Name is Case-sensitive

2021-03-15 Thread Subhajit Das (Jira)
Subhajit Das created TIKA-3321: -- Summary: TikaServer Header Name is Case-sensitive Key: TIKA-3321 URL: https://issues.apache.org/jira/browse/TIKA-3321 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-94) Speech-to-text transcription

2021-03-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301688#comment-17301688 ] Tim Allison commented on TIKA-94: - Just came across SpeechBrain:

[jira] [Commented] (TIKA-3313) Improve performance and usability of RereadableInputStream

2021-03-15 Thread Peter Kronenberg (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301635#comment-17301635 ] Peter Kronenberg commented on TIKA-3313: Just created a pull request for this issue. Any

[jira] [Commented] (TIKA-3313) Improve performance and usability of RereadableInputStream

2021-03-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301634#comment-17301634 ] ASF GitHub Bot commented on TIKA-3313: -- peterkronenberg opened a new pull request #413: URL:

[GitHub] [tika] peterkronenberg opened a new pull request #413: TIKA-3313 Improve performance and usability of RereadableInputStream

2021-03-15 Thread GitBox
peterkronenberg opened a new pull request #413: URL: https://github.com/apache/tika/pull/413 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[jira] [Commented] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Richard Kraus (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301451#comment-17301451 ] Richard Kraus commented on TIKA-3319: - [~tilman] - fantastic thank you so much. It's 12:30 AM where

[jira] [Comment Edited] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301426#comment-17301426 ] Tilman Hausherr edited comment on TIKA-3319 at 3/15/21, 6:48 AM: - Re the

[jira] [Updated] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Richard Kraus (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Kraus updated TIKA-3319: Description: So...in sum 1) it somehow doesn't "point" to a parser? (but it kinda does...) 2) it

[jira] [Commented] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301426#comment-17301426 ] Tilman Hausherr commented on TIKA-3319: --- Re the warnings, here's what I do: {noformat} java -cp

[jira] [Updated] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Richard Kraus (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Kraus updated TIKA-3319: Description: 01 Tika-1.24.1.jar and 1.24 python module have been running well for months on my

[jira] [Commented] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301423#comment-17301423 ] Tilman Hausherr commented on TIKA-3319: --- I think the NPE was fixed in TIKA-3112. Please retry with

[jira] [Updated] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Richard Kraus (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Kraus updated TIKA-3319: Description: 01 Tika-1.24.1.jar and 1.24 python module have been running well for months on my

[jira] [Updated] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Richard Kraus (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Kraus updated TIKA-3319: Description: 01 Tika-1.24.1.jar and 1.24 python module have been running well for months on my

[jira] [Created] (TIKA-3319) Caused by: java.lang.NullPointerException (and more!)

2021-03-15 Thread Richard Kraus (Jira)
Richard Kraus created TIKA-3319: --- Summary: Caused by: java.lang.NullPointerException (and more!) Key: TIKA-3319 URL: https://issues.apache.org/jira/browse/TIKA-3319 Project: Tika Issue Type: