[ 
https://issues.apache.org/jira/browse/TIKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217838#comment-14217838
 ] 

Tim Allison commented on TIKA-1471:
-----------------------------------

Got it.  Thank _you_, [~alanbur].  [~jahewson] dug into the 
PDFont.clearResources() issue on PDFBOX-2200 and declared the static call safe 
even in a multithreaded environment.  The overall issue disappears with PDFBox 
2.0.

In Tika's PDFParser, we're now calling clearResources() after every 
document...I'm wondering if we should do it after every 1000 docs or so.  

Should we close this issue?  Any more work to do?  Thank you, again.

> OOM with corrupt PDF file
> -------------------------
>
>                 Key: TIKA-1471
>                 URL: https://issues.apache.org/jira/browse/TIKA-1471
>             Project: Tika
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.6
>         Environment: Linux, JVM 1.8.0_25-b17, 64-bit
>            Reporter: Alan Burlison
>            Priority: Blocker
>             Fix For: 1.7
>
>
> Use of PDFBox 1.8.6 by Tika 1.6 is causing OOM errors with corrupt PDF files, 
> due to a bug in PDFBox, see PDFBOX-2493. This makes Tika 1.6 unusable from 
> inside a long-running webapp and I've had to revert to Tika 1.5. Although 1.5 
> also throws errors with the corrupt file it does not cause OOM errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to