[ 
https://issues.apache.org/jira/browse/PDFBOX-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376727#comment-14376727
 ] 

Tilman Hausherr commented on PDFBOX-2723:
-----------------------------------------

I'm reverting the 1.8 change, a test with my files shows about 30 with this 
exception:
{code}
SCHWERWIEGEND: Error converting file annots.pdf
java.io.IOException: Missing root object specification in trailer.
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:391)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:887)
        at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1273)
        at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1256)
        at 
org.apache.pdfbox.util.TestPDFToImage.doTestFile(TestPDFToImage.java:198)
        at 
org.apache.pdfbox.util.TestPDFToImage.testRenderImage(TestPDFToImage.java:349)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at junit.framework.TestCase.runTest(TestCase.java:168)
        at junit.framework.TestCase.runBare(TestCase.java:134)
        at junit.framework.TestResult$1.protect(TestResult.java:110)
        at junit.framework.TestResult.runProtected(TestResult.java:128)
        at junit.framework.TestResult.run(TestResult.java:113)
        at junit.framework.TestCase.run(TestCase.java:124)
        at junit.framework.TestSuite.runTest(TestSuite.java:232)
        at junit.framework.TestSuite.run(TestSuite.java:227)
        at junit.textui.TestRunner.doRun(TestRunner.java:116)
        at junit.textui.TestRunner.start(TestRunner.java:180)
        at junit.textui.TestRunner.main(TestRunner.java:138)
        at org.apache.pdfbox.util.TestPDFToImage.main(TestPDFToImage.java:399)
{code}

> PDFBox*.tmp files not deleted by COSParser 
> -------------------------------------------
>
>                 Key: PDFBOX-2723
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2723
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.0
>         Environment: Windows and Linux, with issue being critical on Linux
>            Reporter: Pascal Essiembre
>            Assignee: Tilman Hausherr
>              Labels: patch
>             Fix For: 2.0.0
>
>         Attachments: pdfbox.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> When parsing PDFs, temporary files get created under the system temp 
> directory (e.g. PDFBox6525369863339991063.tmp).  All files created for each 
> documents are always deleted except for one.   So each document parsed adds a 
> new tmp file that never gets deleted.  That's likely due to a stream never 
> closed.  When processing many PDFs on Linux in the same JVM instance, we get 
> the crashing error: "Too many files open".  Changing the max file handle on 
> the OS is not always an option.
> I was able to fix this by modifying the {{COSParser}} class to close a 
> {{COSStream}} instance:
> {code:title=COSParser.java, starting on line 312|borderStyle=solid}
>     private long parseXrefObjStream(long objByteOffset, boolean isStandalone) 
> throws IOException
>     {
>         // ---- parse indirect object head
>         readObjectNumber();
>         readGenerationNumber();
>         readExpectedString(OBJ_MARKER, true);
>         COSDictionary dict = parseCOSDictionary();
>         COSStream xrefStream = parseCOSStream(dict);
>         parseXrefStream(xrefStream, (int) objByteOffset, isStandalone);
>         xrefStream.close();  // <--- *** NEW LINE ***
>         return dict.getLong(COSName.PREV);
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to