[
https://issues.apache.org/jira/browse/PDFBOX-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376727#comment-14376727
]
Tilman Hausherr commented on PDFBOX-2723:
-----------------------------------------
I'm reverting the 1.8 change, a test with my files shows about 30 with this
exception:
{code}
SCHWERWIEGEND: Error converting file annots.pdf
java.io.IOException: Missing root object specification in trailer.
at
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:391)
at
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:887)
at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1273)
at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1256)
at
org.apache.pdfbox.util.TestPDFToImage.doTestFile(TestPDFToImage.java:198)
at
org.apache.pdfbox.util.TestPDFToImage.testRenderImage(TestPDFToImage.java:349)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at junit.textui.TestRunner.doRun(TestRunner.java:116)
at junit.textui.TestRunner.start(TestRunner.java:180)
at junit.textui.TestRunner.main(TestRunner.java:138)
at org.apache.pdfbox.util.TestPDFToImage.main(TestPDFToImage.java:399)
{code}
> PDFBox*.tmp files not deleted by COSParser
> -------------------------------------------
>
> Key: PDFBOX-2723
> URL: https://issues.apache.org/jira/browse/PDFBOX-2723
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.0
> Environment: Windows and Linux, with issue being critical on Linux
> Reporter: Pascal Essiembre
> Assignee: Tilman Hausherr
> Labels: patch
> Fix For: 2.0.0
>
> Attachments: pdfbox.patch
>
> Original Estimate: 5m
> Remaining Estimate: 5m
>
> When parsing PDFs, temporary files get created under the system temp
> directory (e.g. PDFBox6525369863339991063.tmp). All files created for each
> documents are always deleted except for one. So each document parsed adds a
> new tmp file that never gets deleted. That's likely due to a stream never
> closed. When processing many PDFs on Linux in the same JVM instance, we get
> the crashing error: "Too many files open". Changing the max file handle on
> the OS is not always an option.
> I was able to fix this by modifying the {{COSParser}} class to close a
> {{COSStream}} instance:
> {code:title=COSParser.java, starting on line 312|borderStyle=solid}
> private long parseXrefObjStream(long objByteOffset, boolean isStandalone)
> throws IOException
> {
> // ---- parse indirect object head
> readObjectNumber();
> readGenerationNumber();
> readExpectedString(OBJ_MARKER, true);
> COSDictionary dict = parseCOSDictionary();
> COSStream xrefStream = parseCOSStream(dict);
> parseXrefStream(xrefStream, (int) objByteOffset, isStandalone);
> xrefStream.close(); // <--- *** NEW LINE ***
> return dict.getLong(COSName.PREV);
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]