[
https://issues.apache.org/jira/browse/PDFBOX-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Timo Boehme reassigned PDFBOX-1099:
-----------------------------------
Assignee: Timo Boehme
While NonSequentialPDFParser works correctly the issue still exists with
PDFParser. Furthermore we have to use XREF information to decide if an object
in an object stream is references by XREF table or not (currently if an object
already exists the object from obj stream is skipped).
> Only parsing object streams if they are referenced by the xref table / stream
> -----------------------------------------------------------------------------
>
> Key: PDFBOX-1099
> URL: https://issues.apache.org/jira/browse/PDFBOX-1099
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Reporter: Thomas Chojecki
> Assignee: Timo Boehme
>
> Some pdf documents have objects streams and don't reference them through the
> xref table / stream. To prevent the stream parser to dereference such object
> streams, we need to implement the type 2 part (case 2) inside the
> PDFXRefStreamParser and store the objects inside a map. This will take some
> load from the stream parser (see PDFBOX-1098) and causes less failures while
> parsing a document.
> A sample pdf can be get from the issue PDFBOX-1098 and a patch is coming
> soon.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira