[ 
https://issues.apache.org/jira/browse/PDFBOX-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Chojecki closed PDFBOX-1042.
-----------------------------------

    Resolution: Duplicate

Already fixed PDFBOX-1016

> Wrong XRefStream order while parsing incremental updated PDF with XRefStreams
> -----------------------------------------------------------------------------
>
>                 Key: PDFBOX-1042
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1042
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.5.0
>            Reporter: Thomas Chojecki
>            Priority: Critical
>
> A PDF can contain two types of XRef-Entries.
> Most files use XRefTables for object references.
> Web-Optimized (linearized) pdf document uses XRefStreams. This is a compresed 
> XRefTable as ObjectStream. The PDFParser parse this objects the same way as 
> other objects and put them into an object pool (HashMap). If the document was 
> incremental updated, more XRefStreams would be in the pdf document and all 
> will be put into the object pool.
> The XRefStreamParser begin to parse the XRefStreams and try to gain all 
> XRefStream-Object from that pool. The objects returned from the pool aren't 
> in the same order as read. This cause that in some cases the older Object 
> overwrite the newer one. And this cause that the pdfbox can't find the right 
> objects and use the older one instead.
> If a user try to parse such a document, he will got an indeterminate state. 
> older and newer objects are mixed.
> In my case, a document catalog was overwrote by an old one and i can't see 
> the changes that was made with the incremental update.
> A patch and a sample pdf will come soon.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to