[ 
https://issues.apache.org/jira/browse/PDFBOX-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Costermans updated PDFBOX-2015:
-----------------------------------

    Attachment: XRefStm_not_updated.patch
                Word2010.pdf
                modified_Word2010.pdf

> Hybrid reference pdf still contain XRefStm info in the trailer dictionary 
> afterPDDocument#save
> ----------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-2015
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2015
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.8.4
>            Reporter: Tim Costermans
>         Attachments: Word2010.pdf, XRefStm_not_updated.patch, 
> modified_Word2010.pdf
>
>
> From: Tim Costermans [mailto:[email protected]] 
> Sent: maandag 31 maart 2014 12:57
> To: [email protected]
> Subject: RE: PDFBox 1.8.4 and pdf's generated by MS Word
> Hello,
> I’ve written a test case to reproduce the issue. (see patch)
> Could someone have a look at it and give me some pointers on how to solve 
> this issue? I applied this patch on the 1.8.4 tag I checked out locally.
> The issue is that I don’t know the pdf spec, so I don’t know how to fix this 
> issue in the PDFBOX source code.
> Word2010.pdf is the input pdf, I open the document with PDFBOX add a string 
> to the pdf. In this case ‘Hello world!’.
> Afterwards I save the pdf. 
> If I look at the content of the pdf before and after I modified it (using 
> Notepad++) I see this:
> Word2010.pdf:
> Line 647: <</Size 18/Root 1 0 R/Info 7 0 
> R/ID[<AE9AF29D5A22AE47B47C4DA29170BE64><AE9AF29D5A22AE47B47C4DA29170BE64>] 
> /Prev 81972/XRefStm 81702>>
> modified_Word2010.pdf:
> Line 791: /XRefStm 81702
> XRefStm is not updated although the original pdf had multiple revisions that 
> were merged into a new pdf document.
> A third party library we use defends on this XRefStm value and cannot open 
> the pdf after it was modified. (Stack trace see previous msg)
> Any help would be much appreciated.
> Kind regards,
> Tim Costermans
> Hi Tim,
> that’s a bug. 
> Explanation: The original file uses what’s called a hybrid reference. That’s 
> for compatibility with readers which do not support compressed reference 
> streams.  The file generated by PDFBox doesn’t use hybrid references any more 
> but still contains the XRefStm info in the trailer dictionary.
> Could you file an issue at https://issues.apache.org/jira/browse/PDFBOX
> BR
> Maruan



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to