W B created PDFBOX-2440:
---------------------------
Summary: xref stream is saved as table
Key: PDFBOX-2440
URL: https://issues.apache.org/jira/browse/PDFBOX-2440
Project: PDFBox
Issue Type: Bug
Components: Writing
Affects Versions: 1.8.7
Reporter: W B
When saving a PDDocument, PdfBox seems to always write an xref table, even when
the original file contains an xref stream.
To reproduce, load a PDF file (like the one attached) with PDDocument#load (or
PDDocument#loadNonSeq, same result) and then save it with PDDocument#save to
another file.
It seems to me that the problem is in COSWriter#doWriteXRef. When
doc#isXRefStream is true, the xref entries should be wrapped in a stream, but
they're written to output one by one. I think that part should look more like
its counterpart in COSWriter#doWriteXRefInc.
I made some changes to doWriteXRef accordingly and it seems to work for PDFs
that have never been incrementally updated but leads to corrupt files when the
PDF has been incrementally updated before :(
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)