[
https://issues.apache.org/jira/browse/PDFBOX-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16843526#comment-16843526
]
Tilman Hausherr commented on PDFBOX-4541:
-----------------------------------------
COSWriter is definitively too complex (but I don't have a better concept), and
your linearization preparation code would probably have to "predict" all the
optimizations done there. And any changes in COSWriter would probably break
your application. Maybe dependency injection would be a solution? I.e. that
PDDocument has a default COSWriter, but that users could also pass their own
before calling save(). Maybe have a base COSWriter that provides some very
simple classes only. If we can't come up with any, then just an interface.
I'm doubting that any changes made in parseDirObject() are the solution for
anything that happens much later. While you're just thinking about
linearization, what about code that loads a PDF, makes some changes, and then
saves? (Besides, you could just extend parseDirObject yourself in an own
parser, like done in preflight)
> Incorrect? handling of direct/indirect objects
> ----------------------------------------------
>
> Key: PDFBOX-4541
> URL: https://issues.apache.org/jira/browse/PDFBOX-4541
> Project: PDFBox
> Issue Type: Wish
> Components: Parsing, Writing
> Affects Versions: 2.0.14
> Reporter: Jonathan
> Priority: Major
> Attachments: broken_censored.pdf, linearized.pdf,
> linearized_withfix.pdf
>
>
> We ran into some issues concerning blank pages in some of our resulting PDF
> documents. Investigation showed that some objects which were referenced were
> never actually written. We then noticed that these objects were never written
> because they missed the `isDirect` flag. We were able to mitigate this issue
> by adding
> {code:java}
> if (retval != null) {
> retval.setDirect(true);
> }
> return retval;
> {code}
> at the end of `BaseParser.parseDirObject()`.
> While the pdfs were now displayed correctly, QPDFs check reported erroneous
> hint tables. The offsets there were calculated incorrectly because the
> objects were now written not only once, but, in fact, several times in places
> where they should have been merely referenced. We eventually resolved this
> issue by replacing the if-condiction
> {code:java}
> if (willEncrypt || incrementalUpdate || subValue instanceof COSDictionary ||
> subValue == null)
> {code}
> in `COSWriter.visitFromArray(COSArray)` and
> `COSWriter.visitFromDictionay(COSDictionary)` with
> {code:java}
> if (willEncrypt || incrementalUpdate || subValue == null || !(subValue
> instanceof COSObject))
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]