Christian Appl created PDFBOX-5518:
--------------------------------------
Summary: "Threads" array in Document Catalog should be an indirect
reference
Key: PDFBOX-5518
URL: https://issues.apache.org/jira/browse/PDFBOX-5518
Project: PDFBox
Issue Type: Bug
Components: PDModel
Affects Versions: 2.0.26
Reporter: Christian Appl
Attachments: image-2022-09-23-09-50-30-766.png,
image-2022-09-23-10-03-15-070.png
*TL;DR:*
When using either of the methods "getThreads" or "setThreads" in class
PDDocumentCatalog and saving the resulting document: Adobe Preflight is
reporting an issue with the resulting "Threads" array in the document catalog
and claims it should have been an indirect object reference instead of a direct
object.
My claim: The COSWriter should be able to create indirect objects for COSArrays
when required.
*Checking PDF-32000-1:*
In table 28 "Entries in the catalog dictionary" we can find the following
definition:
!image-2022-09-23-09-50-30-766.png!
*Determining reasons:*
1. The mentioned get and set methods create a COSArray for the entry "Threads"
of the catalog dictionary
2. The COSWriter is assuming, that COSArrays should always preferably be
written as a direct substructure of a dictionary.
This may be entirely true for other arrays, but in this case is is cause for a
syntactical error in resulting documents. (It is plausible and possible - but
has not been checked - whether this causes issues for other structures aswell.)
The COSWriter provides the means to create indirect objects for
COSDictionaries, it however does (as far as I can see) not provide the means to
flag a COSArray for the same handling.
*Possible solutions:*
As far as I can see the COSWriter would be entirely capable of creating
COSObjects for any of the COSBase types, the only thing missing is the ability
to mark a COSArray to be written indirectly and the matching handling by the
COSWriter.
Adding something like:
!image-2022-09-23-10-03-15-070.png!
at the right places in the COSWriter (similar to the handling of indirect
COSDictionaries) seems to do the trick and resolves the issue.
*Important issue?:*
I fixed this on our end and hence it is not a pressing issue, also "Threads" is
not as important and common as other structures and hence most documents and
users won´t encounter this issue at all.
However - It would be nice, should this be fixed.
*Concerning a possible patch:*
I could provide a patch making the required changes, but would have to adapt it
for the current PDFBox 2.0.27-SNAPSHOT as I developed it rather as a hotfix for
our mirror of the library.
And concerning that patch I should mention:
As can be assumed - a "isDirectArray" and "setDirectArray" method have been
added to the COSArray - which is a quick and dirty solution, as it would be
preferable for COSArray to use the already existing "direct" field, that other
COSBase types (COSDictionaries) already use.
As stated - the solution is quick and dirty and for a final solution in the
PDFBox library a cleaner approach would be preferable. Hence I did not provide
that patch for now.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]