Dave Hill created PDFBOX-4007:
---------------------------------
Summary: Merged documents don't retain tags
Key: PDFBOX-4007
URL: https://issues.apache.org/jira/browse/PDFBOX-4007
Project: PDFBox
Issue Type: Bug
Components: Utilities
Affects Versions: 2.0.8
Reporter: Dave Hill
Priority: Minor
Attachments: Tagged.pdf, pdfbox.patch
Certain combinations of documents don't retain tags when merged. The document
[^Tagged.pdf] is just a basic one word PDF created and tagged with Pro DC. If
you try to merge this with the government [General Forbearance
form|https://studentloans.gov/myDirectLoan/downloadForm.action?searchType=library&shortName=general&localeCode=en-us]
the output crashes DC when you try to view the tags. If you use a flattened
version of the General Forbearance form then the tags are just munged.
{code}
public static void main(String[] args) throws Exception {
PDFMergerUtility pdfMergerUtility = new PDFMergerUtility();
PDDocument src = PDDocument.load(new File("Tagged.pdf"));
PDDocument dest = PDDocument.load(new File("GeneralForbearance.pdf"));
pdfMergerUtility.appendDocument(dest, src);
src.close();
dest.save(new File("BrokenTags.pdf"));
dest.close();
}
{code}
The included patch appears to make tagging more reliable, but I'm still relying
heavily on cloning which can apparently cause other issues. The documents I
get out with this code seem present correctly in Adobe readers for all
combinations of documents that I tested against.
My patch is made and tested against yesterdays production head and it includes
my changes from [PDFBOX-3999|https://issues.apache.org/jira/browse/PDFBOX-3999]
since it is in the exact same place in the code.
The priority of this is a blocker for 508 compliance of merged documents but I
guessed it to be more of a minor issue in the overall scheme of things, please
correct me if I am mistaken.
Thanks!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]