[
https://issues.apache.org/jira/browse/PDFBOX-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347628#comment-16347628
]
Joe Masinter commented on PDFBOX-4066:
--------------------------------------
I agree there is definitely a use case where you want the duplicated-named
merged fields to not be renamed, instead merged into the same field entry with
multiple widgets. Seems like the best way to approach this would be an option
on the PDFMergerUtility or maybe an extended class. I've tinkered with the
code to try and accomplish this but I'm not familiar enough with the document
structure or PDFBox in general.
> Merging documents with nested fields duplicates child fields
> ------------------------------------------------------------
>
> Key: PDFBOX-4066
> URL: https://issues.apache.org/jira/browse/PDFBOX-4066
> Project: PDFBox
> Issue Type: Bug
> Components: AcroForm, Utilities
> Affects Versions: 2.0.8
> Reporter: Al Phaba
> Assignee: Maruan Sahyoun
> Priority: Major
> Fix For: 2.0.9, 3.0.0 PDFBox
>
> Attachments: TestForm-flattened.pdf, TestForm-merged.pdf,
> TestForm.pdf, flattenAndMerge.pdf
>
>
> I have a pdf with a lot of acroforms, I do some manipulation on it which
> results in a new pdf. So I have PDF-1 (which is the original one )and PDF-2
> (just a duplication of PDF-1), now I want to merge them. Both PDFs have some
> acroforms for example: field_a, field_2...
> Before I merge them I flatten PDF-1, because I only want to have acrofields
> from PDF-2. When I check then my new merged PDF I can see that there are no
> visible fields on on the pages from PDF-1 and there are fields on pages of
> fields of PDF-2. At the first look it seems ok, but when I inspect the fields
> I can see that the merger has renamed all the fields for PDF-2 e.g.
> field_a_dummy123, field_b_dummy232 ...
> It seems to me, that flattening does not remove the fields and thats why the
> PDFMerger from PDFBox will rename the fields for PDF-2 because acrofields
> need to be unique.Another guess was that there is a bug in mergeAcroForm()
>
> {code:java}
> @Test
> public void flattenAndMerge() throws IOException {
> File testForm = new
> File(classLoader.getResource("./TestForm.pdf").getFile());
> byte[] testFormAsByte = Files.readAllBytes(testForm.toPath());
> byte[] testFormAsByte2 = Files.readAllBytes(testForm.toPath());
> PDDocument pdf1 = PDDocument.load(testFormAsByte);
> PDAcroForm acroform = pdf1.getDocumentCatalog().getAcroForm();
> acroform.flatten();
> Path flattendedPdf = Files.createTempFile("flatten", ".pdf");
> pdf1.save(flattendedPdf.toFile());
> PDFMergerUtility merger = new PDFMergerUtility();
> merger.addSource(new
> ByteArrayInputStream(Files.readAllBytes(flattendedPdf)));
> merger.addSource(new ByteArrayInputStream(testFormAsByte2));
> merger.setDestinationFileName("./build/flattenAndMerge.pdf");
> merger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
> }
> {code}
> Here is my SO Article
> [https://stackoverflow.com/questions/48271924/pdfbox-flatten-pdf-does-not-remove-acroform-elements?noredirect=1#comment83544858_48271924]
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]