[ 
https://issues.apache.org/jira/browse/PDFBOX-5962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929358#comment-17929358
 ] 

Rune Flobakk commented on PDFBOX-5962:
--------------------------------------

[~msahyoun], I suspect [~tilman] has actually already mentioned the pattern:
{quote}Maybe this was done by "macOS Version 15.3.1 (Build 24D70) Quartz 
PDFContext"
{quote}
I have experienced a lot of problems in the past with PDFs output from the 
Quartz implementation (typically macOS Preview), but only when processed with a 
3rd party service, and difficult to get any insights into what is happening. 
Though I know the 3rd party processing also involves flattening. This is more 
or less the first time I attempt at any actual processing of PDFs myself. It 
didn't occur to me this time that using Preview (i.e Quartz) to fill out the 
form, there are so many moving parts 😅

I would think acquiring such forms from various agencies, filling them out by 
opening them in macOS Preview (default on macOS), and saving them, is a pretty 
common pattern.

Thank you for very swift response! I may have limited time for following up 
further during the weekend, but should be able to provide more dedicated 
attention from Monday :) 

> Saving PDDocument with flattened form retains fields
> ----------------------------------------------------
>
>                 Key: PDFBOX-5962
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5962
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm
>    Affects Versions: 3.0.4 PDFBox
>         Environment: Java 21
>            Reporter: Rune Flobakk
>            Priority: Major
>         Attachments: form-problem.pdf
>
>
> I believe I may have found a bug or at least a certain change in behavior 
> introduced in v3.0.4.
> For some PDAcroForms, after flattening the form, they seem to somehow retain 
> their fields when saving the PDDocument. {{PDAcroForm.getFields()}} is an 
> empty list after flattening (as I believe is expected), but when saving the 
> {{{}PDDocument{}}}, and re-reading the saved file {{PDAcroForm.getFields()}} 
> contains the fields of the form before it was flattened. Opening the saved 
> file in a PDF viewer also shows the form as editable.
> The flattening works as expected in v3.0.3, and the form becomes non-editable 
> with the values displayed as expected.
> I notice for this particular PDF I am testing with, there are a lot of 
> logging like this in v3.0.4 when invoking {{{}.flatten(){}}}:
> {code:java}
> WARN missing /P entry (page reference) in a widget for field: ... {code}
> So there are apparently some issues with the particular PDF, though it worked 
> as expected in v3.0.3. I see this logging was introduced here: 
> [https://github.com/apache/pdfbox/commit/e49649ae89c913058c1be79bec6b4f561fc1f0b6]
>  which is part of PDFBOX-5225. The {{.flatten()}} invocation succeeds in both 
> versions, but the flattening operation seem to not be effective in the saved 
> PDF file when using v3.0.4.
> I have made a small project demonstrating the problem here:
> [https://github.com/runeflobakk/pdfbox-flatten-form-save-issue]
> There is a {{FlattenFormTest}} JUnit test demonstrating the process for both 
> a problematic PDF and one which works as expected. Changing the pdfbox 
> dependency version to 3.0.3 makes both tests pass. The saved files appears in 
> the target directory for inspection.
> Thank you, and please let me know if there are any details I may have left 
> out!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to