[
https://issues.apache.org/jira/browse/PDFBOX-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627684#comment-16627684
]
Tilman Hausherr commented on PDFBOX-4323:
-----------------------------------------
I ran this code:
{code:java}
PDDocument doc = PDDocument.load(new
URL("https://issues.apache.org/jira/secure/attachment/12941226/fda-form-356h-Scrubbed.pdf").openStream());
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
for (PDField field : acroForm.getFieldTree())
{
List<PDAnnotationWidget> widgets = field.getWidgets();
PDAnnotationWidget widget = widgets.get(0);
if (widget != null)
{
int pageNo = doc.getPages().indexOf(widget.getPage());
if (pageNo < 0)
{
System.out.println(field.getFullyQualifiedName());
}
}
}{code}
and got many results, e.g.
db_ind_rare_disease_desg_10_y
db_ind_rare_disease_desg_10_n
db_ind_rare_disease_desg_10
I could not find these fields on pages. On page 4 it goes up to 9. The field
can be found at {{Root/AcroForm/Fields/[176]}} . The /P dictionary is not a
page, it is a "Template", so that doesn't count. On the bottom of page 4 I
found "Add Second Continuation Page for #15". So I assume this is some sort of
dynamic PDF, i.e. that adds pages.
So I added this line:
{code:java}
System.out.println(widget.getPage().getCOSObject().getItem(COSName.TYPE));{code}
And it turned out that all pages < 0 are templates.
> Not able to determine the page (page number) of the some form fields
> --------------------------------------------------------------------
>
> Key: PDFBOX-4323
> URL: https://issues.apache.org/jira/browse/PDFBOX-4323
> Project: PDFBox
> Issue Type: Bug
> Components: AcroForm, PDModel
> Affects Versions: 2.0.2
> Reporter: Amit Maheshwari
> Priority: Major
> Attachments: fda-form-356h-Scrubbed.pdf
>
>
> I am not able to decide the page number of some form fields (specially of
> page 4, 5 of attached pdf).
> How I'm trying to get page number:
> # First I get list of all pages (as in 'PDPageTree') of pdf using
> 'pdDocumentCatalog.GetPages()'
> # Then I get 'PDAcroForm' for the same pdf using 'getAcroForm()' method
> # Then I get list of all Fields (as in 'PDFieldTree') from previously got
> AcroForm
> I use all these information in following code to get Page Number:
>
> var widgets = field.getWidgets();
> var widget = (widgets.toArray()[0] as PDAnnotationWidget);
> if (widget != null)
> {
> int pageNo = pages.indexOf(widget.getPage());
> }
>
> There is no error, just I am getting pageNo = -1 (as list doesn't contain
> such page)
> But for some fields, list of pages doesn't contain the page which I get from
> 'widget.getPage()'
>
> Let me know if some more clarification required.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]