Re: Best way to deal with NULL PDAcroForm fields

Maruan Sahyoun Wed, 09 Sep 2015 12:13:51 -0700

maybe you can share some of them (upload them to a public location) so we get a 
better idea and might provide some feedback


BR
Maruan

> Am 09.09.2015 um 21:09 schrieb Kevin Ternes <[email protected]>:
> 
> Thank you!
> As I started out saying, I deal with a lot of merkwürdig documents.  I have 
> had a lot of problems with non-compliant PDFs.
> There is no telling how long ago they were created and by whom and with what 
> software.  This change will help.
> 
> -----Original Message-----
> From: Maruan Sahyoun [mailto:[email protected]] 
> Sent: Tuesday, September 08, 2015 12:38 PM
> To: [email protected]
> Subject: Re: Best way to deal with NULL PDAcroForm fields
> 
> Hi,
> 
>> Am 08.09.2015 um 18:38 schrieb Kevin Ternes <[email protected]>:
>> 
>> 1.8.10
>> 
>> -----Original Message-----
>> From: Maruan Sahyoun [mailto:[email protected]]
>> Sent: Tuesday, September 08, 2015 11:20 AM
>> To: [email protected]
>> Subject: Re: Best way to deal with NULL PDAcroForm fields
>> 
>> Hi Kevin,
>> 
>>> Am 08.09.2015 um 16:45 schrieb Kevin Ternes <[email protected]>:
>>> 
>>> 
>>> I get a lot of weird documents.  When I try to set a particular field 
>>> value, some of them throw NullPointerExceptions from line 
>>> PDAcroForm.getField(), line 291:
>>> 
>>> 287: COSArray fields =
>>> 288:    (COSArray) acroForm.getDictionaryObject(
>>> 289:        COSName.getPDFName("Fields"));
>>> 290:
>>> 291: for (int i = 0; i < fields.size() && retval == null; i++) 292:{
>>> 
>>> To avoid this, at first I was calling PDAcroForm.getFields() and checking 
>>> that to see if that was NULL but I realized that it would usually create a 
>>> new fields array to return which seemed wasteful.
> 
> this happens when there is a /Fields entry but there is no content in which 
> case an empty List is returned which you could check using List.isEmpty(). 
> Unfortunately in case the /Fields entry is missing completely null is 
> returned. This has been addressed in PDFBox 2.0.0 where there is always an 
> empty List for both cases. Please note that /Fields is a required entry so 
> the PDF(s) are not in line with the spec but nevertheless should be handled 
> correctly.
> 
>>> 
>>> Is the most efficient way to avoid this to first call:
>>>  COSArray fields =  (COSArray) acroForm.getDictionaryObject( 
>>> COSName.getPDFName("Fields")); myself and check if that is NULL?
> 
> if there is no /Fields entry getFields() returns null - so you could use 
> that. 
> 
>>> 
>>> 
>>> Secondary Question:
>>> The method PDAcroForm.getFields() does a not-NULL check of fields before 
>>> calling fields.size().
>>> Is there a reason that this check is not performed in getField()?
> 
> thats a bug. I've created https://issues.apache.org/jira/browse/PDFBOX-2965 
> <https://issues.apache.org/jira/browse/PDFBOX-2965> for that.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Best way to deal with NULL PDAcroForm fields

Reply via email to