maybe you can share some of them (upload them to a public location) so we get a better idea and might provide some feedback
BR Maruan > Am 09.09.2015 um 21:09 schrieb Kevin Ternes <[email protected]>: > > Thank you! > As I started out saying, I deal with a lot of merkwürdig documents. I have > had a lot of problems with non-compliant PDFs. > There is no telling how long ago they were created and by whom and with what > software. This change will help. > > -----Original Message----- > From: Maruan Sahyoun [mailto:[email protected]] > Sent: Tuesday, September 08, 2015 12:38 PM > To: [email protected] > Subject: Re: Best way to deal with NULL PDAcroForm fields > > Hi, > >> Am 08.09.2015 um 18:38 schrieb Kevin Ternes <[email protected]>: >> >> 1.8.10 >> >> -----Original Message----- >> From: Maruan Sahyoun [mailto:[email protected]] >> Sent: Tuesday, September 08, 2015 11:20 AM >> To: [email protected] >> Subject: Re: Best way to deal with NULL PDAcroForm fields >> >> Hi Kevin, >> >>> Am 08.09.2015 um 16:45 schrieb Kevin Ternes <[email protected]>: >>> >>> >>> I get a lot of weird documents. When I try to set a particular field >>> value, some of them throw NullPointerExceptions from line >>> PDAcroForm.getField(), line 291: >>> >>> 287: COSArray fields = >>> 288: (COSArray) acroForm.getDictionaryObject( >>> 289: COSName.getPDFName("Fields")); >>> 290: >>> 291: for (int i = 0; i < fields.size() && retval == null; i++) 292:{ >>> >>> To avoid this, at first I was calling PDAcroForm.getFields() and checking >>> that to see if that was NULL but I realized that it would usually create a >>> new fields array to return which seemed wasteful. > > this happens when there is a /Fields entry but there is no content in which > case an empty List is returned which you could check using List.isEmpty(). > Unfortunately in case the /Fields entry is missing completely null is > returned. This has been addressed in PDFBox 2.0.0 where there is always an > empty List for both cases. Please note that /Fields is a required entry so > the PDF(s) are not in line with the spec but nevertheless should be handled > correctly. > >>> >>> Is the most efficient way to avoid this to first call: >>> COSArray fields = (COSArray) acroForm.getDictionaryObject( >>> COSName.getPDFName("Fields")); myself and check if that is NULL? > > if there is no /Fields entry getFields() returns null - so you could use > that. > >>> >>> >>> Secondary Question: >>> The method PDAcroForm.getFields() does a not-NULL check of fields before >>> calling fields.size(). >>> Is there a reason that this check is not performed in getField()? > > thats a bug. I've created https://issues.apache.org/jira/browse/PDFBOX-2965 > <https://issues.apache.org/jira/browse/PDFBOX-2965> for that. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

