[
https://issues.apache.org/jira/browse/PDFBOX-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281682#comment-13281682
]
Timo Boehme commented on PDFBOX-1320:
-------------------------------------
Returning an empty collection instead of null breaks PDNameTreeNode.getValue
which tests for null value. This could be changed, however we would not be able
to know if we simply had an empty name array or no name array at all. Since
getValue is implemented to look for kids only if no name array exists I vote
against returning an empty collection but to document that null may be returned
and other code using it has to test for null.
If there are no objections I will do this change (document the null return
value in JavaDoc) and fix the ExtractText which is the only one using this
method (beside getValue).
> NPE in extractEmbeddedDocuments
> -------------------------------
>
> Key: PDFBOX-1320
> URL: https://issues.apache.org/jira/browse/PDFBOX-1320
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.7.0
> Environment: pdfbox 1.7.0 (current trunk)
> Reporter: Samuli Saarinen
> Attachments: PDFBOX-1320.patch, PDNameTreeNode.java.patch
>
>
> While parsing a pdf document the following exception is thrown:
> java.lang.NullPointerException
> at
> org.apache.pdfbox.tika.PDFParser.extractEmbeddedDocuments(PDFParser.java:155)
> at org.apache.pdfbox.tika.PDFParser.parse(PDFParser.java:133)
> at test.TikaParse.main(TikaParse.java:27)
> The document I'm trying to parse is probably confidential so I cannot attach
> it until (or if) I get clearence.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira