[ 
https://jira.nuxeo.com/browse/NXP-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=94307#comment-94307
 ] 

Stéphane Lacoin commented on NXP-6815:
--------------------------------------

pdfbox is printing a warn, but the text is well extracted from the PDF.

> upgrade pdfbox to 1.5.0
> -----------------------
>
>                 Key: NXP-6815
>                 URL: https://jira.nuxeo.com/browse/NXP-6815
>             Project: Nuxeo Enterprise Platform
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 5.4.1
>            Reporter: Stéphane Lacoin
>            Assignee: Stéphane Lacoin
>             Fix For: 5.4.2
>
>         Attachments: nutcracker.pdf, petipa_EN.pdf, swan_lake.pdf, 
> tchaikovsky_EN.pdf
>
>
> PDF converter is referencing an out-dated version of pdfbox : 0.7.3. Since 
> that version, pdfbox has moved into the apache umbrella and is now releasing 
> the 1.5.0 version.
> The 1.5.0 version of pdfbox is fixing NPE problems such as :
> Caused by: org.nuxeo.ecm.core.api.WrappedException: Exception: 
> java.lang.NullPointerException. message: null
>       at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
>       at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
>       at 
> org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:226)
>       at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
>       at org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:149)
>       at 
> org.nuxeo.ecm.core.convert.plugins.text.extractors.PDF2TextConverter.convert(PDF2TextConverter.java:73)
>       ... 9 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets

Reply via email to