[ https://issues.apache.org/jira/browse/TIKA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628102#comment-14628102 ]
Tim Allison edited comment on TIKA-1588 at 7/15/15 2:13 PM: ------------------------------------------------------------ Current version of reports attached comparing PDFBox 1.8.9 vs PDFBox 1.8.10 against the PDFs in govdocs1. Overall takeaway: no new exceptions, no fixed exceptions. Without looking carefully at the files, it looks like there is a slight improvement in 005937.pdf and 722558.pdf. It looks like there might be a very small regression in 167853.pdf, where 1 instance of {{respond}} has become {{respondæ}} I realize now that I should try this again with the PDFBOX-2823 catch blocks removed...doh! was (Author: talli...@mitre.org): Current version of reports attached comparing PDFBox 1.8.9 vs PDFBox 1.8.10 against the PDFs in govdocs1. Overall takeaway: no new exceptions, no fixed exceptions. Without looking carefully at the files, it looks like there is a slight improvement in 005937.pdf and 722558.pdf. It looks like there might be a very small regression in 167853.pdf, where 1 instance of {{respond}} has become {{respondæ}} > Upgrade to PDFBox 1.8.10 when available > --------------------------------------- > > Key: TIKA-1588 > URL: https://issues.apache.org/jira/browse/TIKA-1588 > Project: Tika > Issue Type: Improvement > Components: parser > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Minor > Attachments: reports_1_8_9_vs_1_8_10.zip > > > Let's use this ticket to discuss/prepare for the release and integration of > PDFBox 1.8.10 when it is available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)