[ 
https://issues.apache.org/jira/browse/TIKA-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-1912:
------------------------------
    Description: While working on TIKA-1285, we found that PDFBox 2.0.0 is not 
able to handle truncated files as well as PDFBox 1.8.11.  Let's figure out how 
to gain the benefits from 2.0.0 without losing the ability to extract some 
content from truncated files.

> Figure out how to parse truncated PDFs that were handled by PDFBox 1.8.x but 
> not by 2.0.0
> -----------------------------------------------------------------------------------------
>
>                 Key: TIKA-1912
>                 URL: https://issues.apache.org/jira/browse/TIKA-1912
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>
> While working on TIKA-1285, we found that PDFBox 2.0.0 is not able to handle 
> truncated files as well as PDFBox 1.8.11.  Let's figure out how to gain the 
> benefits from 2.0.0 without losing the ability to extract some content from 
> truncated files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to