[
https://issues.apache.org/jira/browse/TIKA-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1912:
------------------------------
Description: While working on TIKA-1285, we found that PDFBox 2.0.0 is not
able to handle truncated files as well as PDFBox 1.8.11. Let's figure out how
to gain the benefits from 2.0.0 without losing the ability to extract some
content from truncated files.
> Figure out how to parse truncated PDFs that were handled by PDFBox 1.8.x but
> not by 2.0.0
> -----------------------------------------------------------------------------------------
>
> Key: TIKA-1912
> URL: https://issues.apache.org/jira/browse/TIKA-1912
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
>
> While working on TIKA-1285, we found that PDFBox 2.0.0 is not able to handle
> truncated files as well as PDFBox 1.8.11. Let's figure out how to gain the
> benefits from 2.0.0 without losing the ability to extract some content from
> truncated files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)