[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584294#comment-13584294
]
Jukka Zitting commented on TIKA-1074:
-------------------------------------
bq. Wait, do you mean I should remove the handling entirely (not bother future
proofing)?
If POI decides to declare IE (or just generic Exception) as thrown by their
API, it'll break binary compatibility, and thus in any case we'll need to
adjust our code. So adding future proofing code here doesn't win us anything,
it just complicates the codebase for no gain.
> Extraction should continue if an exception is hit visiting an embedded
> document
> -------------------------------------------------------------------------------
>
> Key: TIKA-1074
> URL: https://issues.apache.org/jira/browse/TIKA-1074
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 1.4
>
> Attachments: TIKA-1074.patch, TIKA-1074.patch
>
>
> Spinoff from TIKA-1072.
> In that issue, a problematic document (still not sure if document is corrupt,
> or possible POI bug) caused an exception when visiting the embedded documents.
> If I change Tika to suppress that exception, the rest of the document
> extracts fine.
> So somehow I think we should be more robust here, and maybe log the
> exception, or save/record the exception(s) somewhere so after parsing the app
> could decide what to do about them ...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira