[
https://issues.apache.org/jira/browse/TIKA-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087273#comment-16087273
]
Tim Allison commented on TIKA-2430:
-----------------------------------
[~lfcnassif], there are two options now: randomly truncate a file, and randomly
choose bytes to overwrite with random bytes. If there's a more common pattern
you see...randomly write a block length chunk in a file, please re-open this
issue.
This has already revealed two areas for improvement in POI with just one test
file. I wasn't able to reproduce the EMF bug on the one test file I used,
yet...
> Add at least dev test capability to run Tika against corrupted files in our
> test suite
> --------------------------------------------------------------------------------------
>
> Key: TIKA-2430
> URL: https://issues.apache.org/jira/browse/TIKA-2430
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Assignee: Tim Allison
> Fix For: 1.17
>
>
> [~lfcnassif] observed on TIKA-2428 that a corrupt file caused a permanent
> hang for the EMFParser. Files can be corrupted for various reasons. We can
> add some optional code to let people experiment with running Tika against
> randomly corrupted versions of the files in our test suite. I suspect that
> this will unearth too many errors to start to be run on a regular basis.
> Let's at least add some code in tika-parsers to let devs run the tests.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)