[
https://issues.apache.org/jira/browse/TIKA-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166589#comment-15166589
]
Tim Allison commented on TIKA-1855:
-----------------------------------
I have to admit that I lost a bit of steam on this... I agree with Bob about
the elegance of the initial design. I trust Bob's caution about unzipping more
than one test-jar, which was my original re-plan.
After moving the test files for the sub-parser modules into the appropriate
modules, I now have time to look at what's left in tika-parsers. There are a
handful of tests that could easily and appropriately be moved to the
sub-parser-modules.
If we want to keep all the mime tests together (esp. {{TestMimeTypes}} and
{{TestContainerAwareDetector}}), we'll have to copy or have duplicates of at
least one file per parser. And, then we also have the tests in
AutoDetectParserTest...
As I see it:
* Initial proposal -- all test-docs in test-frameworks (or, test-parsers as it
is now) module, copy that whole test.jar to each sub-parser-module.
* Move individual test docs to appropriate sub-parser-module
** Create test-jar for each module, then copy each module's test-jar back to
test-frameworks
** Duplicate files between sub-parser-modules and test-frameworks...and there
will be quite a few because of the detection tests.
** Move mime-detection tests into the appropriate sub-parser
modules...duplicate ContainerAwareDetectors btwn two sub-parser-modules (ooxml
and package...?) For those mimes for which we don't have a parser, leave those
files in the test-frameworks module.
* Create one test-docs directory that can be read by every module (we have that
in POI, but it doesn't have a clean maven feel to it)
Other options?
> TIka 2.0 - Move shared test-code back to tika-core and distribute test files
> to parser modules
> ----------------------------------------------------------------------------------------------
>
> Key: TIKA-1855
> URL: https://issues.apache.org/jira/browse/TIKA-1855
> Project: Tika
> Issue Type: Sub-task
> Reporter: Tim Allison
> Assignee: Tim Allison
>
> Undo TIKA-1851, and divide test docs to appropriate parser modules.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)