[ 
https://issues.apache.org/jira/browse/TIKA-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166589#comment-15166589
 ] 

Tim Allison commented on TIKA-1855:
-----------------------------------

I have to admit that I lost a bit of steam on this...  I agree with Bob about 
the elegance of the initial design.  I trust Bob's caution about unzipping more 
than one test-jar, which was my original re-plan.

After moving the test files for the sub-parser modules into the appropriate 
modules, I now have time to look at what's left in tika-parsers.  There are a 
handful of tests that could easily and appropriately be moved to the 
sub-parser-modules. 

If we want to keep all the mime tests together (esp. {{TestMimeTypes}} and 
{{TestContainerAwareDetector}}), we'll have to copy or have duplicates of at 
least one file per parser.  And, then we also have the tests in 
AutoDetectParserTest...

As I see it:
* Initial proposal -- all test-docs in test-frameworks (or, test-parsers as it 
is now) module, copy that whole test.jar to each sub-parser-module.
* Move individual test docs to appropriate sub-parser-module
**  Create test-jar for each module, then copy each module's test-jar back to 
test-frameworks
**  Duplicate files between sub-parser-modules and test-frameworks...and there 
will be quite a few because of the detection tests.
** Move mime-detection tests into the appropriate sub-parser 
modules...duplicate ContainerAwareDetectors btwn two sub-parser-modules (ooxml 
and package...?)  For those mimes for which we don't have a parser, leave those 
files in the test-frameworks module.
* Create one test-docs directory that can be read by every module (we have that 
in POI, but it doesn't have a clean maven feel to it)

Other options?


> TIka 2.0 - Move shared test-code back to tika-core and distribute test files 
> to parser modules
> ----------------------------------------------------------------------------------------------
>
>                 Key: TIKA-1855
>                 URL: https://issues.apache.org/jira/browse/TIKA-1855
>             Project: Tika
>          Issue Type: Sub-task
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>
> Undo TIKA-1851, and divide test docs to appropriate parser modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to