[
https://issues.apache.org/jira/browse/TIKA-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072328#comment-13072328
]
Antoni Mylka commented on TIKA-686:
-----------------------------------
FWIW I would say that fewer is better.
We (Aperture) tried it and overdid this. Long story short: version 1.4 was
split into 73 modules, with 31 external dependencies, builds took forever and
day-to-day development work was a pain. It was madness. Clearly, with a bit
more common sense it might have worked out better, but the key issue was that
nobody wanted this and everyone used a special 'onejar' assembly anyway.
I don't like optional dependencies. I need lots of XML in my pom to make my app
work.
I personally like exclusions better. Just it's necessary to make sure that
{{<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<exclusions>
<exclusion>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.poi</groupId>
<artifactId>poi-scratchpad</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
</exclusion>
</exclusions>
</dependency>}}
... works without ClassNotFoundErrors. (Aperture throws them in such a case
right now).
A solution with pom-only modules for each parser are OK as long as the default
case is left as it is. The same problem will have to be solved though. If I
only want office with poi, then the Tika facade must not initialize the
PdfParser even though the class itself is present on the classpath, just its
dependencies aren't.
> Split tika-parsers into separate components
> -------------------------------------------
>
> Key: TIKA-686
> URL: https://issues.apache.org/jira/browse/TIKA-686
> Project: Tika
> Issue Type: Wish
> Components: parser
> Affects Versions: 0.9
> Reporter: Christopher Currie
> Priority: Minor
>
> The email thread [1] from two years ago that led to splitting Tika into
> separate components also suggested splitting tika-parsers into separate
> components based on dependencies. This would be extremely useful, especially
> in cases where a given parser has no dependencies beyond tika-core. Please
> consider refactoring the parsers into separate components for 1.0.
> [1] http://markmail.org/message/tavirkqhn6r2szrz
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira