[
https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981112#comment-15981112
]
Philip Mundt commented on TIKA-1368:
------------------------------------
Some parsers are also relying on dependencies that are brought in transitively
(and therefore are not really obvious). Two examples:
* {{org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient}} depends on
** {{org.apache.http.*}} (httpclient and/or httpcomponents) and
** {{com.google.common.reflect.TypeToken}}, most likely
{{com.google.guava:guava}}
* {{org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser}} depends on
** {{org.apache.http.*}} (httpclient and/or httpcomponents)
{{org.apache.httpcomponents:httpclient}} and
{{org.apache.httpcomponents:httpmime}} are not specified but are brought in by
{{edu.ucar:httpservices}} and {{com.google.guava:guava}} is a transitive
dependency of {{edu.ucar:cdm}}.
I believe it would be a good idea to specifiy these direct dependencies.
> Improve the modularity of tika-parsers
> --------------------------------------
>
> Key: TIKA-1368
> URL: https://issues.apache.org/jira/browse/TIKA-1368
> Project: Tika
> Issue Type: Improvement
> Components: packaging, parser
> Affects Versions: 1.7
> Reporter: Sergey Beryozkin
>
> tika-parsers module has many strong transitive dependencies. This presents a
> challenge to Maven tika-parsers users wishing to use only one or very few
> Parser(s).
> The fact the new Parsers are regularly added makes the exclusion process very
> brittle. For example, an OSGI application switching from Tika 1.6 to Tika 1.7
> and having an exclusion list in place may 'leak' a new parser lib into its
> runtime.
> https://issues.apache.org/jira/browse/TIKA-1367
> can help on its own but a more complete solution would ideally be in place.
> Proposal:
> 1. Make tika-parsers transitive dependencies optional
> 2. Introduce tika-parsers-optional pom that will depend on tika-parsers but
> exclude 3rd-party dependencies
> Both 1 and 2 will depend on the resolution of TIKA-1367. IMHO 1 is cleaner,
> users will be recommended to check the documentation and add the required
> dependencies. 2 also works.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)