[ 
https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981112#comment-15981112
 ] 

Philip Mundt commented on TIKA-1368:
------------------------------------

Some parsers are also relying on dependencies that are brought in transitively 
(and therefore are not really obvious). Two examples:
* {{org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient}} depends on 
** {{org.apache.http.*}} (httpclient and/or httpcomponents) and
** {{com.google.common.reflect.TypeToken}}, most likely 
{{com.google.guava:guava}}
* {{org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser}} depends on 
** {{org.apache.http.*}} (httpclient and/or httpcomponents)

{{org.apache.httpcomponents:httpclient}} and 
{{org.apache.httpcomponents:httpmime}} are not specified but are brought in by 
{{edu.ucar:httpservices}} and {{com.google.guava:guava}} is a transitive 
dependency of {{edu.ucar:cdm}}.

I believe it would be a good idea to specifiy these direct dependencies.

> Improve the modularity of tika-parsers
> --------------------------------------
>
>                 Key: TIKA-1368
>                 URL: https://issues.apache.org/jira/browse/TIKA-1368
>             Project: Tika
>          Issue Type: Improvement
>          Components: packaging, parser
>    Affects Versions: 1.7
>            Reporter: Sergey Beryozkin
>
> tika-parsers module has many strong transitive dependencies. This presents a 
> challenge to Maven tika-parsers users wishing to use only one or very few 
> Parser(s).
> The fact the new Parsers are regularly added makes the exclusion process very 
> brittle. For example, an OSGI application switching from Tika 1.6 to Tika 1.7 
> and having an exclusion list in place may 'leak' a new parser lib into its 
> runtime. 
> https://issues.apache.org/jira/browse/TIKA-1367
> can help on its own but a more complete solution would ideally be in place.
> Proposal:
> 1. Make tika-parsers transitive dependencies optional
> 2. Introduce tika-parsers-optional pom that will depend on tika-parsers but 
> exclude 3rd-party dependencies
> Both 1 and 2 will depend on the resolution of TIKA-1367. IMHO 1 is cleaner, 
> users will be recommended to check the documentation and add the required 
> dependencies. 2 also works.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to