[ 
https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062098#comment-14062098
 ] 

Sergey Beryozkin edited comment on TIKA-1368 at 7/15/14 2:14 PM:
-----------------------------------------------------------------

#2 is for those users who know what they want, so I disagree with you 
qualifying #2 as non-viable, with #2 tika-parsers would stay as it is now.

Speaking of this runtime exception you are referring to. I'm sorry but it 
appears to be somewhat academic. I could've said: what don't we have a single 
Tika module only so that users accidentally do not forget include 
"tika-parsers" given that the source code would compile even without including 
tika-parsers.

What kind of application is it ? Is it expected to have some tests :-) ? I can 
only think of the completely generic Tika container, which is TikaServer. But 
TikaServer would be prepackaged. Can you offer a more realistic example please ?

It's not a huge issue. But I hope we will come up with a basic solution without 
getting locked into arguments :-). I've heard a number of time that users may 
be affected. IMHO users who just would like to do a quick experiment can 
download the whole distro or use TikaServer. IMHO this is a rather narrow space 
where we have a Tika application which can accept anything without users paying 
any attention to the actual dependencies. On the other hand we will ship a 
simple Tika-based solution who will be exposed to our users, who would help 
those users who'd have to manually exclude many dependencies from tika-parsers ?


Thanks, Sergey



was (Author: sergey_beryozkin):
#2 is for those users who know what they want, so I disagree with you 
qualifying #2 as non-viable, with #2 tika-parsers would stay as it is now.

Speaking of this runtime exception you are referring to. I'm sorry but it 
appears to be somewhat academic. I could've said: what don't we have a single 
Tika module only so that users accidentally do not forget include 
"tika-parsers" given that the source code would compile even without including 
tika-parsers.

What kind of application is it ? Is it expected to have some tests :-) ? I can 
only think of the completely generic Tika container, which is TikaServer. But 
TikaServer would be prepackaged. Can you offer a more realistic example please ?

It's not a huge issue. But I hope we will come up with a basic solution without 
getting locked into arguments :-). I've heard a number of time that users mat 
be affected. IMHO users who just would like to do a quick experiment can 
download the whole distro or use TikaServer. IMHO this is a rather narrow space 
where we have a Tika application which can accept anything without users paying 
any attention to the actual dependencies. On the other hand we will ship a 
simple Tika-based solution who will be exposed to our users, who would help 
those users who'd have to manually exclude many dependencies from tika-parsers ?


Thanks, Sergey


> Improve the modularity of tika-parsers
> --------------------------------------
>
>                 Key: TIKA-1368
>                 URL: https://issues.apache.org/jira/browse/TIKA-1368
>             Project: Tika
>          Issue Type: Improvement
>          Components: packaging, parser
>    Affects Versions: 1.7
>            Reporter: Sergey Beryozkin
>
> tika-parsers module has many strong transitive dependencies. This presents a 
> challenge to Maven tika-parsers users wishing to use only one or very few 
> Parser(s).
> The fact the new Parsers are regularly added makes the exclusion process very 
> brittle. For example, an OSGI application switching from Tika 1.6 to Tika 1.7 
> and having an exclusion list in place may 'leak' a new parser lib into its 
> runtime. 
> https://issues.apache.org/jira/browse/TIKA-1367
> can help on its own but a more complete solution would ideally be in place.
> Proposal:
> 1. Make tika-parsers transitive dependencies optional
> 2. Introduce tika-parsers-optional pom that will depend on tika-parsers but 
> exclude 3rd-party dependencies
> Both 1 and 2 will depend on the resolution of TIKA-1367. IMHO 1 is cleaner, 
> users will be recommended to check the documentation and add the required 
> dependencies. 2 also works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to