[ 
https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096668#comment-15096668
 ] 

Uwe Schindler commented on TIKA-1824:
-------------------------------------

Hi, as invited on TIKA-1830, here some comments from Apache Solr:

{quote}
As already stated in the past, we would like to only bundle parsers for text 
document formats, because images, class files or else are not really useful for 
indexing by default. Users that want to do this, can still add the missing 
parser bundles and SPI will do the rest. Currently we have disabled some 
parsers by removing the JAR files (like asm-all.jar, netcdf.jar), so TIKA's SPI 
will disable them automatically (because of ClassNotFoundEx). This was a bit 
rude, but worked.

The reason for this was partly also some version incompatibilities (ASM was old 
in TIKA, Lucene needs newest one), but ASM is not really useful for indexing 
anyways!

In Solr we don't use transitive dependencies in Ivy, so we decide for each JAR 
file which one gets bundled, so we check every release anyways during update.
{quote}

In addition, it would be a good idea to allow loading the TIKA SPI files in a 
separate classloader (to isolate the parser classes from others). The reason 
for this is JAR hell. If TIKA would load the parsers in its own classloader 
(optionally, e.g. by configuration), we could place all parsers and their 
dependencies in a separate lib directory outside the Solr's lib folder.

> Tika 2.0 -  Create Initial Parser Modules
> -----------------------------------------
>
>                 Key: TIKA-1824
>                 URL: https://issues.apache.org/jira/browse/TIKA-1824
>             Project: Tika
>          Issue Type: Improvement
>    Affects Versions: 2.0
>            Reporter: Bob Paulin
>            Assignee: Bob Paulin
>
> Create initial break down of parser modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to