[ 
https://issues.apache.org/jira/browse/NUTCH-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725807#comment-17725807
 ] 

Tim Allison commented on NUTCH-2959:
------------------------------------

Separately, I'm wondering if it would be useful to add an alternative Tika 
parser that relies on tika-server or a modified version of a pipes-parser.  
This would put all of the Tika dependencies and jar hell in its own process, 
and we wouldn't have to load any dependencies aside from tika-core into Nutch's 
jvm.

 

They're working on doing this over on Solr now as well (I think they've chosen 
the tika-server route).

> Upgrade to Apache Tika 2.4.1
> ----------------------------
>
>                 Key: NUTCH-2959
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2959
>             Project: Nutch
>          Issue Type: Task
>    Affects Versions: 1.19
>            Reporter: Markus Jelsma
>            Priority: Major
>             Fix For: 1.20
>
>         Attachments: NUTCH-2959.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to