[
https://issues.apache.org/jira/browse/NUTCH-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725807#comment-17725807
]
Tim Allison commented on NUTCH-2959:
------------------------------------
Separately, I'm wondering if it would be useful to add an alternative Tika
parser that relies on tika-server or a modified version of a pipes-parser.
This would put all of the Tika dependencies and jar hell in its own process,
and we wouldn't have to load any dependencies aside from tika-core into Nutch's
jvm.
They're working on doing this over on Solr now as well (I think they've chosen
the tika-server route).
> Upgrade to Apache Tika 2.4.1
> ----------------------------
>
> Key: NUTCH-2959
> URL: https://issues.apache.org/jira/browse/NUTCH-2959
> Project: Nutch
> Issue Type: Task
> Affects Versions: 1.19
> Reporter: Markus Jelsma
> Priority: Major
> Fix For: 1.20
>
> Attachments: NUTCH-2959.patch
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)