[ https://issues.apache.org/jira/browse/NUTCH-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725807#comment-17725807 ]
Tim Allison commented on NUTCH-2959: ------------------------------------ Separately, I'm wondering if it would be useful to add an alternative Tika parser that relies on tika-server or a modified version of a pipes-parser. This would put all of the Tika dependencies and jar hell in its own process, and we wouldn't have to load any dependencies aside from tika-core into Nutch's jvm. They're working on doing this over on Solr now as well (I think they've chosen the tika-server route). > Upgrade to Apache Tika 2.4.1 > ---------------------------- > > Key: NUTCH-2959 > URL: https://issues.apache.org/jira/browse/NUTCH-2959 > Project: Nutch > Issue Type: Task > Affects Versions: 1.19 > Reporter: Markus Jelsma > Priority: Major > Fix For: 1.20 > > Attachments: NUTCH-2959.patch > > -- This message was sent by Atlassian Jira (v8.20.10#820010)