[
https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896552#action_12896552
]
Chris A. Mattmann commented on NUTCH-874:
-----------------------------------------
Hey Julien,
I think Jukka already worked on something really similar to the ExtParser in
Tika. See:
http://tika.apache.org/0.7/api/org/apache/tika/parser/ExternalParser.html
If we go that route here in Nutch, then I think we should add an encoding
attribute similar to NUTCH-564 and flow it through in parse-tika then. If we
can do that, I think we're good!
Cheers,
Chris
> Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
> --------------------------------------------------------------------------
>
> Key: NUTCH-874
> URL: https://issues.apache.org/jira/browse/NUTCH-874
> Project: Nutch
> Issue Type: Bug
> Components: parser
> Environment: Nutch 2.0
> Reporter: Chris A. Mattmann
> Assignee: Chris A. Mattmann
> Priority: Critical
> Fix For: 2.0
>
>
> I just noticed while fixing NUTCH-564 that the ExtParser hasn't been brought
> up to date with Nutch 2.0 trunk. We should review the plugins in src/plugin
> to make sure they all work with Gora/Nutchbase now.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.