[
https://issues.apache.org/jira/browse/NUTCH-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205750#comment-13205750
]
Lewis John McGibbney commented on NUTCH-1129:
---------------------------------------------
Hi Markus. I'm really gutted about this one, I've not had time to sort it out.
I want to say the following things though.
- Any23 is now available on repository.apache.org [1], however I think we need
to change our ivy resolver to fetch these 0.7.0-snapshots. Should be pretty
trivial though.
- Any23 already has a crawler plugin implementation (nothing like the stuff we
offer in Nutch ;0)) I'm not aware of the code, but it might be worth a swatch?
[2] Unfortunately the documentation is not great at all as I'm sure you'll
agree.
[1] https://repository.apache.org/index.html#nexus-search;quick~org.apache.any23
[2] https://svn.apache.org/viewvc/incubator/any23/trunk/plugins/basic-crawler/
> Any23 Nutch plugin
> ------------------
>
> Key: NUTCH-1129
> URL: https://issues.apache.org/jira/browse/NUTCH-1129
> Project: Nutch
> Issue Type: New Feature
> Components: parser
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Priority: Minor
> Fix For: 1.5
>
>
> This plugin should build on the Any23 library to provide us with a plugin
> which extracts RDF data from HTTP and file resources. Although as of writing
> Any23 not part of the ASF, the project is working towards integration into
> the Apache Incubator. Once the project proves its value, this would be an
> excellent addition to the Nutch 1.X codebase.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira