[
https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408055#comment-13408055
]
Julien Nioche commented on NUTCH-1414:
--------------------------------------
I'm concerned about the proliferation of micro-functionalities such as this
one. What do we keep as part of the distribution and what can be
stored/maintained somewhere else? One could imagine endless variants around
this one e.g. all sorts of entities (People, Location, Emails, etc....) which
is certainly useful for whoever wrote the plugin and possible a few more people
but means we have more code to maintain, document, debug etc...
We could have a page on the WIKI ("Plugin Market?") pointing to external
resources. Obviously the ones which are well maintained, mature and widely used
could make it into our repo. With the Nutch artefacts being accessible with
Ivy/Maven it would be trivial to write a script to build and test a standalone
plugin.
> Date extraction parse filter
> ----------------------------
>
> Key: NUTCH-1414
> URL: https://issues.apache.org/jira/browse/NUTCH-1414
> Project: Nutch
> Issue Type: New Feature
> Components: parser
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.6
>
> Attachments: NUTCH-1414-1.6-1.patch
>
>
> Date extraction parse filter for Nutch to provide means to extract an
> arbitrary page date (article date) from the parse text.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira