[jira] [Assigned] (NUTCH-1403) Add default ScoringFilter for manipulating metadata
[ https://issues.apache.org/jira/browse/NUTCH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-1403: -- Assignee: (was: Sebastian Nagel) > Add default ScoringFilter for manipulating metadata > > > Key: NUTCH-1403 > URL: https://issues.apache.org/jira/browse/NUTCH-1403 > Project: Nutch > Issue Type: Improvement >Reporter: Julien Nioche >Priority: Major > Fix For: 1.19 > > > This is currently done by the urlmeta plugin, which has too vague a name and > a redundant indexing filter now that we have the index-metadata plugin. This > scoring filter would help defining which metadata to pass from : > - the crawl metadata to the content metadata > - the content metadata to the parse metadata > - the parse metadata to the crawldatum for the outlinks > I'd make this scoring filter available by default i.e. not in a separate > plugin as its functionalities are commonly used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (NUTCH-1403) Add default ScoringFilter for manipulating metadata
[ https://issues.apache.org/jira/browse/NUTCH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-1403: -- Assignee: Sebastian Nagel > Add default ScoringFilter for manipulating metadata > > > Key: NUTCH-1403 > URL: https://issues.apache.org/jira/browse/NUTCH-1403 > Project: Nutch > Issue Type: Improvement >Reporter: Julien Nioche >Assignee: Sebastian Nagel >Priority: Major > Fix For: 1.17 > > > This is currently done by the urlmeta plugin, which has too vague a name and > a redundant indexing filter now that we have the index-metadata plugin. This > scoring filter would help defining which metadata to pass from : > - the crawl metadata to the content metadata > - the content metadata to the parse metadata > - the parse metadata to the crawldatum for the outlinks > I'd make this scoring filter available by default i.e. not in a separate > plugin as its functionalities are commonly used. -- This message was sent by Atlassian Jira (v8.3.4#803005)