[ https://issues.apache.org/jira/browse/NUTCH-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797013#action_12797013 ]
Andrzej Bialecki commented on NUTCH-655: ----------------------------------------- I'm not sure about the latest addition (the score option). If we go this route, then I suggest doing the last minor step and recognize reserved metadata keys to do also other useful things like setting fetch interval. I.e. define and recognize "nutch.score" and "nutch.fetchInterval", and document it properly somewhere ...(wiki? javadoc? cmd-line synopsis?). > Injecting Crawl metadata > ------------------------ > > Key: NUTCH-655 > URL: https://issues.apache.org/jira/browse/NUTCH-655 > Project: Nutch > Issue Type: Improvement > Components: injector > Reporter: Julien Nioche > Assignee: Julien Nioche > Priority: Minor > Fix For: 1.1 > > Attachments: Injector.patch, NUTCH-655.v2 > > > the patch attached allows to inject metadata into the crawlDB. The input file > has to contain fields separated by tabs, with the URL being on the first > column. The metadata names and values are separated by '='. A input line > might look like this: > http://www.myurl.com \t categ=value1 \t categ2=value2 > This functionality can be useful to store external knowledge and index it > with a custom plugin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.