[
https://issues.apache.org/jira/browse/NUTCH-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Cooper-Ellis updated NUTCH-1872:
-----------------------------------------
Attachment: (was: urlmeta_propagation2.diff)
> enables control over how injected metadata is propagated
> --------------------------------------------------------
>
> Key: NUTCH-1872
> URL: https://issues.apache.org/jira/browse/NUTCH-1872
> Project: Nutch
> Issue Type: New Feature
> Reporter: Jonathan Cooper-Ellis
> Priority: Minor
>
> This builds on NUTCH-655 and NUTCH-855, allowing users some control over
> which outlinks receive injected metadata. A new configuration property
> "urlmeta.rule" has been introduced, with a default value of "all".
> The value "all" indicated that "urlmeta.tags" should be propagated to all
> outlinks. Other options include: "host" (propagated to outlinks with the same
> host as the url with which the metadata was injected), "domain" (same, except
> with the same domain), "prefix" (treats the injected url as a prefix, so
> metadata is only propagated to urls that extend the injected url).
> Would appreciate feedback on whether you think this is a useful feature, and
> if its implemented properly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)