Jonathan Cooper-Ellis created NUTCH-1872:
--------------------------------------------
Summary: enables control over how injected metadata is propagated
Key: NUTCH-1872
URL: https://issues.apache.org/jira/browse/NUTCH-1872
Project: Nutch
Issue Type: New Feature
Reporter: Jonathan Cooper-Ellis
Priority: Minor
This builds on NUTCH-655 and NUTCH-855, allowing users some control over which
outlinks receive injected metadata. A new configuration property "urlmeta.rule"
has been introduced, with a default value of "all".
The value "all" indicated that "urlmeta.tags" should be propagated to all
outlinks. Other options include: "host" (propagated to outlinks with the same
host as the url with which the metadata was injected), "domain" (same, except
with the same domain), "prefix" (treats the injected url as a prefix, so
metadata is only propagated to urls that extend the injected url).
Would appreciate feedback on whether you think this is a useful feature, and if
its implemented properly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)