Hey Sebastian, Opened an issue/took a stab at it: https://issues.apache.org/jira/browse/NUTCH-1872
Thanks, jce On Tue, Oct 7, 2014 at 9:32 AM, Sebastian Nagel <[email protected]> wrote: > Hi, > > > Having looked at the wiki, NUTCH-655, and NUTCH-855, it seems like using > > the urlmeta plugin out of the box would not achieve this, because the > > metadata would be propagated to all outlinks (which presumably would > > include its parent, et al.). > > > > Is this correct? If so, is there any built-in way to do this or do I need > > to figure something out? > > Yes, that's right. > > But it would be easy to add the check in distributeScoreToOutlinks() > of URLMetaScoringFilter. Maybe it's also a good idea to make this > functionality generally available via a property and predefined > match classes (eg, same prefix, same host, same domain). Feel free to > open an issue for that feature. > > Thanks, > Sebastian > > On 10/06/2014 11:00 PM, Jonathan Cooper-Ellis wrote: > > Hello, > > > > I am interested in injecting metadata and propagating that to its > children > > only. > > > > For example, if I want to inject www.fakenews.com/boston along with some > > metadata that is specific to Boston, so I don't want it to be propagated > to > > www.fakenews.com or www.fakenews.com/atlanta. It should only go to > > www.fakenews.com/boston/.+ > > > > Having looked at the wiki, NUTCH-655, and NUTCH-855, it seems like using > > the urlmeta plugin out of the box would not achieve this, because the > > metadata would be propagated to all outlinks (which presumably would > > include its parent, et al.). > > > > Is this correct? If so, is there any built-in way to do this or do I need > > to figure something out? > > > > Thanks, > > jce > > > > -- Jonathan Cooper-Ellis *Data Engineer* myVBO, LLC dba Ziftr

