Hi,

> Having looked at the wiki, NUTCH-655, and NUTCH-855, it seems like using
> the urlmeta plugin out of the box would not achieve this, because the
> metadata would be propagated to all outlinks (which presumably would
> include its parent, et al.).
>
> Is this correct? If so, is there any built-in way to do this or do I need
> to figure something out?

Yes, that's right.

But it would be easy to add the check in distributeScoreToOutlinks()
of URLMetaScoringFilter. Maybe it's also a good idea to make this
functionality generally available via a property and predefined
match classes (eg, same prefix, same host, same domain). Feel free to
open an issue for that feature.

Thanks,
Sebastian

On 10/06/2014 11:00 PM, Jonathan Cooper-Ellis wrote:
> Hello,
> 
> I am interested in injecting metadata and propagating that to its children
> only.
> 
> For example, if I want to inject www.fakenews.com/boston along with some
> metadata that is specific to Boston, so I don't want it to be propagated to
> www.fakenews.com or www.fakenews.com/atlanta. It should only go to
> www.fakenews.com/boston/.+
> 
> Having looked at the wiki, NUTCH-655, and NUTCH-855, it seems like using
> the urlmeta plugin out of the box would not achieve this, because the
> metadata would be propagated to all outlinks (which presumably would
> include its parent, et al.).
> 
> Is this correct? If so, is there any built-in way to do this or do I need
> to figure something out?
> 
> Thanks,
> jce
> 

Reply via email to