Yes, this is supported in trunk and will still be supported when switching to 
Tika for outlink extraction. Anchors with NOFOLLOW will simply be discarded.
 
 
-----Original message-----
> From:Lewis John Mcgibbney <[email protected]>
> Sent: Thu 16-Aug-2012 10:12
> To: [email protected]
> Subject: Re: Can Nutch process rel-tag likes rel=&quot;nofollow&quot;?
> 
> Currently it looks we like don't have full support for such
> functionality. It is straight foward to grab the nofollow rel tag but
> the post processing is not currently implemented therefore you would
> need to do this yourself.
> 
> Lewis
> 
> On Thu, Aug 16, 2012 at 5:27 AM, weishenyun <[email protected]> wrote:
> > I know Nutch crawl the website according to Robot protocol if you make that
> > configuration. And it will not fetch and parse the link on the page which
> > contains <meta name="robots" content="nofollow">. But can Nutch process
> > rel-tag likes rel="nofollow" in the tags  ......  on the page?
> >
> >
> >
> > --
> > View this message in context: 
> > http://lucene.472066.n3.nabble.com/Can-Nutch-process-rel-tag-likes-rel-nofollow-tp4001541.html
> > Sent from the Nutch - Dev mailing list archive at Nabble.com.
> 
> 
> 
> -- 
> Lewis
> 

Reply via email to