Yes, this is supported in trunk and will still be supported when switching to
Tika for outlink extraction. Anchors with NOFOLLOW will simply be discarded.
-----Original message-----
> From:Lewis John Mcgibbney <[email protected]>
> Sent: Thu 16-Aug-2012 10:12
> To: [email protected]
> Subject: Re: Can Nutch process rel-tag likes rel="nofollow"?
>
> Currently it looks we like don't have full support for such
> functionality. It is straight foward to grab the nofollow rel tag but
> the post processing is not currently implemented therefore you would
> need to do this yourself.
>
> Lewis
>
> On Thu, Aug 16, 2012 at 5:27 AM, weishenyun <[email protected]> wrote:
> > I know Nutch crawl the website according to Robot protocol if you make that
> > configuration. And it will not fetch and parse the link on the page which
> > contains <meta name="robots" content="nofollow">. But can Nutch process
> > rel-tag likes rel="nofollow" in the tags ...... on the page?
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/Can-Nutch-process-rel-tag-likes-rel-nofollow-tp4001541.html
> > Sent from the Nutch - Dev mailing list archive at Nabble.com.
>
>
>
> --
> Lewis
>