[ https://issues.apache.org/jira/browse/NUTCH-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847291#action_12847291 ]
Andrzej Bialecki commented on NUTCH-693: ----------------------------------------- Thanks for the pointer to the article. Indeed, the issue is muddy at best. So far Nutch adhered to a strict interpretation, where the links with this attribute are deleted from page outlinks immediately (so they are not only not followed but also don't affect out-degree metrics). If there is a general agreement in Nutch community towards relaxing this behavior we can further develop this patch - at the moment I don't see such support. Consequently, I propose to discuss it and in the meantime to move this issue to a later release. > Add configurable option for treating nofollow behaviour. > -------------------------------------------------------- > > Key: NUTCH-693 > URL: https://issues.apache.org/jira/browse/NUTCH-693 > Project: Nutch > Issue Type: New Feature > Reporter: Andrew McCall > Assignee: Otis Gospodnetic > Priority: Minor > Attachments: nutch.nofollow.patch > > > For my purposes I'd like to follow links even if they're marked nofollow- > Ideally I'd like to follow them, but not pass the link juice between them. > I've attached a patch that adds a configuration element > parser.html.outlinks.ignore_nofollow which allows the parser to ignore the > nofollow elements on a page. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.