Please unsubscribe!!! On Jun 15, 2015 4:14 PM, "Sujen Shah (JIRA)" <[email protected]> wrote:
> > [ > https://issues.apache.org/jira/browse/NUTCH-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587113#comment-14587113 > ] > > Sujen Shah commented on NUTCH-2039: > ----------------------------------- > > Done, updated the PR. > > > Relevance based scoring filter > > ------------------------------ > > > > Key: NUTCH-2039 > > URL: https://issues.apache.org/jira/browse/NUTCH-2039 > > Project: Nutch > > Issue Type: New Feature > > Reporter: Sujen Shah > > Labels: memex, nutch > > Fix For: 1.11 > > > > > > A ScoringFilter plugin that uses a similarity measure to calculate the > similarity between a given page(gold standard) and the currently parsed > page. The score obtained from this similarity is then distributed to its > outlinks. This filter aims to focus the crawler to crawl/explore relevant > pages. > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) >

