[ 
https://issues.apache.org/jira/browse/NUTCH-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  closed NUTCH-704.
-----------------------------------

    Resolution: Invalid

Please see the ScoringFilter framework, and the 
org.apache.nutch.scoring.webgraph package.

> ensure that more important pages are crawled first
> --------------------------------------------------
>
>                 Key: NUTCH-704
>                 URL: https://issues.apache.org/jira/browse/NUTCH-704
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>            Reporter: kr
>            Priority: Minor
>
> To implement url ordering algorithms mentioned in the paper "Efficient 
> crawling through url ordering" by Lawrence Page et,al.. for crawling to 
> ensure that more "important" pages are crawled first.This is important as 
> even the most powerful and successful search engines have crawled only 15% of 
> the WWW. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to