[ 
https://issues.apache.org/jira/browse/NUTCH-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545766#comment-13545766
 ] 

Ferdy Galema commented on NUTCH-1508:
-------------------------------------

NUTCH-1431 (aka 'distance' concept) only defines a global one. However, for an 
internal branch I created a hack that allows to specify it on a per host-basis 
using the host table. Not very clean.

I think NUTCH-1331 is the better approach, because it is indeed less intrusive 
and because it allows to define a scoring instead of ignoring depth-exceeding 
urls. (Also to keep 1.x and 2.x differences at a minimum). So when this gets 
implemented for 2.x we can throw away the changes in NUTCH-1431.
                
> Port limit crawler to defined depth to 2.x
> ------------------------------------------
>
>                 Key: NUTCH-1508
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1508
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 2.2
>            Reporter: Julien Nioche
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to