[
https://issues.apache.org/jira/browse/NUTCH-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545766#comment-13545766
]
Ferdy Galema commented on NUTCH-1508:
-------------------------------------
NUTCH-1431 (aka 'distance' concept) only defines a global one. However, for an
internal branch I created a hack that allows to specify it on a per host-basis
using the host table. Not very clean.
I think NUTCH-1331 is the better approach, because it is indeed less intrusive
and because it allows to define a scoring instead of ignoring depth-exceeding
urls. (Also to keep 1.x and 2.x differences at a minimum). So when this gets
implemented for 2.x we can throw away the changes in NUTCH-1431.
> Port limit crawler to defined depth to 2.x
> ------------------------------------------
>
> Key: NUTCH-1508
> URL: https://issues.apache.org/jira/browse/NUTCH-1508
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 2.2
> Reporter: Julien Nioche
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira