Hi,

Doesn't the plugin scoring-depth already do what you need?

Julien


On 12 February 2014 14:32, Tuğcem Oral <[email protected]> wrote:

> Hi all,
>
> I'm trying to start nutch in a case where it only discovers new URLs within
> given depth (e.g. 4) and recrawl infinitely. After given depths finished,
> it restarts with existing crawldb and adds new URLs (again within given
> depth), So it continuously fetch most up-to-dated sites.
>
> To achieve this, I'm planning to write a custom urlfilter plugin which
> checks for current depth and behave accordingly. Is there any simpler or
> elegant way to solve this issue?
>
> Thanks in advance,
>
> Tugcem.
>
> --
> TO
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to