Hi all,

I'm trying to start nutch in a case where it only discovers new URLs within
given depth (e.g. 4) and recrawl infinitely. After given depths finished,
it restarts with existing crawldb and adds new URLs (again within given
depth), So it continuously fetch most up-to-dated sites.

To achieve this, I'm planning to write a custom urlfilter plugin which
checks for current depth and behave accordingly. Is there any simpler or
elegant way to solve this issue?

Thanks in advance,

Tugcem.

-- 
TO

Reply via email to