Daniele Menozzi wrote:

ok, so the depth value is only used to stop the crawling at a certain
point, and proceed with the indexing, right?

Yes - depth means in fact - number of interations of generate/fetch/update cycle.

But, another thing: how can I refresh old pages? What class do I have to
use?

nutch generate - will include already fetched pages in new segment for fetching after some time (I think default is 30 days and you can change it in config file). And if you deduplicate segments the old page would be removed from index.
regards
Piotr

Reply via email to