Hi Jungshik,

Sorry for resurrecting this old thread. I have to stop 'the fetch' process but I can't 'afford' to lose
all the pages that were fetched because of the time constraint. It's a relief to find this message.
Please read:
> I have a big fetchlist in my segments folder. How can I fetch only some sites at a time? <
on
http://www.nutch.org/cgi-bin/twiki/view/Main/FAQ


If you stop the fetch process and update the db with only a few fetched pages, the unfetched pages would be scheduled for fetching in 7 days again.

If you will not want to wait, you may use bin/nutch generate -adddays 7 (or 8, I am not sure) to make new segment/fetchlists to fetch.

Already fetched pages should be handled correctly. Make updatedb and use the half segment as all other segments.

Matthias


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to