Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by Gal Nitzan: http://wiki.apache.org/nutch/FAQ ------------------------------------------------------------------------------ ==== How can I recover an aborted fetch process? ==== You have two choices: + 1) Use the aborted output. - 1. Use the aborted output. You'll need to touch the file fetcher.done in the segment directory. All the pages that were not crawled will be re-generated for fetch pretty soon. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way. + * You'll need to touch the file fetcher.done in the segment directory. All the pages that were not crawled will be re-generated for fetch pretty soon. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way. - 2. Discard the aborted output. To do this, just delete the fetcher* directories in the segment and restart the fetcher. + 2) Discard the aborted output. + * Delete all folders from the segment folder except the fetchlist folder and restart the fetcher. ==== Who changes the next fetch date? ==== * After injecting a new url the next fetch date is set to the current time.
