Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by Gal Nitzan: http://wiki.apache.org/nutch/FAQ ------------------------------------------------------------------------------ 1) Recover the pages already fetched and than restart the fetcher. - You'll need to create a file '''fetcher.done''' in the segment directory an than: updatedb, generate and fetch. + You'll need to create a file fetcher.done in the segment directory an than: updatedb, generate and fetch. Assuming your index is at /index {{{ % touch /index/segments/2005somesegment/fetcher.done @@ -91, +91 @@ All the pages that were not crawled will be re-generated for fetch. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way. 2) Discard the aborted output. - + Delete all folders from the segment folder except the fetchlist folder and restart the fetcher. ==== Who changes the next fetch date? ==== @@ -209, +209 @@ === Discussion === [http://grub.org/ Grub] has some interesting ideas about building a search engine using distributed computing. ''And how is that relevant to nutch?'' + ---- + CategoryHomepage
