Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by MatthewHolt: http://wiki.apache.org/nutch/FAQ ------------------------------------------------------------------------------ The crawl tool expects as its first parameter the folder name where the seeding urls file is located so for example if your urls.txt is located in /nutch/seeds the crawl command would look like: crawl seeds -dir /user/nutchuser... + === ReCrawling === + Here are scripts to help you with Intranet recrawling. + ==== Version 0.7.2 ==== + Place in your main Nutch directory. + + [[0.7.2-Recrawl]] + ==== Version 0.8.0 ==== + Place in the bin sub-directory of Nutch. + + [[0.8.0-Recrawl]] === Discussion === [http://grub.org/ Grub] has some interesting ideas about building a search engine using distributed computing. ''And how is that relevant to nutch?''