Nutch only recrawl every 30 days by default. So you set the numberDays adequately and it wil recrawl read nutch-default.xml to get the details
2009/12/9, xiao yang <yangxiao9...@gmail.com>: > What do you mean by "recrawl"? > Does the following command meets what you need? > bin/nutch crawl urls -dir crawl -depth 3 -topN 50 > Change the destination directory to a different one with the last crawl. > > On Thu, Dec 10, 2009 at 1:44 AM, Peters, Vijaya <vijaya_pet...@sra.com> > wrote: >> I'm running Nutch 1.0 in windows. How do I force Nutch to do a complete >> recrawl? >> >> >> >> thanks, >> >> - Vijaya >> >> >> >> Vijaya Peters >> SRA International, Inc. >> 4350 Fair Lakes Court North >> Room 4004 >> Fairfax, VA 22033 >> Tel: 703-502-1184 >> >> www.sra.com <http://www.sra.com/> >> Named to FORTUNE's "100 Best Companies to Work For" list for 10 >> consecutive years >> >> P Please consider the environment before printing this e-mail >> >> This electronic message transmission contains information from SRA >> International, Inc. which may be confidential, privileged or >> proprietary. The information is intended for the use of the individual >> or entity named above. If you are not the intended recipient, be aware >> that any disclosure, copying, distribution, or use of the contents of >> this information is strictly prohibited. If you have received this >> electronic information in error, please notify us immediately by >> telephone at 866-584-2143. >> >> >> >> > -- -MilleBii-