I tried that and it worked a few times, but now I get 0 records selected for 
fetching.

$ bin/nutch crawl urls -dir crawl9a -depth 15 -topN 50
crawl started in: crawl9a
rootUrlDir = urls
threads = 10
depth = 15
topN = 50
Injector: starting
Injector: crawlDb: crawl9a/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: crawl9a/segments/20091209124308
Generator: filtering: true
Generator: topN: 50
Generator: jobtracker is 'local', generating exactly one
Generator: 0 records selected for fetching, exiting ...
Stopping at depth=0 - no more URLs to fetch.
No URLs to fetch - check your seed list and URL filters.
crawl finished: crawl9a

Vijaya Peters
SRA International, Inc.
4350 Fair Lakes Court North
Room 4004
Fairfax, VA  22033
Tel:  703-502-1184

www.sra.com
Named to FORTUNE's "100 Best Companies to Work For" list for 10 consecutive 
years
P Please consider the environment before printing this e-mail
This electronic message transmission contains information from SRA 
International, Inc. which may be confidential, privileged or proprietary.  The 
information is intended for the use of the individual or entity named above.  
If you are not the intended recipient, be aware that any disclosure, copying, 
distribution, or use of the contents of this information is strictly 
prohibited.  If you have received this electronic information in error, please 
notify us immediately by telephone at 866-584-2143.
-----Original Message-----
From: xiao yang [mailto:[email protected]] 
Sent: Wednesday, December 09, 2009 1:19 PM
To: [email protected]
Subject: Re: how to force nutch to do a recrawl

What do you mean by "recrawl"?
Does the following command meets what you need?
bin/nutch crawl urls -dir crawl -depth 3 -topN 50
Change the destination directory to a different one with the last crawl.

On Thu, Dec 10, 2009 at 1:44 AM, Peters, Vijaya <[email protected]> wrote:
> I'm running Nutch 1.0 in windows.  How do I force Nutch to do a complete
> recrawl?
>
>
>
> thanks,
>
> - Vijaya
>
>
>
> Vijaya Peters
> SRA International, Inc.
> 4350 Fair Lakes Court North
> Room 4004
> Fairfax, VA  22033
> Tel:  703-502-1184
>
> www.sra.com <http://www.sra.com/>
> Named to FORTUNE's "100 Best Companies to Work For" list for 10
> consecutive years
>
> P Please consider the environment before printing this e-mail
>
> This electronic message transmission contains information from SRA
> International, Inc. which may be confidential, privileged or
> proprietary.  The information is intended for the use of the individual
> or entity named above.  If you are not the intended recipient, be aware
> that any disclosure, copying, distribution, or use of the contents of
> this information is strictly prohibited.  If you have received this
> electronic information in error, please notify us immediately by
> telephone at 866-584-2143.
>
>
>
>

Reply via email to