I tried that too.  
in Nutch-site.xml, I added in the below, but this had no effect.

<property>
  <name>db.default.fetch.interval</name>
  <value>0</value>
  <description>(DEPRECATED) The default number of days between re-fetches of a 
page.  value was 30
  </description>
</property>

<property>
  <name>db.fetch.interval.default</name>
  <value>3600</value>
  <description>The default number of seconds between re-fetches of a page (30 
days). value was 2592000 (30 days)
  </description>
</property>

<property>
  <name>db.fetch.interval.max</name>
  <value>3600</value>
  <description>The maximum number of seconds between re-fetches of a page
  (90 days). After this period every page in the db will be re-tried, no
  matter what is its status.  value was 7776000
  </description>
</property>

Vijaya Peters
SRA International, Inc.
4350 Fair Lakes Court North
Room 4004
Fairfax, VA  22033
Tel:  703-502-1184

www.sra.com
Named to FORTUNE's "100 Best Companies to Work For" list for 10 consecutive 
years
P Please consider the environment before printing this e-mail
This electronic message transmission contains information from SRA 
International, Inc. which may be confidential, privileged or proprietary.  The 
information is intended for the use of the individual or entity named above.  
If you are not the intended recipient, be aware that any disclosure, copying, 
distribution, or use of the contents of this information is strictly 
prohibited.  If you have received this electronic information in error, please 
notify us immediately by telephone at 866-584-2143.

-----Original Message-----
From: MilleBii [mailto:mille...@gmail.com] 
Sent: Wednesday, December 09, 2009 1:27 PM
To: nutch-user@lucene.apache.org
Subject: Re: how to force nutch to do a recrawl

Nutch only recrawl every 30 days by default. So you set the numberDays
adequately and it wil recrawl read nutch-default.xml to get the
details

2009/12/9, xiao yang <yangxiao9...@gmail.com>:
> What do you mean by "recrawl"?
> Does the following command meets what you need?
> bin/nutch crawl urls -dir crawl -depth 3 -topN 50
> Change the destination directory to a different one with the last crawl.
>
> On Thu, Dec 10, 2009 at 1:44 AM, Peters, Vijaya <vijaya_pet...@sra.com>
> wrote:
>> I'm running Nutch 1.0 in windows.  How do I force Nutch to do a complete
>> recrawl?
>>
>>
>>
>> thanks,
>>
>> - Vijaya
>>
>>
>>
>> Vijaya Peters
>> SRA International, Inc.
>> 4350 Fair Lakes Court North
>> Room 4004
>> Fairfax, VA  22033
>> Tel:  703-502-1184
>>
>> www.sra.com <http://www.sra.com/>
>> Named to FORTUNE's "100 Best Companies to Work For" list for 10
>> consecutive years
>>
>> P Please consider the environment before printing this e-mail
>>
>> This electronic message transmission contains information from SRA
>> International, Inc. which may be confidential, privileged or
>> proprietary.  The information is intended for the use of the individual
>> or entity named above.  If you are not the intended recipient, be aware
>> that any disclosure, copying, distribution, or use of the contents of
>> this information is strictly prohibited.  If you have received this
>> electronic information in error, please notify us immediately by
>> telephone at 866-584-2143.
>>
>>
>>
>>
>


-- 
-MilleBii-

Reply via email to