Hi John,

You can find below parameters in conf/nutch-default.xml. You can change the value and put your own one in conf/nutch-site.xml


<property>
  <name>db.default.fetch.interval</name>
  <value>30</value>
<description>(DEPRECATED) The default number of days between re-fetches of a page.
  </description>
</property>

<property>
  <name>db.fetch.interval.default</name>
  <value>2592000</value>
<description>The default number of seconds between re-fetches of a page (30 days).
  </description>
</property>

<property>
  <name>db.fetch.interval.max</name>
  <value>7776000</value>
  <description>The maximum number of seconds between re-fetches of a page
  (90 days). After this period every page in the db will be re-tried, no
  matter what is its status.
  </description>
</property>


Justin

John Martyniak wrote:
How does nutch determine when content needs to be re-fetched? The way that I understand it is that it is "next fetch" date which 7 days in the future.

Is there anyway to change that? Or to increase the fetching interval. Or somehow base it on how many times a piece of content is requested.

I would like to keep the content as fresh as possible, and the information changes more frequently than every 7 days.

Thanks in advance,

-John

Reply via email to