I'm getting search results for many pages which are now 404s.

I have set

<property>
  <name>db.default.fetch.interval</name>
  <value>1</value>
  <description>The default number of days between re-fetches of a
page.</description>
</property>

in my nutch-site.xml, but when I look at the fetch part of the logs
there is no reference to these pages being downloaded again.
I have days set to 4, and use the following to perform the fetch:

... generate $crawl_db $segments -adddays $days
segment=`ls -d $segments/* | tail -1`
... fetch $segment

Is there a reason why pages might not be fetched again?

Thanks.

Reply via email to