Hi,

I've followed the instructions to set up an Intranet Search Engine, but wondered about updating it with new pages. Do I just have to rerun the crawl everyday or can I use nutch update in some way?

Also I've set the following property in nutch-site.xml

<property>
 <name>db.default.fetch.interval</name>
 <value>1</value>
 <description>The default number of days between re-fetches of a page.
 </description>
</property>

Am I right in thinking this configures nutch to check the current pages it knows about are still valid, and takes them out if not?

Thanks for any help.

JS.


Reply via email to