hi, i want to re_crawl my sites every hour. i write a script for this. i edit some properties in nutch-site.xml. but my re_crawler fetches urls only for 3 times an after that it stop fetching. it's mean that my nutch don't update after 3 hours. this is my changes in nutch-site.xml:
<property> <name>db.fetch.interval.default</name> <value>30</value> <description>The default number of seconds between re-fetches of a page (30 days).</description> </property> <property> <name>db.fetch.schedule.class</name> <value>org.apache.nutch.crawl.AdaptiveFetchSchedule</value> <description>The implementation of fetch schedule. DefaultFetchSchedule simply adds the original fetchInterval to the last fetch time, regardless of page changes.</description> </property> <property> <name>solr.commit.size</name> <value>10</value> <description>Defines the number of documents to send to Solr in a single update batch. Decrease when handling very large documents to prevent Nutch from running out of memory.</description> </property> <property> <name>db.fetch.interval.max</name> <value>36000</value> <description>The maximum number of seconds between re-fetches of a page (90 days). After this period every page in the db will be re-tried, no matter what is its status.</description> </property> -- View this message in context: http://lucene.472066.n3.nabble.com/recrawl-sites-in-nutch-1-3-tp3470457p3470457.html Sent from the Nutch - Dev mailing list archive at Nabble.com.

