Dear Sir, I am customizing Nutch 2.2 to crawl my seed lists which contains about 30 URL. I need to crawl mentioned URL every 24 minutes and JUST fetch new added links. I added the following configurations to nutch-site.xml file and use the following command:
<property> <name>db.fetch.interval.default</name> <value>1800</value> <description>The default number of seconds between re-fetches of a page (30 days). </description> </property> <property> <name>db.update.purge.404</name> <value>true</value> <description>If true, updatedb will add purge records with status DB_GONE from the CrawlDB. </description> </property> ./crawl urls/ testdb http://localhost:8983/solr 2 but whenever I run mention command, nutch goes deep and deeper. would you please tell where is the problem ? Regards,

