Hi Tej, You can do that different ways. Your question has two parts.
Fist part of question is fetch time setting and if they is changed, it will fetch. you should set interval value with db.fetch.interval.default in your nutch-site.xml. In default nutch check websites, are they modification based on http protocol. I think enough your first requirement.
Second part of question, Nutch should be work depend by time. You can do with crontab or oozie workflow. Now I explain crontab way.You can write your nutch crawl shell script in your crontab like this:
0 */2 * * * $NUTCH_HOME/runtime/deploy/bin/crawl <seedDir> <crawlID> <solrURL> <numberOfRounds>
Disadvantage of crontab way is that, crontab don't check your previous job status. Sometimes your job may takes time more than your planning time or crontab dont give information about your job status.
I think better way of schedulat working is ozzie way. But i cant explain now. I will write a document about that.
Talat 24-10-2013 05:35 tarihinde, Tej Kumar Ilindra yazdı:
Hi, I am using Nutch 2.2.1 with Hbase 0.90.4 to crawl and store the data to hbase. As of now, data is getting crawled from website based on the urls provided in the seed.txt *To Do:* I would like to write a program to crawl entire website and for every 2 hours, it should check the website for any updates, if any thing is new, it should crawl. Can anyone suggest me, how to do this.

