Hi,

I am using Nutch 2.2.1 with Hbase 0.90.4 to crawl and store the data to
hbase.

As of now, data is getting crawled from website based on the urls provided
in the seed.txt

*To Do:*
I would like to write a program to crawl entire website and for every 2
hours, it should check the website for any updates, if any thing is new, it
should crawl.

Can anyone suggest me, how to do this.

-- 
Regards,
Tej Ilindra
+91- 9962569369
[Always do what you are afraid to do. -Ralph Waldo Emerson]

Reply via email to