Crawling entire website using Nutch 2.2.1 for every 2 hours

Tej Kumar Ilindra Wed, 23 Oct 2013 21:47:32 -0700

Hi,

I am using Nutch 2.2.1 with Hbase 0.90.4 to crawl and store the data to
hbase.


As of now, data is getting crawled from website based on the urls provided
in the seed.txt

*To Do:*
I would like to write a program to crawl entire website and for every 2
hours, it should check the website for any updates, if any thing is new, it
should crawl.

Can anyone suggest me, how to do this.

-- 
Regards,
Tej Ilindra
+91- 9962569369
[Always do what you are afraid to do. -Ralph Waldo Emerson]

Crawling entire website using Nutch 2.2.1 for every 2 hours

Reply via email to