I am in process of setting up production ready environment for Nutch Crawler. Trying to make the environment fault tolerant to Hadoop node failure, typically tasktracker and datanode failing together due to network issue or crashing OS.
I tried simulating the scenario by stopping one node during a crawl process. I stopped the node which was running a fetch reducer task in 5th cycle. The task got completed after hanging for few minutes. The Namenode UI and Map Reduce admin UI started showing reduced number for nodes. The crawl process continued for the configured 6 cycles and ended. However the total number of URLs crawled was lesser when compared with previous results. I suspect the interrupted fetch task was never retried. I want to understand the behavior and find solution for node failure during crawl. I welcome suggestions on this. I am using Nutch 2.1 with HBase 0.90.6 and Hadoop-0.20.2. Thanks, Raja -- View this message in context: http://lucene.472066.n3.nabble.com/What-would-happen-when-Hadoop-tasktracker-and-data-node-fails-during-Nutch-Crawl-tp4063189.html Sent from the Nutch - User mailing list archive at Nabble.com.

