I am planning to do a huge crawl using Nutch (billions of URLs) and so need to understand whether Nutch can handle restarts after a crash. For single system, if I do Ctrl+C while Nutch is running and then restart it, will it be possible for Nutch to detect where it has reached in last run and start from that point onwards? Or will it be considered as new fresh crawl? Also if I have 5 nodes running Nutch and doing the crawling, if one of the node fails, should it be considered as total failure of Nutch itself? Or should I allow other nodes to proceed further? Will I loose data gathered by the failed node? TIA, --Hrishi DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.