I had setup a crawl of our intranet, ( approximately 1.6 million pages ) and had set the crawl parameters to be depth 5, MAX_INT pages per iteration
After 12 days on the 3rd iteration, I got a crash with an exception thrown Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob( JobClient.java:357 ) at org.apache.nutch.fetcher.Fetcher.fetch( Fetcher.java:443 ) at org.apache.nutch.crawl.Crawl.main( Crawl.java:111 ) I have two questions. 1. Does anyone know what the cause of this error, I looked in the hadoop logs, and saw nothing that indicates the crash cause 2. Is there anyway I can restart this job? So that I don't lose 12 days of fetching Thanks, -Charlie Williams
