Re: Recovering from Socket closed

Chris Schneider Tue, 31 Jan 2006 14:48:31 -0800

Doug, et. al.,

>Chris Schneider wrote:
>>Also, since we've been running this crawl for quite some time, we'd like to 
>>preserve the segment data if at all possible. Could someone please recommend 
>>a way to recover as gracefully as possible from this condition? The Crawl 
>>.main process died with the following output:
>>
>>060129 221129 Indexer: adding segment: 
>>/user/crawler/crawl-20060129091444/segments/20060129200246
>>Exception in thread "main" java.io.IOException: timed out waiting for response
>>    at org.apache.nutch.ipc.Client.call(Client.java:296)
>>    at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
>>    at $Proxy1.submitJob(Unknown Source)
>>    at org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
>>    at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
>>    at org.apache.nutch.indexer.Indexer.index(Indexer.java:263)
>>    at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>>
>>However, it definitely seems as if the JobTracker is still waiting for the 
>>job to finish (no failed jobs).
>
>Have you looked at the web ui?  It will show if things are still running.  
>This is on the jobtracker host at port 50030 by default.


Yes, this is how I know the JobTracker is still waiting for task_m_cia5po to 
complete.

>The bug here is that the RPC call times out while the map task is computing 
>splits.  The fix is that the job tracker should not compute splits until after 
>it has returned from the submitJob RPC.  Please submit a bug in Jira to help 
>remind us to fix this.

I'll be happy to log a bug for this.

Is there a work-around? Based on some other postings, I've increased 
ipc.client.timeout to 300000 (5 minutes). Does this property also control the 
timeout for the RPC call you describe above? If so, should I increase this 
timeout further? Is there a better way for us to avoid getting caught by the 
RPC timeout you describe? This crawl was only a medium-sized test. We hope to 
execute a much larger crawl over the next few days.

>To recover, first determine if the indexing has completed.  If it has not, 
>then use the 'index' command to index things, followed by 'dedup' and 'merge'. 
> Look at the source for Crawl.java:
>
>http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/java/org/apache/nutch/crawl/Crawl.java?view=markup
>
>All you need to do to complete the crawl is to complete the last few steps 
>manually.

We've done these steps manually before, so I'll get on that now. I was just 
worried about whether to trust these segments, how best to restart the 
processes, etc.

Thanks,

- Chris
-- 
------------------------
Chris Schneider
TransPac Software, Inc.
[EMAIL PROTECTED]
------------------------

Re: Recovering from Socket closed

Reply via email to