[Nutch-general] Re: Recovering from Socket closed

Chris Schneider Tue, 31 Jan 2006 14:49:02 -0800

Doug, et. al.,

>Chris Schneider wrote:
>>Also, since we've been running this crawl for quite some time, we'd like to 
>>preserve the segment data if at all possible. Could someone please recommend 
>>a way to recover as gracefully as possible from this condition? The Crawl 
>>.main process died with the following output:
>>
>>060129 221129 Indexer: adding segment: 
>>/user/crawler/crawl-20060129091444/segments/20060129200246
>>Exception in thread "main" java.io.IOException: timed out waiting for response
>>    at org.apache.nutch.ipc.Client.call(Client.java:296)
>>    at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
>>    at $Proxy1.submitJob(Unknown Source)
>>    at org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
>>    at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
>>    at org.apache.nutch.indexer.Indexer.index(Indexer.java:263)
>>    at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>>
>>However, it definitely seems as if the JobTracker is still waiting for the 
>>job to finish (no failed jobs).
>
>Have you looked at the web ui?  It will show if things are still running.  
>This is on the jobtracker host at port 50030 by default.


Yes, this is how I know the JobTracker is still waiting for task_m_cia5po to 
complete.

>The bug here is that the RPC call times out while the map task is computing 
>splits.  The fix is that the job tracker should not compute splits until after 
>it has returned from the submitJob RPC.  Please submit a bug in Jira to help 
>remind us to fix this.

I'll be happy to log a bug for this.

Is there a work-around? Based on some other postings, I've increased 
ipc.client.timeout to 300000 (5 minutes). Does this property also control the 
timeout for the RPC call you describe above? If so, should I increase this 
timeout further? Is there a better way for us to avoid getting caught by the 
RPC timeout you describe? This crawl was only a medium-sized test. We hope to 
execute a much larger crawl over the next few days.

>To recover, first determine if the indexing has completed.  If it has not, 
>then use the 'index' command to index things, followed by 'dedup' and 'merge'. 
> Look at the source for Crawl.java:
>
>http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/java/org/apache/nutch/crawl/Crawl.java?view=markup
>
>All you need to do to complete the crawl is to complete the last few steps 
>manually.

We've done these steps manually before, so I'll get on that now. I was just 
worried about whether to trust these segments, how best to restart the 
processes, etc.

Thanks,

- Chris
-- 
------------------------
Chris Schneider
TransPac Software, Inc.
[EMAIL PROTECTED]
------------------------


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] Re: Recovering from Socket closed

Reply via email to