Re: [Nutch-general] fetch fails at reduce stage because can not sense heartbeat for 600 seconds

Mike Smith Wed, 18 Oct 2006 13:44:04 -0700

I am in the same state again, and same reduce jobs keep failing on different
machines. I cannot get the dump using kill -3 pid, it does not make the
thread to quit. Also, I tried to place some log into FetcherOutputFormat,
but because of this bug:
*https://issues.apache.org/jira/browse/HADOOP-406*<https://issues.apache.org/jira/browse/HADOOP-406>
The logging is not possible in the childs threads. Do you have any idea why
the reducers doesn't catch the QUIT signal from the cache. I am running the
latest version on SVN, otherwise I could log some key,value and url
filtering information at the reduce stage.


Mike


On 10/18/06, Dennis Kubes <[EMAIL PROTECTED]> wrote:


I agree with Andrzej that a thread dump would be best.  Also what
version of nutch are you using?

Dennis



Andrzej Bialecki wrote:
> Mike Smith wrote:
>> Hi Dennis,
>>
>> But it doesn't make sense since the reducers' keys are URLs and the
>> heartbeat cannot be sent when the reduce task is called. Since I am
>> truncating my http content to be less than 100K and I don't get any
>> file,
>> how come reducing a single record which is a single URL and writing its
>> parsed data into DFS takes more than 10 min!! Even if you load the
>> cluster
>> that should never happen. There should be another bug involved.
>>
>
>
> Could you try to produce a thread dump of a task in such state? (kill
> -SIGQUIT pid)
>

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] fetch fails at reduce stage because can not sense heartbeat for 600 seconds

Reply via email to