Re: map task in initializing phase for too long

Jun Rao Thu, 05 Jul 2007 22:29:36 -0700

I enabled logging. The slow map task was caused when making a socket 
connection call in setupIOstreams()(triggered by the first RPC 
call--getProtocolVersion()--from MapTask to TaskTracker). If the socket 
connection call was made at t1, the call didn't return until t1 + ~200 
seconds (normally, each Map task takes about 8 seconds). At the RPC server 
side, doAccept() was also called at t1 + ~200 seconds. I was running a Job 
with 200+ splits 10 times. On average, there was one slow map task per run 
(all slow Map tasks took ~200 seconds to make the socket connection). I 
was using a recent 64-bit IBM JVM on SuSe.


Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA  95120-6099

[EMAIL PROTECTED]
(408)927-1886 (phone)
(408)927-3215 (fax)




Doug Cutting <[EMAIL PROTECTED]> 
06/21/2007 09:21 AM
Please respond to
[email protected]


To
[email protected]
cc

Subject
Re: map task in initializing phase for too long






Jun Rao wrote:
> I am wondering if anyone has experienced this problem. Sometimes when I 
> ran a job, a few map tasks (often just one) hang in the initializing 
phase 
> for more than 3 minutes (it normally finishes in a couple seconds). They 

> will eventually finish, but the whole job is slowed down considerably. 
The 
> weird thing is that the slow task is not deterministic. It doesn't 
always 
> occur and if does, can occur on any split and on any host.

I have not seen this.

Perhaps you can get a stack trace from the tasktracker while this is 
happening?

Owen described how to get such stack traces in:

http://mail-archives.apache.org/mod_mbox/lucene-hadoop-user/200706.mbox/[EMAIL 
PROTECTED]


Owen wrote:
> One side note is that all of the servers have a servlet such that if 
> you do http://<node>:<port>/stacks you'll get a stack trace of all 
> the threads in the server. I find that useful for remote debugging. 
> *smile* Although if it is a task jvm that has the problem, then there 
> isn't a server for them.

(This should probably be added to the documentation or the wiki...)

Doug

Re: map task in initializing phase for too long

Reply via email to