RE: Reduce task hangs when using nutch 0.9 with hadoop 0.12.3

Vishal Shah Thu, 24 May 2007 05:30:51 -0700

Thanks for the reply Arun. We've recompiled the hadoop native binaries and
they seem to be loading fine. We are rerunning the job to see if it works
now.


Regards,

-vishal.

-----Original Message-----
From: Arun C Murthy [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, May 23, 2007 11:40 PM
To: [email protected]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Reduce task hangs when using nutch 0.9 with hadoop 0.12.3

Vishal,

On Wed, May 23, 2007 at 02:15:50PM +0530, Vishal Shah wrote:
>Hi Arun,
>
>  Thanks for the reply. We figured out the root cause of the problem. We
are
>not using the hadoop native libs right now, and the Sun Java Deflater hangs
>sometimes during the reduce phase. Our glibc version is 2.3.5, where as the
>hadoop native libs need 2.4, that's why they are not being used by hadoop. 
>

Personally I've never seen Sun's Deflater hang... but ymmv.

Anyway, there is nothing in native hadoop libs which need glibc-2.4 - looks
like this is just an artifact of the fact that the machine on which the
release was cut (i.e. the native libs in 0.12.3 were built) had glibc-2.4. I
have glibb-2.3.6-r3 on my machine and things work fine...

>  I was wondering if there is a version of the native libs that would work
>with glibc 2.3. 

I don't have access right-away to a box with glibc-2.3.5, but it's really
easy to build them yourself - details here: 
http://wiki.apache.org/lucene-hadoop/NativeHadoop

hth,
Arun

>If not, we'll have to upgrade the glibc version on all
>machines to 2.4.
>
>Regards,
>
>-vishal.
>
>-----Original Message-----
>From: Arun C Murthy [mailto:[EMAIL PROTECTED] 
>Sent: Tuesday, May 22, 2007 4:24 PM
>To: [email protected]
>Cc: [EMAIL PROTECTED]
>Subject: Re: Reduce task hangs when using nutch 0.9 with hadoop 0.12.3
>
>Vishal Shah wrote:
>> Hi,
>>  
>>   We upgraded our code to nutch 0.9 stable version along with hadoop
>0.12.3,
>> which is the latest version of hadoop 0.12.
>>  
>>   After the upgrade, I am seeing task failures during the reduce phase
for
>> parse and fetch (without the parsing option) sometimes.
>>  
>>   Usually, it's just one reduce task that creates this problem. The
>> jobtracker kills this task saying "Task failed to report status for 602
>> seconds. Killing task"
>>  
>>   I tried running the task using IsolationRunner, and it works fine. I am
>> suspecting that there is probably a long computation happening during the
>> reduce phase for one of the keys due to which the tasktracker isn't able
>to
>> report status to the jobtracker in time. 
>>  
>
>If you suspect the long computation one way is to use the 'reporter' 
>parameter to your mapper/reducer to provide status updates and ensure 
>that the TaskTracker doesn't kill the task i.e. doesn't assume the task 
>has been lost.
>
>hth,
>Arun
>
>>   I was wondering if anyone else has seen a similar problem and if there
>is
>> a fix for it.
>>  
>> Thanks,
>>  
>> -vishal.
>> 
>

RE: Reduce task hangs when using nutch 0.9 with hadoop 0.12.3

Reply via email to