Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Bradford Stephens
Hey everyone, I'm having a similar problem: Map output lost, rescheduling: getMapOutput(task_200803281212_0001_m_00_2,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find task_200803281212_0001_m_00_2/file.out.index in any of the configured local directories

Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Bradford Stephens
Also, I'm running hadoop 0.16.1 :) On Fri, Mar 28, 2008 at 1:23 PM, Bradford Stephens [EMAIL PROTECTED] wrote: Hey everyone, I'm having a similar problem: Map output lost, rescheduling: getMapOutput(task_200803281212_0001_m_00_2,0) failed :

RE: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Devaraj Das
Hi Bradford, Could you please check what your mapred.local.dir is set to? Devaraj. -Original Message- From: Bradford Stephens [mailto:[EMAIL PROTECTED] Sent: Saturday, March 29, 2008 1:54 AM To: core-user@hadoop.apache.org Cc: [EMAIL PROTECTED] Subject: Re: hadoop 0.15.3 r612257

Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-03-28 Thread Bradford Stephens
you please check what your mapred.local.dir is set to? Devaraj. -Original Message- From: Bradford Stephens [mailto:[EMAIL PROTECTED] Sent: Saturday, March 29, 2008 1:54 AM To: core-user@hadoop.apache.org Cc: [EMAIL PROTECTED] Subject: Re: hadoop 0.15.3 r612257 freezes

Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-01-29 Thread Jason Venner
We are running under linux with dfs on GiGE lans, kernel 2.6.15-1.2054_FC5smp, with a variety of xeon steppings for our processors. Our replacation factor was set to 3 Florian Leibert wrote: Maybe it helps to know that we're running Hadoop inside amazon's EC2... Thanks, Florian -- Jason

Re: hadoop 0.15.3 r612257 freezes on reduce task

2008-01-29 Thread Jason Venner
That was the error that we were seeing in our hung reduce tasks. It went away for us, and we never figured out why. A number of things happened in our environment around the time it went a way. We shifted to 0.15.2, our cluster moved to a separate switched vlan from our main network, we started