Thanks. This is probably something trivial but if you would've any idea what could be causing this, it would be helpful. I replaced the mapred.local.dir to drives which have bigger capacity. The map jobs start to fail with the following message: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200912311931_0002/attempt_200912311931_0002_m_000027_0/output/file.out.index in any of the configured local directories
This is weird because the file in question exists on that machine in that directory (taskTracker/jobcache....). The permissions are also right so I haven't been able to realize what could be the problem. Do you have any ideas on this ? Thanks Morpheus: Do you believe in fate, Neo? Neo: No. Morpheus: Why Not? Neo: Because I don't like the idea that I'm not in control of my life. ----- Original Message ---- From: Jason Venner <[email protected]> To: [email protected] Sent: Thu, December 31, 2009 1:46:47 PM Subject: Re: large reducer output with same key the mapred.local.dir paramter will be used by each tasktracker node to fprovide directory(ies) to store transitory data about the tasks the tasktracker runs. This includes the map output, and can be very large. On Thu, Dec 31, 2009 at 10:03 AM, himanshu chandola < [email protected]> wrote: > Hi Todd, > Are these directories supposed to be on the namenode or on each of the > datanodes ? In my case it is set to a directory inside /tmp but the > mapred.local.dir was present only on the namenode. > > Thanks for the help > > Himanshu > > Morpheus: Do you believe in fate, Neo? > Neo: No. > Morpheus: Why Not? > Neo: Because I don't like the idea that I'm not in control of my life. > > > > ----- Original Message ---- > From: Todd Lipcon <[email protected]> > To: [email protected] > Sent: Thu, December 31, 2009 10:17:05 AM > Subject: Re: large reducer output with same key > > Hi Himanshu, > > Sounds like your mapred.local.dir doesn't have enough space. My guess is > that you've configured it somewhere inside /tmp/. Instead you should spread > it across all of your local physical disks by comma-separating the > directories in the configuration. Something like: > > <property> > <name>mapred.local.dir</name> > <value>/disk1/mapred-local,/disk2/mapred-local,/disk3/mapred-local</value> > </property> > > (and of course make sure those directories exist and are writable by the > user that runs your hadoop daemons, often "hadoop") > > Thanks > -Todd > > On Thu, Dec 31, 2009 at 2:10 AM, himanshu chandola < > [email protected]> wrote: > > > Hi Everyone, > > My reducer output results in most of the data having the same key. The > > reducer output is close to 16 GB and though my cluster in total has a > > terabyte of space in hdfs I get errors like the following : > > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:719) > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:209) > > > at > > > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084) > > > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: > > > Could not find any valid local directory for > > > task_200808021906_0002_m_000014_2/spill4.out > > > > After such failures, hadoop tries to start the same reduce job couple > times > > on other nodes before the job fails. From the > > exception, it looks to me this is > > probably a disk error(some machines have less than 16 gigs free space on > > hdfs). > > > > So my question was whether hadoop puts values which share the same key as > a > > single block in one node ? Or something else > > could be happening here ? > > > > Thanks > > > > H > > > > > > > > > > > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals
