Does your node5 have adequate free space and proper multi-disk mapred.local.dir configuration set in it?
On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <[email protected]> wrote: > > Hi Everyone > > Today I am testing about 2T data on my cluster, there several failed map > task and reduce task on same node > Here is the log > > Map failed: > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any > valid local directory for output/spill0.out > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127) > at > org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392) > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > Reduce failed: > java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The reduce > copier failed > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not > find any valid local directory for output/map_10003.out > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127) > at > org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706) > > > Does this mean something wrong with the configuration on node5? or this is > normal when we test the data over TBs.This is the first time I run data over > TBs > Any suggestion is welcome > > > BRs > Geelong > > > -- > From Good To Great -- Harsh J
