Hi All,
When I run teragen-terasort test on my hadoop deployed cluster, I get following
error
15/05/27 06:24:36 INFO mapreduce.Job: map 57% reduce 18%
15/05/27 06:24:39 INFO mapreduce.Job: Task Id :
attempt_1432720271082_0005_r_000000_0, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in
shuffle in InMemoryMerger - Thread to merge in-memory shuffled map-outputs
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
find any valid local directory for
output/attempt_1432720271082_0005_r_000000_0/map_38.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:402)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at
org.apache.hadoop.mapred.YarnOutputFiles.getInputFileForWrite(YarnOutputFiles.java:213)
at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:457)
at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94)
15/05/27 06:24:40 INFO mapreduce.Job: map 57% reduce 0%
15/05/27 06:24:46 INFO mapreduce.Job: Task Id :
attempt_1432720271082_0005_m_000041_0, Status : FAILED
FSError: java.io.IOException: No space left on device
15/05/27 06:24:48 INFO mapreduce.Job: Task Id :
attempt_1432720271082_0005_m_000046_0, Status : FAILED
FSError: java.io.IOException: No space left on device
15/05/27 06:24:49 INFO mapreduce.Job: Task Id :
attempt_1432720271082_0005_m_000044_0, Status : FAILED
Error: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
any valid local directory for attempt_1432720271082_0005_m_000044_0_spill_0.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:402)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at
org.apache.hadoop.mapred.YarnOutputFiles.getSpillFileForWrite(YarnOutputFiles.java:159)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1584)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1482)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:720)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:790)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
15/05/27 06:24:50 INFO mapreduce.Job: Task Id :
attempt_1432720271082_0005_m_000045_0, Status : FAILED
Error: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
any valid local directory for attempt_1432720271082_0005_m_000045_0_spill_0.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:402)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at
org.apache.hadoop.mapred.YarnOutputFiles.getSpillFileForWrite(YarnOutputFiles.java:159)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1584)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1482)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:720)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:790)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
15/05/27 06:24:51 INFO mapreduce.Job: Task Id :
attempt_1432720271082_0005_m_000041_1, Status : FAILED
mkdir of
/hadoop/yarn/local/usercache/hdfs/appcache/application_1432720271082_0005/container_1432720271082_0005_01_000050
failed
15/05/27 06:24:54 INFO mapreduce.Job: Task Id :
attempt_1432720271082_0005_m_000046_1, Status : FAILED
FSError: java.io.IOException: No space left on device
I attached mountpoints to the VMs which contains enough disk space and tried
again.
Still facing the same issue
[root@vmktest0001 tmp]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda2 32G 4.6G 26G 16% /
tmpfs 9.3G 0 9.3G 0% /dev/shm
/dev/vdb1 246G 60M 234G 1% /rw
/dev/vdc1 1.9T 69M 1.8T 1% /disk1
/dev/vdd1 1.9T 68M 1.8T 1% /disk2
I added below property tags in mapred-site.xml to test it
<property>
<name>mapreduce.cluster.local.dir</name>
<value>${hadoop.tmp.dir}/mapred/local</value>
</property>
<property>
<name>mapreduce.jobtracker.system.dir</name>
<value>${hadoop.tmp.dir}/mapred/system</value>
</property>
<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>${hadoop.tmp.dir}/mapred/staging</value>
</property>
<property>
<name>mapreduce.cluster.temp.dir</name>
<value>${hadoop.tmp.dir}/mapred/temp</value>
</property>
Please let me know if I have missed out any step
Help Appreciated !!
With Regards,
Pratik Gadiya
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the
property of Persistent Systems Ltd. It is intended only for the use of the
individual or entity to which it is addressed. If you are not the intended
recipient, you are not authorized to read, retain, copy, print, distribute or
use this message. If you have received this communication in error, please
notify the sender and delete all copies of this message. Persistent Systems
Ltd. does not accept any liability for virus infected mails.