Srinivas created HADOOP-15898:
---------------------------------
Summary: WARN [main] org.apache.hadoop.mapred.YarnChild: Exception
running child : java.io.IOException: java.io.IOException: All datanodes
DatanodeInfoWithStorage
[[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK] are bad.
Aborting...
Key: HADOOP-15898
URL: https://issues.apache.org/jira/browse/HADOOP-15898
Project: Hadoop Common
Issue Type: Improvement
Components: performance
Affects Versions: 2.6.0
Environment: Hadoop 2.6.0-cdh5.5.1
Reporter: Srinivas
Fix For: 2.6.0
There is a business impact MR job which runs every day @ 2.00 PM PST and data
size is about 1 - 1.5 TB (depends on the business days) . Ideal elapsed time of
this job : 4 hrs. But the multiple mappers of this job simultaneously
failing with the following error so job will take some times 11 and even 13
hours also like that.
Steps to prevent this problem : 1, Migrated the environment to Yarn .2
increased the ulimit 3. Added extra nodes to the cluster. 4. Disks replacement
taking place regularly But no luck.
WARN [DataStreamer for file
/analytical_profile/DMP_analytical_profile/Turn/SAUP/2018_11_02_tmp/tmp/part-01357.5789
block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089]
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in
pipeline DatanodeInfoWithStorage
[10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK],
DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK],
DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:(
bad datanode
DatanodeInfoWithStorage[10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK]
WARN [DataStreamer for file
/analytical_profile/DMP_analytical_profile/Turn/SAUP/2018_11_02_tmp/tmp/part-01357.5789
block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089]
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in
pipeline
DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK],
DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:
bad datanode
DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK]
WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child :
java.io.IOException: java.io.IOException: All datanodes
DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]
are bad. Aborting... at
com.turn.platform.cheetah.storage.dmp.analytical_profile.merge.IncrementalProfileMergerMapper.close(IncrementalProfileMergerMapper.java:1185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]