Srinivas created HADOOP-15898:
---------------------------------

             Summary: WARN [main] org.apache.hadoop.mapred.YarnChild: Exception 
running child  : java.io.IOException: java.io.IOException: All datanodes 
DatanodeInfoWithStorage 
[[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK] are bad. 
Aborting...
                 Key: HADOOP-15898
                 URL: https://issues.apache.org/jira/browse/HADOOP-15898
             Project: Hadoop Common
          Issue Type: Improvement
          Components: performance
    Affects Versions: 2.6.0
         Environment: Hadoop 2.6.0-cdh5.5.1

 

 
            Reporter: Srinivas
             Fix For: 2.6.0


There is a business impact MR job which runs every day @ 2.00 PM PST and data 
size is about 1 - 1.5 TB (depends on the business days) . Ideal elapsed time of 
this job : 4 hrs.  But the multiple  mappers of this job simultaneously  
failing  with the following error so job will take some times 11 and even 13 
hours also like that.  

Steps to prevent this problem : 1, Migrated the environment to Yarn .2 
increased the ulimit 3. Added extra nodes to the cluster. 4. Disks replacement 
taking place regularly  But no luck.

WARN [DataStreamer for file 
/analytical_profile/DMP_analytical_profile/Turn/SAUP/2018_11_02_tmp/tmp/part-01357.5789
block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089]

org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in 
pipeline DatanodeInfoWithStorage
[10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK],

 
DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK],

DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:(

bad datanode 
DatanodeInfoWithStorage[10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK]

 

WARN [DataStreamer for file 
/analytical_profile/DMP_analytical_profile/Turn/SAUP/2018_11_02_tmp/tmp/part-01357.5789
 block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089] 
org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in 
pipeline 
DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK],
 
DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:
 bad datanode 
DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK]

 

WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : 
java.io.IOException: java.io.IOException: All datanodes 
DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]
 are bad. Aborting... at 
com.turn.platform.cheetah.storage.dmp.analytical_profile.merge.IncrementalProfileMergerMapper.close(IncrementalProfileMergerMapper.java:1185)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to