[jira] [Updated] (HADOOP-15898) 1 TB Data size fails to run with the following error

Srinivas (JIRA) Sat, 03 Nov 2018 08:40:35 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Srinivas updated HADOOP-15898:
------------------------------
    Summary: 1 TB Data size fails to run with the following error   (was: 1 TB 
TeraGen fails to run with the following error )

> 1 TB Data size fails to run with the following error 
> -----------------------------------------------------
>
>                 Key: HADOOP-15898
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15898
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: performance
>    Affects Versions: 2.6.0
>         Environment: Hadoop 2.6.0-cdh5.5.1
>  
>  
>            Reporter: Srinivas
>            Priority: Major
>              Labels: performance
>             Fix For: 2.6.0
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> There is a business impact MR job which runs every day @ 2.00 PM PST and data 
> size is about 1 - 1.5 TB (depends on the business days) . Ideal elapsed time 
> of this job : 4 hrs.  But the multiple  mappers of this job simultaneously  
> failing  with the following error so job will take some times 11 and even 13 
> hours also like that.  
> Steps to prevent this problem : 1, Migrated the environment to Yarn .2 
> increased the ulimit 3. Added extra nodes to the cluster. 4. Disks 
> replacement taking place regularly  But no luck.
> WARN [DataStreamer for file 
> /analytical_profile/DMP_analytical_profile/Turn/SAUP/2018_11_02_tmp/tmp/part-01357.5789
> block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089]
> org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
> BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in 
> pipeline DatanodeInfoWithStorage
> [10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK],
>  
> DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK],
> DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:(
> bad datanode 
> DatanodeInfoWithStorage[10.0.1.37:50010,DS-ed333d2e-839a-4029-a1c9-b6615c322ed2,DISK]
>  
> WARN [DataStreamer for file 
> /analytical_profile/DMP_analytical_profile/Turn/SAUP/2018_11_02_tmp/tmp/part-01357.5789
>  block BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089] 
> org.apache.hadoop.hdfs.DFSClient: Error Recovery for block 
> BP-854530680-69.194.253.58-1430267558563:blk_4683766046_1108754130089 in 
> pipeline 
> DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK],
>  
> DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]:
>  bad datanode 
> DatanodeInfoWithStorage[74.120.143.19:50010,DS-5d10576e-adc3-474f-bc9d-f0d6fb3ae4c3,DISK]
>  
> WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : 
> java.io.IOException: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[74.120.143.6:50010,DS-a5299d68-2858-46c3-8e37-d2559895f979,DISK]
>  are bad. Aborting... at 
> com.turn.platform.cheetah.storage.dmp.analytical_profile.merge.IncrementalProfileMergerMapper.close(IncrementalProfileMergerMapper.java:1185)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-15898) 1 TB Data size fails to run with the following error

Reply via email to