[jira] [Updated] (HDDS-3155) Improved ozone flush implementation to make it faster.

mingchao zhao (Jira) Tue, 10 Mar 2020 21:03:24 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


mingchao zhao updated HDDS-3155:
--------------------------------
    Description: 
Background:

    When we execute mapreduce in the ozone, we find that the task will be stuck 
for a long time after the completion of Map and Reduce. The log is as follows:
{code:java}
//Refer to the attachment: stdout
20/03/05 14:43:30 INFO mapreduce.Job: map 100% reduce 33% 
20/03/05 14:43:33 INFO mapreduce.Job: map 100% reduce 100% 
20/03/05 15:29:52 INFO mapreduce.Job: Job job_1583385253878_0002 completed 
successfully{code}
    By looking at AM's log, we found that the time of over 40 minutes is AM 
writing a task log into ozone.

At present, after MR execution, the Task information is recorded into the log 
on HDFS or ozone by AM.  Moreover, the task information is flush to HDFS or 
ozone one by one 
([details|https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java#L1640]).
 

     The problem occurs when the number of task maps is large. 

     Currently, each flush operation in ozone generates a new chunk file in 
real time on the disk. This approach is not very efficient at the moment. For 
this we can refer to the implementation of HDFS flush. Instead of writing to 
disk each time flush writes the contents of the buffer to the datanode's OS 
buffer. In the first place, we need to ensure that this content can be read by 
other datanodes.

 

  was:
Background:

When we execute mapreduce in the ozone, we find that the task will be stuck for 
a long time after the completion of Map and Reduce. The log is as follows:
{code:java}
//代码占位符
20/03/05 14:43:03 INFO mapreduce.Job:  map 91% reduce 30%20/03/05 14:43:03 INFO 
mapreduce.Job:  map 91% reduce 30%20/03/05 14:43:05 INFO mapreduce.Job:  map 
92% reduce 30%20/03/05 14:43:07 INFO mapreduce.Job:  map 93% reduce 30%20/03/05 
14:43:08 INFO mapreduce.Job:  map 93% reduce 31%20/03/05 14:43:11 INFO 
mapreduce.Job:  map 94% reduce 31%20/03/05 14:43:14 INFO mapreduce.Job:  map 
95% reduce 31%20/03/05 14:43:18 INFO mapreduce.Job:  map 96% reduce 31%20/03/05 
14:43:20 INFO mapreduce.Job:  map 97% reduce 32%20/03/05 14:43:24 INFO 
mapreduce.Job:  map 98% reduce 32%20/03/05 14:43:26 INFO mapreduce.Job:  map 
99% reduce 33%20/03/05 14:43:30 INFO mapreduce.Job:  map 100% reduce 
33%20/03/05 14:43:33 INFO mapreduce.Job:  map 100% reduce 100%20/03/05 15:29:52 
INFO mapreduce.Job: Job job_1583385253878_0002 completed successfully20/03/05 
15:29:52 INFO mapreduce.Job: Counters: 51 File System Counters FILE: Number of 
bytes read=84602 FILE: Number of bytes written=162626320 FILE: Number of read 
operations=0 FILE: Number of large read operations=0 FILE: Number of write 
operations=0 O3FS: Number of bytes read=237780 O3FS: Number of bytes 
written=134217728089 O3FS: Number of read operations=4008 O3FS: Number of large 
read operations=0 O3FS: Number of write operations=1002 Job Counters  Killed 
map tasks=1 Launched map tasks=1000 Launched reduce tasks=1 Data-local map 
tasks=979 Rack-local map tasks=21 Total time spent by all maps in occupied 
slots (ms)=149515400 Total time spent by all reduces in occupied slots 
(ms)=449288 Total time spent by all map tasks (ms)=7475770 Total time spent by 
all reduce tasks (ms)=112322 Total vcore-milliseconds taken by all map 
tasks=7475770 Total vcore-milliseconds taken by all reduce tasks=112322 Total 
megabyte-milliseconds taken by all map tasks=153103769600 Total 
megabyte-milliseconds taken by all reduce tasks=460070912
{code}


> Improved ozone flush implementation to make it faster.
> ------------------------------------------------------
>
>                 Key: HDDS-3155
>                 URL: https://issues.apache.org/jira/browse/HDDS-3155
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: mingchao zhao
>            Priority: Major
>         Attachments: stdout, syslog
>
>
> Background:
>     When we execute mapreduce in the ozone, we find that the task will be 
> stuck for a long time after the completion of Map and Reduce. The log is as 
> follows:
> {code:java}
> //Refer to the attachment: stdout
> 20/03/05 14:43:30 INFO mapreduce.Job: map 100% reduce 33% 
> 20/03/05 14:43:33 INFO mapreduce.Job: map 100% reduce 100% 
> 20/03/05 15:29:52 INFO mapreduce.Job: Job job_1583385253878_0002 completed 
> successfully{code}
>     By looking at AM's log, we found that the time of over 40 minutes is AM 
> writing a task log into ozone.
> At present, after MR execution, the Task information is recorded into the log 
> on HDFS or ozone by AM.  Moreover, the task information is flush to HDFS or 
> ozone one by one 
> ([details|https://github.com/apache/hadoop/blob/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java#L1640]).
>  
>      The problem occurs when the number of task maps is large. 
>      Currently, each flush operation in ozone generates a new chunk file in 
> real time on the disk. This approach is not very efficient at the moment. For 
> this we can refer to the implementation of HDFS flush. Instead of writing to 
> disk each time flush writes the contents of the buffer to the datanode's OS 
> buffer. In the first place, we need to ensure that this content can be read 
> by other datanodes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-3155) Improved ozone flush implementation to make it faster.

Reply via email to