captainzmc opened a new pull request #716: HDDS-3155. Improved ozone client flush implementation to make it faster. URL: https://github.com/apache/hadoop-ozone/pull/716 ## What changes were proposed in this pull request? When we run MR Job (with 1000 maps) based on OzoneFileSystem, the appmaster pauses for nearly an hour after the task is completed. `20/03/05 14:43:30 INFO mapreduce.Job: map 100% reduce 33% 20/03/05 14:43:33 INFO mapreduce.Job: map 100% reduce 100% 20/03/05 15:29:52 INFO mapreduce.Job: Job job_1583385253878_0002 completed successfully` It turns out that the appmaster writes all the task events to the log one by one, calling flush once for each one. This operation is very time consuming in ozone. HDFS currently has two flush ports, flush () and hflush () flush() : flush the data from client buffer to the client package (dfs.write.packet.size default 64k). If the package is not full, it will not be sent to the datanode. hflush(): each invocation sends the data in the buffer to the datanode. Now, ozone's flush is more similar to HDFS's hflush. This PR adds an implementation of flush similar to HDFS‘s flush. Using ozone.client.stream.buffer.flush.delay to control whether to enable(not enabled by default). If we enabled it, when we call the flush() method, we will determine whether the data in the current buffer is greater than ozone.client.stream.buffer.size. If greater than, we will send it to the datanode. Otherwise, we will not send it. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-3155 ## How was this patch tested? Use the existing ut.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
