[GitHub] [flink] 1996fanrui commented on pull request #13885: [FLINK-19911] Read checkpoint stream with buffer to speedup restore

GitBox Mon, 09 Nov 2020 07:41:22 -0800


1996fanrui commented on pull request #13885:
URL: https://github.com/apache/flink/pull/13885#issuecomment-724092762

> Are the HDFS input streams generally not buffered? Would it make sense to
adjust the `HadoopDataInputStream` class to be buffered?

Hi @StephanEwen , I am sorry to reply you so late.
In the past two days, I read the relevant code of hdfs reading data and did
some performance analysis.
conclusion as below:
The hdfs input stream has buffer by default. The default buffer size is 64KB.
In theory, Use FSDataBufferedInputStream to wrap hdfsInputStream does not
reduce the number of disk accesses. But the result shows: When buffer is added,
restore time is reduced to one-third of the original.

So, I did some performance analysis.
From the point of view of CPU usage, When buffer is added cpu usage is
greatly reduced.
Using hdfsInputStream directly, the CPU usage rate is 60~70%.
Use FSDataBufferedInputStream to wrap hdfsInputStream, the CPU usage rate is
20~25%.

Analyze why hdfsInputStream consumes CPU:
The hdfs client contains a lot of statistical information, and the method
call stack is relatively deep, each method will consume a little performance.
As shown in the picture below:
[flame graph remark
link](https://drive.google.com/file/d/1zTwHdmSybAgyBGIIP71FLfMQ5DvK1eON/view?usp=sharing)

If you use FSDataBufferedInputStream to wrap hdfsInputStream, there will be
a buffer outside the hdfs client, avoiding a very deep call stack and a lot of
hdfs statistics.

Original flame graph link:
[restore with
buffer](https://drive.google.com/file/d/1jxBNzh2iIsrX__wfFjvTWiC9LPCCDS3A/view?usp=sharing)
[restore without
buffer](https://drive.google.com/file/d/1qDbqQC4bG34_ZsCrOpDaNouNQpvVXRWH/view?usp=sharing)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] 1996fanrui commented on pull request #13885: [FLINK-19911] Read checkpoint stream with buffer to speedup restore

Reply via email to