[
https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865737#comment-17865737
]
Jufang He commented on FLINK-25646:
-----------------------------------
After investigating, I found two possible causes of performance degradation:
(1) When the Buffer size is reduced, the total size of available buffers
becomes smaller. When the backpressure is severe, the channel requests for
buffers will compete with each other, and the request for buffers will slow
down, which will lead to operator wait for data longer time and further reduce
the overall throughput.The Buffer size is calculated as throughput * expected
time to consume /total number of buffers. As throughput decreases, the Buffer
size continues to shrink, making it slower to request buffers, resulting in a
performance fallback.
(2)Each Buffer corresponds to the ByteBuf of netty. When the Buffer size is
reduced, the netty sending frequency will increase, and each ByteBuf will have
additional loss, resulting in a significant increase in CPU usage, which has a
certain negative impact on performance.
I tried to optimize these two factors:
(1)The previous logic of the throughput calculation was the amount of data
processed/processing time, where the processing time included the waiting time
for data, so remove the waiting time to obtain the theoretical maximum
throughput, and then the new buffer size was calculated. After this adjustment,
the performance backoff has improved, from 33% to around 7%, but netty's CPU
usage is still high because of the smaller buffer size.
(2)As the buffer size becomes smaller, the CPU consumption of netty becomes
higher, so the idea here is not to adjust the buffer size, but to adjust the
number of buffers, number of buffers = throughput * expected time to consume
the buffer / single buffer size. After testing, it is found that there is a
performance fallback because the estimated buffer size or number is at the
input Gate level, and the estimated buffer number is relatively small, let's
say the predicted number is 10, but the parallelism is 100. One input channel
will monopolize a buffer, consuming 10% of the total buffer, resulting in
serious buffer contention and further performance degradation.
It seems that an ideal approach would be to adjust both the buffer size and the
buffer count which can reduce the inflight-data size as much as possible
without significant performance fallback or higher Netty CPU usage. But I
haven't found an algorithm for it yet, it looks a bit complicated.
> Document buffer debloating issues with high parallelism
> -------------------------------------------------------
>
> Key: FLINK-25646
> URL: https://issues.apache.org/jira/browse/FLINK-25646
> Project: Flink
> Issue Type: Improvement
> Components: Documentation, Runtime / Network
> Affects Versions: 1.14.0
> Reporter: Anton Kalashnikov
> Assignee: Anton Kalashnikov
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.15.0
>
>
> According to last benchmarks, there are some problems with buffer debloat
> when job has high parallelism. The high parallelism means the different value
> from job to job but in general it is more than 200. So it makes sense to
> document that problem and propose the solution - increasing the number of
> buffers.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)