[
https://issues.apache.org/jira/browse/FLINK-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036714#comment-17036714
]
Yingjie Cao commented on FLINK-16012:
-------------------------------------
Theoretically, reducing the number of buffers may break the data processing
pipeline which can influence the performance. For verification, I hava tested
the change using the flink micro benchmark and a simple benchmark job.
Unfortunately, regressions are seen for both tests.
For micro benchmark, the following are some results with regression (Because of
the unstable result, I run each test three times.):
Using 2 buffer:
{code:java}
Benchmark (channelsFlushTimeout) (writers) Mode
Cnt Score Error Units
networkThroughput 1000,100ms 1 thrpt
30 15972.952 ± 752.985 ops/ms
networkThroughput 1000,100ms 4 thrpt
30 27650.498 ± 713.728 ops/ms
networkThroughput 1000,100ms 1 thrpt
30 15566.705 ± 2007.335 ops/ms
networkThroughput 1000,100ms 4 thrpt
30 27769.195 ± 1632.614 ops/ms
networkThroughput 1000,100ms 1 thrpt
30 15598.175 ± 1671.515 ops/ms
networkThroughput 1000,100ms 4 thrpt
30 27499.901 ± 1035.415 ops/ms{code}
Using 1 buffer:
{code:java}
Benchmark (channelsFlushTimeout) (writers) Mode
Cnt Score Error Units
networkThroughput 1000,100ms 1 thrpt
30 13116.610 ± 325.587 ops/ms
networkThroughput 1000,100ms 4 thrpt
30 22837.502 ± 1024.360 ops/ms
networkThroughput 1000,100ms 1 thrpt
30 11924.883 ± 1038.508 ops/ms
networkThroughput 1000,100ms 4 thrpt
30 22823.586 ± 892.918 ops/ms
networkThroughput 1000,100ms 1 thrpt
30 12960.345 ± 1596.465 ops/ms
networkThroughput 1000,100ms 4 thrpt
30 23028.803 ± 933.609 ops/ms{code}
>From the above results, we can see about 20% performance regression. For the
>benchmark job, there are also regressions (about 10% - 20%) in some cases
>where input channel numbers are small, for example 2 input channels, which
>means the number of buffer can be used is limited.
> Reduce the default number of exclusive buffers from 2 to 1 on receiver side
> ---------------------------------------------------------------------------
>
> Key: FLINK-16012
> URL: https://issues.apache.org/jira/browse/FLINK-16012
> Project: Flink
> Issue Type: Improvement
> Reporter: Zhijiang
> Assignee: Yingjie Cao
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.11.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In order to reduce the inflight buffers for checkpoint in the case of back
> pressure, we can reduce the number of exclusive buffers for remote input
> channel from default 2 to 1 as the first step. Besides that, the total
> required buffers are also reduced as a result. We can further verify the
> performance effect via various of benchmarks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)