[
https://issues.apache.org/jira/browse/FLINK-24553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435850#comment-17435850
]
Anton Kalashnikov commented on FLINK-24553:
-------------------------------------------
I propose the following values:
* taskmanager.memory.min-segment-size = 256, since it works pretty well when
the record size is small and backpressure is high. at the same time, there are
no problems in other cases were found.
* taskmanager.network.memory.buffer-debloat.period = 200, according to the
math the value can be stabilized on 70% level for 10 attempts(with default
configuration). So since we have debloat target equal to 1000ms it makes sense
to set debloat period to 100ms(100 * 10 = 1000ms) but on the other hand, having
the stabilization time equal to debloat target time is a little too sensitive
to throughput volatiles. So according to experiments, the value 200ms is a good
trade-off between the time of reaction and spikes ignoring.
* taskmanager.network.memory.buffer-debloat.threshold-percentages = 25,
logically if considering the default values like the buffer size(32Kb), it
looks like the value should be changed too significantly for applying it. (from
32Kb to 16Kb). In my opinion, even if the value has changed from 32Kb to ~20Kb
it makes sense to apply this value. Of course, the situation is different for
the small value, but perhaps it is even better to change buffer size more
frequent when the backpressure is high
> Change buffer debloating default configuration values
> -----------------------------------------------------
>
> Key: FLINK-24553
> URL: https://issues.apache.org/jira/browse/FLINK-24553
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Affects Versions: 1.14.0
> Reporter: Anton Kalashnikov
> Assignee: Anton Kalashnikov
> Priority: Major
>
> After the investigation for buffer debloating effectiveness, there are some
> conclusion:
> * taskmanager.memory.min-segment-size can be decreased from 1024 to at least
> 256 because in some cases even 1024 is a too big value and at the same time
> the low min value is not a problem.
> * taskmanager.network.memory.buffer-debloat.samples can be decreased from 20
> to 10 or taskmanager.network.memory.buffer-debloat.period can be decreased
> from 500ms to 100ms or 200ms. According to the investigation, the current
> speed of reaction is too slow so it is better to increase it by changing one
> of these parameters.
> * taskmanager.network.memory.buffer-debloat.threshold-percentages can be
> decreased from 50 to 10 because there are no problems were found when the
> announcement of buffer size happened more frequently but it actually can
> positively influent the checkpoint time.
> it is better to change the default value for min-segment-size only after the
> task https://issues.apache.org/jira/browse/FLINK-24190 will be done.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)