[
https://issues.apache.org/jira/browse/FLINK-23974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438189#comment-17438189
]
Anton Kalashnikov commented on FLINK-23974:
-------------------------------------------
As the conclusion of this task:
* Several bugs were found(see dependent tickets) which caused some performance
drop
* A couple of improvements were proposed(see dependent tickets) which should a
little improve performance
* The performance for static load even after all dependent tickets is still a
little lower than without buffer debloat but it is insignificant
* The scenario with data skew looks pretty good after all of the proposed
changes.
* Sinus shape and erratic load require further investigation in separate
tickets since we need to determine what is our target scenario. For example, if
we have a sinus shape load because of the source, then it works without any
problems even now. But if we have a sinus shape based on the time because of
the sink(ex. the throughput of the sink decreasing during 10sec then it
increasing again during 10sec), then first of all it doesn't clear what is the
possible reason for this behavior, secondly, it doesn't work well(the
throughput is significantly lower) right now and it looks like it is too
expensive to fix this.
> Decreased throughput with enabled buffer debloat
> ------------------------------------------------
>
> Key: FLINK-23974
> URL: https://issues.apache.org/jira/browse/FLINK-23974
> Project: Flink
> Issue Type: Sub-task
> Affects Versions: 1.14.0
> Reporter: Anton Kalashnikov
> Assignee: Anton Kalashnikov
> Priority: Major
>
> According to task https://issues.apache.org/jira/browse/FLINK-23456, we have
> some performance drop when the buffer debloat is enabled:
> * for static load with throttling in source - 1-3%
> * for sinus shape load with throttling in source - 2-3%
> * for erratic load with throttling in source - 5-6%
> It needs to investigate the reason for that and try to improve if it is
> possible.
> It makes sense to write the microbenchmarks for these scenarios to reproduce
> the problem locally.
> Problem assumption:
> Highly likely the problem is the speed of reaction on increasing load. During
> the investigation, it needs to pay attention to the calculation throughput
> for different load profiles. It is possible that when the load is low we are
> decreasing the buffer size to a small value(maybe minimum value) and when the
> load gets back to high we need several iterations of the throughput
> calculation to reach the appropriate buffer size.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)