[
https://issues.apache.org/jira/browse/FLINK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anton Kalashnikov updated FLINK-24578:
--------------------------------------
Parent: FLINK-24589
Issue Type: Sub-task (was: Bug)
> Unexpected erratic load shape for channel skew load profile
> -----------------------------------------------------------
>
> Key: FLINK-24578
> URL: https://issues.apache.org/jira/browse/FLINK-24578
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Checkpointing
> Affects Versions: 1.14.0
> Reporter: Anton Kalashnikov
> Priority: Major
> Attachments: antiphaseBufferSize.png, erraticBufferSize1.png,
> erraticBufferSize2.png
>
>
> given:
> The job with 5 maps(with keyBy).
> All channels are remote. Parallelism is 80
> The first task produces only two keys - `indexOfThisSubtask` and
> `indexOfThisSubtask + 1`. So every subTask has a constant value of active
> channels(depends on hash rebalance)
> Every record has an equal size and is processed for an equal time.
>
> when:
> The buffer debloat is enabled with the default configuration.
>
> then:
> The buffer size synchonizes on every subTask on the first map for some
> reason. It can have the strong synchronization as shown on the
> erraticBufferSize1 picture but usually synchronization is less explicit as on
> erraticBufferSize2.
> !erraticBufferSize1.png!
>
> Expected:
> After the stabilization period the buffer size should be mostly constant with
> small fluctuation or the different tasks should be in antiphase to each
> other(when one subtask has small buffer size the another should have a big
> buffer size). for example the picture antiphaseBufferSize
> !antiphaseBufferSize.png!
>
> Unfortunatelly, it is not reproduced every time which means that this problem
> can be connected to environment. But at least, it makes sense to try to
> understand why we have so strange load shape when only several input channels
> are active.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)