[
https://issues.apache.org/jira/browse/FLINK-37572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938908#comment-17938908
]
Yanquan Lv commented on FLINK-37572:
------------------------------------
I would like to take this.
> Shuffle Event by bucket and tableid to avoid data skew
> ------------------------------------------------------
>
> Key: FLINK-37572
> URL: https://issues.apache.org/jira/browse/FLINK-37572
> Project: Flink
> Issue Type: Improvement
> Components: Flink CDC
> Affects Versions: cdc-3.3.0
> Reporter: Yanquan Lv
> Priority: Minor
>
> We have an operator BucketAssign Operator to assign buckets to
> DataChangeEvents and shuffle them to different subtasks to avoid write
> conflicts.
> However, this may lead to data skewing, as most tables may only have 1 to 2
> buckets, all of which will be assigned to the same subtask.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)