Yanquan Lv created FLINK-37572:
----------------------------------

             Summary: Shuffle Event by bucket and tableid to avoid data skew
                 Key: FLINK-37572
                 URL: https://issues.apache.org/jira/browse/FLINK-37572
             Project: Flink
          Issue Type: Improvement
          Components: Flink CDC
    Affects Versions: cdc-3.3.0
            Reporter: Yanquan Lv


We have an operator BucketAssign Operator to assign buckets to DataChangeEvents 
and shuffle them to different subtasks to avoid write conflicts.

However, this may lead to data skewing, as most tables may only have 1 to 2 
buckets, all of which will be assigned to the same subtask.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to