Yanquan Lv created FLINK-37572: ---------------------------------- Summary: Shuffle Event by bucket and tableid to avoid data skew Key: FLINK-37572 URL: https://issues.apache.org/jira/browse/FLINK-37572 Project: Flink Issue Type: Improvement Components: Flink CDC Affects Versions: cdc-3.3.0 Reporter: Yanquan Lv
We have an operator BucketAssign Operator to assign buckets to DataChangeEvents and shuffle them to different subtasks to avoid write conflicts. However, this may lead to data skewing, as most tables may only have 1 to 2 buckets, all of which will be assigned to the same subtask. -- This message was sent by Atlassian Jira (v8.20.10#820010)