Yanquan Lv created FLINK-36682:
----------------------------------
Summary: Add split assign strategy to avoid OOM error in
TaskManager
Key: FLINK-36682
URL: https://issues.apache.org/jira/browse/FLINK-36682
Project: Flink
Issue Type: Bug
Components: Flink CDC
Affects Versions: cdc-3.3.0
Reporter: Yanquan Lv
Fix For: cdc-3.3.0
During snapshot reading phase, we will split table into chunks and assign them
to split reader in TaskManager.
For evenly chunk split, them are assigned in ascending order. For example, a
table that primary key is id may be split into chunks like [-∞, 10000),
[10000,20000), [20000,30000), ......[1500000, +∞). However, during snapshot
reading phase, more records may be inserted and id will increase to relative
high, and the last split may need to fetch too many records, for example, the
last split may need to fetch records in range [1500000, 3000000], witch will
cause TaskManager out of memory.
So I propose to add a strategy to allow user to config how to assign split, and
by default, we can send the last split first to split reader.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)