[
https://issues.apache.org/jira/browse/FLINK-36682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896837#comment-17896837
]
Yanquan Lv commented on FLINK-36682:
------------------------------------
I'm willing to support this feature.
> Add split assign strategy to avoid OOM error in TaskManager
> -----------------------------------------------------------
>
> Key: FLINK-36682
> URL: https://issues.apache.org/jira/browse/FLINK-36682
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Affects Versions: cdc-3.3.0
> Reporter: Yanquan Lv
> Priority: Major
> Fix For: cdc-3.3.0
>
>
> During snapshot reading phase, we will split table into chunks and assign
> them to split reader in TaskManager.
> For evenly chunk split, them are assigned in ascending order. For example, a
> table that primary key is id may be split into chunks like [-∞, 10000),
> [10000,20000), [20000,30000), ......[1500000, +∞). However, during snapshot
> reading phase, more records may be inserted and id will increase to relative
> high, and the last split may need to fetch too many records, for example, the
> last split may need to fetch records in range [1500000, 3000000], witch will
> cause TaskManager out of memory.
> So I propose to add a strategy to allow user to config how to assign split,
> and by default, we can send the last split first to split reader.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)