[
https://issues.apache.org/jira/browse/FLINK-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607074#comment-17607074
]
Xintong Song commented on FLINK-25328:
--------------------------------------
[~zjureel], I'll take a look asap. Btw, it seems the design doc is private now.
Please make it public available.
> Improvement of reuse segments for join/agg/sort operators in TaskManager for
> flink olap queries
> -----------------------------------------------------------------------------------------------
>
> Key: FLINK-25328
> URL: https://issues.apache.org/jira/browse/FLINK-25328
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.14.0, 1.12.5, 1.13.3
> Reporter: Shammon
> Priority: Major
>
> We submit batch jobs to flink session cluster as olap queries, and these
> jobs' subtasks in TaskManager are frequently created and destroyed because
> they finish their work quickly. Each slot in taskmanager manages
> `MemoryManager` for multiple tasks in one job, and the `MemoryManager` is
> closed when all the subtasks are finished. Join/Aggregate/Sort and etc.
> operators in the subtasks allocate `MemorySegment` via `MemoryManager` and
> these `MemorySegment` will be free when they are finished.
>
> It causes too much memory allocation and free of `MemorySegment` in
> taskmanager. For example, a TaskManager contains 50 slots, one job has 3
> join/agg operatos run in the slot, each operator will allocate 2000 segments
> and initialize them. If the subtasks of a job take 100ms to execute, then the
> taskmanager will execute 10 jobs' subtasks one second and it will allocate
> and free 2000 * 3 * 50 * 10 = 300w segments for them. Allocate and free too
> many segments from memory will cause two issues:
> 1) Increases the CPU usage of taskmanager
> 2) Increase the cost of subtasks in taskmanager, which will increase the
> latency of job and decrease the qps.
> To improve the usage of memory segment between jobs in the same slot,
> we propose not drop memory manager when all the subtasks in the slot are
> finished. The slot will hold the `MemoryManager` and not free the allocated
> `MemorySegment` in it immediately. When some subtasks of another job are
> assigned to the slot, they don't need to allocate segments from memory and
> can reuse the `MemoryManager` and `MemorySegment` in it. WDYT? [~xtsong] THX
--
This message was sent by Atlassian Jira
(v8.20.10#820010)