Yu Li created FLINK-12699: ----------------------------- Summary: Reduce CPU consumption when snapshot/restore the spilled key-group Key: FLINK-12699 URL: https://issues.apache.org/jira/browse/FLINK-12699 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends Reporter: Yu Li Assignee: Yu Li
We need to prevent the unnecessary de/serialization when snapshotting/restoring the spilled state key-group. To achieve this, we need to: 1. Add meta information for {{HeapKeyedStatebackend}} checkpoint on DFS, separating the on-heap and on-disk part 2. Write the off-heap bytes directly to DFS when checkpointing and mark it as on-disk 3. Directly write the bytes onto disk when restoring the data back from DFS, if it's marked as on-disk Notice that we cannot directly use file copy since we use mmap meanwhile support copy-on-write. -- This message was sent by Atlassian JIRA (v7.6.3#76005)