[
https://issues.apache.org/jira/browse/FLINK-35624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863540#comment-17863540
]
Rui Fan commented on FLINK-35624:
---------------------------------
It seems file merging doesn't work. I'm not sure whether I misused or missed
any configuration option or it's a bug.
Hi [~zakelly] , would you mind providing a job to test if my job or
configuration isn't correct, thanks a lot.
h2. My test job:
[https://github.com/1996fanrui/fanrui-learning/blob/ac0e15e511fb88faf3dba9a0f1c50c37bec52d23/module-flink/src/main/java/com/dream/flink/uc/UnalignedCheckpointAndKeyedStateDemo.java]
The code includes all options. I set
execution.checkpointing.file-merging.enabled= true, and I didn't set other
options for file-merging.
h2. Flink version:
I download flink-1.20 from
[https://lists.apache.org/thread/05nrhxxbv9bxkq4bh6sndd28fc3k35lw]
h2. Exception:
When I start a job, the job throw Exception directly.
!image-2024-07-07-14-04-47-065.png!
> Release Testing: Verify FLIP-306 Unified File Merging Mechanism for
> Checkpoints
> -------------------------------------------------------------------------------
>
> Key: FLINK-35624
> URL: https://issues.apache.org/jira/browse/FLINK-35624
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Checkpointing
> Reporter: Zakelly Lan
> Assignee: Rui Fan
> Priority: Blocker
> Labels: release-testing
> Fix For: 1.20.0
>
> Attachments: image-2024-07-07-14-04-47-065.png
>
>
> Follow up the test for https://issues.apache.org/jira/browse/FLINK-32070
>
> 1.20 is the MVP version for FLIP-306. It is a little bit complex and should
> be tested carefully. The main idea of FLIP-306 is to merge checkpoint files
> in TM side, and provide new {{{}StateHandle{}}}s to the JM. There will be a
> TM-managed directory under the 'shared' checkpoint directory for each
> subtask, and a TM-managed directory under the 'taskowned' checkpoint
> directory for each Task Manager. Under those new introduced directories, the
> checkpoint files will be merged into smaller file set. The following
> scenarios need to be tested, including but not limited to:
> # With the file merging enabled, periodic checkpoints perform properly, and
> the failover, restore and rescale would also work well.
> # Switch the file merging on and off across jobs, checkpoints and recovery
> also work properly.
> # There will be no left-over TM-managed directory, especially when there is
> no cp complete before the job cancellation.
> # File merging takes no effect in (native) savepoints.
> Besides the behaviors above, it is better to validate the function of space
> amplification control and metrics. All the config options can be found under
> 'execution.checkpointing.file-merging'.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)