[
https://issues.apache.org/jira/browse/FLINK-38325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18018277#comment-18018277
]
Zakelly Lan commented on FLINK-38325:
-------------------------------------
And by the way, I noticed you are writing Datastream job right? Are you using
the new State API with asynchronous state access or still using the original
state API?
> Checkpoints are hanging and timing out frequently
> -------------------------------------------------
>
> Key: FLINK-38325
> URL: https://issues.apache.org/jira/browse/FLINK-38325
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 2.0.0, 2.1.0
> Environment: Flink version 2.1 (also observed on 2.0) with Forst
> state backend.
> Running on kubernetes using the Flink apache kubernetes operator.
> Reporter: Lucas Borges
> Priority: Major
> Attachments: Screenshot 2025-09-03 at 14.53.56.png, Screenshot
> 2025-09-03 at 14.54.21.png, Screenshot 2025-09-03 at 14.54.36.png
>
>
> This issue is being observed on a Flink 2.1 job running with Forst state
> backend. We noticed that checkpoints are failing due to timeouts/hanging more
> frequently than other Flink 1.x jobs.
> We suspect maybe there is a deadlock somewhere, based on one task-manager's
> thread dump (could not attach it to the Jira issue due to size limits).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)