[jira] [Commented] (BEAM-10927) Beam Flink Runner 1.10 checkpoint failure

Beam JIRA Bot (Jira) Sat, 29 May 2021 10:27:37 -0700


    [ 
https://issues.apache.org/jira/browse/BEAM-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17353805#comment-17353805
 ]


Beam JIRA Bot commented on BEAM-10927:
--------------------------------------

This issue was marked "stale-P2" and has not received a public comment in 14 
days. It is now automatically moved to P3. If you are still affected by it, you 
can comment and move it back to P2.

> Beam Flink Runner 1.10 checkpoint failure
> -----------------------------------------
>
>                 Key: BEAM-10927
>                 URL: https://issues.apache.org/jira/browse/BEAM-10927
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.23.0
>            Reporter: Omkar Deshpande
>            Priority: P3
>
> Recently upgraded to beam-runners-flink-1.10 v2.23.0 from 
> beam-runners-flink-1.9 v2.23.0. Also, upgraded the flink server to 1.10.2 
> from 1.9.3.
> The beam pipeline reads from kafkaio and writes to kafkaio and there is an 
> in-memory pardo between PBegin and PDone. The application is configured to 
> use s3 for checkpointing and the state backend is RocksDB.
> This beam pipeline was working as expected with beam-runners-flink-1.9 as 
> expected. But after upgrading to beam-runners-flink-1.10 the checkpoints keep 
> timing out. I have tried increasing time out to several hours. But 
> checkpoints keep timing out.
> There are no exceptions in the log. Based on the logs, both synchronous and 
> asynchronous phases of checkpointing are not happening. Usually "Trigger 
> checkpoint" log statement is followed by "Confirm checkpoint" when the 
> checkpoint succeeds. But with 1.10, I only see "Trigger checkpoint" and no 
> confirmation of completion or even indication of progress. There are enough 
> cpu and memory available and there is no deadlock.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-10927) Beam Flink Runner 1.10 checkpoint failure

Reply via email to