[jira] [Updated] (FLINK-31766) Restoring from a retained checkpoint that was generated with changelog backend enabled might fail due to missing files

2023-04-25 Thread Yanfei Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanfei Lei updated FLINK-31766:
---
Component/s: Runtime / Checkpointing

> Restoring from a retained checkpoint that was generated with changelog 
> backend enabled might fail due to missing files
> --
>
> Key: FLINK-31766
> URL: https://issues.apache.org/jira/browse/FLINK-31766
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Coordination
>Affects Versions: 1.17.0, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Priority: Major
> Attachments: 
> FLINK-31593.StatefulJobSavepointMigrationITCase.create_snapshot.log, 
> FLINK-31593.StatefulJobSavepointMigrationITCase.verify_snapshot.log
>
>
> in FLINK-31593 we discovered a instability when generating the test data for 
> {{StatefulJobSavepointMigrationITCase}} and 
> {{StatefulJobWBroadcastStateMigrationITCase}}. It appears that files are 
> deleted that shouldn't be deleted (see [~Yanfei Lei]'s [comment in 
> FLINK-31593|https://issues.apache.org/jira/browse/FLINK-31593?focusedCommentId=17706679=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17706679]).
> It's quite reproducible when generating the 1.17 test data for 
> {{StatefulJobWBroadcastStateMigrationITCase}} and doing a test run to verify 
> it.
> I'm attaching the debug logs of such two runs that I generated for 
> FLINK-31593 in this issue as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31766) Restoring from a retained checkpoint that was generated with changelog backend enabled might fail due to missing files

2023-04-11 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-31766:
--
Attachment: 
FLINK-31593.StatefulJobSavepointMigrationITCase.verify_snapshot.log

FLINK-31593.StatefulJobSavepointMigrationITCase.create_snapshot.log

> Restoring from a retained checkpoint that was generated with changelog 
> backend enabled might fail due to missing files
> --
>
> Key: FLINK-31766
> URL: https://issues.apache.org/jira/browse/FLINK-31766
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.0, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Priority: Major
> Attachments: 
> FLINK-31593.StatefulJobSavepointMigrationITCase.create_snapshot.log, 
> FLINK-31593.StatefulJobSavepointMigrationITCase.verify_snapshot.log
>
>
> in FLINK-31593 we discovered a instability when generating the test data for 
> {{StatefulJobSavepointMigrationITCase}} and 
> {{StatefulJobWBroadcastStateMigrationITCase}}. It appears that files are 
> deleted that shouldn't be deleted (see [~Yanfei Lei]'s [comment in 
> FLINK-31593|https://issues.apache.org/jira/browse/FLINK-31593?focusedCommentId=17706679=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17706679]).
> It's quite reproducible when generating the 1.17 test data for 
> {{StatefulJobWBroadcastStateMigrationITCase}} and doing a test run to verify 
> it.
> I'm attaching the debug logs of such two runs that I generated for 
> FLINK-31593 in this issue as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)