rkhachatryan commented on PR #20404:
URL: https://github.com/apache/flink/pull/20404#issuecomment-1205045700

   > I'm not sure the root cause. It can be seen from the log that there is a 
checkpoint completed, I guess CheckpointStatsHistory is deleted after 
miniCluster.cancelJob()?
   
   It shouldn't be the case because it would cause NPE:
   ```
   cluster.getExecutionGraph(jobID).get().getCheckpointStatsSnapshot())
   ```
   Besides that, cancelled jobs should be archived.
   
   My guess is that there's a race condition when archiving the job:
   - `CheckpointStatsTracker.createSnapshot` checks condition `dirty && 
statsReadWriteLock.tryLock()`
   - if the lock is being hold then `ArchivedExecutionGraph` might receive no 
latest checkpoint
    
   The current PR tries to fix the issue by replacing job cancellation with 
`ControlledSource` + `requestJobResult`, right?
   I think that might also work.
   
   ----
   
   > Agreed, how about renaming it to ChangelogSwitchStateBackendITCase?
   
   `ChangelogPeriodicMaterializationRescaleITCase` doesn't switch the backend.
   
   The current hierarchy is as follows:
   ```
   ChangelogPeriodicMaterializationTestBase 
       ChangelogPeriodicMaterializationITCase 
       ChangelogPeriodicMaterializationSwitchEnvTestBase 
           ChangelogPeriodicMaterializationSwitchStateBackendITCase 
           ChangelogPeriodicMaterializationRescaleITCase 
   ```
   
   Except for `testFailedMaterialization`, all tests test recovery with some 
settings altered.
   So I think `ChangelogRecoveryITCase` is a suitable name 
(`ChangelogRecoveryITCaseBase`, `ChangelogRecoverySwitchEnvTestBase`, etc).
   
   However, if the refactoring is part of a fix as you mentioned, then this 
renaming is probably out of scope of this PR.
   
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to