rkhachatryan commented on code in PR #19331:
URL: https://github.com/apache/flink/pull/19331#discussion_r844886734
##########
flink-runtime/src/main/java/org/apache/flink/runtime/state/SharedStateRegistry.java:
##########
@@ -66,10 +67,25 @@ StreamStateHandle registerReference(
/**
* Register given shared states in the registry.
*
+ * <p>NOTE: For state from checkpoints from other jobs or runs (i.e. after
recovery), please use
+ * {@link #registerAllAfterRestored(CompletedCheckpoint, RestoreMode)}
+ *
* @param stateHandles The shared states to register.
* @param checkpointID which uses the states.
*/
void registerAll(Iterable<? extends CompositeStateHandle> stateHandles,
long checkpointID);
+ /**
+ * Set the lowest checkpoint ID below which no state is discarded,
inclusive.
+ *
+ * <p>After recovery from an incremental checkpoint, its state should NOT
be discarded, even if
+ * {@link #unregisterUnusedState(long) not used} anymore (unless
recovering in {@link
+ * RestoreMode#CLAIM CLAIM} mode).
+ *
+ * <p>This should hold for both cases: when recovering from that initial
checkpoint; and from
+ * any subsequent checkpoint derived from it.
+ */
+ void registerAllAfterRestored(CompletedCheckpoint checkpoint, RestoreMode
mode);
Review Comment:
I agree that a restored checkpoint ID should be persisted durably somewhere.
And it can also be subsequent checkpoints themselves. I think we need some
more analysis to chose the right option and implement correctly.
And because this is not strictly a regression, I strongly prefer to fix it
separately and also not block the release.
Another aspect is that we should address this in 1.14 too (for `LEGACY`
mode).
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]