curcur commented on a change in pull request #16606:
URL: https://github.com/apache/flink/pull/16606#discussion_r688319196
##########
File path:
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java
##########
@@ -329,37 +332,47 @@ public boolean
deregisterKeySelectionListener(KeySelectionListener<K> listener)
// materialization may truncate only a part of the previous result and
the backend would
// have to split it somehow for the former option, so the latter is
used.
lastCheckpointId = checkpointId;
- lastUploadedFrom = materializedTo;
+ lastUploadedFrom =
periodicMaterializer.getMaterializedState().lastMaterializedTo();
lastUploadedTo =
stateChangelogWriter.lastAppendedSequenceNumber().next();
LOG.debug(
"snapshot for checkpoint {}, change range: {}..{}",
checkpointId,
lastUploadedFrom,
lastUploadedTo);
+
+ MaterializedState materializedStateCopy =
periodicMaterializer.getMaterializedState();
+
return toRunnableFuture(
stateChangelogWriter
.persist(lastUploadedFrom)
- .thenApply(this::buildSnapshotResult));
+ .thenApply(delta -> buildSnapshotResult(delta,
materializedStateCopy)));
}
- private SnapshotResult<KeyedStateHandle>
buildSnapshotResult(ChangelogStateHandle delta) {
- // Can be called by either task thread during the sync checkpoint
phase (if persist future
- // was already completed); or by the writer thread otherwise. So need
to synchronize.
- // todo: revisit after FLINK-21357 - use mailbox action?
- synchronized (materialized) {
- // collections don't change once started and handles are immutable
- List<ChangelogStateHandle> prevDeltaCopy = new
ArrayList<>(restoredNonMaterialized);
- if (delta != null && delta.getStateSize() > 0) {
- prevDeltaCopy.add(delta);
- }
- if (prevDeltaCopy.isEmpty() && materialized.isEmpty()) {
- return SnapshotResult.empty();
- } else {
- return SnapshotResult.of(
- new ChangelogStateBackendHandleImpl(
- materialized, prevDeltaCopy,
getKeyGroupRange()));
- }
+ @Override
+ @VisibleForTesting
+ public void triggerMaterialization() {
+ periodicMaterializer.triggerMaterialization();
+ }
Review comment:
I've played with `ManuallyTriggeredScheduledExecutorService` a bit
It does not make things cleaner or less exposure of methods/APIs, but
introducing more complexities for PeriodicMaterializer constructor (now I have
to expose a way to substitute the executor all the way through from the
creation of createKeyedStateBackned).
The reason that why it does not make thigns cleaner is because I want to
test materialization itself works correctly. To do that, I have to expose the
"triggerMaterialization" method, and trigger it through the
`ManuallyTriggeredScheduledExecutorService` manually.
so "triggerMaterialization" has to be exposed anyways. I do not see much
diffeerences that I call it directly in the test thread or through the
ManuallyTriggeredScheduledExecutorService. They are basically doing the same
thing.
Please let me knwo whether this makes sense to you, or if you have better
suggestions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]