mynameborat commented on a change in pull request #1489:
URL: https://github.com/apache/samza/pull/1489#discussion_r616993155
##########
File path: samza-api/src/main/java/org/apache/samza/checkpoint/Checkpoint.java
##########
@@ -19,54 +19,27 @@
package org.apache.samza.checkpoint;
+import java.util.Map;
import org.apache.samza.system.SystemStreamPartition;
-import java.util.Collections;
-import java.util.Map;
-
-/**
- * A checkpoint is a mapping of all the streams a job is consuming and the
most recent current offset for each.
- * It is used to restore a {@link org.apache.samza.task.StreamTask}, either as
part of a job restart or as part
- * of restarting a failed container within a running job.
- */
-public class Checkpoint {
- private final Map<SystemStreamPartition, String> offsets;
+public interface Checkpoint {
/**
- * Constructs a new checkpoint based off a map of Samza stream offsets.
- * @param offsets Map of Samza streams to their current offset.
+ * Gets the version number of the Checkpoint
+ * @return Short indicating the version number
*/
- public Checkpoint(Map<SystemStreamPartition, String> offsets) {
- this.offsets = offsets;
- }
+ short getVersion();
/**
- * Gets a unmodifiable view of the current Samza stream offsets.
- * @return A unmodifiable view of a Map of Samza streams to their recorded
offsets.
+ * Gets a unmodifiable view of the last processed offsets for {@link
SystemStreamPartition}s.
+ * The returned value differs based on the Checkpoint version:
+ * <ol>
+ * <li>For {@link CheckpointV1}, returns the input {@link
SystemStreamPartition} offsets, as well
+ * as the latest KafkaStateChangelogOffset for any store changelog
{@link SystemStreamPartition} </li>
+ * <li>For {@link CheckpointV2} returns the input offsets only.</li>
+ * </ol>
Review comment:
I should have been a bit more clear. There are some niche use cases that
end up consuming their own state changelogs. so what is the expectation there?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]