szehon-ho commented on a change in pull request #3775:
URL: https://github.com/apache/iceberg/pull/3775#discussion_r772695828
##########
File path: core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java
##########
@@ -102,43 +102,37 @@ public static Snapshot oldestAncestor(Table table) {
}
/**
- * Traverses the history of the table's current snapshot and:
- * 1. returns null, if no snapshot exists or target timestamp is more recent
than the current snapshot.
- * 2. else return the first snapshot which satisfies {@literal >=}
targetTimestamp.
- * <p>
- * Given the snapshots (with timestamp): [S1 (10), S2 (11), S3 (12), S4 (14)]
- * <p>
- * firstSnapshotAfterTimestamp(table, x {@literal <=} 10) = S1
- * firstSnapshotAfterTimestamp(table, 11) = S2
- * firstSnapshotAfterTimestamp(table, 13) = S4
- * firstSnapshotAfterTimestamp(table, 14) = S4
- * firstSnapshotAfterTimestamp(table, x {@literal >} 14) = null
- * <p>
- * where x is the target timestamp in milliseconds and Si is the snapshot
+ * Traverses the history of the table's current snapshot and finds the first
snapshot committed after the given time.
*
* @param table a table
- * @param targetTimestampMillis a timestamp in milliseconds
- * @return the first snapshot which satisfies {@literal >=} targetTimestamp,
or null if the current snapshot is
- * more recent than the target timestamp
+ * @param timestampMillis a timestamp in milliseconds
+ * @return the first snapshot after the given timestamp, or null if the
current snapshot is older than the timestamp
+ * @throws IllegalStateException if the first ancestor after the given time
can't be determined
*/
- public static Snapshot firstSnapshotAfterTimestamp(Table table, Long
targetTimestampMillis) {
- Snapshot currentSnapshot = table.currentSnapshot();
- // Return null if no snapshot exists or target timestamp is more recent
than the current snapshot
- if (currentSnapshot == null || currentSnapshot.timestampMillis() <
targetTimestampMillis) {
+ public static Snapshot oldestAncestorAfter(Table table, long
timestampMillis) {
+ if (table.currentSnapshot() == null) {
+ // there are no snapshots or ancestors
return null;
}
- // Return the oldest snapshot which satisfies >= targetTimestamp
Snapshot lastSnapshot = null;
for (Snapshot snapshot : currentAncestors(table)) {
- if (snapshot.timestampMillis() < targetTimestampMillis) {
+ if (snapshot.timestampMillis() < timestampMillis) {
return lastSnapshot;
+ } else if (snapshot.timestampMillis() == timestampMillis) {
+ return snapshot;
Review comment:
OK, thanks for context that it's a new method rather than changing the
existing one.
The previous code is slightly easier (one less case for user to understand,
and in my understanding it still works but just takes one more cycle), but not
a big deal as the new case is straight forward.
If it was me, I'd prefer the clarity of Stream methods, but I guess we do
mostly manual traversals in Iceberg due to performance.
```
currentAncestors(table).toStream().filter(Snapshot::timestampMillis() <=
timestampMillis).findFirst()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]