stevenzwu opened a new pull request, #16293:
URL: https://github.com/apache/iceberg/pull/16293

   ## Summary
   
   For format version 4 and above, this change makes `SnapshotProducer` produce 
snapshot timestamps that are strictly greater than the parent snapshot's 
`timestamp-ms`, using a Lamport-clock style fast-forward when the wall clock 
has drifted backward.
   
   - Adds `TableMetadata.MIN_FORMAT_VERSION_MONOTONIC_TIMESTAMPS = 4`.
   - In `SnapshotProducer.commit()`, replaces `System.currentTimeMillis()` with 
`snapshotTimestampMillis(parentSnapshot)`. The new helper returns 
`clock.millis()` for V1\u2013V3 (and for V4 root snapshots with no parent), and 
`max(clock.millis(), parentSnapshot.timestampMillis() + 1)` for V4+ with a 
parent on the target branch. This preserves wall-clock timestamps in the steady 
state and only fast-forwards by the minimum amount needed when the clock has 
drifted backward.
   - Adds an `@VisibleForTesting` `setClock(Clock)` hook so the Lamport 
behavior can be exercised deterministically.
   
   This is a behavior change for V4 only. V1\u2013V3 commits continue to use 
the wall clock unchanged.
   
   ## Why
   
   The format-v4 row-timestamp feature exposes each manifest entry's 
`commit_timestamp_ms` as the `_last_updated_timestamp_ms` metadata column. 
Time-travel and \"latest update\" queries on that column become incoherent if a 
backward jump in the writer's wall clock produces snapshots whose 
`timestamp-ms` does not strictly increase with sequence number. Enforcing 
monotonicity at write time keeps the per-row \"last updated\" timestamp 
consistent with snapshot order.
   
   The change is also defensive against clock skew between writers in 
distributed environments: a stale clock that briefly reports a past time no 
longer rewrites history.
   
   ## Test plan
   
   - [x] `./gradlew :iceberg-core:test --tests 
org.apache.iceberg.TestSnapshotProducer` (29 tests, 6 skipped for pre-V4 by 
`assumeThat`, 0 failures)
   - [x] `./gradlew :iceberg-core:spotlessCheck`
   - [ ] CI
   
   New tests in `TestSnapshotProducer`:
   - `testSnapshotTimestampsAreMonotonicallyIncreasing` \u2014 three 
consecutive V4 appends produce strictly increasing `timestamp-ms`.
   - `testV4LamportClockFastForwardsDriftedClock` \u2014 with the producer's 
`Clock` pinned 10s in the past, the second V4 snapshot's `timestamp-ms` equals 
`firstTs + 1`, confirming the fast-forward branch is exercised.
   
   ## Related
   
   A separate PR will propose the corresponding [table 
spec](../tree/main/format/spec.md) change documenting the 
monotonic-`timestamp-ms` requirement for V4.
   
   Made with [Cursor](https://cursor.com)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to