gengliangwang commented on code in PR #55636:
URL: https://github.com/apache/spark/pull/55636#discussion_r3174335852


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/Changelog.java:
##########
@@ -35,8 +35,34 @@
  *       {@code update_preimage}, or {@code update_postimage}</li>
  *   <li>{@code _commit_version} (connector-defined type, e.g. LONG) — the 
version containing
  *       this change</li>
- *   <li>{@code _commit_timestamp} (TIMESTAMP) — the timestamp of the 
commit</li>
+ *   <li>{@code _commit_timestamp} (TIMESTAMP) -- the timestamp of the commit. 
All rows
+ *       belonging to a single {@code _commit_version} must share the same
+ *       {@code _commit_timestamp}. For streaming reads with post-processing 
enabled,
+ *       two additional requirements apply:
+ *       <ol>
+ *         <li>All rows of a single commit must appear in the same micro-batch 
(i.e.

Review Comment:
   You're right on both counts. Updated in e8db78a:
   
   Replaced requirement 2 ("distinct commit versions must have distinct 
timestamps") with the actual invariant: each micro-batch's rows must carry 
`_commit_timestamp` strictly greater than the maximum `_commit_timestamp` of 
any prior micro-batch. The new wording explicitly mentions out-of-order commits 
as a covered case (the `v2@ts=20`, `v3@ts=10` example you gave would now be a 
contract violation).
   
   Also clarified that multiple distinct commits with equal `_commit_timestamp` 
are allowed within a single micro-batch -- only *across* batches does timestamp 
progression need to be strictly increasing. That's strictly weaker than the 
previous "distinct versions must have distinct timestamps" requirement and 
avoids the unrealistic ms-resolution edge case you flagged.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to