alexprosak commented on code in PR #13824:
URL: https://github.com/apache/iceberg/pull/13824#discussion_r2303227600


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java:
##########
@@ -523,6 +523,11 @@ public ReadLimit getDefaultReadLimit() {
     }
   }
 
+  @Override
+  public void prepareForTriggerAvailableNow() {

Review Comment:
   Yeah could be that if AvailableNow is started and new snapshots are added 
while the trigger is still active they will be processed when they shouldn't 
be. Was seeing if a more trivial noop would work but agreed here.
   
   Added logic to record the last offset the stream should process + use it in 
`latestOffset` to break out of the `shouldContinueReading` loop, as well as a 
test for this case.



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java:
##########
@@ -523,6 +523,11 @@ public ReadLimit getDefaultReadLimit() {
     }
   }
 
+  @Override
+  public void prepareForTriggerAvailableNow() {
+    LOG.info("The streaming query reports to use Trigger.AvailableNow");

Review Comment:
   Added log to record last offset



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to