amogh-jahagirdar commented on code in PR #14824:
URL: https://github.com/apache/iceberg/pull/14824#discussion_r2615601664


##########
core/src/main/java/org/apache/iceberg/rest/ScanTaskIterable.java:
##########
@@ -240,10 +239,14 @@ public void close() {
     }
 
     private boolean isDone() {
-      return taskQueue.isEmpty()
+      // Reorder the conditions to make sure TaskQueue is empty is checked 
last.
+      // It may happen that a worker is about to add a new task to the queue, 
but before
+      // that happens, taskQueue.isEmpty() is checked then it completes fast 
before the
+      // activeWorker is decremented. This would lead to a false negative.
+      return activeWorkers.get() == 0
           && planTasks.isEmpty()
-          && activeWorkers.get() == 0
-          && initialFileScanTasks.isEmpty();
+          && initialFileScanTasks.isEmpty()
+          && taskQueue.isEmpty();

Review Comment:
   I was searching around, and offline we discussed about fork-join pool but 
maybe another approach is to use a "poison pill " 
https://github.com/singhpk234/iceberg/pull/275/files from the producer side to 
indicate termination?
   
   Basically the producer knows best when task production are done and 
effectively emits a final "notification" via the same queue at the end of the 
stream to indicate things are truly done. That way I think we avoid any 
syncrhonization of "is this done " state  between consumer thread and the 
producer? The producer just tells the consumer it's done?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to