amogh-jahagirdar commented on code in PR #14824:
URL: https://github.com/apache/iceberg/pull/14824#discussion_r2615601664
##########
core/src/main/java/org/apache/iceberg/rest/ScanTaskIterable.java:
##########
@@ -240,10 +239,14 @@ public void close() {
}
private boolean isDone() {
- return taskQueue.isEmpty()
+ // Reorder the conditions to make sure TaskQueue is empty is checked
last.
+ // It may happen that a worker is about to add a new task to the queue,
but before
+ // that happens, taskQueue.isEmpty() is checked then it completes fast
before the
+ // activeWorker is decremented. This would lead to a false negative.
+ return activeWorkers.get() == 0
&& planTasks.isEmpty()
- && activeWorkers.get() == 0
- && initialFileScanTasks.isEmpty();
+ && initialFileScanTasks.isEmpty()
+ && taskQueue.isEmpty();
Review Comment:
I was searching around, and offline we discussed about fork-join pool but
maybe another approach is to use a "poison pill "
https://github.com/singhpk234/iceberg/pull/275/files from the producer side to
indicate termination?
Basically the producer knows best when task production are done and
effectively emits a final "notification" via the same queue at the end of the
stream to indicate things are truly done. That way I think we avoid any
syncrhonization of "is this done " state between consumer thread and the
producer? The producer just tells the consumer it's done?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]