kbendick commented on a change in pull request #1446:
URL: https://github.com/apache/iceberg/pull/1446#discussion_r488308006



##########
File path: core/src/main/java/org/apache/iceberg/util/ParallelIterable.java
##########
@@ -98,6 +99,22 @@ private boolean checkTasks() {
 
       for (int i = 0; i < taskFutures.length; i += 1) {
         if (taskFutures[i] == null || taskFutures[i].isDone()) {
+          if (taskFutures[i] != null) {
+            // check for task failure and re-throw any exception
+            try {
+              taskFutures[i].get();

Review comment:
       It is definitely the appropriate way to get the result from a future, 
but I'm used to blocking on all of them or on a batch of them at one time (say 
2 or so at one time) via some kind of join. However, this is absolutly just 
because I reviewed this when I was too tired. I should have definitely waited 
until I had fresh eyes to consider. I completely overlooked the part where you 
have `taskFutures[i].isDone()`. So that's totally my bad.
   
   But now that you explain how this class is utilized, in particular that the 
consumer is single threaded, I better understand why this is written the way it 
is. 
   
   I do very much appreciate you taking the time to explain to me how the class 
is utilized:
   - The single threaded consumer
   - The avoidance of large memory costs in scan planning for presto, 
particularly with limit queries.
   - As well as the fact that the consumer might stop calling next.
   
   I had mistakenly assumed that by this point in time, the iterator would 
definitely iterate over all manifests.
   
   To further explain what I had thinking (since you graciously took the time 
to explain this classes utilization in detail), what I would have expected 
would be to loop over two of them or N of them, then await (via a join or some 
call to .get) on N at once. But looking at it again after getting more rest, I 
can see where I was mistaken. Especially since you have the check for 
`taskFutures[i].isDone()`. So I do apologize for leaving this review when I 
should have probably slept on it and then looked at it again with fresh eyes.
   
   Again, thanks for taking the time to help me better understand this classes 
utilization and the unique concerns of various query engines. I don't have much 
presto experience, so this was definitely enlightening.
   
   Thanks @rdblue!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to