This is an automated email from the ASF dual-hosted git repository.

findepi pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg.git


The following commit(s) were added to refs/heads/main by this push:
     new bcb32818da Drop ParallelIterable's queue low water mark (#10978)
bcb32818da is described below

commit bcb32818dab866a81539e04ae807c3a14e0e625c
Author: Piotr Findeisen <[email protected]>
AuthorDate: Thu Aug 22 00:04:57 2024 +0200

    Drop ParallelIterable's queue low water mark (#10978)
    
    As part of the change in commit
    7831a8dfc3a2de546ca069f4fc1e7afd03777554, queue low water mark was
    introduced. However, it resulted in increased number of manifests being
    read when planning LIMIT queries in Trino Iceberg connector. To avoid
    increased I/O, back out the change for now.
---
 core/src/main/java/org/apache/iceberg/util/ParallelIterable.java | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/core/src/main/java/org/apache/iceberg/util/ParallelIterable.java 
b/core/src/main/java/org/apache/iceberg/util/ParallelIterable.java
index 27cd96a397..40bdf1e0c4 100644
--- a/core/src/main/java/org/apache/iceberg/util/ParallelIterable.java
+++ b/core/src/main/java/org/apache/iceberg/util/ParallelIterable.java
@@ -195,12 +195,9 @@ public class ParallelIterable<T> extends CloseableGroup 
implements CloseableIter
       // If the consumer is processing records more slowly than the producers, 
the producers will
       // eventually fill the queue and yield, returning continuations. 
Continuations and new tasks
       // are started by checkTasks(). The check here prevents us from 
restarting continuations or
-      // starting new tasks too early (when queue is almost full) or too late 
(when queue is already
-      // emptied). Restarting too early would lead to tasks yielding very 
quickly (CPU waste on
-      // scheduling). Restarting too late would mean the consumer may need to 
wait for the tasks
-      // to produce new items. A consumer slower than producers shouldn't need 
to wait.
-      int queueLowWaterMark = maxQueueSize / 2;
-      if (queue.size() > queueLowWaterMark) {
+      // starting new tasks before the queue is emptied. Restarting too early 
would lead to tasks
+      // yielding very quickly (CPU waste on scheduling).
+      if (!queue.isEmpty()) {
         return true;
       }
 

Reply via email to