stevenzwu commented on a change in pull request #4177:
URL: https://github.com/apache/iceberg/pull/4177#discussion_r824859312



##########
File path: core/src/main/java/org/apache/iceberg/util/ThreadPools.java
##########
@@ -61,6 +62,26 @@ public static ExecutorService getWorkerPool() {
     return WORKER_POOL;
   }
 
+  public static ExecutorService newWorkerPool(String namePrefix, Integer 
parallelism) {
+    return MoreExecutors.getExitingExecutorService(
+        (ThreadPoolExecutor) Executors.newFixedThreadPool(
+            Optional.ofNullable(parallelism).orElse(WORKER_THREAD_POOL_SIZE),
+            new ThreadFactoryBuilder()
+                .setDaemon(true)
+                .setNameFormat(namePrefix + "-%d")
+                .build()));
+  }
+
+  public static ExecutorService newKeyedWorkerPool(String key, String 
namePrefix, Integer parallelism) {

Review comment:
       after reviewing the usage of the thread pools, I am also in favor of no 
sharing of thread pools so that we can avoid the static cache. None of the 
usage is on parallel tasks.
   * source: split planning (running on jobmanager or the single-parallelism 
StreamingMonitorFunction)
   * sink: single-parallelism committer
   
   But we do need to add some user doc to clarify the behavior change regarding 
I/O thread pool. Previously, there is a global shared thread pool per JVM. Now 
it is per source/sink. E.g., Internally we had a rather unique setup where a 
single Flink job (running on many taskmanagers) can ingest data to dozens or 
hundreds of Iceberg tables. For those setups, users would need to tune down the 
pool size to probably 1 to avoid excessive number of threads created in JVM.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to