[GitHub] [iceberg] pvary commented on a change in pull request #2228: Hive: Implement multi-table inserts

GitBox Mon, 22 Mar 2021 02:59:29 -0700


pvary commented on a change in pull request #2228:
URL: https://github.com/apache/iceberg/pull/2228#discussion_r598571069




##########
File path: mr/src/main/java/org/apache/iceberg/mr/InputFormatConfig.java
##########
@@ -58,8 +58,11 @@ private InputFormatConfig() {
   public static final String LOCATION_PROVIDER = 
"iceberg.mr.location.provider";
   public static final String ENCRYPTION_MANAGER = 
"iceberg.mr.encription.manager";
 
-  public static final String COMMIT_THREAD_POOL_SIZE = 
"iceberg.mr.commit.thread.pool.size";
-  public static final int COMMIT_THREAD_POOL_SIZE_DEFAULT = 10;
+  public static final String OUTPUT_TABLES = "iceberg.mr.output.tables";
+  public static final String COMMIT_TABLE_THREAD_POOL_SIZE = 
"iceberg.mr.commit.table.thread.pool.size";

Review comment:
       I can be convinced to not do it parallel (my first patch contained no 
parallel execution). It really depends on the length of the Iceberg commit 
commit. I have seen generated BI queries with more than 10 target tables, but 
usually we will have only a single table as a target.
   
   If we decide to keep the option to run the Iceberg commits in parallel, I 
would like to keep it as a separate pool with different sizing. So if the 
`iceberg.mr.commit.table.thread.pool.size` is set to `<=1`, or we have only a 
single target table then we will run the commit in the same thread and do not 
create an executor pool for the Iceberg commits.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] pvary commented on a change in pull request #2228: Hive: Implement multi-table inserts

Reply via email to