okumin commented on code in PR #4477:
URL: https://github.com/apache/hive/pull/4477#discussion_r1265275497


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergSerDe.java:
##########
@@ -148,6 +148,14 @@ public void initialize(@Nullable Configuration 
configuration, Properties serDePr
     // TODO: remove once we have both Fanout and ClusteredWriter available: 
HIVE-25948
     HiveConf.setIntVar(configuration, 
HiveConf.ConfVars.HIVEOPTSORTDYNAMICPARTITIONTHRESHOLD, 1);
     HiveConf.setVar(configuration, HiveConf.ConfVars.DYNAMICPARTITIONINGMODE, 
"nonstrict");
+
+    Context.Operation operation = 
HiveCustomStorageHandlerUtils.getWriteOperation(configuration,
+            serDeProperties.getProperty(Catalogs.NAME));
+
+    if (operation != null) {
+      HiveConf.setFloatVar(configuration, 
HiveConf.ConfVars.TEZ_MAX_PARTITION_FACTOR, 1f);

Review Comment:
   Some random thoughts. I would say these are minor.
   - Is it best to disable over-provisioning only for DELETE reducers using a 
Hive hook or something?
       - Over-provisioning might work for INSERT or UPDATE
       - It could work also for DELETE if rows are filtered
       - Maybe, the optimization should be applied to only the last reducer in 
the case like Map -> Reduce -> Reduce?
   - `hive.tez.auto.reducer.parallelism.min.threshold=0.0` can be an option?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to