okumin commented on a change in pull request #1744:
URL: https://github.com/apache/hive/pull/1744#discussion_r571351354



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java
##########
@@ -81,6 +82,12 @@ public static ReduceWork createReduceWork(
     boolean isAutoReduceParallelism =
         
context.conf.getBoolVar(HiveConf.ConfVars.TEZ_AUTO_REDUCER_PARALLELISM);
 
+    float slowStartMaxSrcFraction = context.conf.getFloat(
+        ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_MAX_SRC_FRACTION,
+        
ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_MAX_SRC_FRACTION_DEFAULT);
+    float slowStartMinSrcFraction = context.conf.getFloat(
+        ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_MIN_SRC_FRACTION,
+        
ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_MIN_SRC_FRACTION_DEFAULT);

Review comment:
       I wonder if we should add new parameters and use them instead of ones 
defined in ShuffleVertexManager.
   Looking through this class, it would be consistent to introduce new 
parameters and use `TEZ_SHUFFLE_VERTEX_MANAGER_{MIN, MAX}_SRC_FRACTION` only in 
`DagUtils.java`.
   However, Hive on Tez can access `TEZ_SHUFFLE_VERTEX_MANAGER_{MIN, 
MAX}_SRC_FRACTION` in other cases. So I thought it might be also confusing to 
have two methods to tweak slow-start behavior, one is for auto-reduce 
parallelism and the other one is for all other cases.
   
https://github.com/apache/tez/blob/73bcabd2bca2536bf4f3673443a8dcdaaf79a4eb/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2893-L2897
   
   My ideas are
   - use `TEZ_SHUFFLE_VERTEX_MANAGER_{MIN, MAX}_SRC_FRACTION`
   - add params for auto parallelism, like `TEZ_AUTO_REDUCER_PARALLELISM_{MIN, 
MAX}_SRC_FRACTION`, and use them only for auto parallelism
   - add params to configure slow-start behavior, like `TEZ_SLOW_START_{MIN, 
MAX}_SRC_FRACTION`, and use it for all cases, meaning Hive on Tez ignores 
`TEZ_SHUFFLE_VERTEX_MANAGER_{MIN, MAX}_SRC_FRACTION` configured by a user




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to