[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126125#comment-16126125
 ] 

Xuefu Zhang commented on HIVE-17291:
------------------------------------

For test result stability, I'm fine with whatever solution. It would be 
interesting to find out why this is an issue even with Rui's magic in QTestUtil.

As to production, most likely dynamic allocation is enabled for sharing 
resources among users. The problem I see with dynamically determining 
parallelism with available executors is the dynamic nature of the executors. 
Typically, each user or user group has a queue, which is specified when a query 
is submitted. The capacity of the queue is decided in YARN. Further, the 
availability of the yarn containers is not guaranteed, and this is apparent 
when the cluster is busy. In this mode, using available executors can 
underestimate the number of reducers needed. This also happens when Spark 
client is initially launched. In addition, with dynamic allocation, 
minExecutors (usually 0) and initialExecutors (usually a small number) are not 
guaranteed either, and maxExecutors (usually a large number) only puts an upper 
limit. In a production env, users are less likely to overwrite what admin sets 
as default. Given such an uncertainly, I don't think we should determine 
parallelism based on available executors. I'd propose that we use "size per 
reducer" to decide the number of reducers, which might be further constrained 
under what maxExecutors allows.

Static allocation is mostly useful for benchmarking, but less likely used in a 
multi-tenant env. Even under this mode, {{spark.executor.instances}} are not 
guaranteed either. However, once the client gets an executor, it never gives it 
up. Thus, it's useful to determine the number of reducers using available 
executors. This comes with catch, which is about the first query run when the 
executors are starting. For this case, I think it should be okay to use 
{{spark.executor.instances}}. In short, with static allocation, we can use 
available executors to determine reducer parallelism and use 
{{spark.executor.instances}} when info about available executor is not 
available.

Any thoughts?

> Set the number of executors based on config if client does not provide 
> information
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-17291
>                 URL: https://issues.apache.org/jira/browse/HIVE-17291
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: 3.0.0
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>         Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to