[
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126125#comment-16126125
]
Xuefu Zhang commented on HIVE-17291:
------------------------------------
For test result stability, I'm fine with whatever solution. It would be
interesting to find out why this is an issue even with Rui's magic in QTestUtil.
As to production, most likely dynamic allocation is enabled for sharing
resources among users. The problem I see with dynamically determining
parallelism with available executors is the dynamic nature of the executors.
Typically, each user or user group has a queue, which is specified when a query
is submitted. The capacity of the queue is decided in YARN. Further, the
availability of the yarn containers is not guaranteed, and this is apparent
when the cluster is busy. In this mode, using available executors can
underestimate the number of reducers needed. This also happens when Spark
client is initially launched. In addition, with dynamic allocation,
minExecutors (usually 0) and initialExecutors (usually a small number) are not
guaranteed either, and maxExecutors (usually a large number) only puts an upper
limit. In a production env, users are less likely to overwrite what admin sets
as default. Given such an uncertainly, I don't think we should determine
parallelism based on available executors. I'd propose that we use "size per
reducer" to decide the number of reducers, which might be further constrained
under what maxExecutors allows.
Static allocation is mostly useful for benchmarking, but less likely used in a
multi-tenant env. Even under this mode, {{spark.executor.instances}} are not
guaranteed either. However, once the client gets an executor, it never gives it
up. Thus, it's useful to determine the number of reducers using available
executors. This comes with catch, which is about the first query run when the
executors are starting. For this case, I think it should be okay to use
{{spark.executor.instances}}. In short, with static allocation, we can use
available executors to determine reducer parallelism and use
{{spark.executor.instances}} when info about available executor is not
available.
Any thoughts?
> Set the number of executors based on config if client does not provide
> information
> ----------------------------------------------------------------------------------
>
> Key: HIVE-17291
> URL: https://issues.apache.org/jira/browse/HIVE-17291
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Affects Versions: 3.0.0
> Reporter: Peter Vary
> Assignee: Peter Vary
> Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide
> information we should try to use the one provided by default. This can happen
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)