[ 
https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125508#comment-16125508
 ] 

Rui Li commented on HIVE-17291:
-------------------------------

[~pvary], thanks so much for working in the middle of the night - I'll make 
sure to take time difference into consideration next time I use any magic :) :)
bq. When we change the spark configuration the RpcServer is killed, and a new 
one is started
More precisely, it's the spark session gets killed and started again, not the 
RpcServer. Rpc configs are made immutable by HIVE-16876.
bq. What happens when a query uses a wrong number of reducers? The query will 
run, but will result in slower execution? Out of memory?
Yes. But I don't think using mem/core will lead to "wrong number of reducers". 
We always firstly compute numReducers based on data size. Then mem/core is used 
as an attempt to have more reducers, not fewer:
{code}
            // If there are more cores, use the number of cores
            numReducers = Math.max(numReducers, 
sparkMemoryAndCores.getSecond());
{code}
 The reason is spark tasks are cheap but tend to require more memory. So it's 
safer to have more of them when we can. On the other hand, we still don't want 
too much than really needed, therefore it's better to use the available cores 
than the configured number - it ensures when taking effect, all the reducers 
finish in one round (assuming static allocation). So I think this is a useful, 
at least harmless, optimization when automatically deciding numReducers.
For stable tests, I'd prefer to stick to what we do in QTestUtil. Let me know 
your opinions. Thanks.

> Set the number of executors based on config if client does not provide 
> information
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-17291
>                 URL: https://issues.apache.org/jira/browse/HIVE-17291
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: 3.0.0
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>         Attachments: HIVE-17291.1.patch
>
>
> When calculating the memory and cores and the client does not provide 
> information we should try to use the one provided by default. This can happen 
> on startup, when {{spark.dynamicAllocation.enabled}} is not enabled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to