[jira] [Commented] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

Rui Li (JIRA) Sun, 04 Jan 2015 02:05:22 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263821#comment-14263821
 ]


Rui Li commented on HIVE-9251:
------------------------------

Basically the following problems will lead to small number of reducers:
# At start-up, it may take some time for executors to register with driver. 
Therefore we may get inaccurate # of executors.
# We rely on {{spark.executor.cores}} to get # of cores per executor, which is 
not available to standalone mode. Therefore, # of total cores will be the same 
as # of executors in standalone mode.
# We didn't consider the maximum data size a reducer can handle.

To solve the first two problems, we may need to ask spark to expose more 
information about granted resources to user.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-9251
>                 URL: https://issues.apache.org/jira/browse/HIVE-9251
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Rui Li
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

Reply via email to