[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267213#comment-14267213 ]
Rui Li commented on HIVE-9251: ------------------------------ I quickly checked the failed tests. Most of them are in query plan because number of reducers changed. Some may also need a SORT_QUERY_RESULT tag. If we want to decide number of reducers based on input size and cluster info, maybe we shouldn't expose it in the query plan, given that input size may change and we currently need some hacks/workarounds to get spark cluster info. Any ideas? > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --------------------------------------------------------------------------------------- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Rui Li > Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)