Lianhui Wang created SPARK-15616:
------------------------------------
Summary: Metastore relation should fallback to HDFS size of
partitions that are involved in Query if statistics are not available.
Key: SPARK-15616
URL: https://issues.apache.org/jira/browse/SPARK-15616
Project: Spark
Issue Type: Improvement
Components: SQL
Reporter: Lianhui Wang
Currently if some partitions of a partitioned table are used in join operation
we rely on Metastore returned size of table to calculate if we can convert the
operation to Broadcast join.
if Filter can prune some partitions, Hive can prune partition before
determining to use broadcast joins according to HDFS size of partitions that
are involved in Query.So sparkSQL needs it that can improve join's performance
for partitioned table.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]