[
https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102084#comment-16102084
]
Ashutosh Chauhan commented on HIVE-16811:
-----------------------------------------
Can you please create a RB for latest patch ?
> Estimate statistics in absence of stats
> ---------------------------------------
>
> Key: HIVE-16811
> URL: https://issues.apache.org/jira/browse/HIVE-16811
> Project: Hive
> Issue Type: Improvement
> Reporter: Vineet Garg
> Assignee: Vineet Garg
> Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and
> this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING,
> S_NATIONKEY INT,
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY INT,
> L_PARTKEY INT,
> L_SUPPKEY INT,
> L_LINENUMBER INT,
> L_QUANTITY DOUBLE,
> L_EXTENDEDPRICE DOUBLE,
> L_DISCOUNT DOUBLE,
> L_TAX DOUBLE,
> L_RETURNFLAG STRING,
> L_LINESTATUS STRING,
> l_shipdate STRING,
> L_COMMITDATE STRING,
> L_RECEIPTDATE STRING,
> L_SHIPINSTRUCT STRING,
> L_SHIPMODE STRING,
> L_COMMENT STRING) partitioned by (dl
> int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
> p_partkey INT,
> p_name STRING,
> p_mfgr STRING,
> p_brand STRING,
> p_type STRING,
> p_size INT,
> p_container STRING,
> p_retailprice DOUBLE,
> p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey =
> l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up
> with join at least better than cross join
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)