[
https://issues.apache.org/jira/browse/HIVE-28572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krisztian Kasa updated HIVE-28572:
----------------------------------
Status: Patch Available (was: Open)
> Support Distribute by and Cluster by clauses in CBO
> ---------------------------------------------------
>
> Key: HIVE-28572
> URL: https://issues.apache.org/jira/browse/HIVE-28572
> Project: Hive
> Issue Type: Improvement
> Components: CBO
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
>
> If a query has {{distribute by}} or {{cluster by}} clause CBO is turned off
> and only non-CBO optimizations are applied to the query plan.
> One impact of not using CBO is that implicit type conversions are not added.
> Example:
> {code:java}
> create table t1 (a string, b int);
> insert into t1 values ('2014-03-14 10:10:12', 10);
> select * from t1 where a between date_add('2014-03-14', -1) and '2014-03-14'
> distribute by a;
> {code}
> {code:java}
> TableScan
> alias: t1
> filterExpr: a BETWEEN DATE'2014-03-13' AND '2014-03-14'
> (type: boolean)
> {code}
> vs
> {code:java}
> select * from t1 where a between date_add('2014-03-14', -1) and '2014-03-14'
> {code}
> {code:java}
> TableScan
> alias: t1
> filterExpr: CAST( a AS DATE) BETWEEN DATE'2014-03-13' AND
> DATE'2014-03-14' (type: boolean)
> {code}
> Moreover, if vectorization is turned off the results of the above queries are
> different which leads to data corruption.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)