[
https://issues.apache.org/jira/browse/HIVE-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vineet Garg updated HIVE-20266:
-------------------------------
Description:
{code:sql}
CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING
NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
{code}
{code:sql}
explain INSERT INTO tablePartitioned partition(p1, p2) select key, value,
value, key as p1, 3 as p2 from src limit 10;
{code}
*Without CBO*
{noformat}
Map 1
Map Operator Tree:
TableScan
alias: src
Statistics: Num rows: 2500 Data size: 26560 Basic stats:
COMPLETE Column stats: NONE
Select Operator
expressions: key (type: string), value (type: string),
value (type: string), key (type: string), 3 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 2500 Data size: 26560 Basic stats:
COMPLETE Column stats: NONE
Limit
Number of rows: 10
Statistics: Num rows: 10 Data size: 100 Basic stats:
COMPLETE Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 10 Data size: 100 Basic stats:
COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
string), _col2 (type: string), _col3 (type: string), _col4 (type: int)
{noformat}
*With CBO*
{noformat}
Map 1
Map Operator Tree:
TableScan
alias: src
Statistics: Num rows: 2500 Data size: 26560 Basic stats:
COMPLETE Column stats: NONE
Select Operator
expressions: key (type: string), value (type: string),
value (type: string), key (type: string)
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 2500 Data size: 26560 Basic stats:
COMPLETE Column stats: NONE
Limit
Number of rows: 10
Statistics: Num rows: 10 Data size: 100 Basic stats:
COMPLETE Column stats: NONE
Reduce Output Operator
sort order:
Statistics: Num rows: 10 Data size: 100 Basic stats:
COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type:
string), _col2 (type: string), _col3 (type: string)
{noformat}
CBO has 4 columns being shuffled as compared to 3 in non-cbo
was:
{code:sql}
CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING
NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
{code}
{code:sql}
explain INSERT INTO tablePartitioned partition(p1, p2) select key, value,
value, key as p1, 3 as p2 from src limit 10;
{code}
> Extra column is being shuffled in cbo as compared to non-cbo
> ------------------------------------------------------------
>
> Key: HIVE-20266
> URL: https://issues.apache.org/jira/browse/HIVE-20266
> Project: Hive
> Issue Type: Improvement
> Components: Query Planning
> Reporter: Vineet Garg
> Assignee: Vineet Garg
> Priority: Major
>
> {code:sql}
> CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING
> NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
> {code}
> {code:sql}
> explain INSERT INTO tablePartitioned partition(p1, p2) select key, value,
> value, key as p1, 3 as p2 from src limit 10;
> {code}
> *Without CBO*
> {noformat}
> Map 1
> Map Operator Tree:
> TableScan
> alias: src
> Statistics: Num rows: 2500 Data size: 26560 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: key (type: string), value (type: string),
> value (type: string), key (type: string), 3 (type: int)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4
> Statistics: Num rows: 2500 Data size: 26560 Basic stats:
> COMPLETE Column stats: NONE
> Limit
> Number of rows: 10
> Statistics: Num rows: 10 Data size: 100 Basic stats:
> COMPLETE Column stats: NONE
> Reduce Output Operator
> sort order:
> Statistics: Num rows: 10 Data size: 100 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: _col0 (type: string), _col1 (type:
> string), _col2 (type: string), _col3 (type: string), _col4 (type: int)
> {noformat}
> *With CBO*
> {noformat}
> Map 1
> Map Operator Tree:
> TableScan
> alias: src
> Statistics: Num rows: 2500 Data size: 26560 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: key (type: string), value (type: string),
> value (type: string), key (type: string)
> outputColumnNames: _col0, _col1, _col2, _col3
> Statistics: Num rows: 2500 Data size: 26560 Basic stats:
> COMPLETE Column stats: NONE
> Limit
> Number of rows: 10
> Statistics: Num rows: 10 Data size: 100 Basic stats:
> COMPLETE Column stats: NONE
> Reduce Output Operator
> sort order:
> Statistics: Num rows: 10 Data size: 100 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: _col0 (type: string), _col1 (type:
> string), _col2 (type: string), _col3 (type: string)
> {noformat}
> CBO has 4 columns being shuffled as compared to 3 in non-cbo
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)