[
https://issues.apache.org/jira/browse/HIVE-21778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajesh Balamohan updated HIVE-21778:
------------------------------------
Description:
{noformat}
drop table if exists test_struct;
CREATE external TABLE test_struct
(
f1 string,
demo_struct struct<f1:string, f2:string, f3:string>,
datestr string
);
set hive.cbo.enable=true;
explain select * from etltmp.test_struct where datestr='2019-01-01' and
demo_struct is not null;
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: test_struct
filterExpr: (datestr = '2019-01-01') (type: boolean) <----- Note that
demo_struct filter is not added here
Filter Operator
predicate: (datestr = '2019-01-01') (type: boolean)
Select Operator
expressions: f1 (type: string), demo_struct (type:
struct<f1:string,f2:string,f3:string>), '2019-01-01' (type: string)
outputColumnNames: _col0, _col1, _col2
ListSink
set hive.cbo.enable=false;
explain select * from etltmp.test_struct where datestr='2019-01-01' and
demo_struct is not null;
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: test_struct
filterExpr: ((datestr = '2019-01-01') and demo_struct is not null)
(type: boolean) <----- Note that demo_struct filter is added when CBO is turned
off
Filter Operator
predicate: ((datestr = '2019-01-01') and demo_struct is not null)
(type: boolean)
Select Operator
expressions: f1 (type: string), demo_struct (type:
struct<f1:string,f2:string,f3:string>), '2019-01-01' (type: string)
outputColumnNames: _col0, _col1, _col2
ListSink
{noformat}
In CalcitePlanner::genFilterRelNode, the following code misses to evaluate this
filter.
{noformat}
RexNode factoredFilterExpr = RexUtil
.pullFactors(cluster.getRexBuilder(), convertedFilterExpr);
{noformat}
Note that even if we add `demo_struct.f1` it would end up pushing the filter
correctly.
was:
{noformat}
drop table if exists test_struct;
CREATE external TABLE test_struct
(
f1 string,
demo_struct struct<f1:string, f2:string, f3:string>,
datestr string
);
set hive.cbo.enable=true;
explain select * from etltmp.test_struct where datestr='2019-01-01' and
demo_struct is not null;
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: test_struct
filterExpr: (datestr = '2019-01-01') (type: boolean) <----- Note that
demo_struct filter is not added here
Filter Operator
predicate: (datestr = '2019-01-01') (type: boolean)
Select Operator
expressions: f1 (type: string), demo_struct (type:
struct<f1:string,f2:string,f3:string>), '2019-01-01' (type: string)
outputColumnNames: _col0, _col1, _col2
ListSink
set hive.cbo.enable=false;
explain select * from etltmp.test_struct where datestr='2019-01-01' and
demo_struct is not null;
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: test_struct
filterExpr: ((datestr = '2019-01-01') and demo_struct is not null)
(type: boolean) <----- Note that demo_struct filter is added when CBO is turned
off
Filter Operator
predicate: ((datestr = '2019-01-01') and demo_struct is not null)
(type: boolean)
Select Operator
expressions: f1 (type: string), demo_struct (type:
struct<f1:string,f2:string,f3:string>), '2019-01-01' (type: string)
outputColumnNames: _col0, _col1, _col2
ListSink
{noformat}
In CalcitePlanner::genFilterRelNode, the following code misses to evaluate this
filter.
{noformat}
RexNode factoredFilterExpr = RexUtil
.pullFactors(cluster.getRexBuilder(), convertedFilterExpr);
{noformat}
Note that even if we add `demo_struct.f1` it would end up pushing the filter
correctly. Suspecting {code}RexCall::isAlwaysTrue{code} is evaluating to true
in this case.
> CBO: "Struct is not null" gets evaluated as `nullable` always causing filter
> miss in the query
> ----------------------------------------------------------------------------------------------
>
> Key: HIVE-21778
> URL: https://issues.apache.org/jira/browse/HIVE-21778
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 4.0.0, 2.3.5
> Reporter: Rajesh Balamohan
> Priority: Major
> Attachments: test_null.q, test_null.q.out
>
>
> {noformat}
> drop table if exists test_struct;
> CREATE external TABLE test_struct
> (
> f1 string,
> demo_struct struct<f1:string, f2:string, f3:string>,
> datestr string
> );
> set hive.cbo.enable=true;
> explain select * from etltmp.test_struct where datestr='2019-01-01' and
> demo_struct is not null;
> STAGE PLANS:
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> TableScan
> alias: test_struct
> filterExpr: (datestr = '2019-01-01') (type: boolean) <----- Note
> that demo_struct filter is not added here
> Filter Operator
> predicate: (datestr = '2019-01-01') (type: boolean)
> Select Operator
> expressions: f1 (type: string), demo_struct (type:
> struct<f1:string,f2:string,f3:string>), '2019-01-01' (type: string)
> outputColumnNames: _col0, _col1, _col2
> ListSink
> set hive.cbo.enable=false;
> explain select * from etltmp.test_struct where datestr='2019-01-01' and
> demo_struct is not null;
> STAGE PLANS:
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> TableScan
> alias: test_struct
> filterExpr: ((datestr = '2019-01-01') and demo_struct is not null)
> (type: boolean) <----- Note that demo_struct filter is added when CBO is
> turned off
> Filter Operator
> predicate: ((datestr = '2019-01-01') and demo_struct is not null)
> (type: boolean)
> Select Operator
> expressions: f1 (type: string), demo_struct (type:
> struct<f1:string,f2:string,f3:string>), '2019-01-01' (type: string)
> outputColumnNames: _col0, _col1, _col2
> ListSink
> {noformat}
> In CalcitePlanner::genFilterRelNode, the following code misses to evaluate
> this filter.
> {noformat}
> RexNode factoredFilterExpr = RexUtil
> .pullFactors(cluster.getRexBuilder(), convertedFilterExpr);
> {noformat}
> Note that even if we add `demo_struct.f1` it would end up pushing the filter
> correctly.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)