[
https://issues.apache.org/jira/browse/CALCITE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
hezhang updated CALCITE-3609:
-----------------------------
Description:
student table:
{code:java}
CREATE TABLE `student`(
`id` int,
`info` struct<name:string,age:int>)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'colelction.delim'=':',
'field.delim'=',',
'serialization.format'=',',
'serialization.null.format'='NULL')
{code}
the sql:
{code:java}
select * from student where info is not null;
{code}
result:
{code:java}
1 {"name":"zhou","age":30}
2 {"name":"yan","age":30}
3 {"name":"chen","age":20}
4 {"name":"li","age":80}
NULL NULL
NULL {"name":null,"age":null}
{code}
cause:calcite Ineffective optimization
{code:java}
HiveProject(id=[$0], info=[$1])
HiveTableScan(table=[[default.student]], table:alias=[student])
{code}
after fix RelRecordType isNullable,the plan is
{code:java}
HiveProject(id=[$0], info=[$1])
HiveFilter(condition=[IS NOT NULL($1)])
HiveTableScan(table=[[default.student]], table:alias=[student])
{code}
then the result is:
{code:java}
1 {"name":"zhou","age":30}
2 {"name":"yan","age":30}
3 {"name":"chen","age":20}
4 {"name":"li","age":80}
NULL {"name":null,"age":null}
{code}
was:
student table:
{code:java}
CREATE TABLE `student`(
`id` int,
`info` struct<name:string,age:int>)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'colelction.delim'=':',
'field.delim'=',',
'serialization.format'=',',
'serialization.null.format'='NULL')
{code}
the sql:
{code:java}
select * from student where info is not null;
{code}
result:
{code:java}
1 {"name":"zhou","age":30}
2 {"name":"yan","age":30}
3 {"name":"chen","age":20}
4 {"name":"li","age":80}
NULL NULL
NULL {"name":null,"age":null}
{code}
cause:calcite Ineffective optimization
{code:java}
HiveProject(id=[$0], info=[$1])
HiveTableScan(table=[[default.student]], table:alias=[student])
{code}
after fix RelRecordType isNullable,the plan is
> calite for hive struct Ineffective optimization
> ------------------------------------------------
>
> Key: CALCITE-3609
> URL: https://issues.apache.org/jira/browse/CALCITE-3609
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.10.0, 1.16.0
> Reporter: hezhang
> Priority: Blocker
> Fix For: 1.16.0
>
>
> student table:
> {code:java}
> CREATE TABLE `student`(
> `id` int,
> `info` struct<name:string,age:int>)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> WITH SERDEPROPERTIES (
> 'colelction.delim'=':',
> 'field.delim'=',',
> 'serialization.format'=',',
> 'serialization.null.format'='NULL')
> {code}
> the sql:
>
> {code:java}
> select * from student where info is not null;
> {code}
> result:
>
> {code:java}
> 1 {"name":"zhou","age":30}
> 2 {"name":"yan","age":30}
> 3 {"name":"chen","age":20}
> 4 {"name":"li","age":80}
> NULL NULL
> NULL {"name":null,"age":null}
> {code}
> cause:calcite Ineffective optimization
> {code:java}
> HiveProject(id=[$0], info=[$1])
> HiveTableScan(table=[[default.student]], table:alias=[student])
> {code}
>
> after fix RelRecordType isNullable,the plan is
> {code:java}
> HiveProject(id=[$0], info=[$1])
> HiveFilter(condition=[IS NOT NULL($1)])
> HiveTableScan(table=[[default.student]], table:alias=[student])
> {code}
> then the result is:
> {code:java}
> 1 {"name":"zhou","age":30}
> 2 {"name":"yan","age":30}
> 3 {"name":"chen","age":20}
> 4 {"name":"li","age":80}
> NULL {"name":null,"age":null}
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)