Hi Julian,

During my work on subquery support in Hive I am running into similar issue with 
almost same query where the CASE statement [NOT(CASE(=($1, 0), false, IS NOT 
NULL($5), true, IS NULL($3), null, <($2, $1), null, false))] is generated with 
a null constant expression but with NULL type. Is that a valid thing for 
calcite to do ? Or should it be generating a typed null constant ?

I noticed most of the time case statement is simplified and typed (within 
Subquery remove rule). But there are statements which are left un-simplified 
and with untyped expression. I wonder if this is a bug. 

Thanks,
Vineet

On 10/31/16, 12:42 PM, "Julian Hyde" <[email protected]> wrote:

>The behavior of NOT IN in SQL is complicated when there are NULL
>values around. In particular, if one "word" value from the sub-query
>is null, then the outer query must return 0 rows. (Why? Because "word
>NOT IN ('foo', 'bar' null)" would evaluate to UNKNOWN for every row.)
>
>It is valid to deduce that "word" in the sub-query is never null,
>because of the "WHERE word = 'hello'" condition. I would have hoped
>that a constant reduction could do that, and then maybe the CASE
>expression can be simplified.
>
>By the way, to be pedantic, what we are talking about here is the
>RelNode tree, the relational algebra, which comes out of the
>SqlToRelConverter. The AST is the SqlNode tree, which comes out of the
>parser and goes into the SqlToRelConverter.
>
>On Mon, Oct 31, 2016 at 8:46 AM, Alexander Shoshin
><[email protected]> wrote:
>> Hello, everybody.
>>
>> Trying to resolve an Apache Flink issue I got some troubles with Calcite. 
>> Can you help me to understand is there a problem in Calcite or just in wrong 
>> settings passed to Calcite functions?
>>
>> I have a simple table "Words" with one column named "word" and a query with 
>> NOT IN operator:
>> val query = "SELECT word FROM Words WHERE word NOT IN (SELECT word FROM 
>> Words WHERE word = 'hello')"
>>
>> This query parsed by org.apache.calcite.sql.parser.SqlParser.parseStmt() and 
>> then transformed to a relational tree by 
>> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(...).
>>
>> As a result I see the following abstract syntax tree
>> LogicalProject(word=[$0])
>>   LogicalFilter(condition=[NOT(CASE(=($1, 0), false, IS NOT NULL($5), true, 
>> IS NULL($3), null, <($2, $1), null, false))])
>>     LogicalJoin(condition=[=($3, $4)], joinType=[left])
>>       LogicalProject($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$0])
>>         LogicalJoin(condition=[true], joinType=[inner])
>>           EnumerableTableScan(table=[[Words]])
>>           LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>>             LogicalProject($f0=[$0], $f1=[true])
>>               LogicalProject(word=[$0])
>>                 LogicalFilter(condition=[=($0, 'hello')])
>>                   EnumerableTableScan(table=[[Words]])
>>       LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>>         LogicalProject($f0=[$0], $f1=[true])
>>           LogicalProject(word=[$0])
>>             LogicalFilter(condition=[=($0, 'hello')])
>>               EnumerableTableScan(table=[[Words]])
>>
>> which fails later during query plan optimization (while calling 
>> org.apache.calcite.tools.Programs.RuleSetProgram.run()).
>>
>> I think it might be because of a very complex abstract syntax tree generated 
>> by Calcite. Shouldn't it be more simple? This one looks good for me:
>> LogicalProject(word=[$0])
>>   LogicalFilter(condition=[IS NULL($2)])
>>     LogicalJoin(condition=[=($0, $1)], joinType=[left])
>>       EnumerableTableScan(table=[[Words]])
>>       LogicalProject($f0=[$0], $f1=[true])
>>         LogicalProject(word=[$0])
>>           LogicalFilter(condition=[=($0, 'hello')])
>>             EnumerableTableScan(table=[[Words]])
>>
>> And when I write a query using LEFT OUTER JOIN to receive this syntax tree - 
>> the optimization works fine. And the query execution result is the same as 
>> must be in case of using NOT IN. So am I wrong with a supposition about bad 
>> abstract syntax tree or not? I will be glad to receive any comments.
>>
>> Regards,
>> Alexander
>

Reply via email to