[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888558#comment-16888558 ] Danny Chan commented on CALCITE-3128: - [~julianhyde] Because you seem to have objections for this PR, can you take a look for this issue again ? > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874592#comment-16874592 ] Danny Chan commented on CALCITE-3128: - [~julianhyde] thanks, {quote}if the condition is '(t1.c is null) = (t2.c is null)', I think the problem would still occur. {quote} I checked this query: {code:sql} select dept.deptno, emp.ename from dept join emp on (dept.dname is null) = (emp.sal is null) order by emp.ename limit 10; {code} and finally plan is {code:xml} EnumerableLimit(fetch=[10]) EnumerableSort(sort0=[$1], dir0=[ASC]) EnumerableCalc(expr#0..4=[{inputs}], DEPTNO=[$t0], ENAME=[$t3]) EnumerableHashJoin(condition=[=($1, $4)], joinType=[inner]) EnumerableCalc(expr#0..2=[{inputs}], expr#3=[IS NULL($t1)], DEPTNO=[$t0], $f3=[$t3]) EnumerableTableScan(table=[[scott, DEPT]]) EnumerableCalc(expr#0..7=[{inputs}], expr#8=[IS NULL($t5)], proj#0..1=[{exprs}], $f8=[$t8]) EnumerableTableScan(table=[[scott, EMP]]) {code} The condition '(t1.c is null) = (t2.c is null)' for hash join are all boolean keys, which means they will never be null, so i don't think there is any problem for this case. {quote}The key factor, I think, is that one side has just one column, and a null in that column. {quote} This is not true, like the case: {code:sql} SELECT * FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) b {code} > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874480#comment-16874480 ] Julian Hyde commented on CALCITE-3128: -- Does this problem occur if and only if the condition is always true? I think not. if the condition is '(t1.c is null) = (t2.c is null)', I think the problem would still occur. The key factor, I think, is that one side has just one column, and a null in that column. So I don't think {{boolean isConditionAlwaysTrue}} should be part of the solution. > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873015#comment-16873015 ] Danny Chan commented on CALCITE-3128: - [~mgelbana] Thanks for your test case, i have applied a PR [https://github.com/apache/calcite/pull/1279] [~donnyzone] i also checked the Postgres and the behavior is same as you proposed. While i think the MYSQL 5.6 behavior is wrong. > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872709#comment-16872709 ] Julian Hyde commented on CALCITE-3128: -- Current behavior is wrong. Cartesian product of one row to one row should produce one row, regardless of whether columns are null. MySQL and PostgreSQL are correct. I think it's very plausible that the problem is in Linq4j. Linq4j tries to be clever and represents 1-column collections as raw values. Which is fine. But if "null" also means "was empty" there's a problem. Probably the fix should include a test for Linq4j hashJoin_ in addition to a SQL-level test. > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Priority: Major > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871116#comment-16871116 ] Feng Zhu commented on CALCITE-3128: --- [~danny0405] I ever executed the query in PG. It will produce a (NULL, NULL) result. {code:java} SELECT *FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b {code} I am not sure whether it is the problem with Linq4q[1]. When the current value is _null_ in left side (i.e., outers), Linq4q directly sets (innerEnumerable=null), resulting an emptyEnumerator. Q1: result expected (1, NULL, NULL) SELECT *FROM (SELECT 1, NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b Q2: 0 rowsSELECT *FROM (SELECT NULLIF(5, 5)) a, (SELECT 1, NULLIF(5, 5)) b [1][https://github.com/apache/calcite/blob/4e89fddab415a1e04b82c7d69960e399f608949f/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1115] > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Priority: Major > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871051#comment-16871051 ] Danny Chan commented on CALCITE-3128: - While i execute the query in MYSQL 5.6: {code:sql} SELECT *FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b {code} the result are (null) which is same from query: {code:java} SELECT *FROM (SELECT NULLIF(5, 5)) a {code} Does this behavior is expected ? > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Priority: Major > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870827#comment-16870827 ] Feng Zhu commented on CALCITE-3128: --- The generated code for query seems to be OK. {code:java} public org.apache.calcite.linq4j.Enumerable bind(final org.apache.calcite.DataContext root) { final org.apache.calcite.linq4j.Enumerable _inputEnumerable = org.apache.calcite.linq4j.Linq4j.asEnumerable(new Integer[] {0}); final org.apache.calcite.linq4j.AbstractEnumerable left = new org.apache.calcite.linq4j.AbstractEnumerable(){ public org.apache.calcite.linq4j.Enumerator enumerator() { return new org.apache.calcite.linq4j.Enumerator(){ public final org.apache.calcite.linq4j.Enumerator inputEnumerator = _inputEnumerable.enumerator(); public void reset() {inputEnumerator.reset();} public boolean moveNext() {return inputEnumerator.moveNext();} public void close() {inputEnumerator.close();} public Object current() {return null;} }; } }; final org.apache.calcite.linq4j.Enumerable _inputEnumerable0 = org.apache.calcite.linq4j.Linq4j.asEnumerable(new Integer[] {0}); final org.apache.calcite.linq4j.AbstractEnumerable right = new org.apache.calcite.linq4j.AbstractEnumerable(){ public org.apache.calcite.linq4j.Enumerator enumerator() { return new org.apache.calcite.linq4j.Enumerator(){ public final org.apache.calcite.linq4j.Enumerator inputEnumerator = _inputEnumerable0.enumerator(); public void reset() {inputEnumerator.reset();} public boolean moveNext() {return inputEnumerator.moveNext();} public void close() {inputEnumerator.close();} public Object current() {return null;} }; } }; return left.hashJoin(right, new org.apache.calcite.linq4j.function.Function1() { public org.apache.calcite.runtime.FlatLists.ComparableEmptyList apply(Integer v1) { return org.apache.calcite.runtime.FlatLists.COMPARABLE_EMPTY_LIST; } public Object apply(Object v1) { return apply((Integer) v1); } } , new org.apache.calcite.linq4j.function.Function1() { public org.apache.calcite.runtime.FlatLists.ComparableEmptyList apply(Integer v1) { return org.apache.calcite.runtime.FlatLists.COMPARABLE_EMPTY_LIST; } public Object apply(Object v1) { return apply((Integer) v1); } } , new org.apache.calcite.linq4j.function.Function2() { public Object[] apply(Integer left, Integer right) { return new Object[] {left, right}; } public Object[] apply(Object left, Object right) { return apply((Integer) left, (Integer) right); } } , null, false, false); } {code} > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Priority: Major > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-3128) Joining two tables producing only NULLs will return 0 rows
[ https://issues.apache.org/jira/browse/CALCITE-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864265#comment-16864265 ] Muhammad Gelbana commented on CALCITE-3128: --- My analysis so far is that {{leftKeys}} and {{rightKeys}} for EnumerableHashJoin (Recently named EnumerableJoin) are blank, which illudes {{PhysTypeImpl.generateAccessor(List)}} to return an expression of an empty list. {{leftKeys}} and {{rightKeys}} are blank because while constructing a Join node, {{JoinInfo.of(RelNode, RelNode, RexNode)}} uses {{RelOptUtil.splitJoinCondition(RelNode, RelNode, RexNode, List, List, List)}} to fill up {{leftKeys}} and {{rightKeys}} but it doesn't because the condition is always true (i.e cartesian product). I'm still not sure where or how to fix this. > Joining two tables producing only NULLs will return 0 rows > -- > > Key: CALCITE-3128 > URL: https://issues.apache.org/jira/browse/CALCITE-3128 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.20.0 >Reporter: Muhammad Gelbana >Priority: Major > > The following queries will return 0 rows while they're expected to ruturn > rows with NULLs in them. > {code:sql} > SELECT * > FROM (SELECT NULLIF(5, 5)) a, (SELECT NULLIF(5, 5)) b > {code} > {code:sql} > SELECT * > FROM (VALUES (NULLIF(5, 5)), (NULLIF(5, 5))) a, (VALUES (NULLIF(5, 5)), > (NULLIF(5, 5))) b > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)