[ 
https://issues.apache.org/jira/browse/DRILL-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610664#comment-14610664
 ] 

Aman Sinha commented on DRILL-2235:
-----------------------------------

This query plans successfully since we now support  (since Drill 1.0) 
NestedLoopJoin with scalar subqueries.   Here's a plan with query agains TPC-H:

{code}
0: jdbc:drill:zk=local> explain plan for select n1.n_name from 
cp.`tpch/nation.parquet` n1 where (n1.n_nationkey, n1.n_regionkey) not in 
(select n2.n_nationkey, n2.n_regionkey from cp.`tpch/nation.parquet` n2 where 
n2.n_regionkey < 10);
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(n_name=[$0])
00-02        SelectionVectorRemover
00-03          Filter(condition=[NOT(CASE(=($1, 0), false, IS NOT NULL($7), 
true, IS NULL($3), null, IS NULL($4), null, <($2, $1), null, false))])
00-04            HashJoin(condition=[AND(=($3, $5), =($4, $6))], 
joinType=[left])
00-06              Project(n_name=[$2], $f0=[$3], $f1=[$4], f5=[$0], f6=[$1])
00-08                NestedLoopJoin(condition=[true], joinType=[inner])
00-11                  Project(n_nationkey=[$2], n_regionkey=[$0], n_name=[$1])
00-14                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/nation.parquet]], 
selectionRoot=classpath:/tpch/nation.parquet, numFiles=1, 
columns=[`n_nationkey`, `n_regionkey`, `n_name`]]])
00-10                  StreamAgg(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0, 
$1)])
00-13                    Project($f0=[$0], $f1=[$1], $f2=[true])
00-16                      SelectionVectorRemover
00-18                        Filter(condition=[<($1, 10)])
00-20                          Project(n_nationkey=[$1], n_regionkey=[$0])
00-21                            Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/nation.parquet]], 
selectionRoot=classpath:/tpch/nation.parquet, numFiles=1, 
columns=[`n_nationkey`, `n_regionkey`]]])
00-05              Project($f00=[$0], $f10=[$1], $f2=[$2])
00-07                HashAgg(group=[{0, 1}], agg#0=[MIN($2)])
00-09                  Project($f0=[$0], $f1=[$1], $f2=[true])
00-12                    SelectionVectorRemover
00-15                      Filter(condition=[<($1, 10)])
00-17                        Project(n_nationkey=[$1], n_regionkey=[$0])
00-19                          Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=classpath:/tpch/nation.parquet]], 
selectionRoot=classpath:/tpch/nation.parquet, numFiles=1, 
columns=[`n_nationkey`, `n_regionkey`]]])
{code} 

However, note that the StreamAgg is doing a COUNT($0, $1)  .. it seems Calcite 
generates such an aggregate expression.  I am not sure what is the semantics of 
count(a, b).   Running this query fails during execution because we don't 
support this function. 

> Assert when NOT IN clause contains multiple columns
> ---------------------------------------------------
>
>                 Key: DRILL-2235
>                 URL: https://issues.apache.org/jira/browse/DRILL-2235
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.8.0
>            Reporter: Victoria Markman
>            Assignee: Aman Sinha
>             Fix For: 1.2.0
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select * from t1;
> +------------+------------+------------+
> |     a1     |     b1     |     c1     |
> +------------+------------+------------+
> | 1          | aaaaa      | 2015-01-01 |
> | 2          | bbbbb      | 2015-01-02 |
> | 3          | ccccc      | 2015-01-03 |
> | 4          | null       | 2015-01-04 |
> | 5          | eeeee      | 2015-01-05 |
> | 6          | fffff      | 2015-01-06 |
> | 7          | ggggg      | 2015-01-07 |
> | null       | hhhhh      | 2015-01-08 |
> | 9          | iiiii      | null       |
> | 10         | jjjjj      | 2015-01-10 |
> +------------+------------+------------+
> 10 rows selected (0.056 seconds)
> 0: jdbc:drill:schema=dfs> select * from t2;
> +------------+------------+------------+
> |     a2     |     b2     |     c2     |
> +------------+------------+------------+
> | 0          | zzz        | 2014-12-31 |
> | 1          | aaaaa      | 2015-01-01 |
> | 2          | bbbbb      | 2015-01-02 |
> | 2          | bbbbb      | 2015-01-02 |
> | 2          | bbbbb      | 2015-01-02 |
> | 3          | ccccc      | 2015-01-03 |
> | 4          | ddddd      | 2015-01-04 |
> | 5          | eeeee      | 2015-01-05 |
> | 6          | fffff      | 2015-01-06 |
> | 7          | ggggg      | 2015-01-07 |
> | 7          | ggggg      | 2015-01-07 |
> | 8          | hhhhh      | 2015-01-08 |
> | 9          | iiiii      | 2015-01-09 |
> +------------+------------+------------+
> 13 rows selected (0.069 seconds)
> {code}
> IN clause returns correct result:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from t1 where (a1, b1) in (select 
> a2, b2 from t2);
> +------------+
> |   EXPR$0   |
> +------------+
> | 7          |
> +------------+
> 1 row selected (0.258 seconds)
> {code}
> NOT IN clause asserts:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from t1 where (a1, b1) not in 
> (select a2, b2 from t2);
> Query failed: AssertionError: AND(AND(NOT(IS TRUE($7)), IS NOT NULL($3)), IS 
> NOT NULL($4))
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> {code}
> #Thu Feb 12 12:13:26 EST 2015
> git.commit.id.abbrev=de89f36
> {code}
> drillbit.log
> {code}
> 2015-02-12 22:47:11,730 [2b22d290-315e-4450-8b3f-9b3590eb20c3:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: AND(AND(NOT(IS TRUE($7)), IS NOT NULL($3)), 
> IS NOT NULL($4))
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:197) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.AssertionError: AND(AND(NOT(IS TRUE($7)), IS NOT 
> NULL($3)), IS NOT NULL($4))
>         at org.eigenbase.rel.FilterRelBase.<init>(FilterRelBase.java:56) 
> ~[optiq-core-0.9-drill-r18.jar:na]
>         at org.eigenbase.rel.FilterRel.<init>(FilterRel.java:50) 
> ~[optiq-core-0.9-drill-r18.jar:na]
>         at org.eigenbase.rel.CalcRel.createFilter(CalcRel.java:212) 
> ~[optiq-core-0.9-drill-r18.jar:na]
>         at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertWhere(SqlToRelConverter.java:840)
>  ~[optiq-core-0.9-drill-r18.jar:na]
>         at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:497)
>  ~[optiq-core-0.9-drill-r18.jar:na]
>         at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:474)
>  ~[optiq-core-0.9-drill-r18.jar:na]
>         at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2657)
>  ~[optiq-core-0.9-drill-r18.jar:na]
>         at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:432)
>  ~[optiq-core-0.9-drill-r18.jar:na]
>         at 
> net.hydromatic.optiq.prepare.PlannerImpl.convert(PlannerImpl.java:186) 
> ~[optiq-core-0.9-drill-r18.jar:na]
>         at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:163)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:126)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:145)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:515) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:188) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         ... 4 common frames omitted
> 2015-02-12 22:47:11,747 [2b22d290-315e-4450-8b3f-9b3590eb20c3:foreman] ERROR 
> o.a.drill.exec.work.foreman.Foreman - Error 
> 8f0bb8dd-deac-4846-9608-e941da9035e8: AssertionError: AND(AND(NOT(IS 
> TRUE($7)), IS NOT NULL($3)), IS NOT NULL($4))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to