[
https://issues.apache.org/jira/browse/DRILL-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028700#comment-14028700
]
Aman Sinha commented on DRILL-485:
----------------------------------
As mentioned in my previous comment, this has been fixed. I tested it again on
latest build (commit level 27a9c98) with the following queries .. both Explain
plans and correctness of results w.r.t Postgres.
Query 1: (Note the Filter above the HashJoin. The Filter has the inequality
'<' condition)
0: jdbc:drill:zk=local> explain plan for select cast(c.c_name as varchar(20))
from orders o, customer c where o.o_custkey = c.c_custkey and o.o_orderkey <
c.c_nationkey;
+------------+------------+
| text | json |
+------------+------------+
| 00-00 Screen
00-01 UnionExchange
01-01 Project(EXPR$0=[CAST($4):VARCHAR(20) CHARACTER SET "ISO-8859-1"
COLLATE "ISO-8859-1$en_US$primary"])
01-02 SelectionVectorRemover
01-03 Filter(condition=[<($1, $3)])
01-04 HashJoin(condition=[=($0, $2)], joinType=[inner])
01-06 Project(o_custkey=[$1], o_orderkey=[$0])
01-07 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=file:/Users/asinha/data/tpchmulti/orders]],
selectionRoot=/Users/asinha/data/tpchmulti/orders, columns=[SchemaPath
[`o_custkey`], SchemaPath [`o_orderkey`]]]])
01-05 BroadcastExchange
02-01 Project(c_custkey=[$2], c_nationkey=[$1], c_name=[$0])
02-02 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=file:/Users/asinha/data/tpchmulti/customer]],
selectionRoot=/Users/asinha/data/tpchmulti/customer, columns=[SchemaPath
[`c_custkey`], SchemaPath [`c_nationkey`], SchemaPath [`c_name`]]]])
0: jdbc:drill:zk=local> select cast(c.c_name as varchar(20)) from orders o,
customer c where o.o_custkey = c.c_custkey and o.o_orderkey < c.c_nationkey;
+------------+
| EXPR$0 |
+------------+
| Customer#000000370 |
| Customer#000000781 |
| Customer#000001369 |
| Customer#000000445 |
| Customer#000000557 |
| Customer#000000392 |
+------------+
6 rows selected
Query 2: (Note the Filter above the HashJoin. The Filter has the OR
condition).
0: jdbc:drill:zk=local> explain plan for select cast(c.c_name as varchar(20))
from orders o, customer c where o.o_custkey = c.c_custkey and (o.o_orderkey =
c.c_nationkey or o.o_custkey = c.c_nationkey);
+------------+------------+
| text | json |
+------------+------------+
| 00-00 Screen
00-01 UnionExchange
01-01 Project(EXPR$0=[CAST($4):VARCHAR(20) CHARACTER SET "ISO-8859-1"
COLLATE "ISO-8859-1$en_US$primary"])
01-02 SelectionVectorRemover
01-03 Filter(condition=[OR(=($1, $3), =($0, $3))])
01-04 HashJoin(condition=[=($0, $2)], joinType=[inner])
01-06 Project(o_custkey=[$1], o_orderkey=[$0])
01-07 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=file:/Users/asinha/data/tpchmulti/orders]],
selectionRoot=/Users/asinha/data/tpchmulti/orders, columns=[SchemaPath
[`o_custkey`], SchemaPath [`o_orderkey`]]]])
01-05 BroadcastExchange
02-01 Project(c_custkey=[$2], c_nationkey=[$1], c_name=[$0])
02-02 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=file:/Users/asinha/data/tpchmulti/customer]],
selectionRoot=/Users/asinha/data/tpchmulti/customer, columns=[SchemaPath
[`c_custkey`], SchemaPath [`c_nationkey`], SchemaPath [`c_name`]]]])
0: jdbc:drill:zk=local> select cast(c.c_name as varchar(20)) from orders o,
customer c where o.o_custkey = c.c_custkey and (o.o_orderkey = c.c_nationkey or
o.o_custkey = c.c_nationkey);
+------------+
| EXPR$0 |
+------------+
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
| Customer#000000004 |
+------------+
31 rows selected
> Support non-equijoins as long as there is at least 1 equijoin condition
> between the same 2 tables
> -------------------------------------------------------------------------------------------------
>
> Key: DRILL-485
> URL: https://issues.apache.org/jira/browse/DRILL-485
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Reporter: Aman Sinha
> Assignee: Aman Sinha
> Fix For: 1.0.0-BETA1
>
>
> Currently, if the query has a non-equijoin between tables t1, t2 we throw an
> error that it is unsupported. However, if there is at least one equijoin
> between the tables t1, t2 in addition to the non-equijoin, then we should
> perform the join and then do a filter on top using the non-equijoin
> condition.
> example queries where this could be applicable:
> SELECT a1 FROM t1, t2 WHERE b1 = b2 AND c1 < c2;
> SELECT a1 FROM t1, t2 WHERE b1 = b2 AND (c1 = c2 OR d1 = d2);
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)