Jinfeng Ni created DRILL-2087:
---------------------------------
Summary: Some planner option setting might produce a wrong plan
and wrong query result for join query.
Key: DRILL-2087
URL: https://issues.apache.org/jira/browse/DRILL-2087
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Reporter: Jinfeng Ni
Assignee: Jinfeng Ni
I have the following parquet data ( essentially tpch's nation/region tables,
but break into multiple parts).
{code}
ls -l nation
total 24
-rw-r--r--@ 1 jni staff 1376 Jan 27 16:47 01.parquet
-rw-r--r--@ 1 jni staff 1400 Jan 27 16:47 02.parquet
-rw-r--r--@ 1 jni staff 1279 Jan 27 16:47 03.parquet
ls -l region
total 24
-rw-r--r--@ 1 jni staff 564 Jan 27 16:54 01.parquet
-rw-r--r--@ 1 jni staff 491 Jan 27 16:54 02.parquet
-rw-r--r--@ 1 jni staff 506 Jan 27 16:54 03.parquet
{code}
With the default planner setting, the following query produce the correct
result :
{code}
0: jdbc:drill:zk=local> select t1.n_nationkey from
dfs.`/Users/jni/work/data/parquet/region` t2 join
dfs.`/Users/jni/work/data/parquet/nation` t1 on t1.n_regionkey = t2.r_regionkey;
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
+-------------+
| n_nationkey |
+-------------+
| 18 |
| 12 |
| 9 |
| 8 |
| 21 |
| 19 |
| 7 |
| 6 |
| 23 |
| 22 |
| 20 |
| 13 |
| 11 |
| 10 |
| 4 |
| 0 |
| 16 |
| 15 |
| 14 |
| 5 |
| 24 |
| 17 |
| 3 |
| 2 |
| 1 |
+-------------+
25 rows selected (2.057 seconds)
{code}
However, if I modify the planner option (`planner.slice_target` and
`planner.join.row_count_estimate_factor` ) as following, then the same query
would return 0 row, which is not correct.
{code}
0: jdbc:drill:zk=local> alter session set `planner.slice_target` = 10;
+------------+------------+
| ok | summary |
+------------+------------+
| true | planner.slice_target updated. |
+------------+------------+
1 row selected (0.093 seconds)
0: jdbc:drill:zk=local> alter session set
`planner.join.row_count_estimate_factor` = 0.1;
+------------+------------+
| ok | summary |
+------------+------------+
| true | planner.join.row_count_estimate_factor updated. |
+------------+------------+
1 row selected (0.041 seconds)
0: jdbc:drill:zk=local> select t1.n_nationkey from
dfs.`/Users/jni/work/data/parquet/region` t2 join
dfs.`/Users/jni/work/data/parquet/nation` t1 on t1.n_regionkey = t2.r_regionkey;
+-------------+
| n_nationkey |
+-------------+
+-------------+
No rows selected (0.71 seconds)
{code}
If we look at explain plan result, after changing the planner option, we will
see for the JOIN operator, the LEFT does not have any EXCHANGE operator, while
the RIGHT has HashToRandomExchange operator. That seems to be not a valid plan.
{code}
00-00 Screen
00-01 UnionExchange
01-01 Project(n_nationkey=[$2])
01-02 HashJoin(condition=[=($1, $0)], joinType=[inner])
01-04 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=file:/Users/jni/work/data/parquet/region]],
selectionRoot=/Users/jni/work/data/parquet/region, numFiles=1,
columns=[`r_regionkey`]]])
01-03 HashToRandomExchange(dist0=[[$0]])
02-01 Project(n_regionkey=[$1], n_nationkey=[$0])
02-02 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=file:/Users/jni/work/data/parquet/nation]],
selectionRoot=/Users/jni/work/data/parquet/nation, numFiles=1,
columns=[`n_regionkey`, `n_nationkey`]]])
{code}
The cause of this problem seems to be that Drill will remove EXCHANGE operator
under some conditions. That condition does not guarantee that a JOIN operator
always either has EXCHANGE on both sides, or none.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)