Hi All,
Filter is seen on top of Project in query plan for a null equality join.
This is over CSV data, shouldn't the filter appear below the project in the
query plan ?
I am on Drill 1.9.0 git commit id: a29f1e29
Note : t1 has some nulls in it
t2 does not have any nulls in it.
{noformat}
0: jdbc:drill:schema=dfs.tmp> explain plan for
select * from `oneColDupsWnulls.csv` t1 JOIN `oneColWOnulls.csv` t2
ON t1.columns[0] = t2.columns[0]
WHERE t1.columns[0] IS NOT DISTINCT FROM t2.columns[0]
OR ( t1.columns[0] IS NULL AND t2.columns[0] IS NULL );
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 ProjectAllowDup(*=[$0], *0=[$1])
00-02 Project(T43¦¦*=[$0], T44¦¦*=[$2])
00-03 SelectionVectorRemover
00-04 Filter(condition=[OR(CAST(CASE(IS NULL(ITEM($1, 0)), IS
NULL(ITEM($3, 0)), IS NULL(ITEM($3, 0)), IS NULL(ITEM($1, 0)), =(ITEM($1,
0), ITEM($3, 0)))):BOOLEAN NOT NULL, AND(IS NULL(ITEM($1, 0)), IS
NULL(ITEM($3, 0))))])
00-05 Project(T43¦¦*=[$0], columns=[$1], T44¦¦*=[$3],
columns0=[$4])
00-06 HashJoin(condition=[=($2, $5)], joinType=[inner])
00-07 Project(T44¦¦*=[$0], columns0=[$1], $f20=[$2])
00-09 Project(T44¦¦*=[$0], columns=[$1], $f2=[ITEM($1,
0)])
00-11 Project(T44¦¦*=[$0], columns=[$1])
00-13 Scan(groupscan=[EasyGroupScan
[selectionRoot=maprfs:/tmp/oneColWOnulls.csv, numFiles=1, columns=[`*`],
files=[maprfs:///tmp/oneColWOnulls.csv]]])
00-08 Project(T43¦¦*=[$0], columns=[$1], $f2=[ITEM($1, 0)])
00-10 Project(T43¦¦*=[$0], columns=[$1])
00-12 Scan(groupscan=[EasyGroupScan
[selectionRoot=maprfs:/tmp/oneColDupsWnulls.csv, numFiles=1, columns=[`*`],
files=[maprfs:///tmp/oneColDupsWnulls.csv]]])
{noformat}
Results returned by query
{noformat}
0: jdbc:drill:schema=dfs.tmp> select * from `oneColDupsWnulls.csv` t1 JOIN
`oneColWOnulls.csv` t2 ON t1.columns[0] = t2.columns[0] WHERE t1.columns[0]
IS NOT DISTINCT FROM t2.columns[0] OR ( t1.columns[0] IS NULL AND
t2.columns[0] IS NULL );
+-------------+-------------+
| columns | columns0 |
+-------------+-------------+
| ["test"] | ["test"] |
| ["foo"] | ["foo"] |
| ["foo"] | ["foo"] |
| ["bar"] | ["bar"] |
| ["yes"] | ["yes"] |
| ["yes"] | ["yes"] |
| ["no"] | ["no"] |
| ["no"] | ["no"] |
| ["foobar"] | ["foobar"] |
| ["foobar"] | ["foobar"] |
| ["never"] | ["never"] |
| ["never"] | ["never"] |
| ["ever"] | ["ever"] |
| ["ever"] | ["ever"] |
| ["here"] | ["here"] |
| ["there"] | ["there"] |
| ["no"] | ["no"] |
| ["no"] | ["no"] |
| ["yes"] | ["yes"] |
| ["yes"] | ["yes"] |
| ["foobar"] | ["foobar"] |
| ["foobar"] | ["foobar"] |
| ["temp"] | ["temp"] |
+-------------+-------------+
23 rows selected (0.341 seconds)
{noformat}
Thanks,
Khurram