Ok, if there is not much of a performance gain as suggested in the
responses, then there is not much that can be done in this case.

Thanks,
Khurram

On Thu, Oct 27, 2016 at 3:41 AM, Jinfeng Ni <[email protected]> wrote:

> Also, if the project operator is not doing any expression evaluation,
> the project operator itself would not introduce any big overhead.
> There probably is no big benefit if we push filter past the project
> operator, as Zelaine said.
>
>
> On Wed, Oct 26, 2016 at 3:07 PM, Jinfeng Ni <[email protected]> wrote:
> > The project under filter is for dynamic expansion of * column. Since
> > the join filter is referring to columns in the project's output, it's
> > not possible to push filter past that project.
> >
> >
> >
> > On Wed, Oct 26, 2016 at 1:50 PM, Zelaine Fong <[email protected]>
> wrote:
> >> The filter, I assume you' referring to, is a join filter.  So, at a
> >> minimum, it needs to be applied after the hash join.  I'm not sure
> there's
> >> a lot of benefit in pushing that filter past the project that's on top
> of
> >> the hash join.
> >>
> >> -- Zelaine
> >>
> >> On Wed, Oct 26, 2016 at 8:59 AM, Khurram Faraaz <[email protected]>
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> Filter is seen on top of Project in query plan for a null equality
> join.
> >>> This is over CSV data, shouldn't the filter appear below the project
> in the
> >>> query plan ?
> >>> I am on Drill 1.9.0 git commit id: a29f1e29
> >>>
> >>> Note : t1 has some nulls in it
> >>>            t2 does not have any nulls in it.
> >>>
> >>> {noformat}
> >>> 0: jdbc:drill:schema=dfs.tmp> explain plan for
> >>> select * from `oneColDupsWnulls.csv` t1 JOIN `oneColWOnulls.csv` t2
> >>> ON t1.columns[0] = t2.columns[0]
> >>> WHERE t1.columns[0] IS NOT DISTINCT FROM t2.columns[0]
> >>> OR ( t1.columns[0] IS NULL AND t2.columns[0] IS NULL );
> >>> +------+------+
> >>> | text | json |
> >>> +------+------+
> >>> | 00-00    Screen
> >>> 00-01      ProjectAllowDup(*=[$0], *0=[$1])
> >>> 00-02        Project(T43¦¦*=[$0], T44¦¦*=[$2])
> >>> 00-03          SelectionVectorRemover
> >>> 00-04            Filter(condition=[OR(CAST(CASE(IS NULL(ITEM($1, 0)),
> IS
> >>> NULL(ITEM($3, 0)), IS NULL(ITEM($3, 0)), IS NULL(ITEM($1, 0)),
> =(ITEM($1,
> >>> 0), ITEM($3, 0)))):BOOLEAN NOT NULL, AND(IS NULL(ITEM($1, 0)), IS
> >>> NULL(ITEM($3, 0))))])
> >>> 00-05              Project(T43¦¦*=[$0], columns=[$1], T44¦¦*=[$3],
> >>> columns0=[$4])
> >>> 00-06                HashJoin(condition=[=($2, $5)], joinType=[inner])
> >>> 00-07                  Project(T44¦¦*=[$0], columns0=[$1], $f20=[$2])
> >>> 00-09                    Project(T44¦¦*=[$0], columns=[$1],
> $f2=[ITEM($1,
> >>> 0)])
> >>> 00-11                      Project(T44¦¦*=[$0], columns=[$1])
> >>> 00-13                        Scan(groupscan=[EasyGroupScan
> >>> [selectionRoot=maprfs:/tmp/oneColWOnulls.csv, numFiles=1,
> columns=[`*`],
> >>> files=[maprfs:///tmp/oneColWOnulls.csv]]])
> >>> 00-08                  Project(T43¦¦*=[$0], columns=[$1], $f2=[ITEM($1,
> >>> 0)])
> >>> 00-10                    Project(T43¦¦*=[$0], columns=[$1])
> >>> 00-12                      Scan(groupscan=[EasyGroupScan
> >>> [selectionRoot=maprfs:/tmp/oneColDupsWnulls.csv, numFiles=1,
> >>> columns=[`*`],
> >>> files=[maprfs:///tmp/oneColDupsWnulls.csv]]])
> >>> {noformat}
> >>>
> >>> Results returned by query
> >>>
> >>> {noformat}
> >>> 0: jdbc:drill:schema=dfs.tmp> select * from `oneColDupsWnulls.csv` t1
> JOIN
> >>> `oneColWOnulls.csv` t2 ON t1.columns[0] = t2.columns[0] WHERE
> t1.columns[0]
> >>> IS NOT DISTINCT FROM t2.columns[0] OR ( t1.columns[0] IS NULL AND
> >>> t2.columns[0] IS NULL );
> >>> +-------------+-------------+
> >>> |   columns   |  columns0   |
> >>> +-------------+-------------+
> >>> | ["test"]    | ["test"]    |
> >>> | ["foo"]     | ["foo"]     |
> >>> | ["foo"]     | ["foo"]     |
> >>> | ["bar"]     | ["bar"]     |
> >>> | ["yes"]     | ["yes"]     |
> >>> | ["yes"]     | ["yes"]     |
> >>> | ["no"]      | ["no"]      |
> >>> | ["no"]      | ["no"]      |
> >>> | ["foobar"]  | ["foobar"]  |
> >>> | ["foobar"]  | ["foobar"]  |
> >>> | ["never"]   | ["never"]   |
> >>> | ["never"]   | ["never"]   |
> >>> | ["ever"]    | ["ever"]    |
> >>> | ["ever"]    | ["ever"]    |
> >>> | ["here"]    | ["here"]    |
> >>> | ["there"]   | ["there"]   |
> >>> | ["no"]      | ["no"]      |
> >>> | ["no"]      | ["no"]      |
> >>> | ["yes"]     | ["yes"]     |
> >>> | ["yes"]     | ["yes"]     |
> >>> | ["foobar"]  | ["foobar"]  |
> >>> | ["foobar"]  | ["foobar"]  |
> >>> | ["temp"]    | ["temp"]    |
> >>> +-------------+-------------+
> >>> 23 rows selected (0.341 seconds)
> >>> {noformat}
> >>>
> >>> Thanks,
> >>> Khurram
> >>>
>

Reply via email to