[
https://issues.apache.org/jira/browse/DRILL-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020930#comment-14020930
]
Ramana Inukonda Nagaraj edited comment on DRILL-928 at 6/8/14 12:31 AM:
------------------------------------------------------------------------
git.commit.id.abbrev=b5ba202
{code}
Physical plan without merge join being forced(works)
00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02 UnionExchange
01-01 Project($f0=[0])
01-02 HashJoin(condition=[=($4, $2)], joinType=[inner])
01-04 HashToRandomExchange(dist0=[[$2]])
02-01 HashJoin(condition=[=($0, $1)], joinType=[inner])
02-03 HashToRandomExchange(dist0=[[$0]])
04-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/supplier]],
selectionRoot=/drill/testdata/tpch-multi/supplier, columns=[SchemaPath
[`s_suppkey`]]]])
02-02 HashToRandomExchange(dist0=[[$0]])
05-01 Filter(condition=[AND(>=($2, 1995-01-01), <=($2,
1995-12-31))])
05-02 Project(l_suppkey=[$0], l_orderkey=[$2],
l_shipdate=[$1])
05-03 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/lineitem]],
selectionRoot=/drill/testdata/tpch-multi/lineitem, columns=[SchemaPath
[`l_suppkey`], SchemaPath [`l_orderkey`], SchemaPath [`l_shipdate`]]]])
01-03 HashToRandomExchange(dist0=[[$0]])
03-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/orders]],
selectionRoot=/drill/testdata/tpch-multi/orders, columns=[SchemaPath
[`o_orderkey`]]]])
{code}
{code}
Physical plan with merge join forced
00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02 UnionExchange
01-01 Project($f0=[0])
01-02 MergeJoin(condition=[=($4, $2)], joinType=[inner])
01-04 SelectionVectorRemover
01-06 Sort(sort0=[$2], dir0=[ASC])
01-08 HashToRandomExchange(dist0=[[$2]])
02-01 MergeJoin(condition=[=($0, $1)], joinType=[inner])
02-03 SelectionVectorRemover
02-05 Sort(sort0=[$0], dir0=[ASC])
02-07 HashToRandomExchange(dist0=[[$0]])
04-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/supplier]],
selectionRoot=/drill/testdata/tpch-multi/supplier, columns=[SchemaPath
[`s_suppkey`]]]])
02-02 SelectionVectorRemover
02-04 Sort(sort0=[$0], dir0=[ASC])
02-06 HashToRandomExchange(dist0=[[$0]])
05-01 Filter(condition=[AND(>=($2, 1995-01-01),
<=($2, 1995-12-31))])
05-02 Project(l_suppkey=[$0], l_orderkey=[$2],
l_shipdate=[$1])
05-03 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/lineitem]],
selectionRoot=/drill/testdata/tpch-multi/lineitem, columns=[SchemaPath
[`l_suppkey`], SchemaPath [`l_orderkey`], SchemaPath [`l_shipdate`]]]])
01-03 SelectionVectorRemover
01-05 Sort(sort0=[$0], dir0=[ASC])
01-07 HashToRandomExchange(dist0=[[$0]])
03-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/orders]],
selectionRoot=/drill/testdata/tpch-multi/orders, columns=[SchemaPath
[`o_orderkey`]]]])
{code}
was (Author: inramana):
git.commit.id.abbrev=b5ba202
Physical plan without merge join being forced(works)
00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02 UnionExchange
01-01 Project($f0=[0])
01-02 HashJoin(condition=[=($4, $2)], joinType=[inner])
01-04 HashToRandomExchange(dist0=[[$2]])
02-01 HashJoin(condition=[=($0, $1)], joinType=[inner])
02-03 HashToRandomExchange(dist0=[[$0]])
04-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/supplier]],
selectionRoot=/drill/testdata/tpch-multi/supplier, columns=[SchemaPath
[`s_suppkey`]]]])
02-02 HashToRandomExchange(dist0=[[$0]])
05-01 Filter(condition=[AND(>=($2, 1995-01-01), <=($2,
1995-12-31))])
05-02 Project(l_suppkey=[$0], l_orderkey=[$2],
l_shipdate=[$1])
05-03 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/lineitem]],
selectionRoot=/drill/testdata/tpch-multi/lineitem, columns=[SchemaPath
[`l_suppkey`], SchemaPath [`l_orderkey`], SchemaPath [`l_shipdate`]]]])
01-03 HashToRandomExchange(dist0=[[$0]])
03-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/orders]],
selectionRoot=/drill/testdata/tpch-multi/orders, columns=[SchemaPath
[`o_orderkey`]]]])
Physical plan with merge join forced
00-00 Screen
00-01 StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02 UnionExchange
01-01 Project($f0=[0])
01-02 MergeJoin(condition=[=($4, $2)], joinType=[inner])
01-04 SelectionVectorRemover
01-06 Sort(sort0=[$2], dir0=[ASC])
01-08 HashToRandomExchange(dist0=[[$2]])
02-01 MergeJoin(condition=[=($0, $1)], joinType=[inner])
02-03 SelectionVectorRemover
02-05 Sort(sort0=[$0], dir0=[ASC])
02-07 HashToRandomExchange(dist0=[[$0]])
04-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/supplier]],
selectionRoot=/drill/testdata/tpch-multi/supplier, columns=[SchemaPath
[`s_suppkey`]]]])
02-02 SelectionVectorRemover
02-04 Sort(sort0=[$0], dir0=[ASC])
02-06 HashToRandomExchange(dist0=[[$0]])
05-01 Filter(condition=[AND(>=($2, 1995-01-01),
<=($2, 1995-12-31))])
05-02 Project(l_suppkey=[$0], l_orderkey=[$2],
l_shipdate=[$1])
05-03 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/lineitem]],
selectionRoot=/drill/testdata/tpch-multi/lineitem, columns=[SchemaPath
[`l_suppkey`], SchemaPath [`l_orderkey`], SchemaPath [`l_shipdate`]]]])
01-03 SelectionVectorRemover
01-05 Sort(sort0=[$0], dir0=[ASC])
01-07 HashToRandomExchange(dist0=[[$0]])
03-01 Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/tpch-multi/orders]],
selectionRoot=/drill/testdata/tpch-multi/orders, columns=[SchemaPath
[`o_orderkey`]]]])
> Regression: Forcing a merge join instead of a hash join results in a Failure
> while reading vector. Expected vector class of
> org.apache.drill.exec.vector.BigIntVector but was holding vector class
> org.apache.drill.exec.vector.VarBinaryVector.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-928
> URL: https://issues.apache.org/jira/browse/DRILL-928
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Reporter: Ramana Inukonda Nagaraj
> Priority: Critical
>
> Test case:
> alter session set `planner.enable_hashjoin` = false;
> select count(*)
> from supplier s, lineitem l, orders o
> where s.s_suppkey = l.l_suppkey
> and o.o_orderkey = l.l_orderkey
> and l.l_shipdate between date '1995-01-01' and date '1995-12-31' ;
> If the alter session is removed the query works fine. Which leads me to
> believe its something in either the mergejoin or the sort before the
> mergejoin step.
--
This message was sent by Atlassian JIRA
(v6.2#6252)