[
https://issues.apache.org/jira/browse/PHOENIX-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934214#comment-15934214
]
chenglei edited comment on PHOENIX-3745 at 3/21/17 2:11 PM:
------------------------------------------------------------
{quote}
1. Checking if the sub-query is already ordered on the join key might be
redundant, since that should be taken care of by the sub-query compilation
itself later on. So could you please verify?
{quote}
After verifying I think checking if the sub-query is already ordered on the
join key is not redundant , just see following unit tests:
{noformat}
@Test
1 public void testSortMergeJoinSubQueryBugForOrderByPrefix() throws
Exception {
2 Connection conn = null;
3 try {
4 conn= DriverManager.getConnection(getUrl());
5
6 String tableName1="MERGE1";
7 String tableName2="MERGE2";
8
9 conn.createStatement().execute("DROP TABLE if exists
"+tableName1);
10
11 String sql="CREATE TABLE IF NOT EXISTS "+tableName1+" ( "+
12 "AID INTEGER PRIMARY KEY,"+
13 "AGE INTEGER"+
14 ")";
15 conn.createStatement().execute(sql);
16 conn.createStatement().execute("DROP TABLE if exists
"+tableName2);
17 sql="CREATE TABLE IF NOT EXISTS "+tableName2+" ( "+
18 "BID INTEGER PRIMARY KEY,"+
19 "CODE INTEGER"+
20 ")";
21 conn.createStatement().execute(sql);
22
23 sql="select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from (select
aid,age from "+tableName1+" where age >=11 and age<=33 order by age limit 3) a
inner join"+
"(select bid,code from "+tableName2+" order by bid,code limit 1) b on
a.aid=b.bid ";
24
25 QueryPlan queryPlan=getQueryPlan(conn, sql);
26 SortMergeJoinPlan
sortMergeJoinPlan=(SortMergeJoinPlan)((ClientScanPlan)queryPlan).getDelegate();
27
28 ClientScanPlan
rhsOuterPlan=(ClientScanPlan)((TupleProjectionPlan)(sortMergeJoinPlan.getRhsPlan())).getDelegate();
29 OrderBy orderBy=rhsOuterPlan.getOrderBy();
30 assertTrue(orderBy.getOrderByExpressions().isEmpty());
31 }
32 finally {
33 if(conn!=null) {
34 conn.close();
35 }
36 }
37 }
{noformat}
the line 30 in above code is failed after I remove the
SubselectRewriter.isOrderByPrefix method in my patch, the RHS:
"(select bid,code from (select bid,code from merge2 order by bid,code limit 1)
order by bid)" still has {{order by bid}} after compilation even if the inner
query " (select bid,code from merge2 order by bid,code limit 1)". is already
ordered on the bid.
The problem is caused by following line 520 in
QueryCompiler.compileSingleQuery method,the judgment of {{isInRowKeyOrder}} is
too coarse, and obviously inner query " (select bid,code from merge2 order by
bid,code limit 1)" is not {{isInRowKeyOrder}} by this judgment ,so the outer
{{order by bid}} can not be compiled out. May be we can file a new JIRA to
improve it :
{code}
515 TableRef tableRef = context.getResolver().getTables().get(0);
516 ColumnResolver resolver =
FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(),
tableRef, innerPlan.getProjector());
517 context.setResolver(resolver);
518 tableRef = resolver.getTables().get(0);
519 context.setCurrentTable(tableRef);
520 boolean isInRowKeyOrder = innerPlan.getGroupBy() ==
GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
521
522 return compileSingleFlatQuery(context, select, binds, asSubquery,
allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder);
{code}
\\
{quote}
2. Shouldn't the subselectAsTableNode always be a DerivedTableNode? Is
if(subselectAsTableNode instanceof DerivedTableNode) necessary? Shall we use
assert or Preconditions check instead?
{quote}
Yes, the {{subselectAsTableNode}} is always a {{DerivedTableNode}}, we indeed
should use assert instead of if.
I uploaded my second patch with two minor modifications:
1. change the if statement to "assert subselectAsTableNode instanceof
DerivedTableNode".
2. add more unit tests and IT tests for checking if the sub-query is already
ordered on the join key.
I also create a new pull request,please help me have a review, thanks.
was (Author: comnetwork):
{quote}
1. Checking if the sub-query is already ordered on the join key might be
redundant, since that should be taken care of by the sub-query compilation
itself later on. So could you please verify?
{quote}
I think checking if the sub-query is already ordered on the join key is not
redundant after verifying , just see following unit tests:
{noformat}
@Test
1 public void testSortMergeJoinSubQueryBugForOrderByPrefix() throws
Exception {
2 Connection conn = null;
3 try {
4 conn= DriverManager.getConnection(getUrl());
5
6 String tableName1="MERGE1";
7 String tableName2="MERGE2";
8
9 conn.createStatement().execute("DROP TABLE if exists
"+tableName1);
10
11 String sql="CREATE TABLE IF NOT EXISTS "+tableName1+" ( "+
12 "AID INTEGER PRIMARY KEY,"+
13 "AGE INTEGER"+
14 ")";
15 conn.createStatement().execute(sql);
16 conn.createStatement().execute("DROP TABLE if exists
"+tableName2);
17 sql="CREATE TABLE IF NOT EXISTS "+tableName2+" ( "+
18 "BID INTEGER PRIMARY KEY,"+
19 "CODE INTEGER"+
20 ")";
21 conn.createStatement().execute(sql);
22
23 sql="select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from (select
aid,age from "+tableName1+" where age >=11 and age<=33 order by age limit 3) a
inner join"+
"(select bid,code from "+tableName2+" order by bid,code limit 1) b on
a.aid=b.bid ";
24
25 QueryPlan queryPlan=getQueryPlan(conn, sql);
26 SortMergeJoinPlan
sortMergeJoinPlan=(SortMergeJoinPlan)((ClientScanPlan)queryPlan).getDelegate();
27
28 ClientScanPlan
rhsOuterPlan=(ClientScanPlan)((TupleProjectionPlan)(sortMergeJoinPlan.getRhsPlan())).getDelegate();
29 OrderBy orderBy=rhsOuterPlan.getOrderBy();
30 assertTrue(orderBy.getOrderByExpressions().isEmpty());
31 }
32 finally {
33 if(conn!=null) {
34 conn.close();
35 }
36 }
37 }
{noformat}
the line 30 in above code is failed after I remove the
SubselectRewriter.isOrderByPrefix method in my patch, the RHS:
"(select bid,code from (select bid,code from merge2 order by bid,code limit 1)
order by bid)" still has {{order by bid}} after compilation even if the inner
query " (select bid,code from merge2 order by bid,code limit 1)". is already
ordered on the bid.
The problem is caused by following line 520 in
QueryCompiler.compileSingleQuery method,the judgment of {{isInRowKeyOrder}} is
too coarse, and obviously inner query " (select bid,code from merge2 order by
bid,code limit 1)" is not {{isInRowKeyOrder}} by this judgment ,so the outer
{{order by bid}} can not be compiled out. May be we can file a new JIRA to
improve it :
{code}
515 TableRef tableRef = context.getResolver().getTables().get(0);
516 ColumnResolver resolver =
FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(),
tableRef, innerPlan.getProjector());
517 context.setResolver(resolver);
518 tableRef = resolver.getTables().get(0);
519 context.setCurrentTable(tableRef);
520 boolean isInRowKeyOrder = innerPlan.getGroupBy() ==
GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
521
522 return compileSingleFlatQuery(context, select, binds, asSubquery,
allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder);
{code}
\\
{quote}
2. Shouldn't the subselectAsTableNode always be a DerivedTableNode? Is
if(subselectAsTableNode instanceof DerivedTableNode) necessary? Shall we use
assert or Preconditions check instead?
{quote}
Yes, the {{subselectAsTableNode}} always is a {{DerivedTableNode}}, we indeed
should use assert instead of if.
I uploaded my second version patch, the modifications of patch are minor:
1. change the if statement to "assert subselectAsTableNode instanceof
DerivedTableNode".
2. add more unit tests and IT tests for Checking if the sub-query is already
ordered on the join key.
> SortMergeJoin might incorrectly override the OrderBy of LHS or RHS
> ------------------------------------------------------------------
>
> Key: PHOENIX-3745
> URL: https://issues.apache.org/jira/browse/PHOENIX-3745
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.9.0
> Reporter: chenglei
> Assignee: chenglei
> Attachments: PHOENIX-3745_v2.patch
>
>
> Let us look a simple test case:
> h4. 1. Create two tables
> {noformat}
> CREATE TABLE IF NOT EXISTS MERGE1 (
> AID INTEGER PRIMARY KEY
> AGE INTEGER
> );
> CREATE TABLE IF NOT EXISTS MERGE2 (
> BID INTEGER PRIMARY KEY,
> CODE INTEGER
> );
> {noformat}
> h4. 2. Upsert values
> {noformat}
> UPSERT INTO MERGE1(AID,AGE) VALUES (1,11);
> UPSERT INTO MERGE1(AID,AGE) VALUES (2,22);
> UPSERT INTO MERGE1 (AID,AGE) VALUES (3,33);
> UPSERT INTO MERGE2 (BID,CODE) VALUES (1,66);
> UPSERT INTO MERGE2 (BID,CODE) VALUES (2,55);
> UPSERT INTO MERGE2 (BID,CODE) VALUES (3,44);
> {noformat}
> h4. 3. Execute query
> {noformat}
> select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from
> (select aid,age from merge1 where age >=11 and age<=33) a inner join
> (select bid,code from merge2 order by code limit 1) b on a.aid=b.bid
> {noformat}
> h4. (/) Expected result
> {noformat}
> 3,44
> {noformat}
> h4. (!) Incorrect actual result
> {noformat}
> 1,66
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)