[
https://issues.apache.org/jira/browse/PHOENIX-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
chenglei updated PHOENIX-5148:
------------------------------
Description:
Given a table
{code:java}
create table test (
pk1 varchar not null ,
pk2 varchar not null,
pk3 varchar not null,
v1 varchar,
v2 varchar,
CONSTRAINT TEST_PK PRIMARY KEY (
pk1,
pk2,
pk3 ))
{code}
Consider following three cases:
*1. OrderBy of ClientScanPlan*
for sql:
{code:java}
select v1 from (select v1,v2,pk3 from test t where pk1 = '6' order by
t.v2,t.pk3,t.v1 limit 10) a order by v2,pk3
{code}
Obviously, the outer {{OrderBy}} "order by v2,pk3" should be compiled out
because it matchs the inner query {{OrderBy}} "order by t.v2,t.pk3,t.v1" , but
unfortunately it is not compiled out.
*2. GroupBy of ClientAggregatePlan*
for sql :
{code:java}
select v1 from (select v1,pk2,pk1 from test t where pk1 = '6' order by
t.pk2,t.v1,t.pk1 limit 10) a group by pk2,v1
{code}
Obviously, the outer {{GroupBy}} "group by pk2,v1" should be orderPreserving
because it matchs the inner query {{OrderBy}} "order by t.pk2,t.v1,t.pk1" , but
unfortunately the {{isOrderPreserving()}} of outer {{GroupBy}} return false.
*3. OrderBy of SortMergeJoinPlan(from PHOENIX-4618)*
for sql:
{code:java}
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b{code}
The result of the sort-merge-join is sorted on (T1.a, T1.b) and (T2.a, T2.b) at
the same time.
Thus, both 1)
{code:java}
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T1.a,
T1.b{code}
and 2)
{code:java}
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T2.a,
T2.b{code}
should avoid doing an extra order-by after the sort-merge-join operation.
All the above three cases are caused by the same problem that the
{{OrderPreservingTracker}} relies solely on primary keys for inferring
alignment between the target {{OrderByExpression}} and the source sortedness.
For following {{QueryCompiler.compileSingleQuery}}, because the inner query of
above cases has {{OrderBy}}, so in line 534, local variable {{isInRowKeyOrder}}
is false:
{code:java}
518 protected QueryPlan compileSingleQuery(StatementContext context,
SelectStatement select, List<Object> binds, boolean asSubquery, boolean
allowPageFilter) throws SQLException{
519 SelectStatement innerSelect = select.getInnerSelectStatement();
520 if (innerSelect == null) {
521 return compileSingleFlatQuery(context, select, binds,
asSubquery, allowPageFilter, null, null, true);
522 }
523
524 QueryPlan innerPlan = compileSubquery(innerSelect, false);
525 TupleProjector tupleProjector = new
TupleProjector(innerPlan.getProjector());
526 innerPlan = new TupleProjectionPlan(innerPlan, tupleProjector,
null);
527
528 // Replace the original resolver and table with those having
compiled type info.
529 TableRef tableRef = context.getResolver().getTables().get(0);
530 ColumnResolver resolver =
FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(),
tableRef, innerPlan.getProjector());
531 context.setResolver(resolver);
532 tableRef = resolver.getTables().get(0);
533 context.setCurrentTable(tableRef);
534 boolean isInRowKeyOrder = innerPlan.getGroupBy() ==
GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
535
536 return compileSingleFlatQuery(context, select, binds, asSubquery,
allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder);
537 }
{code}
was:
Given a table
{code:java}
create table test (
pk1 varchar not null ,
pk2 varchar not null,
pk3 varchar not null,
v1 varchar,
v2 varchar,
CONSTRAINT TEST_PK PRIMARY KEY (
pk1,
pk2,
pk3 ))
{code}
Consider following three cases:
*1. OrderBy of ClientScanPlan*
for sql:
{code:java}
select v1 from (select v1,v2,pk3 from test t where pk1 = '6' order by
t.v2,t.pk3,t.v1 limit 10) a order by v2,pk3
{code}
Obviously, the outer {{OrderBy}} "order by v2,pk3" should be compiled out
because it matchs the inner query {{OrderBy}} "order by t.v2,t.pk3,t.v1" , but
unfortunately it is not compiled out.
*2. GroupBy of ClientAggregatePlan*
for sql :
{code:java}
select v1 from (select v1,pk2,pk1 from test t where pk1 = '6' order by
t.pk2,t.v1,t.pk1 limit 10) a group by pk2,v1
{code}
Obviously, the outer {{GroupBy}} "group by pk2,v1" should be orderPreserving
because it matchs the inner query {{OrderBy}} "order by t.pk2,t.v1,t.pk1" , but
unfortunately the {{isOrderPreserving()}} of outer {{GroupBy}} return false.
*3. OrderBy of SortMergeJoinPlan(from PHOENIX-4618)*
for sql:
{code:java}
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b{code}
The result of the sort-merge-join is sorted on (T1.a, T1.b) and (T2.a, T2.b) at
the same time.
Thus, both 1)
{code:java}
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T1.a,
T1.b{code}
and 2)
{code:java}
SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T2.a,
T2.b{code}
should avoid doing an extra order-by after the sort-merge-join operation.
All the above three cases are caused by the same problem that the
{{OrderPreservingTracker}} relies solely on primary keys for inferring
alignment between the target \{{OrderByExpression}}s and the source sortedness.
> Improve OrderPreservingTracker to optimize OrderBy/GroupBy for ClientScanPlan
> and ClientAggregatePlan
> -----------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-5148
> URL: https://issues.apache.org/jira/browse/PHOENIX-5148
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 4.14.1
> Reporter: chenglei
> Priority: Major
>
> Given a table
> {code:java}
> create table test (
> pk1 varchar not null ,
> pk2 varchar not null,
> pk3 varchar not null,
> v1 varchar,
> v2 varchar,
> CONSTRAINT TEST_PK PRIMARY KEY (
> pk1,
> pk2,
> pk3 ))
> {code}
> Consider following three cases:
> *1. OrderBy of ClientScanPlan*
> for sql:
> {code:java}
> select v1 from (select v1,v2,pk3 from test t where pk1 = '6' order by
> t.v2,t.pk3,t.v1 limit 10) a order by v2,pk3
> {code}
> Obviously, the outer {{OrderBy}} "order by v2,pk3" should be compiled out
> because it matchs the inner query {{OrderBy}} "order by t.v2,t.pk3,t.v1" ,
> but unfortunately it is not compiled out.
> *2. GroupBy of ClientAggregatePlan*
> for sql :
> {code:java}
> select v1 from (select v1,pk2,pk1 from test t where pk1 = '6' order by
> t.pk2,t.v1,t.pk1 limit 10) a group by pk2,v1
> {code}
> Obviously, the outer {{GroupBy}} "group by pk2,v1" should be orderPreserving
> because it matchs the inner query {{OrderBy}} "order by t.pk2,t.v1,t.pk1" ,
> but unfortunately the {{isOrderPreserving()}} of outer {{GroupBy}} return
> false.
> *3. OrderBy of SortMergeJoinPlan(from PHOENIX-4618)*
> for sql:
> {code:java}
> SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b{code}
> The result of the sort-merge-join is sorted on (T1.a, T1.b) and (T2.a, T2.b)
> at the same time.
> Thus, both 1)
> {code:java}
> SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T1.a,
> T1.b{code}
> and 2)
> {code:java}
> SELECT * FROM T1 JOIN T2 ON T1.a = T2.a and T1.b = T2.b ORDER BY T2.a,
> T2.b{code}
> should avoid doing an extra order-by after the sort-merge-join operation.
>
> All the above three cases are caused by the same problem that the
> {{OrderPreservingTracker}} relies solely on primary keys for inferring
> alignment between the target {{OrderByExpression}} and the source sortedness.
> For following {{QueryCompiler.compileSingleQuery}}, because the inner query
> of above cases has {{OrderBy}}, so in line 534, local variable
> {{isInRowKeyOrder}} is false:
> {code:java}
> 518 protected QueryPlan compileSingleQuery(StatementContext context,
> SelectStatement select, List<Object> binds, boolean asSubquery, boolean
> allowPageFilter) throws SQLException{
> 519 SelectStatement innerSelect = select.getInnerSelectStatement();
> 520 if (innerSelect == null) {
> 521 return compileSingleFlatQuery(context, select, binds,
> asSubquery, allowPageFilter, null, null, true);
> 522 }
> 523
> 524 QueryPlan innerPlan = compileSubquery(innerSelect, false);
> 525 TupleProjector tupleProjector = new
> TupleProjector(innerPlan.getProjector());
> 526 innerPlan = new TupleProjectionPlan(innerPlan, tupleProjector,
> null);
> 527
> 528 // Replace the original resolver and table with those having
> compiled type info.
> 529 TableRef tableRef = context.getResolver().getTables().get(0);
> 530 ColumnResolver resolver =
> FromCompiler.getResolverForCompiledDerivedTable(statement.getConnection(),
> tableRef, innerPlan.getProjector());
> 531 context.setResolver(resolver);
> 532 tableRef = resolver.getTables().get(0);
> 533 context.setCurrentTable(tableRef);
> 534 boolean isInRowKeyOrder = innerPlan.getGroupBy() ==
> GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
> 535
> 536 return compileSingleFlatQuery(context, select, binds,
> asSubquery, allowPageFilter, innerPlan, tupleProjector, isInRowKeyOrder);
> 537 }
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)