[ 
https://issues.apache.org/jira/browse/PHOENIX-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16726691#comment-16726691
 ] 

chenglei edited comment on PHOENIX-4820 at 12/21/18 12:33 PM:
--------------------------------------------------------------

[~tdsilva], thank you for review. The {{Order By}} could be compiled out 
because {{Group By}} is executed by sort in {{ClientAggregatePlan}},just as 
following code in {{ClientAggregatePlan.iterator}}:

{code:java}
158    } else {
159                iterator = new OrderedResultIterator(iterator, 
keyExpressionOrderBy, thresholdBytes, null, null, 
projector.getEstimatedRowByteSize());
160                aggResultIterator = new 
ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator), 
serverAggregators, keyExpressions);
161              }
162       }
163        aggResultIterator = new 
GroupedAggregatingResultIterator(LookAheadResultIterator.wrap(aggResultIterator),
 clientAggregators);
164   }
{code:java}


So if the {{Order By}} matchs the order of {{Group By}}, the  {{Order By}} 
could be compiled out.

In fact, when we compile the expression If the statement has a {{Group By}}, we 
replace the expression matchs {{Group By}} expressions as 
RowKeyColumnExpression, just as following code in {{ExpressionCompiler}}:

{code:java}
296    private Expression wrapGroupByExpression(Expression expression) {
297             // If we're in an aggregate function, don't wrap a group by 
expression,
298              // since in that case we're aggregating over the 
regular/ungrouped
299              // column.
300             if (aggregateFunction == null) {
301                  int index = groupBy.getExpressions().indexOf(expression);
302                  if (index >= 0) {
303                        isAggregate = true;
304                        RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
305                        expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
306                  }
307              }
308             return expression;
309          }
{code:java}


BTW. even for client-side hash aggregation introduced by PHOENIX-4751,when 
{{Order By}} could be compiled out, {{Group By}} is still  executed by sort, 
just as following code in {{ClientHashAggregatingResultIterator.next}}:

{{code:java}}
102     if (orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy == 
OrderBy.REV_ROW_KEY_ORDER_BY) {
103                 keyList = sortKeys();
104                 keyIterator = keyList.iterator();
105            } else {
106                keyIterator = hash.keySet().iterator();
107            }
{{code:java}}



was (Author: comnetwork):
[~tdsilva], thank you for review. The {{Order By}} could be compiled out 
because {{Group By}} is executed by sort in {{ClientAggregatePlan}},just as 
following code in {{ClientAggregatePlan.iterator}}:

{code}
158    } else {
159                iterator = new OrderedResultIterator(iterator, 
keyExpressionOrderBy, thresholdBytes, null, null, 
projector.getEstimatedRowByteSize());
160                aggResultIterator = new 
ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator), 
serverAggregators, keyExpressions);
161                }
162       }
163        aggResultIterator = new 
GroupedAggregatingResultIterator(LookAheadResultIterator.wrap(aggResultIterator),
 clientAggregators);
164   }
{code}


So if the {{Order By}} matchs the order of {{Group By}}, the  {{Order By}} 
could be compiled out.

In fact, when we compile the expression If the statement has a {{Group By}}, we 
replace the expression matchs {{Group By}} expressions as 
RowKeyColumnExpression, just as following code in {{ExpressionCompiler}}:

{code}
296    private Expression wrapGroupByExpression(Expression expression) {
297         // If we're in an aggregate function, don't wrap a group by 
expression,
298              // since in that case we're aggregating over the 
regular/ungrouped
299              // column.
300             if (aggregateFunction == null) {
301                  int index = groupBy.getExpressions().indexOf(expression);
302                  if (index >= 0) {
303                        isAggregate = true;
304                        RowKeyValueAccessor accessor = new 
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
305                        expression = new RowKeyColumnExpression(expression, 
accessor, groupBy.getKeyExpressions().get(index).getDataType());
306                  }
307              }
308             return expression;
309          }
{code}


BTW. even for client-side hash aggregation introduced by PHOENIX-4751,when 
{{Order By}} could be compiled out, {{Group By}} is still  executed by sort, 
just as following code in {{ClientHashAggregatingResultIterator.next}}

{{code}}
102     if (orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy == 
OrderBy.REV_ROW_KEY_ORDER_BY) {
103                 keyList = sortKeys();
104                 keyIterator = keyList.iterator();
105            } else {
106                keyIterator = hash.keySet().iterator();
107            }
{{code}}


> Optimize OrderBy for ClientAggregatePlan
> ----------------------------------------
>
>                 Key: PHOENIX-4820
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4820
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.14.0
>            Reporter: chenglei
>            Assignee: chenglei
>            Priority: Major
>             Fix For: 4.15.0
>
>         Attachments: PHOENIX-4820-4.x-HBase-1.3.patch
>
>
> Given a table
> {code}
>   create table test ( 
>        pk1 varchar not null , 
>        pk2 varchar not null, 
>        pk3 varchar not null,
>         v1 varchar, 
>         v2 varchar, 
>         CONSTRAINT TEST_PK PRIMARY KEY ( 
>               pk1,
>               pk2,
>               pk3 ))
> {code}
> for following sql :
> {code}
>      select a.ak3 
>      from (select substr(pk1,1,1) ak1,substr(pk2,1,1) ak2,substr(pk3,1,1) 
> ak3,substr(v1,1,1) av1,substr(v2,1,1) av2 from test order by pk2,pk3 limit 
> 10) a  group by a.ak3,a.av1 order by a.ak3,a.av1
> {code}
> Intuitively, the above OrderBy statement {{order by a.ak3,a.av1}} should be 
> compiled out because it match the group by statement, but in fact it is not.
> The problem is caused by the {{QueryCompiler.compileSingleQuery}} and 
> {{QueryCompiler.compileSingleFlatQuery}},for  
> {{QueryCompiler.compileSingleQuery}} method,because the inner query has order 
> by, so in line 520, local variable {{isInRowKeyOrder}} is false:
> {code}
> 519        context.setCurrentTable(tableRef);
> 520        boolean isInRowKeyOrder = innerPlan.getGroupBy() == 
> GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
> {code}
> In {{QueryCompiler.compileSingleFlatQuery}},when {{OrderByCompiler.compile}} 
> method is invoked, the last parameter {{isInRowKeyOrder}} is false:
> {code}
> 562        OrderBy orderBy = OrderByCompiler.compile(context, select, 
> groupBy, limit, offset, projector,
> 563                groupBy == GroupBy.EMPTY_GROUP_BY ? 
> innerPlanTupleProjector : null, isInRowKeyOrder);
> {code}
> So in following line 156 for  {{OrderByCompiler.compile}},even though the 
> {{tracker.isOrderPreserving}} is true, the OrderBy  statement could not be 
> compiled out.
> {code}
> 156       if (isInRowKeyOrder && tracker.isOrderPreserving()) {
> {code}      
> In my opinion, with GroupBy, in following line 563 for 
> {{QueryCompiler.compileSingleFlatQuery}} method, when we call 
> {{OrderByCompiler.compile}} method, we no need to conside the 
> {{isInRowKeyOrder}},  just like the previous parameter  {{tupleProjector}} 
> does.
> {code}
>  562        OrderBy orderBy = OrderByCompiler.compile(context, select, 
> groupBy, limit, offset, projector,
>  563                groupBy == GroupBy.EMPTY_GROUP_BY ? 
> innerPlanTupleProjector : null, isInRowKeyOrder);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to