[
https://issues.apache.org/jira/browse/PHOENIX-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16726691#comment-16726691
]
chenglei edited comment on PHOENIX-4820 at 12/21/18 12:36 PM:
--------------------------------------------------------------
[~tdsilva], thank you for review. The {{Order By}} could be compiled out
because {{Group By}} is executed by sort in {{ClientAggregatePlan}},just as
following code in {{ClientAggregatePlan.iterator}}:
{code}
158 } else {
159 iterator = new OrderedResultIterator(iterator,
keyExpressionOrderBy, thresholdBytes, null, null,
projector.getEstimatedRowByteSize());
160 aggResultIterator = new
ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator),
serverAggregators, keyExpressions);
161 }
162 }
163 aggResultIterator = new
GroupedAggregatingResultIterator(LookAheadResultIterator.wrap(aggResultIterator),
clientAggregators);
164 }
{code}
So if the {{Order By}} matchs the order of {{Group By}}, the {{Order By}}
could be compiled out.
In fact, when we compile the expression If the statement has a {{Group By}}, we
replace the expression matchs {{Group By}} expressions as
RowKeyColumnExpression, just as following code in {{ExpressionCompiler}}:
{code}
296 private Expression wrapGroupByExpression(Expression expression) {
297 // If we're in an aggregate function, don't wrap a group by
expression,
298 // since in that case we're aggregating over the
regular/ungrouped
299 // column.
300 if (aggregateFunction == null) {
301 int index = groupBy.getExpressions().indexOf(expression);
302 if (index >= 0) {
303 isAggregate = true;
304 RowKeyValueAccessor accessor = new
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
305 expression = new RowKeyColumnExpression(expression,
accessor, groupBy.getKeyExpressions().get(index).getDataType());
306 }
307 }
308 return expression;
309 }
{code}
BTW. even for client-side hash aggregation introduced by PHOENIX-4751,when
{{Order By}} could be compiled out, {{Group By}} is still executed by sort,
just as following code in {{ClientHashAggregatingResultIterator.next}}:
{code}
102 if (orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy ==
OrderBy.REV_ROW_KEY_ORDER_BY) {
103 keyList = sortKeys();
104 keyIterator = keyList.iterator();
105 } else {
106 keyIterator = hash.keySet().iterator();
107 }
{code}
was (Author: comnetwork):
[~tdsilva], thank you for review. The {{Order By}} could be compiled out
because {{Group By}} is executed by sort in {{ClientAggregatePlan}},just as
following code in {{ClientAggregatePlan.iterator}}:
{code:java}
158 } else {
159 iterator = new OrderedResultIterator(iterator,
keyExpressionOrderBy, thresholdBytes, null, null,
projector.getEstimatedRowByteSize());
160 aggResultIterator = new
ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator),
serverAggregators, keyExpressions);
161 }
162 }
163 aggResultIterator = new
GroupedAggregatingResultIterator(LookAheadResultIterator.wrap(aggResultIterator),
clientAggregators);
164 }
{code:java}
So if the {{Order By}} matchs the order of {{Group By}}, the {{Order By}}
could be compiled out.
In fact, when we compile the expression If the statement has a {{Group By}}, we
replace the expression matchs {{Group By}} expressions as
RowKeyColumnExpression, just as following code in {{ExpressionCompiler}}:
{code:java}
296 private Expression wrapGroupByExpression(Expression expression) {
297 // If we're in an aggregate function, don't wrap a group by
expression,
298 // since in that case we're aggregating over the
regular/ungrouped
299 // column.
300 if (aggregateFunction == null) {
301 int index = groupBy.getExpressions().indexOf(expression);
302 if (index >= 0) {
303 isAggregate = true;
304 RowKeyValueAccessor accessor = new
RowKeyValueAccessor(groupBy.getKeyExpressions(), index);
305 expression = new RowKeyColumnExpression(expression,
accessor, groupBy.getKeyExpressions().get(index).getDataType());
306 }
307 }
308 return expression;
309 }
{code:java}
BTW. even for client-side hash aggregation introduced by PHOENIX-4751,when
{{Order By}} could be compiled out, {{Group By}} is still executed by sort,
just as following code in {{ClientHashAggregatingResultIterator.next}}:
{{code:java}}
102 if (orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy ==
OrderBy.REV_ROW_KEY_ORDER_BY) {
103 keyList = sortKeys();
104 keyIterator = keyList.iterator();
105 } else {
106 keyIterator = hash.keySet().iterator();
107 }
{{code:java}}
> Optimize OrderBy for ClientAggregatePlan
> ----------------------------------------
>
> Key: PHOENIX-4820
> URL: https://issues.apache.org/jira/browse/PHOENIX-4820
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 4.14.0
> Reporter: chenglei
> Assignee: chenglei
> Priority: Major
> Fix For: 4.15.0
>
> Attachments: PHOENIX-4820-4.x-HBase-1.3.patch
>
>
> Given a table
> {code}
> create table test (
> pk1 varchar not null ,
> pk2 varchar not null,
> pk3 varchar not null,
> v1 varchar,
> v2 varchar,
> CONSTRAINT TEST_PK PRIMARY KEY (
> pk1,
> pk2,
> pk3 ))
> {code}
> for following sql :
> {code}
> select a.ak3
> from (select substr(pk1,1,1) ak1,substr(pk2,1,1) ak2,substr(pk3,1,1)
> ak3,substr(v1,1,1) av1,substr(v2,1,1) av2 from test order by pk2,pk3 limit
> 10) a group by a.ak3,a.av1 order by a.ak3,a.av1
> {code}
> Intuitively, the above OrderBy statement {{order by a.ak3,a.av1}} should be
> compiled out because it match the group by statement, but in fact it is not.
> The problem is caused by the {{QueryCompiler.compileSingleQuery}} and
> {{QueryCompiler.compileSingleFlatQuery}},for
> {{QueryCompiler.compileSingleQuery}} method,because the inner query has order
> by, so in line 520, local variable {{isInRowKeyOrder}} is false:
> {code}
> 519 context.setCurrentTable(tableRef);
> 520 boolean isInRowKeyOrder = innerPlan.getGroupBy() ==
> GroupBy.EMPTY_GROUP_BY && innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
> {code}
> In {{QueryCompiler.compileSingleFlatQuery}},when {{OrderByCompiler.compile}}
> method is invoked, the last parameter {{isInRowKeyOrder}} is false:
> {code}
> 562 OrderBy orderBy = OrderByCompiler.compile(context, select,
> groupBy, limit, offset, projector,
> 563 groupBy == GroupBy.EMPTY_GROUP_BY ?
> innerPlanTupleProjector : null, isInRowKeyOrder);
> {code}
> So in following line 156 for {{OrderByCompiler.compile}},even though the
> {{tracker.isOrderPreserving}} is true, the OrderBy statement could not be
> compiled out.
> {code}
> 156 if (isInRowKeyOrder && tracker.isOrderPreserving()) {
> {code}
> In my opinion, with GroupBy, in following line 563 for
> {{QueryCompiler.compileSingleFlatQuery}} method, when we call
> {{OrderByCompiler.compile}} method, we no need to conside the
> {{isInRowKeyOrder}}, just like the previous parameter {{tupleProjector}}
> does.
> {code}
> 562 OrderBy orderBy = OrderByCompiler.compile(context, select,
> groupBy, limit, offset, projector,
> 563 groupBy == GroupBy.EMPTY_GROUP_BY ?
> innerPlanTupleProjector : null, isInRowKeyOrder);
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)