[jira] [Commented] (PHOENIX-2989) Allow DistinctPrefixFilter optimization when HAVING clause only reference COUNT(DISTINCT)

James Taylor (JIRA) Sun, 12 Jun 2016 09:13:30 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326507#comment-15326507
 ]


James Taylor commented on PHOENIX-2989:
---------------------------------------

We'll need to prevent any aggregators from being created from an ORDER BY for 
an ungrouped aggregation, otherwise our check won't work:
{code}
        plan.getGroupBy().getKeyExpressions().size() ==  
        context.getAggregationManager().getAggregators().getAggregatorCount()
{code}

The easiest way to do that is to override the addExpression method in the 
ExpressionCompiler created in OrderByCompiler:
{code}
        ExpressionCompiler compiler = groupBy.isUngroupedAggregation()
            ? new ExpressionCompiler(context, groupBy) {
                   @Override
                   protected Expression addExpression(Expression expression) {
                   }
               }
            : new ExpressionCompiler(context, groupBy);
{code}

Then in the loop doing validation, just don't add the expression to the list of 
OrderBy nodes:
{code}
            if (!expression.isStateless() || groupBy.isUngroupedAggregation()) {
{code}

and we'll end up returning an EMPTY_ORDER_BY after the validation has ben done.


> Allow DistinctPrefixFilter optimization when HAVING clause only reference 
> COUNT(DISTINCT)
> -----------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2989
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2989
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>             Fix For: 4.8.0
>
>
> The DistinctPrefixFilter optimization can still be used if a HAVING clause 
> only references COUNT(DISTINCT) expressions. One way to detect this is to 
> collect a Set<ParseNode> using a visitor for the SELECT and HAVING which only 
> collects COUNT(DISTINCT) expressions. This set will then be used as the GROUP 
> BY nodes if there's no existing GROUP BY.
> The check for whether or not to add the filter can then change to something 
> like this:
> {code}
>     if (... &&
>     ( context.getAggregationManager().isEmpty() ||
>       ( plan.getGroupBy().isUngroupedAggregate() &&
>         plan.getGroupBy().getKeyExpressions().size() ==  
>         context.getAggregationManager().getAggregators().getAggregatorCount() 
> ) ) )
> {code}
> That way, it'll only add the filter if all expressions pulled in as a GROUP 
> BY expression (only the count distinct ones) account for all of the 
> aggregators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2989) Allow DistinctPrefixFilter optimization when HAVING clause only reference COUNT(DISTINCT)

Reply via email to