[
https://issues.apache.org/jira/browse/PHOENIX-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765712#comment-16765712
]
Xinyi Yan commented on PHOENIX-2988:
------------------------------------
[~jamestaylor] do we have this implementation already? If not, I want to take
this task, thanks.
> Replace COUNT(DISTINCT...) with COUNT(...) when possible
> --------------------------------------------------------
>
> Key: PHOENIX-2988
> URL: https://issues.apache.org/jira/browse/PHOENIX-2988
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
> Priority: Major
>
> An optimization that would really benefit the SELECT COUNT(DISTINCT pkCol)
> case: if there's only a single COUNT(DISTINCT pkCol) and the GroupBy ends up
> being order preserving, you can replace the COUNT(DISTINCT pkCol) with a
> COUNT(pkCol) in the SELECT, HAVING, and ORDER BY clauses. That'll prevent the
> DistinctValueWithCountServerAggregator from being used which keeps a Map of
> all unique values and instead just keep a single overall count, which is all
> we need thanks to your DistinctPrefixFilter.
> A few considerations in the implementation:
> * Pass through select in the call to groupBy.compile() in QueryCompiler and
> change the return type to return a new select (as the SELECT, HAVING, and
> ORDER BY may have been rewritten). Probably easiest if the GroupBy object is
> just mutated in place.
> * Within the groupBy.compile() call, use a visitor on the SELECT, HAVING and
> ORDER BY clauses to do the rewriting. You can do that by deriving a class
> from ParseNodeRewriter, overriding the {{visitLeave(final FunctionParseNode
> node, List<ParseNode> nodes)}} method to return a new COUNT parse node with
> the {{nodes}} passed in as children if {{node}} equals the
> DistinctCountParseNode that you replaced in the select statement.
> * The compilation of the HAVING clause should be moved after the call to
> groupBy compile in QueryCompiler, like this since it may have been rewritten
> in the groupBy.compile call:
> {code}
> select = groupBy.compile(context, select, innerPlanTupleProjector);
> Expression having = HavingCompiler.compile(context, select, groupBy);
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)