[
https://issues.apache.org/jira/browse/PHOENIX-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325041#comment-15325041
]
James Taylor commented on PHOENIX-2965:
---------------------------------------
No, the logic is wrong in GroupByCompiler. We shouldn't be adding all the
select expressions to the group by list - only if statement.isDistinct() is
true should we do that. I think that for your optimization to kick in, we can
add all select expressions are instances of DistinctCountAggregateFunction and
then we need to add the *child node* of DistinctCountAggregateFunction as the
group by expression (I'm surprised it worked before, as you'd have a GROUP BY
COUNT(DISTINCT...) ). Not sure if this will be more performant in the non order
preserving case - it'll probably be more or less the same - so I suppose it's
ok to always do it (but only for DistinctCountAggregateFunction).
> Use DistinctPrefixFilter logic for COUNT(DISTINCT ...) and COUNT(...) GROUP BY
> ------------------------------------------------------------------------------
>
> Key: PHOENIX-2965
> URL: https://issues.apache.org/jira/browse/PHOENIX-2965
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Fix For: 4.8.0
>
> Attachments: 2965-v2.txt, 2965-v3.txt, 2965-v4.txt, 2965-v5.txt,
> 2965-v6.txt, 2965.txt, PHOENIX-2965_wip.patch
>
>
> Parent uses skip scanning to optimize DISTINCT and certain GROUP BY
> operations along the row key.
> COUNT queries are optimized differently, could be sped up significantly as
> well.
> [~giacomotaylor], I might need to help into where COUNT(DISTINCT) queries are
> planned and optimized.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)