[jira] [Commented] (PHOENIX-2344) Implement partial stream aggregate

Anoop Sam John (JIRA) Thu, 22 Oct 2015 23:37:10 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970532#comment-14970532
 ]


Anoop Sam John commented on PHOENIX-2344:
-----------------------------------------

Nice one.

> Implement partial stream aggregate
> ----------------------------------
>
>                 Key: PHOENIX-2344
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2344
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Maryann Xue
>            Assignee: Maryann Xue
>
> We now have ordered group-by (stream aggregate) and unordered group-by (hash 
> aggregate) in Phoenix. Stream aggregate is usually much more beneficial than 
> hash aggregate in terms of memory usage and pipelining, but it requires that 
> the aggregate's input is ordered on group-by expressions, i.e. the group-by 
> expressions is the beginning part of the input's collation (ordering).
> However, we could have something in the middle, a stream/hash hybrid 
> aggregate when the group-by expressions and the input collation share some 
> common part. For example, we group table T1 by column A, B and T1 is sorted 
> on column A, C, we'll have the ordered part as A, and the hash part as B. 
> Thus within the range of a same A, a hash table is used for collecting all 
> different Bs; while at the changing point of A, we can purge the intermediate 
> hash table and feed the result for the previous A to next operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2344) Implement partial stream aggregate

Reply via email to