[jira] [Created] (PHOENIX-2344) Implement partial stream aggregate

Maryann Xue (JIRA) Thu, 22 Oct 2015 10:35:35 -0700

Maryann Xue created PHOENIX-2344:
------------------------------------

             Summary: Implement partial stream aggregate
                 Key: PHOENIX-2344
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2344
             Project: Phoenix
          Issue Type: Improvement
            Reporter: Maryann Xue
            Assignee: Maryann Xue



We now have ordered group-by (stream aggregate) and unordered group-by (hash 
aggregate) in Phoenix. Stream aggregate is usually much more beneficial than 
hash aggregate in terms of memory usage and pipelining, but it requires that 
the aggregate's input is ordered on group-by expressions, i.e. the group-by 
expressions is the beginning part of the input's collation (ordering).
However, we could have something in the middle, a stream/hash hybrid aggregate 
when the group-by expressions and the input collation share some common part. 
For example, we group table T1 by column A, B and T1 is sorted on column A, C, 
we'll have the ordered part as A, and the hash part as B. Thus within the range 
of a same A, a hash table is used for collecting all different Bs; while at the 
changing point of A, we can purge the intermediate hash table and feed the 
result for the previous A to next operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (PHOENIX-2344) Implement partial stream aggregate

Reply via email to