Hi Maryann, I filed PHOENIX-4751 <https://issues.apache.org/jira/browse/PHOENIX-4751>.
Is this likely to be reviewed soon (say next few weeks), or should I look at the Phoenix source to estimate the scope / impact? Thanks, Gerald On Tue, May 22, 2018 at 11:12 AM, Maryann Xue <maryann....@gmail.com> wrote: > Since the performance running a group-by aggregation on client side is > most likely bad, it’s usually not desired. The original implementation was > for functionality completeness only so it chose the easiest way, which > reused some existing classes. In some cases, though, the client group-by > can still be tolerable if there aren’t many distinct keys. So yes, please > open a JIRA for implementing hash aggregation on client side. Thank you! > > > Thanks, > Maryann > > On Tue, May 22, 2018 at 10:50 AM Gerald Sangudi <gsang...@23andme.com> > wrote: > >> Hello, >> >> Any guidance or thoughts on the thread below? >> >> Thanks, >> Gerald >> >> >> On Fri, May 18, 2018 at 11:39 AM, Gerald Sangudi <gsang...@23andme.com> >> wrote: >> >>> Maryann, >>> >>> Can Phoenix provide hash aggregation on the client side? Are there >>> design / implementation reasons not to, or should I file a ticket for this? >>> >>> Thanks, >>> Gerald >>> >>> On Fri, May 18, 2018 at 11:29 AM, Maryann Xue <maryann....@gmail.com> >>> wrote: >>> >>>> Hi Gerald, >>>> >>>> Phoenix does have hash aggregation. The reason why sort-based >>>> aggregation is used in your query plan is that the aggregation happens on >>>> the client side. And that is because sort-merge join is used (as hinted) >>>> which is a client driven join, and after that join stage all operations can >>>> only be on the client-side. >>>> >>>> >>>> Thanks, >>>> Marynn >>>> >>>> On Fri, May 18, 2018 at 10:57 AM, Gerald Sangudi <gsang...@23andme.com> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> Does Phoenix provide hash aggregation? If not, is it on the roadmap, >>>>> or should I file a ticket? We have aggregation queries that do not require >>>>> sorted results. >>>>> >>>>> For example, this EXPLAIN plan shows a CLIENT SORT. >>>>> >>>>> *CREATE TABLE unsalted ( keyA BIGINT NOT NULL, keyB BIGINT >>>>> NOT NULL, val SMALLINT, CONSTRAINT pk PRIMARY KEY (keyA, >>>>> keyB));* >>>>> >>>>> >>>>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */ t1.val v1, t2.val v2, >>>>> COUNT(*) c FROM unsalted t1 JOIN unsalted t2 ON (t1.keyA = t2.keyA) GROUP >>>>> BY t1.val, >>>>> t2.val;+------------------------------------------------------------+-----------------+----------------+--+| >>>>> PLAN | EST_BYTES_READ | EST_ROWS_READ | >>>>> |+------------------------------------------------------------+-----------------+----------------+--+| >>>>> SORT-MERGE-JOIN (INNER) TABLES | null | null | >>>>> || CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED | null | >>>>> null >>>>> | || AND | null | >>>>> null | || CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED | >>>>> null >>>>> | null | || CLIENT SORTED BY [TO_DECIMAL(T1.VAL), T2.VAL] | >>>>> null | null | || CLIENT AGGREGATE INTO DISTINCT ROWS BY [T1.VAL, T2.VAL] >>>>> | null | null | >>>>> |+------------------------------------------------------------+-----------------+----------------+--+* >>>>> Thanks, >>>>> Gerald >>>>> >>>> >>>> >>> >>