Re: Hash aggregation

James Taylor Thu, 14 Jun 2018 06:30:48 -0700

Hi Gerald,
No further suggestions than my comments on the JIRA. Maybe a good next step
would be a patch?
Thanks,
James


On Tue, Jun 12, 2018 at 8:15 PM, Gerald Sangudi <gsang...@23andme.com>
wrote:

> Hi Maryann and James,
>
> Any further guidance on PHOENIX-4751
> <https://issues.apache.org/jira/browse/PHOENIX-4751>?
>
> Thanks,
> Gerald
>
> On Wed, May 23, 2018 at 11:00 AM, Gerald Sangudi <gsang...@23andme.com>
> wrote:
>
>> Hi Maryann,
>>
>> I filed PHOENIX-4751 <https://issues.apache.org/jira/browse/PHOENIX-4751>
>> .
>>
>> Is this likely to be reviewed soon (say next few weeks), or should I look
>> at the Phoenix source to estimate the scope / impact?
>>
>> Thanks,
>> Gerald
>>
>> On Tue, May 22, 2018 at 11:12 AM, Maryann Xue <maryann....@gmail.com>
>> wrote:
>>
>>> Since the performance running a group-by aggregation on client side is
>>> most likely bad, it’s usually not desired. The original implementation was
>>> for functionality completeness only so it chose the easiest way, which
>>> reused some existing classes. In some cases, though, the client group-by
>>> can still be tolerable if there aren’t many distinct keys. So yes, please
>>> open a JIRA for implementing hash aggregation on client side. Thank you!
>>>
>>>
>>> Thanks,
>>> Maryann
>>>
>>> On Tue, May 22, 2018 at 10:50 AM Gerald Sangudi <gsang...@23andme.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> Any guidance or thoughts on the thread below?
>>>>
>>>> Thanks,
>>>> Gerald
>>>>
>>>>
>>>> On Fri, May 18, 2018 at 11:39 AM, Gerald Sangudi <gsang...@23andme.com>
>>>> wrote:
>>>>
>>>>> Maryann,
>>>>>
>>>>> Can Phoenix provide hash aggregation on the client side? Are there
>>>>> design / implementation reasons not to, or should I file a ticket for 
>>>>> this?
>>>>>
>>>>> Thanks,
>>>>> Gerald
>>>>>
>>>>> On Fri, May 18, 2018 at 11:29 AM, Maryann Xue <maryann....@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Gerald,
>>>>>>
>>>>>> Phoenix does have hash aggregation. The reason why sort-based
>>>>>> aggregation is used in your query plan is that the aggregation happens on
>>>>>> the client side. And that is because sort-merge join is used (as hinted)
>>>>>> which is a client driven join, and after that join stage all operations 
>>>>>> can
>>>>>> only be on the client-side.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Marynn
>>>>>>
>>>>>> On Fri, May 18, 2018 at 10:57 AM, Gerald Sangudi <
>>>>>> gsang...@23andme.com> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Does Phoenix provide hash aggregation? If not, is it on the roadmap,
>>>>>>> or should I file a ticket? We have aggregation queries that do not 
>>>>>>> require
>>>>>>> sorted results.
>>>>>>>
>>>>>>> For example, this EXPLAIN plan shows a CLIENT SORT.
>>>>>>>
>>>>>>> *CREATE TABLE unsalted (       keyA BIGINT NOT NULL,       keyB
>>>>>>> BIGINT NOT NULL,       val SMALLINT,       CONSTRAINT pk PRIMARY KEY 
>>>>>>> (keyA,
>>>>>>> keyB));*
>>>>>>>
>>>>>>>
>>>>>>> *EXPLAINSELECT /*+ USE_SORT_MERGE_JOIN */ t1.val v1, t2.val v2,
>>>>>>> COUNT(*) c FROM unsalted t1 JOIN unsalted t2 ON (t1.keyA = t2.keyA) 
>>>>>>> GROUP
>>>>>>> BY t1.val,
>>>>>>> t2.val;+------------------------------------------------------------+-----------------+----------------+--+|
>>>>>>>                            PLAN   | EST_BYTES_READ | EST_ROWS_READ  |
>>>>>>> |+------------------------------------------------------------+-----------------+----------------+--+|
>>>>>>> SORT-MERGE-JOIN (INNER) TABLES                             | null | 
>>>>>>> null |
>>>>>>> ||     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED  | null | 
>>>>>>> null
>>>>>>> | || AND                                                        | null |
>>>>>>> null | ||     CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED  | 
>>>>>>> null
>>>>>>> | null | || CLIENT SORTED BY [TO_DECIMAL(T1.VAL), T2.VAL]              |
>>>>>>> null | null | || CLIENT AGGREGATE INTO DISTINCT ROWS BY [T1.VAL, T2.VAL]
>>>>>>>    | null | null |
>>>>>>> |+------------------------------------------------------------+-----------------+----------------+--+*
>>>>>>> Thanks,
>>>>>>> Gerald
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>

Re: Hash aggregation

Reply via email to