[jira] [Commented] (PHOENIX-3773) Implement FIRST_VALUES aggregate function

Loknath Priyatham Teja Singamsetty (JIRA) Fri, 02 Jun 2017 12:51:44 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035310#comment-16035310
 ]


Loknath Priyatham Teja Singamsetty  commented on PHOENIX-3773:
--------------------------------------------------------------

bq. You might want to check out ArrayAppendFunctionIT which exercises 
ARRAY_APPEND 
Thanks for the pointer James. Was looking into the same and was understanding 
how things were working.

[~jamestaylor] Looks like I found the reason. The 
PArrayDataType.appendItemToArray can be used when you already have an Array 
serialized to bytes with atleast one element in it. We cannot leverage this 
without having an array pre-constructed.

In our case, the requirement is to convert the multiple <T>PDataType to single 
<T>PArrayDataType. There is no such util method which can construct the Array 
from scratch given element one by one to array.

We have to perform serialization/deserialization for one element in order to 
construct the Array, post which we can make use of 
PArrayDataType.appendItemToArray. This would save serialization/deserialization 
cost on the rest of items in first values array result set.

Let me know if this approach is fine with you. 

  

> Implement FIRST_VALUES aggregate function
> -----------------------------------------
>
>                 Key: PHOENIX-3773
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3773
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: Loknath Priyatham Teja Singamsetty 
>              Labels: SFDC
>             Fix For: 4.11.0
>
>         Attachments: PHOENIX-3773_4.x-HBase-0.98.patch, 
> PHOENIX-3773_master.patch, PHOENIX-3773.patch, PHOENIX-3773.v2.patch, 
> PHOENIX-3773.v3.patch
>
>
> Similar to FIRST_VALUE, but would allow the user to specify how many values 
> to keep. This could use a MinMaxPriorityQueue under the covers and be much 
> more efficient than using multiple NTH_VALUE calls to do the same like this:
> {code}
> SELECT entity_id,
>        NTH_VALUE(user_id,1) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth1_user_id,
>        NTH_VALUE(user_id,2) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth2_user_id,
>        NTH_VALUE(user_id,3) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth3_user_id,
>        count(*)
> FROM  MY_TABLE 
> WHERE tenant_id='00Dx0000000XXXX'
> AND entity_id in ('0D5x000000ABCD','0D5x000000ABCE')
> GROUP BY entity_id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PHOENIX-3773) Implement FIRST_VALUES aggregate function

Reply via email to