[ 
https://issues.apache.org/jira/browse/PHOENIX-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022503#comment-16022503
 ] 

Loknath Priyatham Teja Singamsetty  commented on PHOENIX-3773:
--------------------------------------------------------------

[~jamestaylor] 
bq. Have FIRST_VALUES return an ARRAY type so that you can return all values in 
a single row. It's not going to work to change the semantics of SQL (it's 
pretty well established). A function can't return multiple rows like that.

Gone through Oracle/SQL rank over and grouped_concat behaviours. Probably what 
you are suggesting here is to implement Group_concat 
https://stackoverflow.com/questions/2129693/using-limit-within-group-by-to-get-n-results-per-group.

 Kindly help me with the expectation here.

TEST.TEST
||id||page_id||date||val||
|2|8|1|7|
|3|8|2|9|
|4|8|3|4|
|5|8|4|2|
|6|9|5|10|
|7|9|6|13|

For the above table with below queries,

a) FIRST_VALUES With Group By Clause
SELECT page_id, FIRST_VALUES(val, 2) WITHIN GROUP (ORDER BY dates DESC) as 
first_values FROM TEST.TEST GROUP BY page_id

Expected Output?
-----------------------
||page_id||first_values||
|8|2,4|
|9|13,10|

b) FIRST_VALUES without group by
SELECT FIRST_VALUES(val, 2) as first_values WITHIN GROUP (ORDER BY dates DESC) 
as first_values FROM TEST.TEST 
||first_values||
|13,10|

[~jamestaylor] Let me know if the above looks as expected behaviour.

 

> Implement FIRST_VALUES aggregate function
> -----------------------------------------
>
>                 Key: PHOENIX-3773
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3773
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: Loknath Priyatham Teja Singamsetty 
>              Labels: SFDC
>             Fix For: 4.11.0
>
>         Attachments: PHOENIX-3773.patch, PHOENIX-3773.v2.patch
>
>
> Similar to FIRST_VALUE, but would allow the user to specify how many values 
> to keep. This could use a MinMaxPriorityQueue under the covers and be much 
> more efficient than using multiple NTH_VALUE calls to do the same like this:
> {code}
> SELECT entity_id,
>        NTH_VALUE(user_id,1) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth1_user_id,
>        NTH_VALUE(user_id,2) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth2_user_id,
>        NTH_VALUE(user_id,3) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth3_user_id,
>        count(*)
> FROM  MY_TABLE 
> WHERE tenant_id='00Dx0000000XXXX'
> AND entity_id in ('0D5x000000ABCD','0D5x000000ABCE')
> GROUP BY entity_id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to