[ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839393#action_12839393
 ] 

Zheng Shao commented on HIVE-259:
---------------------------------

> (1) I am not familiar with the exact definition of percentile function. Is 
> the percentile()'s result must be a member of input data?
See the link above.

> (2) HashMap and ArrayList is used to copy and sort. Can we use tree map here? 
> this is a small and can be ignored.
In the beginning of new test case, 
I think HashMap is better here. The reason is that the number of "iterate" is 
usually much higher than the number of unique numbers (the size of the 
HashMap). By using HashMap we reduce the cost of "iterate".

> In the beginning of new test case, .. appears two times
Fixed in HIVE-259.5.patch


> Add PERCENTILE aggregate function
> ---------------------------------
>
>                 Key: HIVE-259
>                 URL: https://issues.apache.org/jira/browse/HIVE-259
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Jerome Boulon
>         Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.4.patch, HIVE-259.5.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to