[
https://issues.apache.org/jira/browse/HIVE-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879894#action_12879894
]
Mayank Lahiri commented on HIVE-1387:
-------------------------------------
This is what I suggest we do to resolve this issue:
1. Create a new percentile_approx() function that overrides
GenericUDAFHistogramNumeric to approximate a fine-grained histogram with many
bins (say 10,000 for example, but I'll run some experiments), and then use the
histogram to estimate the percentile value.
2. Convert the existing simple percentile() UDAF to a generic UDAF. When the
input is byte, short, int, or long, then use the existing code (with some
modifications, like converting the linear scan to a binary search). When the
input is float or double, then automatically use the percentile_approx()
function.
> Make PERCENTILE work with double data type
> ------------------------------------------
>
> Key: HIVE-1387
> URL: https://issues.apache.org/jira/browse/HIVE-1387
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: Vaibhav Aggarwal
> Assignee: Mayank Lahiri
> Attachments: patch-1387-1.patch
>
>
> The PERCENTILE UDAF does not work with double datatype.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.