Shyam created SPARK-31310:
-----------------------------

             Summary: percentile_approx function not working as expected
                 Key: SPARK-31310
                 URL: https://issues.apache.org/jira/browse/SPARK-31310
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.3, 2.4.1, 2.4.0
         Environment: park-sql-2.4.1v with Java 8
            Reporter: Shyam
             Fix For: 2.4.3


I'm using spark-sql-2.4.1v with Java 8 and I'm trying to do find quantiles, 
i.e. percentile 0, percentile 25, etc, on the given column data of dataframe.

Column values data set is as below 23456.55,34532.55,23456.55

When I use percentile_approx() function the results are not matching to that of 
Excel percentile_inc() function.

Ex :

for above data set i.e. 23456.55,34532.55,23456.55
percentile_0,percentile_10,percentile_25,percentile_50,percentile_75,percentile_90,percentile_100
 respectively 
using percentile_approx() function
23456.55,23456.55,23456.55,23456.55,23456.55,23456.55,23456.55

Using excel i.e. percentile_inc()
23456.55,23456.55,23456.55,23456.55,28994.550000000003,32317.350000000002,34532.55
How to get correct percentiles as excel using percentile_approx() function?

For the details please check it.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to