[ 
https://issues.apache.org/jira/browse/MATH-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046272#comment-13046272
 ] 

Christopher Nix commented on MATH-582:
--------------------------------------

I believe the implementation of percentiles within the library is in accordance 
with the NIST definition of percentiles.  To address your examples separately:

1.  What is missing from the API in the description of the implementation is 
"If pos < 1 then return the smallest element in the array".  As such, the value 
of 0.0 returned in your first example is indeed correct for this implementation.

2.  In this definition of percentiles, the value of pos is a position in the 
array to be interpolated, but with array indices starting with 1. So with pos = 
1.25, the value returned is correctly a quarter between the 1st and 2nd array 
values.

Percentiles do not meet intuition well when working with small datasets.  Other 
definitions, for example one with pos = 1+p*(n-1)/100 (like in MS Excel), may 
meet your requirement better in the above datasets, but not so well with medium 
ones.  With large datasets, the two definitions converge.

Hope this helps,

Chris N

> Percentile does not work as described in API
> --------------------------------------------
>
>                 Key: MATH-582
>                 URL: https://issues.apache.org/jira/browse/MATH-582
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.2
>            Reporter: Andre Herbst
>
> example call:
> StatUtils.percentile(new double[]{0d, 1d}, 25)   returns 0.0
> The API says that there is a position being computed:  p*(n+1)/100 -> we have 
> p=25 and n=2
> I would expect position 0.75 as result. Next step according to the API is: 
> interpolation between both values at floor(0.25) and at ceil(0.25). Those 
> values are 0d and 1d ... so lower + d * (upper - lower) should give 0d + 
> 0.25*(1d - 0d) = 0.25
> But the above call returns 0 as result. This does not make sense to me.
> another example where I think the result is not correct:
> StatUtils.percentile(new double[]{0d, 1d, 1d, 1d}, 25)   returns 0.25
> we have pos = 25*5/100 = 1.25  ... so d = 0.25
> values at position floor(1.25) and ceil(1.25) are 1d and 1d. How comes that 
> the result is not between 1d?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to