[
https://issues.apache.org/jira/browse/MATH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283756#comment-14283756
]
Thomas Neidhart commented on MATH-1197:
---------------------------------------
One observation: the samples contain a lot of equal values.
The KS test statistic is implemented using Arrays.binarySearch, but this does
not specify which index will be found when looking for a given value in a
sorted array.
E.g. if you have samples [0, 0, 0, 0, 0, 1] and you search for 0, you might get
an index in the range [0, 4]. As far as I understand the KS statistic, it is an
empirical distribution function which calculates the cumulative density based
on how many values are less or equal than the given observation, which is not
equal to the result returned by Arrays.binarySearch.
> Incorrect Kolmogorov–Smirnov Statistic for two samples
> -------------------------------------------------------
>
> Key: MATH-1197
> URL: https://issues.apache.org/jira/browse/MATH-1197
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 3.4.1
> Environment: Ubuntu 14.04
> Reporter: Danaja Thiyunuwan Maldeniya
>
> kolmogorovSmirnovTest(double[],double[]) against the samples given below
> gives 5.699107852308316E-12 instead of 0.9793 (approx.) Traced the issue to
> kolmogorovSmirnovStatistic(double[],double[]) which gives 0.49507389162561577
> instead of 0.064 (verified with ks.test in R and JDistlib)
> double[] x =
> {0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2.202653,2.202653,2.202653
>
> ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653
>
> ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653
>
> ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.181199,3.181199,3.181199,3.181199,3.181199,3.181199,3.723539
>
> ,3.723539,3.723539,3.723539,4.383482,4.383482,4.383482,4.383482,5.320671,5.320671,5.320671,5.717284,6.964001,7.352165
>
> ,8.710510,8.710510,8.710510,8.710510,8.710510,8.710510,9.539004,9.539004,
> 10.720619, 17.726077, 17.726077, 17.726077, 17.726077
> ,22.053875 ,23.799144 ,27.355308 ,30.584960 ,30.584960
> ,30.584960, 30.584960, 30.751808};
> double[] y =
> {0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
>
> ,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,2.202653
>
> ,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,2.202653,3.061758,3.723539,5.628420,5.628420,5.628420,5.628420
> ,5.628420,6.916982,6.916982,6.916982, 10.178538, 10.178538,
> 10.178538, 10.178538, 10.178538 };
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)