[
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614601#comment-13614601
]
Phil Steitz commented on MATH-437:
----------------------------------
I think we should bump this to 4.0 or at least 3.3. It was probably a mistake
to put K-S in the distribution package. The K-S distribution itself is of
little practical usefulness (to my knowledge at least). I have never seen it
used for anything but performing K-S tests. It is tricky enough to compute the
distribution function itself with any kind of numerical stability, as the
comments above and the literature around K-S tests confirm. Computing moments
is, as the reference where Luc (resourcefully!) found test data states,
"intractable." I think it may be best to steer clear of this and focus on just
getting good implementation of the test itself, which should move to
.inference. I would prefer to do a little more research though to decide how
best to set up the API and implementation for the test. It could be we would
be better off not using the cdfs in the current impl, instead using beta
approximation to compute p-values as in [1]. Note also that since discussion
above / initial implementation, Simard has published [2] with some empirical
findings on how the various K-S approximation methods perform.
So to summarize, I think the first step is to agree on the K-S test API. Then
deprecate the class in .distribution and move the test class to .inference.
[1]http://www.ism.ac.jp/editsec/aism/pdf/054_3_0577.pdf
[2] http://www.jstatsoft.org/v39/i11/paper
> Kolmogorov Smirnov Distribution
> -------------------------------
>
> Key: MATH-437
> URL: https://issues.apache.org/jira/browse/MATH-437
> Project: Commons Math
> Issue Type: New Feature
> Reporter: Mikkel Meyer Andersen
> Assignee: Phil Steitz
> Priority: Minor
> Fix For: 3.2
>
> Attachments: ks-distribution.patch, MATH437-with-test-take-1
>
> Original Estimate: 0.25h
> Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a
> known probability density functions or if two samples are from the same
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov
> distribution is used. Quite good asymptotics exist for the one-sided test,
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira