[ 
https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614601#comment-13614601
 ] 

Phil Steitz commented on MATH-437:
----------------------------------

I think we should bump this to 4.0 or at least 3.3. It was probably a mistake 
to put K-S in the distribution package.  The K-S distribution itself is of 
little practical usefulness (to my knowledge at least).  I have never seen it 
used for anything but performing K-S tests.  It is tricky enough to compute the 
distribution function itself with any kind of numerical stability, as the 
comments above and the literature around K-S tests confirm.  Computing moments 
is, as the reference where Luc (resourcefully!) found test data states, 
"intractable."  I think it may be best to steer clear of this and focus on just 
getting good implementation of the test itself, which should move to 
.inference.   I would prefer to do a little more research though to decide how 
best to set up the API and implementation for the test.  It could be we would 
be better off not using the cdfs in the current impl, instead using beta 
approximation to compute p-values as in [1].  Note also that since discussion 
above / initial implementation, Simard has published [2] with some empirical 
findings on how the various K-S approximation methods perform.

So to summarize, I think the first step is to agree on the K-S test API.  Then 
deprecate the class in .distribution and move the test class to .inference.

[1]http://www.ism.ac.jp/editsec/aism/pdf/054_3_0577.pdf
[2] http://www.jstatsoft.org/v39/i11/paper
                
> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Phil Steitz
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: ks-distribution.patch, MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a 
> known probability density functions or if two samples are from the same 
> distribution. To evaluate the test statistic, the Kolmogorov-Smirnov 
> distribution is used. Quite good asymptotics exist for the one-sided test, 
> but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to