[
https://issues.apache.org/jira/browse/MATH-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046973#comment-14046973
]
Phil Steitz commented on MATH-1131:
-----------------------------------
I think the patch definitely improves things, so +1 to commit that for now. I
am not sure that the Marsaglia-Tsang method is best for large n, though. It
might be best to either a) just use the Kolmogorov approximation or b) use what
Simard-L'Ecuyer ([2] in the class javadoc) refer to as the Pelz-Good method for
large n (or more precisely large n*d). I think R does a). The two-sample
tests do a).
> Kolmogorov-Smirnov Tests takes 'forever' on 10,000 item dataset
> ---------------------------------------------------------------
>
> Key: MATH-1131
> URL: https://issues.apache.org/jira/browse/MATH-1131
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 3.3
> Environment: Java 8
> Reporter: Schalk W. Cronjé
> Attachments: 1.txt, MATH-1131.patch, ReproduceKsIssue.groovy,
> ReproduceKsIssue.java
>
>
> I have code simplified to the following:
> KolmogorovSmirnovTest kst = new KolmogorovSmirnovTest();
> NormalDistribution nd = new NormalDistribution(mean,stddev);
> kst.kolmogorovSmirnovTest(nd,dataset)
> I find that for my dataset of 10,000 items, the call to kolmogorovSmirnovTest
> takes 'forever'. It has not returned after nearly 15minutes and in one my my
> tests has gone over 150MB in memory usage.
--
This message was sent by Atlassian JIRA
(v6.2#6252)