The monte carlo approach I developed for 2-sample Kolmogorov-Smirnov
tests converges too slowly to be practical. I suspect full
enumeration of n - m partitions of n + m will actually be faster for
small m + n. To do this, I need to enumerate combinations. I have
implemented a fast, non-recursive algorithm to do this (Knuth's
algorithm T from 7.2.1.3 of TACP 4A), exposed as
class LexicographicCombinationIterator implements Iterator<int[]>
The constructor takes <n, k> and the int[] arrays returned by next()
are increasing k-length arrays from {0, ..., n - 1}, iterated in
lexicographic order. The array-based approach is what I need in K-S
and also fastest.
Initially, I implemented this as a private inner class in
KolmogorovSmirnovTest, but this is making testing inconvenient and
it also seems like a generically useful class. The question is
where to put it. One possibility is to make it a public inner class
of MathArrays. Of just add it as a class in util. Advice appreciated.
Phil
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]