The monte carlo approach I developed for 2-sample Kolmogorov-Smirnov
tests converges too slowly to be practical.  I suspect full
enumeration of n - m partitions of n + m will actually be faster for
small m + n.  To do this, I need to enumerate combinations.  I have
implemented a fast, non-recursive algorithm to do this (Knuth's
algorithm T from 7.2.1.3 of TACP 4A), exposed as

class LexicographicCombinationIterator implements Iterator<int[]>

The constructor takes <n, k> and the int[] arrays returned by next()
are increasing k-length arrays from {0, ..., n - 1}, iterated in
lexicographic order.  The array-based approach is what I need in K-S
and also fastest. 

Initially, I implemented this as a private inner class in
KolmogorovSmirnovTest, but this is making testing inconvenient and
it also seems like a generically useful class.  The question is
where to put it.  One possibility is to make it a public inner class
of MathArrays.  Of just add it as a class in util.  Advice appreciated.

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to