Re: random cover of a range

Jason House Fri, 13 Feb 2009 06:05:32 -0800

Jason House Wrote:

> Andrei Alexandrescu Wrote:
> 
> > dsimcha wrote:
> > > == Quote from Andrei Alexandrescu ([email protected])'s 
> > > article
> > >> Ok, following Leonardo's and Denis' ideas on covering an immutable array
> > >> in random order, I thought about it and came up with one algorithm, in
> > >> addition to the obvious one that stores one integer per element (see
> > >> Denis' code).
> > >> Here's the algorithm. It stores one bit per element and takes O(n log n)
> > >> time to cover a range of length n.
> > >> 1. Given a random-access range r of length n, create a bitmap with one
> > >> bit per element in r. The range is initially all zeros.
> > >> 2. At each step, select a random index in the _original_ array, i.e. a
> > >> random number in 0 .. r.length. Then walk upwards (with wrapping at the
> > >> end of the range) until you find a bit = 0 in the bitmap. Mark that bit
> > >> = 1 and return the found element.
> > >> 3. The cover ends when all bits in the bitmap are 1.
> > >> I did a ballpark complexity estimate by approximating the average number
> > >> of steps in (2) with n / (n - k), where k is the number of 1s in the
> > >> bitmap. IIRC the average number of steps to hit a 1 in a binomial
> > >> distribution with p = (n - k) / n is 1/p. Right? In that case, the total
> > >> number of steps taken by the algo above is:
> > >> N = n / 1 + n / 2 + ... + n / (n - 1) = n * (1 + 1/2 + 1/3 + ... + 1/(n
> > >> - 1)) < O (n log n)
> > >> Makes sense?
> > >> If the range does not offer random access, I think this algorithm will
> > >> have quadratic complexity because it needs to make a number of steps
> > >> proportional to n at each iteration.
> > >> Andrei
> > > 
> > > The concern I have with this scheme is that iterating over an array with 
> > > this
> > > cover would not produce an unbiased distribution of permutations.  A good
> > > shuffling algorithm will produce a perfectly uniform distribution over all
> > > possible permutations of the original range, assuming the underlying 
> > > random number
> > > generator is unbiased.  In other words, given an array of length N, every
> > > permutation should have an equal probability 1/N!.
> > > 
> > > To get this unbiasedness, after selecting M values from a larger set N, 
> > > the
> > > probabilities of selecting any of the remaining N - M elements as element 
> > > M + 1
> > > must be uniform.  In this case, it is clearly not, since elements just 
> > > after
> > > elements that have already been selected are "bigger targets".  Similar 
> > > problems
> > > exist for naive implementations of the Fisher-Yates algorithm
> > > (http://en.wikipedia.org/wiki/Fisher-Yates_shuffle).  In other words, 
> > > unless one
> > > is very careful, one will not get a truly unbiased, random permutation.
> > > 
> > > That said, for certain applications, what you propose may be "random 
> > > enough."
> > > However, if you do include it, please at least document that it is biased 
> > > and
> > > should not be used for things like monte carlo simulations or gambling, 
> > > where
> > > people might care.
> > 
> > I can't believe I fell for this. I knew Fisher-Yates and used it in 
> > randomShuffle!
> > 
> > I won't put a subtly skewed distribution in the standard library. The 
> > quest for finding a random cover of an array with as little extra memory 
> > consumed and with good complexity is on! I'd appreciate any ideas.
> 
> 
> The obvious baseline is to make an array with 0..n. Start on the left and 
> swap with a random element to the right. That's O(n) for both speed and 
> memory.


Oops, that's wrong. It's O(n log(n)) for memory.

 
> O(n^2) runtime algorithms are easy with a bit array.

It's also possible to use a bit array for remembering what was picked but pick 
f(n) elements at a time. The memory is then O(n + log(n)f(n)) and the runtime 
>= O(n^2/f(n) + f(n)log(f(n))
 
> I bet a special random generator could be built for O(1) memory at the 
> sacrifice of less random sequences. I think it should be possible to pick 
> seed numbers to a generator that will cycle through all values in a set order.
> 
> I'll think more to see if I can come up with something creative.

Re: random cover of a range

Reply via email to