dsimcha wrote:
== Quote from Andrei Alexandrescu ([email protected])'s article
I'm writing a range to select a random sampling of a forward range. The
number of elements in the forward range is already known.
The strategy is: at any moment, I know I need to select k random samples
out of n. To implement popFront, I generate a random number in [0, n -
k]. That's the probability that the next item in the input will be part
of the selection. If the random number is 0, then the element got
selected. Otherwise, I must ignore that guy so I popFront the input
range, decrease n by one, and repeat.
This seems reasonable but somehow the selection comes skewed: the
elements towards the beginning of the range are less represented than
the ones towards the end. Where am I wrong?
Andrei

I'm confused.  As far as I can tell, RandomCover is correct in 2.030.  I both 
read
the code and did a monte carlo test for some short sequences to see if the
distribution was uniform on the space of possible permutations.  Isn't this 
just a
case of using RandomCover, but stopping before iterating through the entire 
range?
 In other words, wouldn't the following work:

auto selectK(R)(R someRange, uint K, Random gen) {
    assert(K <= someRange.length);
    return take(K, randomCover(someRange, gen));
}

You are right. I'm just less than happy with randomCover because it needs additional memory, so I eliminated it as a possibility from the get-go.

Andrei

Reply via email to