On 03.05.2017 01:09, MysticZach wrote:
Counterexample: [1,1,1,0,0]
Your algorithm returns 0 with probability 7/20, 1 with probability
6/20 and 2 with probability 7/20.
Note that there is a simple reason why your algorithm cannot work for
this case: it picks one of 20 numbers at random. The resulting
probability mass cannot be evenly distributed among the three
elements, because 20 is not divisible by 3.
That's true. Two points, though: If the range of error is within
1/(n*(n-1)), with array length n,
It's not though. For example, [1,1,1,0,...,0] (length 29), you get 0 and
2 each with probability 43/116, but 1 only with probability 30/116.
It might be interesting to figure out how far from uniform the
distribution can get.
it may be close enough for a given
real world use case. 2. n is known to be large, which makes 1/(n*(n-1))
even smaller. You'd probably have to trade accuracy for performance. I
wonder how much less performant a truly accurate algorithm would be.
Also, whether there's a way to make an algorithm of this style
completely accurate.
For an accurate algorithm based on (unconditional) uniformly random
sampling, the number of outcomes from sampling needs to be divisible by
all numbers from 1 to n. (Because each of those could conceivably be the
number of elements satisfying the predicate.)