When the n'th number (n>k) is received, apply the following algorithm: 1. Generate random real r uniformly distributed in [0..1]. 2. If r > k/n, throw away the new number, 3. else generate random integer i in [1..k] and replace the number in the i'th storage location with the new one.
This is working properly if after completion each number in storage is there with probability k/n. Certainly this is true of the new number. The rest of the numbers must have been in storage with probability k/(n-1) before our algorithm ran. In order to remain in storage, they had not to be replaced. The probability of replacement was k/n * 1/k = 1/n. So they remained through this execution of the algorithm with probability 1-1/n = (n-1)/n. This execution was independent of previous ones, so the overall probability of remaining in storage through all executions was k/(n-1) * (n-1)/n = k/n as desired. On Mar 17, 2:23 pm, karthikeya s <[email protected]> wrote: > You are given a dynamic stream of numbers and numbers are coming one > by one. At a time you can store at max k numbers(coz you have only > that much memory available). Task is that we have to select the random > subset of size k from the numbers we have got so far with equal > probability. -- You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/algogeeks?hl=en.
