Re: sampling from frequency distribution / histogram without replacement

2019-01-18 Thread duncan smith
On 14/01/2019 20:11, duncan smith wrote: > Hello, > Just checking to see if anyone has attacked this problem before > for cases where the population size is unfeasibly large. i.e. The number > of categories is manageable, but the sum of the frequencies, N, > precludes simple solutions such

Re: sampling from frequency distribution / histogram without replacement

2019-01-15 Thread duncan smith
On 15/01/2019 17:59, Ian Hobson wrote: > Hi, > > If I understand your problem you can do it in two passes through the > population. > The thing is that I start with the population histogram and I want to generate a sample histogram. The population itself is too large to deal with each

Re: sampling from frequency distribution / histogram without replacement

2019-01-15 Thread Ian Hobson
Hi, If I understand your problem you can do it in two passes through the population. First, however, lets work through taking a sample of 2 from 7 to demonstrate the method. Take the first element with a probability of 2/7. (Note 1). If you took it, you only want 1 more, so the probability

Re: sampling from frequency distribution / histogram without replacement

2019-01-15 Thread duncan smith
On 15/01/2019 02:41, Spencer Graves wrote: > > > On 2019-01-14 18:40, duncan smith wrote: >> On 14/01/2019 22:59, Gregory Ewing wrote: >>> duncan smith wrote: Hello,    Just checking to see if anyone has attacked this problem before for cases where the population size is

Re: sampling from frequency distribution / histogram without replacement

2019-01-14 Thread Spencer Graves
On 2019-01-14 18:40, duncan smith wrote: On 14/01/2019 22:59, Gregory Ewing wrote: duncan smith wrote: Hello,   Just checking to see if anyone has attacked this problem before for cases where the population size is unfeasibly large. The fastest way I know of is to create a list of

Re: sampling from frequency distribution / histogram without replacement

2019-01-14 Thread duncan smith
On 14/01/2019 22:59, Gregory Ewing wrote: > duncan smith wrote: >> Hello, >>   Just checking to see if anyone has attacked this problem before >> for cases where the population size is unfeasibly large. > > The fastest way I know of is to create a list of cumulative > frequencies, then

Re: sampling from frequency distribution / histogram without replacement

2019-01-14 Thread Gregory Ewing
duncan smith wrote: Hello, Just checking to see if anyone has attacked this problem before for cases where the population size is unfeasibly large. The fastest way I know of is to create a list of cumulative frequencies, then generate uniformly distributed numbers and use a binary search

sampling from frequency distribution / histogram without replacement

2019-01-14 Thread duncan smith
Hello, Just checking to see if anyone has attacked this problem before for cases where the population size is unfeasibly large. i.e. The number of categories is manageable, but the sum of the frequencies, N, precludes simple solutions such as creating a list, shuffling it and using the first