Re: [R] Random # generator accuracy

2009-07-24 Thread Duncan Murdoch
Sent: Thursday, July 23, 2009 12:00 PM To: r-help@r-project.org Subject: [R] Random # generator accuracy Dan Nordlund wrote: It would be necessary to see the code for your 'brief test' before anyone could meaningfully comment on your results. But your results for a single test could have been

[R] Random # generator accuracy

2009-07-23 Thread Jim Bouldin
Dan Nordlund wrote: It would be necessary to see the code for your 'brief test' before anyone could meaningfully comment on your results. But your results for a single test could have been a valid random result. I've re-created what I did below. The problem appears to be with the weighting

Re: [R] Random # generator accuracy

2009-07-23 Thread Greg Snow
[mailto:r-help-boun...@r- project.org] On Behalf Of Jim Bouldin Sent: Thursday, July 23, 2009 12:00 PM To: r-help@r-project.org Subject: [R] Random # generator accuracy Dan Nordlund wrote: It would be necessary to see the code for your 'brief test' before anyone could meaningfully comment

Re: [R] Random # generator accuracy

2009-07-23 Thread Jim Bouldin
- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Jim Bouldin Sent: Thursday, July 23, 2009 12:00 PM To: r-help@r-project.org Subject: [R] Random # generator accuracy Dan Nordlund wrote: It would be necessary to see the code for your

Re: [R] Random # generator accuracy

2009-07-23 Thread Greg Snow
- From: Jim Bouldin [mailto:jrboul...@ucdavis.edu] Sent: Thursday, July 23, 2009 12:49 PM To: Greg Snow; r-help@r-project.org Subject: RE: [R] Random # generator accuracy Thanks Greg, that most definitely was it. So apparently the default is sampling without replacement. Fine

Re: [R] Random # generator accuracy

2009-07-23 Thread Ted Harding
On 23-Jul-09 17:59:56, Jim Bouldin wrote: Dan Nordlund wrote: It would be necessary to see the code for your 'brief test' before anyone could meaningfully comment on your results. But your results for a single test could have been a valid random result. I've re-created what I did below.

Re: [R] Random # generator accuracy

2009-07-23 Thread Ted Harding
OOPS! The result of a calculation below somehow got omitted! (325820+326140+325289+325098+325475+325916)/ (174873+175398+174196+174445+173240+174110) # [1] 1.867351 to be compared (as at the end) with the ratio 1.867471 of the expected number of weight=2 to expected number of weight=1.

Re: [R] Random # generator accuracy

2009-07-23 Thread Jim Bouldin
You are absolutely correct Ted. When no weights are applied it doesn't matter if you sample with or without replacement, because the probability of choosing any particular value is equally distributed among all such. But when they're weighted unequally that's not the case. It is also

Re: [R] Random # generator accuracy

2009-07-23 Thread Ted Harding
Indeed, Jim! And that's why I said to read carefully what is said about prob in '?sample': If 'replace' is false, these probabilities are applied sequentially, that is the probability of choosing the next item is proportional to the weights amongst the remaining items. Whereas, if you

Re: [R] Random # generator accuracy

2009-07-23 Thread Thomas Lumley
On Thu, 23 Jul 2009 ted.hard...@manchester.ac.uk wrote: The general problem, of sampling without replacement in such a way that for each item the probability that it is included in the sample is proportional to a pre-assigned weight (sampling with probability proportional to size) is quite

Re: [R] Random # generator accuracy

2009-07-23 Thread Jim Bouldin
Perfectly explained Ted. One might, at first reflection, consider that simply repeating the values 7 through 12 and sampling (w/o replacement) from among the 18 resulting values, would be similar to just doubling the selection probabilities for 7 through 12 and then sampling. That would clearly

Re: [R] Random # generator accuracy

2009-07-23 Thread Ted Harding
On 23-Jul-09 22:16:39, Thomas Lumley wrote: On Thu, 23 Jul 2009 ted.hard...@manchester.ac.uk wrote: The general problem, of sampling without replacement in such a way that for each item the probability that it is included in the sample is proportional to a pre-assigned weight (sampling with