Re: [R] sample function and memory usage
On Tue, 8 May 2007, Victor Gravenholt wrote: As a part of a simulation, I need to sample from a large vector repeatedly. For some reason sample() builds up the memory usage ( 500 MB for this example) when used inside a for loop as illustrated here: X - 1:10 P - runif(10) for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P) Even worse, I am not able to free up memory without quitting R. I quickly run out of memory when trying to perform the simulation. Is there any way to avoid this to happen? The problem seem to appear only when specifying both replace=TRUE and probability weights for the vector being sampled, and this happens both on Windows XP and Linux (Ubuntu). And for 1 size = 10. There was a typo causing memory not to be freed in that range. It is now fixed in 2.5.0 patched. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sample function and memory usage
As a part of a simulation, I need to sample from a large vector repeatedly. For some reason sample() builds up the memory usage ( 500 MB for this example) when used inside a for loop as illustrated here: X - 1:10 P - runif(10) for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P) Even worse, I am not able to free up memory without quitting R. I quickly run out of memory when trying to perform the simulation. Is there any way to avoid this to happen? The problem seem to appear only when specifying both replace=TRUE and probability weights for the vector being sampled, and this happens both on Windows XP and Linux (Ubuntu). Victor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sample function
On 11-Mar-05 Martin C. Martin wrote: hist is lumping things together. Try: sum(temp == 0) compare to the height of the left most bar. Is this a bug in hist? - Martin Well, not a bug strictly speaking since it works as documented, but I do think it's not necessarily a happy choice. The unsuspecting (like Martin) will step into holes even after reading ?hist, since the truths are rather deeply (and I think somewhat obliquely) hidden (?hist leads you to look up ?nclass.Sturges which in turn only mentions Sturges' formula and invites you to read VR's MASS book and other references in the hope of further clarification -- all a bit much when you just want to draw a histogram, which ought to be kid's stuff! Not to mention the things to do with parameters include.lowest and right whose combined effect is not too obvious). I'd like to repeat the sort of hint I occasionally give: In using R, if there's any doubt it is best to spell out exactly what you want rather than expecting the functions to agree with what you want. R functions are often more complex and subtle than you might suspect. In this particular case, hist(temp,breaks= -0.5+(-0:14) ) will produce the sort of thing which is wanted. One could interpret the results which Martin reported as due to a sort of confusion (but on whose part -- R or Martin?) over the fact that hist is designed to deal with continuous values, while his sample consists of integers. For that particular case, one could also use table or barchart, as has been suggested by David Scott, which would produce a plot of similar appearance; but this is not in the histogram family despite appearances, since it is not primarily a quantitative plot (i.e. respecting the numerical values and their numerical comparisons), but more a catefory count. In particular, natural variants of the above hist command such as hist(temp,breaks= -0.5+2*(0:7) ) (which corresponds to binning by different intervals) do not lie so easily in the table or barchart domain. And I don't agree with David's comment that No, hist is the wrong thing to use to display this data. In so far as these data are considered to be numerical values of which one wants a view of their distribution, then hist is entirely appropriate, as for any other numerical variable. The only question is how to get this to happen appropriately. Would David make the same comment about data sampled from (0:5000) instead of (0:12)? Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 11-Mar-05 Time: 10:59:55 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] sample function
Hi everyone, I need help. I want to have a uniform kind distribution. When I used sample function I got almost twice many zeros compared to other numbers. What's wrong with my command ? temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13))) hist(temp) Thanks in advance, Taka, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sample function
hist is lumping things together. Try: sum(temp == 0) compare to the height of the left most bar. Is this a bug in hist? - Martin mirage sell wrote: Hi everyone, I need help. I want to have a uniform kind distribution. When I used sample function I got almost twice many zeros compared to other numbers. What's wrong with my command ? temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13))) hist(temp) Thanks in advance, Taka, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] sample function
It's not the simulated data, but how hist() handled it. If you use truehist() in the MASS package, you don't see the problem. Nor would you see it like this: table(temp)/length(temp) temp 0 1 2 3 4 5 6 7 8 9 10 11 0.0745 0.0745 0.0830 0.0755 0.0760 0.0750 0.0700 0.0765 0.0775 0.0805 0.0830 0.0765 12 0.0775 Andy From: mirage sell Hi everyone, I need help. I want to have a uniform kind distribution. When I used sample function I got almost twice many zeros compared to other numbers. What's wrong with my command ? temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13))) hist(temp) Thanks in advance, Taka, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sample function
On Thu, 10 Mar 2005, mirage sell wrote: Hi everyone, I need help. I want to have a uniform kind distribution. When I used sample function I got almost twice many zeros compared to other numbers. What's wrong with my command ? Nothing is wrong with your sampling, it is the display in the histogram. Try temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13))) table(temp) David Scott _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 AucklandNEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sample function
On Thu, 10 Mar 2005, Martin C. Martin wrote: hist is lumping things together. Try: sum(temp == 0) compare to the height of the left most bar. Is this a bug in hist? No, hist is the wrong thing to use to display this data. Try temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13))) barplot(table(temp)) David Scott _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 AucklandNEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sample function
On Thu, 2005-03-10 at 20:54 -0600, mirage sell wrote: Hi everyone, I need help. I want to have a uniform kind distribution. When I used sample function I got almost twice many zeros compared to other numbers. What's wrong with my command ? temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13))) hist(temp) Thanks in advance, Hint: take note that there are only 12 cells in the plot, not 13... However, note that the frequency of the 13 elements are appropriate: table(sample(0:12, 2000, replace=T)) 0 1 2 3 4 5 6 7 8 9 10 11 12 158 156 151 163 156 158 146 154 134 158 146 147 173 Review the details of how the breaks are selected in ?hist. BTW, you do not need to specify the 'prob' argument if you want equal probabilities as per my example above. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sample function
R is not S-PLUS, and you need Modern Applied Statistics in S (4th ed) for a description including R. sample in R used a PRNG: see ?RNG in R for the details of PRNGs in R. On Fri, 27 Jun 2003, [iso-8859-1] Ramzi Feghali wrote: i have a question about the sample function used in R, does it work as a pseudo-dandom number generator programmed with C, like it is described in Modern Applied Statics with S-Plus 3d edition chapter 5 section 2? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help