Re: [R] sample function and memory usage

2007-05-09 Thread Prof Brian Ripley
On Tue, 8 May 2007, Victor Gravenholt wrote:

 As a part of a simulation, I need to sample from a large vector repeatedly.
 For some reason sample() builds up the memory usage ( 500 MB for this
 example) when used inside a for loop as illustrated here:

 X - 1:10
 P - runif(10)
 for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P)

 Even worse, I am not able to free up memory without quitting R.
 I quickly run out of memory when trying to perform the simulation. Is
 there any way to avoid this to happen?

 The problem seem to appear only when specifying both replace=TRUE and
 probability weights for the vector being sampled, and this happens both
 on Windows XP and Linux (Ubuntu).

And for 1  size = 10.  There was a typo causing memory not to be 
freed in that range.  It is now fixed in 2.5.0 patched.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sample function and memory usage

2007-05-08 Thread Victor Gravenholt
As a part of a simulation, I need to sample from a large vector repeatedly.
For some reason sample() builds up the memory usage ( 500 MB for this 
example) when used inside a for loop as illustrated here:

X - 1:10
P - runif(10)
for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P)

Even worse, I am not able to free up memory without quitting R.
I quickly run out of memory when trying to perform the simulation. Is 
there any way to avoid this to happen?

The problem seem to appear only when specifying both replace=TRUE and 
probability weights for the vector being sampled, and this happens both 
on Windows XP and Linux (Ubuntu).


Victor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sample function

2005-03-11 Thread Ted Harding
On 11-Mar-05 Martin C. Martin wrote:
 hist is lumping things together.
 
 Try:
 sum(temp == 0)
 
 compare to the height of the left most bar.
 
 Is this a bug in hist?
 
 - Martin

Well, not a bug strictly speaking since it works as documented,
but I do think it's not necessarily a happy choice.

The unsuspecting (like Martin) will step into holes even after
reading ?hist, since the truths are rather deeply (and I think
somewhat obliquely) hidden (?hist leads you to look up
?nclass.Sturges which in turn only mentions Sturges' formula
and invites you to read VR's MASS book and other references
in the hope of further clarification -- all a bit much when
you just want to draw a histogram, which ought to be kid's
stuff! Not to mention the things to do with parameters
include.lowest and right whose combined effect is not
too obvious).

I'd like to repeat the sort of hint I occasionally give:

In using R, if there's any doubt it is best to spell out exactly
what you want rather than expecting the functions to agree with
what you want. R functions are often more complex and subtle
than you might suspect.

In this particular case,

  hist(temp,breaks= -0.5+(-0:14) )

will produce the sort of thing which is wanted. One could
interpret the results which Martin reported as due to a
sort of confusion (but on whose part -- R or Martin?)
over the fact that hist is designed to deal with
continuous values, while his sample consists of integers.

For that particular case, one could also use table or
barchart, as has been suggested by David Scott, which
would produce a plot of similar appearance; but this is
not in the histogram family despite appearances, since
it is not primarily a quantitative plot (i.e. respecting
the numerical values and their numerical comparisons), but
more a catefory count. In particular, natural variants
of the above hist command such as

  hist(temp,breaks= -0.5+2*(0:7) )

(which corresponds to binning by different intervals) do
not lie so easily in the table or barchart domain.

And I don't agree with David's comment that No, hist
is the wrong thing to use to display this data.

In so far as these data are considered to be numerical
values of which one wants a view of their distribution,
then hist is entirely appropriate, as for any other
numerical variable. The only question is how to get
this to happen appropriately.

Would David make the same comment about data sampled
from (0:5000) instead of (0:12)?

Best wishes to all,
Ted.



E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 11-Mar-05   Time: 10:59:55
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] sample function

2005-03-10 Thread mirage sell
Hi everyone, I need help.
I want to have a uniform kind distribution. When I used sample function I 
got almost twice many zeros compared to other numbers. What's wrong with my 
command ?

temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13)))
hist(temp)
Thanks in advance,
Taka,
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sample function

2005-03-10 Thread Martin C. Martin
hist is lumping things together.
Try:
sum(temp == 0)
compare to the height of the left most bar.
Is this a bug in hist?
- Martin
mirage sell wrote:
Hi everyone, I need help.
I want to have a uniform kind distribution. When I used sample 
function I got almost twice many zeros compared to other numbers. What's 
wrong with my command ?

temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13)))
hist(temp)
Thanks in advance,
Taka,
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] sample function

2005-03-10 Thread Liaw, Andy
It's not the simulated data, but how hist() handled it.  If you use
truehist() in the MASS package, you don't see the problem.  Nor would you
see it like this:

 table(temp)/length(temp)
temp
 0  1  2  3  4  5  6  7  8  9 10
11 
0.0745 0.0745 0.0830 0.0755 0.0760 0.0750 0.0700 0.0765 0.0775 0.0805 0.0830
0.0765 
12 
0.0775 

Andy

 From: mirage sell
 
 Hi everyone, I need help.
 I want to have a uniform kind distribution. When I used 
 sample function I 
 got almost twice many zeros compared to other numbers. What's 
 wrong with my 
 command ?
 
 temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13)))
 hist(temp)
 
 Thanks in advance,
 
 Taka,
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sample function

2005-03-10 Thread David Scott
On Thu, 10 Mar 2005, mirage sell wrote:
Hi everyone, I need help.
I want to have a uniform kind distribution. When I used sample function I 
got almost twice many zeros compared to other numbers. What's wrong with my 
command ?

Nothing is wrong with your sampling, it is the display in the histogram.
Try
temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13)))
table(temp)
David Scott
_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
AucklandNEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]
Graduate Officer, Department of Statistics
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sample function

2005-03-10 Thread David Scott
On Thu, 10 Mar 2005, Martin C. Martin wrote:
hist is lumping things together.
Try:
sum(temp == 0)
compare to the height of the left most bar.
Is this a bug in hist?
No, hist is the wrong thing to use to display this data.
Try
temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13)))
barplot(table(temp))
David Scott
_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
AucklandNEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]
Graduate Officer, Department of Statistics
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sample function

2005-03-10 Thread Marc Schwartz
On Thu, 2005-03-10 at 20:54 -0600, mirage sell wrote:
 Hi everyone, I need help.
 I want to have a uniform kind distribution. When I used sample function I 
 got almost twice many zeros compared to other numbers. What's wrong with my 
 command ?
 
 temp -sample(0:12, 2000, replace=T,prob=(rep(1/13,13)))
 hist(temp)
 
 Thanks in advance,

Hint: take note that there are only 12 cells in the plot, not 13...

However, note that the frequency of the 13 elements are appropriate:

 table(sample(0:12, 2000, replace=T))

  0   1   2   3   4   5   6   7   8   9  10  11  12
158 156 151 163 156 158 146 154 134 158 146 147 173


Review the details of how the breaks are selected in ?hist.

BTW, you do not need to specify the 'prob' argument if you want equal
probabilities as per my example above.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sample function

2003-06-27 Thread Prof Brian D Ripley
R is not S-PLUS, and you need Modern Applied Statistics in S (4th ed) for a
description including R.

sample in R used a PRNG: see ?RNG in R for the details of PRNGs in R.

On Fri, 27 Jun 2003, [iso-8859-1] Ramzi Feghali wrote:

 i have a question about the sample function used in R, does it work as
 a pseudo-dandom number generator programmed with C, like it is described
 in Modern Applied Statics with S-Plus 3d edition chapter 5 section 2?


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help