Re: [R] Questions about generating samples in R

2006-11-27 Thread Peter Dalgaard
Christian Schulz [EMAIL PROTECTED] writes:

 split - sample(2,nrow(dataframe),replace=T,prob=c(0.04,0.96))
 
 dataframe[split==1,]  # 200
 dataframe[split==2,] # 4800
 
 regards, christian
 
  ?sample should tell you what you need to know.

It does, but the above is not how. To get exactly 200 samples, use

sel - sample(200, nrow(dataframe))
dataframe[sel,]

  On 26/11/06, Alexander Geisler [EMAIL PROTECTED] wrote:

  Hello!
 
  I have a data set with 8 columns and in about 5000 rows. What I want to
  do is to generate samples of this data set.
 
  Samples of a special size, as example 200.
 
  What is the easiest way to do this? No special things are needed, only
  the random selection of 200 rows of the data set.
 
  Thanks
  Alex
 
  --
  Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach
  email: [EMAIL PROTECTED] | [EMAIL PROTECTED]
  phone: +43 650 / 811 61 90 | skpye: al1405ex
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about generating samples in R

2006-11-27 Thread Mark Na
Further to Alexander's question ... could anyone provide assistance
with random stratified sampling? Let's say we have Alex's dataframe
and we want to stratify the random selection by group membership
(which is contained in one of the eight columns).

We might want to randomly select:

1) a constant number (e.g., 5) of rows from each group, or
2) a percentage (e.g. 10%) of rows from each group resulting in groups
being represented proportionally in the sample (with respect to the
population).

I am aware of stratsrs but this function does not seem to allow the
second of the above two options.

Any ideas how to achieve this in R?

Thanks, Mark



On 11/26/06, Alexander Geisler [EMAIL PROTECTED] wrote:
 Hello!

 I have a data set with 8 columns and in about 5000 rows. What I want to
 do is to generate samples of this data set.

 Samples of a special size, as example 200.

 What is the easiest way to do this? No special things are needed, only
 the random selection of 200 rows of the data set.

 Thanks
 Alex

 --
 Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach
 email: [EMAIL PROTECTED] | [EMAIL PROTECTED]
 phone: +43 650 / 811 61 90 | skpye: al1405ex

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about generating samples in R

2006-11-27 Thread Charles C. Berry
On Mon, 27 Nov 2006, Mark Na wrote:

 Further to Alexander's question ... could anyone provide assistance
 with random stratified sampling? Let's say we have Alex's dataframe
 and we want to stratify the random selection by group membership
 (which is contained in one of the eight columns).

 We might want to randomly select:

 1) a constant number (e.g., 5) of rows from each group, or
 2) a percentage (e.g. 10%) of rows from each group resulting in groups
 being represented proportionally in the sample (with respect to the
 population).

 I am aware of stratsrs but this function does not seem to allow the
 second of the above two options.

 Any ideas how to achieve this in R?


Suppose 'grp.numbers' holds the group identitities.

Define wrappers for sample():

sample.just.5 - function(x) sample(x ,size = 5 )

sample.10.pct - function(x) sample(x,size=round(0.10*length(x)))

Then use tapply:

samples.of.5 - tapply(seq(along=grp.numbers),grp.numbers, 
sample.just.5 )

Check this with:

table( grp.numbers[ unlist( samples.of.5 ) ] )

Again use tapply:

samples.of.10.pct - tapply(seq(along=grp.numbers),grp.numbers, 
sample.10.pct )

Check this with:

table( grp.numbers[ unlist( samples.of.10.pct ) ] )


There are loads of variations ...


 Thanks, Mark



 On 11/26/06, Alexander Geisler [EMAIL PROTECTED] wrote:
 Hello!

 I have a data set with 8 columns and in about 5000 rows. What I want to
 do is to generate samples of this data set.

 Samples of a special size, as example 200.

 What is the easiest way to do this? No special things are needed, only
 the random selection of 200 rows of the data set.

 Thanks
 Alex

 --
 Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach
 email: [EMAIL PROTECTED] | [EMAIL PROTECTED]
 phone: +43 650 / 811 61 90 | skpye: al1405ex

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Questions about generating samples in R

2006-11-26 Thread Alexander Geisler
Hello!

I have a data set with 8 columns and in about 5000 rows. What I want to 
do is to generate samples of this data set.

Samples of a special size, as example 200.

What is the easiest way to do this? No special things are needed, only 
the random selection of 200 rows of the data set.

Thanks
Alex

-- 
Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach
email: [EMAIL PROTECTED] | [EMAIL PROTECTED]
phone: +43 650 / 811 61 90 | skpye: al1405ex

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about generating samples in R

2006-11-26 Thread David Barron
?sample should tell you what you need to know.

On 26/11/06, Alexander Geisler [EMAIL PROTECTED] wrote:
 Hello!

 I have a data set with 8 columns and in about 5000 rows. What I want to
 do is to generate samples of this data set.

 Samples of a special size, as example 200.

 What is the easiest way to do this? No special things are needed, only
 the random selection of 200 rows of the data set.

 Thanks
 Alex

 --
 Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach
 email: [EMAIL PROTECTED] | [EMAIL PROTECTED]
 phone: +43 650 / 811 61 90 | skpye: al1405ex

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about generating samples in R

2006-11-26 Thread Christian Schulz

split - sample(2,nrow(dataframe),replace=T,prob=c(0.04,0.96))

dataframe[split==1,]  # 200
dataframe[split==2,] # 4800

regards, christian

 ?sample should tell you what you need to know.

 On 26/11/06, Alexander Geisler [EMAIL PROTECTED] wrote:
   
 Hello!

 I have a data set with 8 columns and in about 5000 rows. What I want to
 do is to generate samples of this data set.

 Samples of a special size, as example 200.

 What is the easiest way to do this? No special things are needed, only
 the random selection of 200 rows of the data set.

 Thanks
 Alex

 --
 Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach
 email: [EMAIL PROTECTED] | [EMAIL PROTECTED]
 phone: +43 650 / 811 61 90 | skpye: al1405ex

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.