Re: [R] Questions about generating samples in R
Christian Schulz [EMAIL PROTECTED] writes: split - sample(2,nrow(dataframe),replace=T,prob=c(0.04,0.96)) dataframe[split==1,] # 200 dataframe[split==2,] # 4800 regards, christian ?sample should tell you what you need to know. It does, but the above is not how. To get exactly 200 samples, use sel - sample(200, nrow(dataframe)) dataframe[sel,] On 26/11/06, Alexander Geisler [EMAIL PROTECTED] wrote: Hello! I have a data set with 8 columns and in about 5000 rows. What I want to do is to generate samples of this data set. Samples of a special size, as example 200. What is the easiest way to do this? No special things are needed, only the random selection of 200 rows of the data set. Thanks Alex -- Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach email: [EMAIL PROTECTED] | [EMAIL PROTECTED] phone: +43 650 / 811 61 90 | skpye: al1405ex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about generating samples in R
Further to Alexander's question ... could anyone provide assistance with random stratified sampling? Let's say we have Alex's dataframe and we want to stratify the random selection by group membership (which is contained in one of the eight columns). We might want to randomly select: 1) a constant number (e.g., 5) of rows from each group, or 2) a percentage (e.g. 10%) of rows from each group resulting in groups being represented proportionally in the sample (with respect to the population). I am aware of stratsrs but this function does not seem to allow the second of the above two options. Any ideas how to achieve this in R? Thanks, Mark On 11/26/06, Alexander Geisler [EMAIL PROTECTED] wrote: Hello! I have a data set with 8 columns and in about 5000 rows. What I want to do is to generate samples of this data set. Samples of a special size, as example 200. What is the easiest way to do this? No special things are needed, only the random selection of 200 rows of the data set. Thanks Alex -- Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach email: [EMAIL PROTECTED] | [EMAIL PROTECTED] phone: +43 650 / 811 61 90 | skpye: al1405ex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about generating samples in R
On Mon, 27 Nov 2006, Mark Na wrote: Further to Alexander's question ... could anyone provide assistance with random stratified sampling? Let's say we have Alex's dataframe and we want to stratify the random selection by group membership (which is contained in one of the eight columns). We might want to randomly select: 1) a constant number (e.g., 5) of rows from each group, or 2) a percentage (e.g. 10%) of rows from each group resulting in groups being represented proportionally in the sample (with respect to the population). I am aware of stratsrs but this function does not seem to allow the second of the above two options. Any ideas how to achieve this in R? Suppose 'grp.numbers' holds the group identitities. Define wrappers for sample(): sample.just.5 - function(x) sample(x ,size = 5 ) sample.10.pct - function(x) sample(x,size=round(0.10*length(x))) Then use tapply: samples.of.5 - tapply(seq(along=grp.numbers),grp.numbers, sample.just.5 ) Check this with: table( grp.numbers[ unlist( samples.of.5 ) ] ) Again use tapply: samples.of.10.pct - tapply(seq(along=grp.numbers),grp.numbers, sample.10.pct ) Check this with: table( grp.numbers[ unlist( samples.of.10.pct ) ] ) There are loads of variations ... Thanks, Mark On 11/26/06, Alexander Geisler [EMAIL PROTECTED] wrote: Hello! I have a data set with 8 columns and in about 5000 rows. What I want to do is to generate samples of this data set. Samples of a special size, as example 200. What is the easiest way to do this? No special things are needed, only the random selection of 200 rows of the data set. Thanks Alex -- Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach email: [EMAIL PROTECTED] | [EMAIL PROTECTED] phone: +43 650 / 811 61 90 | skpye: al1405ex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Questions about generating samples in R
Hello! I have a data set with 8 columns and in about 5000 rows. What I want to do is to generate samples of this data set. Samples of a special size, as example 200. What is the easiest way to do this? No special things are needed, only the random selection of 200 rows of the data set. Thanks Alex -- Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach email: [EMAIL PROTECTED] | [EMAIL PROTECTED] phone: +43 650 / 811 61 90 | skpye: al1405ex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about generating samples in R
?sample should tell you what you need to know. On 26/11/06, Alexander Geisler [EMAIL PROTECTED] wrote: Hello! I have a data set with 8 columns and in about 5000 rows. What I want to do is to generate samples of this data set. Samples of a special size, as example 200. What is the easiest way to do this? No special things are needed, only the random selection of 200 rows of the data set. Thanks Alex -- Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach email: [EMAIL PROTECTED] | [EMAIL PROTECTED] phone: +43 650 / 811 61 90 | skpye: al1405ex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about generating samples in R
split - sample(2,nrow(dataframe),replace=T,prob=c(0.04,0.96)) dataframe[split==1,] # 200 dataframe[split==2,] # 4800 regards, christian ?sample should tell you what you need to know. On 26/11/06, Alexander Geisler [EMAIL PROTECTED] wrote: Hello! I have a data set with 8 columns and in about 5000 rows. What I want to do is to generate samples of this data set. Samples of a special size, as example 200. What is the easiest way to do this? No special things are needed, only the random selection of 200 rows of the data set. Thanks Alex -- Alexander Geisler * Kaltenbach 151 * A-6272 Kaltenbach email: [EMAIL PROTECTED] | [EMAIL PROTECTED] phone: +43 650 / 811 61 90 | skpye: al1405ex __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.