Re: [R] Subset and sumerize
Ashta, ## I may have misunderstood your question and if so I apologize. ## I had to remove the extra line after "45" before ## the ",sep=" to use your code. ## You could have used dput(dat) to send a more reliable (robust) version. dat <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), x1 = structure(c(1L, 5L, 5L, 5L, 2L, 5L, 3L, 4L, 5L, 5L), .Label = c("a", "d", "g", "h", "x"), class = "factor"), x2 = structure(c(1L, 6L, 1L, 4L, 6L, 6L, 5L, 2L, 6L, 3L), .Label = c("b", "e", "g", "k", "t", "z"), class = "factor"), y = c(15L, 21L, 16L, 25L, 31L, 28L, 41L, 32L, 38L, 45L)), .Names = c("ID", "x1", "x2", "y"), class = "data.frame", row.names = c(NA, -10L)) # In your proposed solution "newdat" is never defined yet you are using it as if it were. ## It is my understanding that your goal is to define newdat as a ## subset of dat where x1 == "x" and x2 == "z". ## This can be done with one line. newdat <- dat[dat$x1 == "x" & dat$x2 == "z", ] newdat > On Oct 14, 2016, at 1:26 PM, Ashtawrote: > > Hi all, > > I am trying to summarize big data set by selecting a row > conditionally. and tried to do it in a loop > > Here is the sample of my data and my attempt > > dat<-read.table(text=" ID,x1,x2,y > 1,a,b,15 > 1,x,z,21 > 1,x,b,16 > 1,x,k,25 > 2,d,z,31 > 2,x,z,28 > 2,g,t,41 > 3,h,e,32 > 3,x,z,38 > 3,x,g,45 > ",sep=",",header=TRUE) > > For each unique ID, I want to select a data when x1= "x" and x2="z" > Here is the selected data (newdat) > ID,x1,x2,y > 1,x,z,21 > 2,x,z,28 > 3,x,z,38 > > Then I want summarize Y values and out put as follows > Summerize > summary(newdat[i]) > ## > ID Min. 1st Qu. MedianMean 3rd Qu.Max. > 1 > 2 > 3 > . > . > . > 28 > > > Here is my attempt but did not work, > > trt=c(1:28) > for(i in 1:length (trt)) > { > day[i]= newdat[which(newdat$ID== trt[i] & newdat$x1 =="x" & > newdat$x2 =="z"),] > NR[i]=dim(day[i])[1] > print(paste("Number of Records :", NR[i])) > sm[i]=summary(day[i]) > } > > Thank you in advance > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset and sumerize
For the data you provide, it's simply: summary(subset(dat, x1 == "x" & x2 == "z")$y) Note that x1 and x2 are factors in your example. We also don't know what you want to do if there are more than one combination of that per ID, or if there ID values with no matching rows. Sarah On Fri, Oct 14, 2016 at 2:26 PM, Ashtawrote: > Hi all, > > I am trying to summarize big data set by selecting a row > conditionally. and tried to do it in a loop > > Here is the sample of my data and my attempt > > dat<-read.table(text=" ID,x1,x2,y > 1,a,b,15 > 1,x,z,21 > 1,x,b,16 > 1,x,k,25 > 2,d,z,31 > 2,x,z,28 > 2,g,t,41 > 3,h,e,32 > 3,x,z,38 > 3,x,g,45 > ",sep=",",header=TRUE) > > For each unique ID, I want to select a data when x1= "x" and x2="z" > Here is the selected data (newdat) > ID,x1,x2,y > 1,x,z,21 > 2,x,z,28 > 3,x,z,38 > > Then I want summarize Y values and out put as follows > Summerize > summary(newdat[i]) > ## > ID Min. 1st Qu. MedianMean 3rd Qu.Max. > 1 > 2 > 3 > . > . > . > 28 > > > Here is my attempt but did not work, > > trt=c(1:28) > for(i in 1:length (trt)) > { > day[i]= newdat[which(newdat$ID== trt[i] & newdat$x1 =="x" & > newdat$x2 =="z"),] > NR[i]=dim(day[i])[1] > print(paste("Number of Records :", NR[i])) > sm[i]=summary(day[i]) > } > > Thank you in advance > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subset and sumerize
Hi all, I am trying to summarize big data set by selecting a row conditionally. and tried to do it in a loop Here is the sample of my data and my attempt dat<-read.table(text=" ID,x1,x2,y 1,a,b,15 1,x,z,21 1,x,b,16 1,x,k,25 2,d,z,31 2,x,z,28 2,g,t,41 3,h,e,32 3,x,z,38 3,x,g,45 ",sep=",",header=TRUE) For each unique ID, I want to select a data when x1= "x" and x2="z" Here is the selected data (newdat) ID,x1,x2,y 1,x,z,21 2,x,z,28 3,x,z,38 Then I want summarize Y values and out put as follows Summerize summary(newdat[i]) ## ID Min. 1st Qu. MedianMean 3rd Qu.Max. 1 2 3 . . . 28 Here is my attempt but did not work, trt=c(1:28) for(i in 1:length (trt)) { day[i]= newdat[which(newdat$ID== trt[i] & newdat$x1 =="x" & newdat$x2 =="z"),] NR[i]=dim(day[i])[1] print(paste("Number of Records :", NR[i])) sm[i]=summary(day[i]) } Thank you in advance __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.