[R] Group several variables and apply a function to the group
Dear R-experts, I am struggling with the following problem, and I am looking for advice from more experienced R-users: I have a data frame with 2 identifying variables (comn and mi), and an output variable (x). comn is a variable for a company and mi is a variable for a month. comn-c(abc, abc, abc, abc, abc, abc, xyz, xyz,xyz, xyz) mi- c(1, 1,1, 2, 2, 2, 1, 1, 3, 3) x- c(-0.0031, 0.0009, -0.007, 0.1929,0.0087, 0.099,-0.089, 0.005, -0.0078, 0.67 ) df- data.frame(comn=comn, mi=mi, x=x) For each company, within a particular month, I would like to compute the standard deviation of x: for example, for abc, I would like to compute the sd of x for month1 (when mi=1) and for month2 (when mi=2). In other languages (Stata for instance), I would create a grouping variable (group comnn and mi) and then, apply the sd function for each group. However, I don't find an elegant way to do the same in R: I was thinking about the following: I could subset my data frame by mi and create one file per month, and then make a loop and in each file, use a by operator for each comn. I am sure it would work, but I feel that it would be like killing an ant with a tank. I was wondering if anyone knew a more straightforward way to implement that kind of operation? Thanks a lot, Best, Aurelien [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group several variables and apply a function to the group
Like this? library(plyr) ddply(df,.(comn,mi),summarise,stDEV=sd(x)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA http://www.fws.gov/redbluff/rbdd_jsmp.aspx From: Aurélien PHILIPPOT aurelien.philip...@gmail.com To: R-help@r-project.org Sent: Sunday, December 4, 2011 12:32 PM Subject: [R] Group several variables and apply a function to the group Dear R-experts, I am struggling with the following problem, and I am looking for advice from more experienced R-users: I have a data frame with 2 identifying variables (comn and mi), and an output variable (x). comn is a variable for a company and mi is a variable for a month. comn-c(abc, abc, abc, abc, abc, abc, xyz, xyz,xyz, xyz) mi- c(1, 1,1, 2, 2, 2, 1, 1, 3, 3) x- c(-0.0031, 0.0009, -0.007, 0.1929,0.0087, 0.099,-0.089, 0.005, -0.0078, 0.67 ) df- data.frame(comn=comn, mi=mi, x=x) For each company, within a particular month, I would like to compute the standard deviation of x: for example, for abc, I would like to compute the sd of x for month1 (when mi=1) and for month2 (when mi=2). In other languages (Stata for instance), I would create a grouping variable (group comnn and mi) and then, apply the sd function for each group. However, I don't find an elegant way to do the same in R: I was thinking about the following: I could subset my data frame by mi and create one file per month, and then make a loop and in each file, use a by operator for each comn. I am sure it would work, but I feel that it would be like killing an ant with a tank. I was wondering if anyone knew a more straightforward way to implement that kind of operation? Thanks a lot, Best, Aurelien [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group several variables and apply a function to the group
exactly like that! thanks a lot. Aurelien 2011/12/4 Felipe Carrillo mazatlanmex...@yahoo.com Like this? library(plyr) ddply(df,.(comn,mi),summarise,stDEV=sd(x)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA http://www.fws.gov/redbluff/rbdd_jsmp.aspx *From:* Aurélien PHILIPPOT aurelien.philip...@gmail.com *To:* R-help@r-project.org *Sent:* Sunday, December 4, 2011 12:32 PM *Subject:* [R] Group several variables and apply a function to the group Dear R-experts, I am struggling with the following problem, and I am looking for advice from more experienced R-users: I have a data frame with 2 identifying variables (comn and mi), and an output variable (x). comn is a variable for a company and mi is a variable for a month. comn-c(abc, abc, abc, abc, abc, abc, xyz, xyz,xyz, xyz) mi- c(1, 1,1, 2, 2, 2, 1, 1, 3, 3) x- c(-0.0031, 0.0009, -0.007, 0.1929,0.0087, 0.099,-0.089, 0.005, -0.0078, 0.67 ) df- data.frame(comn=comn, mi=mi, x=x) For each company, within a particular month, I would like to compute the standard deviation of x: for example, for abc, I would like to compute the sd of x for month1 (when mi=1) and for month2 (when mi=2). In other languages (Stata for instance), I would create a grouping variable (group comnn and mi) and then, apply the sd function for each group. However, I don't find an elegant way to do the same in R: I was thinking about the following: I could subset my data frame by mi and create one file per month, and then make a loop and in each file, use a by operator for each comn. I am sure it would work, but I feel that it would be like killing an ant with a tank. I was wondering if anyone knew a more straightforward way to implement that kind of operation? Thanks a lot, Best, Aurelien [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group several variables and apply a function to the group
Aurélien PHILIPPOT wrote Dear R-experts, I am struggling with the following problem, and I am looking for advice from more experienced R-users: I have a data frame with 2 identifying variables (comn and mi), and an output variable (x). comn is a variable for a company and mi is a variable for a month. comn-c(abc, abc, abc, abc, abc, abc, xyz, xyz,xyz, xyz) mi- c(1, 1,1, 2, 2, 2, 1, 1, 3, 3) x- c(-0.0031, 0.0009, -0.007, 0.1929,0.0087, 0.099,-0.089, 0.005, -0.0078, 0.67 ) df- data.frame(comn=comn, mi=mi, x=x) For each company, within a particular month, I would like to compute the standard deviation of x: for example, for abc, I would like to compute the sd of x for month1 (when mi=1) and for month2 (when mi=2). In other languages (Stata for instance), I would create a grouping variable (group comnn and mi) and then, apply the sd function for each group. However, I don't find an elegant way to do the same in R: I was thinking about the following: I could subset my data frame by mi and create one file per month, and then make a loop and in each file, use a by operator for each comn. I am sure it would work, but I feel that it would be like killing an ant with a tank. I was wondering if anyone knew a more straightforward way to implement that kind of operation? Thanks a lot, Best, Aurelien [[alternative HTML version deleted]] __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. One way would be to use the aggregate function. # Your Data ... # Note: I have removed the quotes off the output variable x comn-c(abc, abc, abc, abc, abc, abc, xyz, xyz,xyz, xyz) mi- c(1, 1,1, 2, 2, 2, 1, 1, 3, 3) x- c(-0.0031, 0.0009, -0.007, 0.1929,0.0087, 0.099,-0.089, 0.005, -0.0078, 0.67) df- data.frame(comn=comn, mi=mi, x=x) # Aggregate Function aggregate(df$x, by=list(df$comn,df$mi),FUN=sd) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/Group-several-variables-and-apply-a-function-to-the-group-tp4158017p4158090.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group several variables and apply a function to the group
?aggregate should do it aggregate(df$x,list(df$comn, df$mi), sd) There are other ways of course Using the reshape2 package library(reshape2) x1 - melt(df, id=c(comn, mi)) dcast(x1, comn + mi ~ variable, sd) --- On Sun, 12/4/11, Aurélien PHILIPPOT aurelien.philip...@gmail.com wrote: From: Aurélien PHILIPPOT aurelien.philip...@gmail.com Subject: [R] Group several variables and apply a function to the group To: R-help@r-project.org Received: Sunday, December 4, 2011, 3:32 PM Dear R-experts, I am struggling with the following problem, and I am looking for advice from more experienced R-users: I have a data frame with 2 identifying variables (comn and mi), and an output variable (x). comn is a variable for a company and mi is a variable for a month. comn-c(abc, abc, abc, abc, abc, abc, xyz, xyz,xyz, xyz) mi- c(1, 1,1, 2, 2, 2, 1, 1, 3, 3) x- c(-0.0031, 0.0009, -0.007, 0.1929,0.0087, 0.099,-0.089, 0.005, -0.0078, 0.67 ) df- data.frame(comn=comn, mi=mi, x=x) For each company, within a particular month, I would like to compute the standard deviation of x: for example, for abc, I would like to compute the sd of x for month1 (when mi=1) and for month2 (when mi=2). In other languages (Stata for instance), I would create a grouping variable (group comnn and mi) and then, apply the sd function for each group. However, I don't find an elegant way to do the same in R: I was thinking about the following: I could subset my data frame by mi and create one file per month, and then make a loop and in each file, use a by operator for each comn. I am sure it would work, but I feel that it would be like killing an ant with a tank. I was wondering if anyone knew a more straightforward way to implement that kind of operation? Thanks a lot, Best, Aurelien [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Group several variables and apply a function to the group
... with() is useful here: e.g. in base R, simply tapply() or ave() with with() with(df,ave(x, comn,mi, FUN = sd)) -- Bert On Sun, Dec 4, 2011 at 1:07 PM, John Kane jrkrid...@yahoo.ca wrote: ?aggregate should do it aggregate(df$x,list(df$comn, df$mi), sd) There are other ways of course Using the reshape2 package library(reshape2) x1 - melt(df, id=c(comn, mi)) dcast(x1, comn + mi ~ variable, sd) --- On Sun, 12/4/11, Aurélien PHILIPPOT aurelien.philip...@gmail.com wrote: From: Aurélien PHILIPPOT aurelien.philip...@gmail.com Subject: [R] Group several variables and apply a function to the group To: R-help@r-project.org Received: Sunday, December 4, 2011, 3:32 PM Dear R-experts, I am struggling with the following problem, and I am looking for advice from more experienced R-users: I have a data frame with 2 identifying variables (comn and mi), and an output variable (x). comn is a variable for a company and mi is a variable for a month. comn-c(abc, abc, abc, abc, abc, abc, xyz, xyz,xyz, xyz) mi- c(1, 1,1, 2, 2, 2, 1, 1, 3, 3) x- c(-0.0031, 0.0009, -0.007, 0.1929,0.0087, 0.099,-0.089, 0.005, -0.0078, 0.67 ) df- data.frame(comn=comn, mi=mi, x=x) For each company, within a particular month, I would like to compute the standard deviation of x: for example, for abc, I would like to compute the sd of x for month1 (when mi=1) and for month2 (when mi=2). In other languages (Stata for instance), I would create a grouping variable (group comnn and mi) and then, apply the sd function for each group. However, I don't find an elegant way to do the same in R: I was thinking about the following: I could subset my data frame by mi and create one file per month, and then make a loop and in each file, use a by operator for each comn. I am sure it would work, but I feel that it would be like killing an ant with a tank. I was wondering if anyone knew a more straightforward way to implement that kind of operation? Thanks a lot, Best, Aurelien [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.