[R] cumsum function with data frame
Dear list, I have a problem with the cumsum function. I have a data frame like the following one variableYear value EC01 2005 5 EC01 2006 10 AAO12005 2 AAO1 2006 4 what I would like to obtain is variableYear value cumsum EC01 2005 5 5 EC01 2006 10 15 AAO12005 22 AAO1 2006 46 if I use the by function or the aggregate function the result is a list or something else, what I want is a data frame as I showed above... anyone knows how to get it??? THANKS A LOT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum function with data frame
See ?split and ?unsplit. Data - read.table(textConnection(variableYear value EC01 2005 5 EC01 2006 10 AAO12005 2 AAO1 2006 4),header=T) Datalist -split(Data,Data$variable) resultlist - lapply(Datalist,function(x){ x$cumul - cumsum(x$value) return(x) }) result - unsplit(resultlist,Data$variable) result variable Year value cumul 1 EC01 2005 5 5 2 EC01 20061015 3 AAO1 2005 2 2 4 AAO1 2006 4 6 On a side note: I've used this construction now for a number of problems. Some could be better solved using more specific functions (e.g. ave() for adding a column with means for example). I'm not sure however this is the most optimal approach to applying a function to subsets of a dataframe and adding the result of that function as an extra variable. Anybody care to elaborate on how the R masters had it in mind? Cheers Joris On Thu, Jun 3, 2010 at 5:58 PM, n.via...@libero.it n.via...@libero.itwrote: Dear list, I have a problem with the cumsum function. I have a data frame like the following one variableYear value EC01 2005 5 EC01 2006 10 AAO12005 2 AAO1 2006 4 what I would like to obtain is variableYear value cumsum EC01 2005 5 5 EC01 2006 10 15 AAO12005 22 AAO1 2006 46 if I use the by function or the aggregate function the result is a list or something else, what I want is a data frame as I showed above... anyone knows how to get it??? THANKS A LOT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum function with data frame
You can also use ddply from the plyr package: library(plyr) Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) Data ddply(Data,.(variable),summarise,Year=Year,value=value,CUMSUM=cumsum(value)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: Joris Meys jorism...@gmail.com To: n.via...@libero.it n.via...@libero.it Cc: r-help@r-project.org Sent: Thu, June 3, 2010 9:26:17 AM Subject: Re: [R] cumsum function with data frame See ?split and ?unsplit. Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) Datalist -split(Data,Data$variable) resultlist - lapply(Datalist,function(x){ x$cumul - cumsum(x$value) return(x) }) result - unsplit(resultlist,Data$variable) result variable Year value cumul 1 EC01 2005 5 5 2 EC01 2006 10 15 3 AAO1 2005 2 2 4 AAO1 2006 4 6 On a side note: I've used this construction now for a number of problems. Some could be better solved using more specific functions (e.g. ave() for adding a column with means for example). I'm not sure however this is the most optimal approach to applying a function to subsets of a dataframe and adding the result of that function as an extra variable. Anybody care to elaborate on how the R masters had it in mind? Cheers Joris On Thu, Jun 3, 2010 at 5:58 PM, ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.itwrote: Dear list, I have a problem with the cumsum function. I have a data frame like the following one variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4 what I would like to obtain is variable Year value cumsum EC01 2005 5 5 EC01 2006 10 15 AAO1 2005 2 2 AAO1 2006 4 6 if I use the by function or the aggregate function the result is a list or something else, what I want is a data frame as I showed above... anyone knows how to get it??? THANKS A LOT [[alternative HTML version deleted]] __ ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 href=mailto:joris.m...@ugent.be;joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] __ ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org mailing list href=https://stat.ethz.ch/mailman/listinfo/r-help; target=_blank https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumsum function with data frame
Better yet, is shorter using tranform instead of summarise: Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) ddply(Data,.(variable),transform,CUMSUM=cumsum(value)) - Original Message From: Felipe Carrillo mazatlanmex...@yahoo.com To: Joris Meys jorism...@gmail.com; n.via...@libero.it n.via...@libero.it Cc: r-help@r-project.org Sent: Thu, June 3, 2010 11:28:58 AM Subject: Re: [R] cumsum function with data frame You can also use ddply from the plyr package: library(plyr) Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) Data ddply(Data,.(variable),summarise,Year=Year,value=value,CUMSUM=cumsum(value)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: Joris Meys ymailto=mailto:jorism...@gmail.com; href=mailto:jorism...@gmail.com;jorism...@gmail.com To: ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it Cc: ymailto=mailto:r-help@r-project.org; href=mailto:r-help@r-project.org;r-help@r-project.org Sent: Thu, June 3, 2010 9:26:17 AM Subject: Re: [R] cumsum function with data frame See ?split and ?unsplit. Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) Datalist -split(Data,Data$variable) resultlist - lapply(Datalist,function(x){ x$cumul - cumsum(x$value) return(x) }) result - unsplit(resultlist,Data$variable) result variable Year value cumul 1 EC01 2005 5 5 2 EC01 2006 10 15 3 AAO1 2005 2 2 4 AAO1 2006 4 6 On a side note: I've used this construction now for a number of problems. Some could be better solved using more specific functions (e.g. ave() for adding a column with means for example). I'm not sure however this is the most optimal approach to applying a function to subsets of a dataframe and adding the result of that function as an extra variable. Anybody care to elaborate on how the R masters had it in mind? Cheers Joris On Thu, Jun 3, 2010 at 5:58 PM, ymailto=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.itwrote: Dear list, I have a problem with the cumsum function. I have a data frame like the following one variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4 what I would like to obtain is variable Year value cumsum EC01 2005 5 5 EC01 2006 10 15 AAO1 2005 2 2 AAO1 2006 4 6 if I use the by function or the aggregate function the result is a list or something else, what I want is a data frame as I showed above... anyone knows how to get it??? THANKS A LOT [[alternative HTML version deleted]] __ ymailto=mailto: ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org href=mailto: href=mailto:R-help@r-project.org;R-help@r-project.org ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org mailing list target=_blank https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 href=mailto: href=mailto:joris.m...@ugent.be;joris.m...@ugent.be ymailto=mailto:joris.m...@ugent.be; href=mailto:joris.m...@ugent.be;joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
Re: [R] cumsum function with data frame
Or better yet, you can use transform only (in base): transform(Data, CUMSUM = cumsum(value)) HTH, Jorge On Thu, Jun 3, 2010 at 3:30 PM, Felipe Carrillo wrote: Better yet, is shorter using tranform instead of summarise: Data - read.table(textConnection(variableYear value EC0120055 EC01200610 AAO12005 2 AAO1 2006 4),header=T) ddply(Data,.(variable),transform,CUMSUM=cumsum(value)) - Original Message From: Felipe Carrillo mazatlanmex...@yahoo.com To: Joris Meys jorism...@gmail.com; n.via...@libero.it n.via...@libero.it Cc: r-help@r-project.org Sent: Thu, June 3, 2010 11:28:58 AM Subject: Re: [R] cumsum function with data frame You can also use ddply from the plyr package: library(plyr) Data - read.table(textConnection(variableYear value EC01 20055 EC01200610 AAO12005 2 AAO1 2006 4),header=T) Data ddply(Data,.(variable),summarise,Year=Year,value=value,CUMSUM=cumsum(value)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: Joris Meys ymailto=mailto:jorism...@gmail.com; href=mailto:jorism...@gmail.com;jorism...@gmail.com To: ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto: n.via...@libero.it href=mailto:n.via...@libero.it;n.via...@libero.it Cc: ymailto=mailto:r-help@r-project.org; href=mailto:r-help@r-project.org;r-help@r-project.org Sent: Thu, June 3, 2010 9:26:17 AM Subject: Re: [R] cumsum function with data frame See ?split and ?unsplit. Data - read.table(textConnection(variableYear value EC01 2005 5 EC012006 10 AAO1 2005 2 AAO1 2006 4),header=T) Datalist -split(Data,Data$variable) resultlist - lapply(Datalist,function(x){ x$cumul - cumsum(x$value) return(x) }) result - unsplit(resultlist,Data$variable) result variable Year value cumul 1EC01 200555 2 EC01 200610 15 3AAO1 200522 4AAO1 2006 4 6 On a side note: I've used this construction now for a number of problems. Some could be better solved using more specific functions (e.g. ave() for adding a column with means for example). I'm not sure however this is the most optimal approach to applying a function to subsets of a dataframe and adding the result of that function as an extra variable. Anybody care to elaborate on how the R masters had it in mind? Cheers Joris On Thu, Jun 3, 2010 at 5:58 PM, ymailto=mailto: href=mailto:n.via...@libero.it; n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.itwrote: Dear list, I have a problem with the cumsum function. I have a data frame like the following one variable Year value EC01 2005 5 EC01 200610 AAO12005 2 AAO1 2006 4 what I would like to obtain is variableYear value cumsum EC01 20055 5 EC01 200610 15 AAO1 2005 2 2 AAO1 2006 4 6 if I use the by function or the aggregate function the result is a list or something else, what I want is a data frame as I showed above... anyone knows how to get it??? THANKS A LOT [[alternative HTML version deleted]] __ ymailto=mailto: ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org href=mailto: href=mailto:R-help@r-project.org;R-help@r-project.org ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org mailing list target=_blank https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 href=mailto: href=mailto:joris.m...@ugent.be;joris.m...@ugent.be ymailto=mailto:joris.m...@ugent.be; href=mailto:joris.m...@ugent.be;joris.m...@ugent.be
Re: [R] cumsum function with data frame
But then you don't apply cumsum within each factor level. Hence the ddply. Cheers Joris On Thu, Jun 3, 2010 at 9:35 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Or better yet, you can use transform only (in base): transform(Data, CUMSUM = cumsum(value)) HTH, Jorge On Thu, Jun 3, 2010 at 3:30 PM, Felipe Carrillo wrote: Better yet, is shorter using tranform instead of summarise: Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) ddply(Data,.(variable),transform,CUMSUM=cumsum(value)) - Original Message From: Felipe Carrillo mazatlanmex...@yahoo.com To: Joris Meys jorism...@gmail.com; n.via...@libero.it n.via...@libero.it Cc: r-help@r-project.org Sent: Thu, June 3, 2010 11:28:58 AM Subject: Re: [R] cumsum function with data frame You can also use ddply from the plyr package: library(plyr) Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) Data ddply(Data,.(variable),summarise,Year=Year,value=value,CUMSUM=cumsum(value)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: Joris Meys ymailto=mailto:jorism...@gmail.com; href=mailto:jorism...@gmail.com;jorism...@gmail.com To: ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it Cc: ymailto=mailto:r-help@r-project.org; href=mailto:r-help@r-project.org;r-help@r-project.org Sent: Thu, June 3, 2010 9:26:17 AM Subject: Re: [R] cumsum function with data frame See ?split and ?unsplit. Data - read.table(textConnection(variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4),header=T) Datalist -split(Data,Data$variable) resultlist - lapply(Datalist,function(x){ x$cumul - cumsum(x$value) return(x) }) result - unsplit(resultlist,Data$variable) result variable Year value cumul 1 EC01 2005 5 5 2 EC01 2006 10 15 3 AAO1 2005 2 2 4 AAO1 2006 4 6 On a side note: I've used this construction now for a number of problems. Some could be better solved using more specific functions (e.g. ave() for adding a column with means for example). I'm not sure however this is the most optimal approach to applying a function to subsets of a dataframe and adding the result of that function as an extra variable. Anybody care to elaborate on how the R masters had it in mind? Cheers Joris On Thu, Jun 3, 2010 at 5:58 PM, ymailto=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.itwrote: Dear list, I have a problem with the cumsum function. I have a data frame like the following one variable Year value EC01 2005 5 EC01 2006 10 AAO1 2005 2 AAO1 2006 4 what I would like to obtain is variable Year value cumsum EC01 2005 5 5 EC01 2006 10 15 AAO1 2005 2 2 AAO1 2006 4 6 if I use the by function or the aggregate function the result is a list or something else, what I want is a data frame as I showed above... anyone knows how to get it??? THANKS A LOT [[alternative HTML version deleted]] __ ymailto=mailto: ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org href=mailto: href=mailto:R-help@r-project.org;R-help@r-project.org ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org mailing list target=_blank https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 href
Re: [R] cumsum function with data frame
Thanks Joris, you are absolutely right. Apologies to all for that. May be this time I can get it right :-) transform(Data, CUMSUM = do.call(c, with(Data, tapply(value, rev(variable), cumsum Regards, Jorge On Thu, Jun 3, 2010 at 8:04 PM, Joris Meys wrote: But then you don't apply cumsum within each factor level. Hence the ddply. Cheers Joris On Thu, Jun 3, 2010 at 9:35 PM, Jorge Ivan Velez wrote: Or better yet, you can use transform only (in base): transform(Data, CUMSUM = cumsum(value)) HTH, Jorge On Thu, Jun 3, 2010 at 3:30 PM, Felipe Carrillo wrote: Better yet, is shorter using tranform instead of summarise: Data - read.table(textConnection(variableYear value EC0120055 EC01200610 AAO12005 2 AAO1 2006 4),header=T) ddply(Data,.(variable),transform,CUMSUM=cumsum(value)) - Original Message From: Felipe Carrillo mazatlanmex...@yahoo.com To: Joris Meys jorism...@gmail.com; n.via...@libero.it n.via...@libero.it Cc: r-help@r-project.org Sent: Thu, June 3, 2010 11:28:58 AM Subject: Re: [R] cumsum function with data frame You can also use ddply from the plyr package: library(plyr) Data - read.table(textConnection(variableYear value EC01 20055 EC01200610 AAO12005 2 AAO1 2006 4),header=T) Data ddply(Data,.(variable),summarise,Year=Year,value=value,CUMSUM=cumsum(value)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: Joris Meys ymailto=mailto:jorism...@gmail.com; href=mailto:jorism...@gmail.com;jorism...@gmail.com To: ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it Cc: ymailto=mailto:r-help@r-project.org; href=mailto:r-help@r-project.org;r-help@r-project.org Sent: Thu, June 3, 2010 9:26:17 AM Subject: Re: [R] cumsum function with data frame See ?split and ?unsplit. Data - read.table(textConnection(variableYear value EC01 2005 5 EC012006 10 AAO1 2005 2 AAO1 2006 4),header=T) Datalist -split(Data,Data$variable) resultlist - lapply(Datalist,function(x){ x$cumul - cumsum(x$value) return(x) }) result - unsplit(resultlist,Data$variable) result variable Year value cumul 1EC01 200555 2 EC01 200610 15 3AAO1 200522 4AAO1 2006 4 6 On a side note: I've used this construction now for a number of problems. Some could be better solved using more specific functions (e.g. ave() for adding a column with means for example). I'm not sure however this is the most optimal approach to applying a function to subsets of a dataframe and adding the result of that function as an extra variable. Anybody care to elaborate on how the R masters had it in mind? Cheers Joris On Thu, Jun 3, 2010 at 5:58 PM, ymailto=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it href=mailto: href=mailto:n.via...@libero.it;n.via...@libero.it ymailto=mailto:n.via...@libero.it; href=mailto:n.via...@libero.it;n.via...@libero.itwrote: Dear list, I have a problem with the cumsum function. I have a data frame like the following one variable Year value EC01 2005 5 EC01 200610 AAO12005 2 AAO1 2006 4 what I would like to obtain is variableYear value cumsum EC01 20055 5 EC01 200610 15 AAO1 2005 2 2 AAO1 2006 4 6 if I use the by function or the aggregate function the result is a list or something else, what I want is a data frame as I showed above... anyone knows how to get it??? THANKS A LOT [[alternative HTML version deleted]] __ ymailto=mailto: ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org href=mailto: href=mailto:R-help@r-project.org;R-help@r-project.org ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org