Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Rich Shepard wrote: Nope. Can't. Dan et al.: I apologize for the certainty; anything can happen to data as un-intended consequences of processing. I will write that there _should_ be no duplicates. What I'm going to do is re-read the data frame from the source text file that was written out from the postgres table, then subset only a single stream's worth of data as a test set. I'll obfuscate the identification (although the data have been submitted to the regulator over the decades) and make the set available to use if this new approach still has issues with dcast(). May not be until Monday that I return with results on this effort as I'm going to be carfully checking and documenting each step and its results. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt() [RESOLVED]
On Fri, 28 Oct 2011, Rich Shepard wrote: May not be until Monday that I return with results on this effort as I'm going to be carfully checking and documenting each step and its results. Dan, et al.: Got it working properly now. Took a different approach after re-reading the source data into an R data frame. I applied the subset() function to extract all rows for a single stream (all commands are on a single line in emacs but wrap when copied into alpine): burns - subset(chemdata, stream == 'BurnsCrk', select = c(site, sampdate, param, quant), drop = T) Then I applied the reshape2 melt() function: burns.melt - melt(burns, id.vars = c('site', 'sampdate', 'param')) which produced this data frame: head(burns.melt) site sampdate param variable value 1 BC-0.5 1996-06-02OHquant NA 2 BC-0.5 1996-06-02 SO4quant 194.00 3 BC-0.5 1996-06-02 TDSquant 530.00 4 BC-0.5 1996-06-02 TSSquant NA 5 BC-0.5 1996-04-19Asquant 0.01 6 BC-1.5 1996-09-19Asquant NA Then the reshape2 dcast() function produced the desired results: burns.cast - dcast(burns.melt, site + sampdate ~ param) head(burns.cast) site sampdate Acid Ag Al Alk-HO Alk-Tot As Ba Be Bo CO3Ca Cd 1 BC-0.5 1996-04-19 NA NA 0.07 NA 162 0.01 0.18 NA NA NA 76.56 NA 2 BC-0.5 1996-05-21 NA NA NA NA 152 NA NA NA NA NANA NA 3 BC-0.5 1996-06-02 NA NA NA NA 212 NA NA NA NA NANA NA 4 BC-1 1988-06-26 NA NA NA NA NA 0.00 NA NA NA NANA NA 5 BC-1 1988-07-30 NA NA NA NA NA 0.00 NA NA NA NANA NA 6 BC-1 1989-05-15 NA NA NA NA NA 0.00 NA NA NA NANA NA et cetera. Thank you all for your patient help. Now I should be able to produce scatter plots and run linear regressions on selected parameter pairs by site and date. Carpe weekend, all, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Syntax Check: rshape2 melt()
This is my first excursion into using reshape2 and I want to ensure that the melt() function call is syntactically correct. The unmodifed data frame is organized this way: head(tds.anal) site sampdate param quant 1UDS-O 2006-12-06 TDS 10800 4 STC-FS 1996-06-14 Cond 280 7UDS-O 2007-10-04Mg 1620 9UDS-O 2007-10-04 SO4 7580 19 JCM-10B 2007-06-21Ca79 20 JCM-10B 2007-06-21Cl 114 What I want looks like this: sitesampdate TDS Cond Mg Ca Cl Na SO4 UDS-O 2006-12-06 10800 NA 1620 NA NA NA 7580 with the actual data for each param, of course. I've read the reshape.pdf, reshape2.pdf, the ?melt help page, and the ?melt.data.frame help page. I'm still unclear on the differences among measure.vars, variable.name, and value.name. After several attempts I have what may be what the melted tds.anal should look like: m.tds.anal - melt(tds.anal, id.vars = c('site', 'sampdate', 'param'), \ measure.vars = 'quant', value.name = 'quant', na.rm = F) head(m.tds.anal) site sampdate param variable quant 1 UDS-O 2006-12-06 TDSquant 10800 2 STC-FS 1996-06-14 Condquant 280 3 UDS-O 2007-10-04Mgquant 1620 4 UDS-O 2007-10-04 SO4quant 7580 5 JCM-10B 2007-06-21Caquant79 6 JCM-10B 2007-06-21Clquant 114 Is the melt() function call correct? Should the melted result look like the unmelted (long form in Paul Dalgaard's book) data with the additional 'variable' column containing 'quant' for each row? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 8:26 AM To: r-help@r-project.org Subject: [R] Syntax Check: rshape2 melt() This is my first excursion into using reshape2 and I want to ensure that the melt() function call is syntactically correct. The unmodifed data frame is organized this way: head(tds.anal) site sampdate param quant 1UDS-O 2006-12-06 TDS 10800 4 STC-FS 1996-06-14 Cond 280 7UDS-O 2007-10-04Mg 1620 9UDS-O 2007-10-04 SO4 7580 19 JCM-10B 2007-06-21Ca79 20 JCM-10B 2007-06-21Cl 114 What I want looks like this: sitesampdate TDS Cond Mg Ca Cl Na SO4 UDS-O 2006-12-06 10800 NA 1620 NA NA NA 7580 with the actual data for each param, of course. I've read the reshape.pdf, reshape2.pdf, the ?melt help page, and the ?melt.data.frame help page. I'm still unclear on the differences among measure.vars, variable.name, and value.name. After several attempts I have what may be what the melted tds.anal should look like: m.tds.anal - melt(tds.anal, id.vars = c('site', 'sampdate', 'param'), \ measure.vars = 'quant', value.name = 'quant', na.rm = F) head(m.tds.anal) site sampdate param variable quant 1 UDS-O 2006-12-06 TDSquant 10800 2 STC-FS 1996-06-14 Condquant 280 3 UDS-O 2007-10-04Mgquant 1620 4 UDS-O 2007-10-04 SO4quant 7580 5 JCM-10B 2007-06-21Caquant79 6 JCM-10B 2007-06-21Clquant 114 Is the melt() function call correct? Should the melted result look like the unmelted (long form in Paul Dalgaard's book) data with the additional 'variable' column containing 'quant' for each row? Rich Rich, What I think you want is as simple as test.melted - melt(test) wanted - cast(test.melted, site + sampdate ~ param) Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: What I think you want is as simple as test.melted - melt(test) Dan, I see the difference this syntax makes; sampdate is the variable associated with the value (quant) for a specific ID pair of site and param. wanted - cast(test.melted, site + sampdate ~ param) This, unfortunately, doesn't work: c.tds.anal - cast(m.tds.anal, site + sampdate ~ param) Error: could not find function cast ?cast has an example using acast() so I tried that: c.tds.anal - acast(m.tds.anal, site + sampdate ~ param) Error in eval(expr, envir, enclos) : object 'sampdate' not found Changing 'sampdate' to 'variable' produces results, but not what's needed: c.tds.anal - acast(m.tds.anal, site + variable ~ param) Aggregation function missing: defaulting to length head(c.tds.anal) Ca Cl Cond Mg Na SO4 TDS BC-0.5_sampdate 1 13 1 0 3 3 BC-0.5_quant 1 13 1 0 3 3 BC-1_sampdate8 8 11 8 7 10 10 BC-1_quant 8 8 11 8 7 10 10 BC-1.5_sampdate 3 46 3 0 6 6 BC-1.5_quant 3 46 3 0 6 6 Now that I better understand melt(), I'll read ?cast. Thanks very much, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 10:22 AM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: What I think you want is as simple as test.melted - melt(test) Dan, I see the difference this syntax makes; sampdate is the variable associated with the value (quant) for a specific ID pair of site and param. wanted - cast(test.melted, site + sampdate ~ param) This, unfortunately, doesn't work: c.tds.anal - cast(m.tds.anal, site + sampdate ~ param) Error: could not find function cast ?cast has an example using acast() so I tried that: c.tds.anal - acast(m.tds.anal, site + sampdate ~ param) Error in eval(expr, envir, enclos) : object 'sampdate' not found Changing 'sampdate' to 'variable' produces results, but not what's needed: c.tds.anal - acast(m.tds.anal, site + variable ~ param) Aggregation function missing: defaulting to length head(c.tds.anal) Ca Cl Cond Mg Na SO4 TDS BC-0.5_sampdate 1 13 1 0 3 3 BC-0.5_quant 1 13 1 0 3 3 BC-1_sampdate8 8 11 8 7 10 10 BC-1_quant 8 8 11 8 7 10 10 BC-1.5_sampdate 3 46 3 0 6 6 BC-1.5_quant 3 46 3 0 6 6 Now that I better understand melt(), I'll read ?cast. Thanks very much, Rich Rich, I should apologize. I saw rshape2 and just glossed over that and read rshape. So I was using the rshape package rather than rshape2. I don't know the relationship between those two packages and/or how they differ. I am sure that there are others that can help you out here. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Nordlund, Dan (DSHS/RDA) Sent: Thursday, October 27, 2011 10:47 AM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 10:22 AM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: What I think you want is as simple as test.melted - melt(test) Dan, I see the difference this syntax makes; sampdate is the variable associated with the value (quant) for a specific ID pair of site and param. wanted - cast(test.melted, site + sampdate ~ param) This, unfortunately, doesn't work: c.tds.anal - cast(m.tds.anal, site + sampdate ~ param) Error: could not find function cast ?cast has an example using acast() so I tried that: c.tds.anal - acast(m.tds.anal, site + sampdate ~ param) Error in eval(expr, envir, enclos) : object 'sampdate' not found Changing 'sampdate' to 'variable' produces results, but not what's needed: c.tds.anal - acast(m.tds.anal, site + variable ~ param) Aggregation function missing: defaulting to length head(c.tds.anal) Ca Cl Cond Mg Na SO4 TDS BC-0.5_sampdate 1 13 1 0 3 3 BC-0.5_quant 1 13 1 0 3 3 BC-1_sampdate8 8 11 8 7 10 10 BC-1_quant 8 8 11 8 7 10 10 BC-1.5_sampdate 3 46 3 0 6 6 BC-1.5_quant 3 46 3 0 6 6 Now that I better understand melt(), I'll read ?cast. Thanks very much, Rich Rich, I should apologize. I saw rshape2 and just glossed over that and read rshape. So I was using the rshape package rather than rshape2. I don't know the relationship between those two packages and/or how they differ. I am sure that there are others that can help you out here. Dan I need more coffee. That should have been reshape and reshape2, not rshape and rshape2. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: I should apologize. I saw rshape2 and just glossed over that and read rshape. Dan, Apparently the 'e' in the subject line fell off. It's 'reshape2'. So I was using the rshape package rather than rshape2. I don't know the relationship between those two packages and/or how they differ. I am sure that there are others that can help you out here. I, too, don't know how the two packages 'reshape, The Orignal' and 'reshape2, Rebooted' differ. The two articles by Hadley that I read (Statistical Computing and Graphics, 2005, and Journal of Statistical Software, 2007) do not explicitly specify which version of reshape is addressed. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 10:56 AM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: I should apologize. I saw rshape2 and just glossed over that and read rshape. Dan, Apparently the 'e' in the subject line fell off. It's 'reshape2'. So I was using the rshape package rather than rshape2. I don't know the relationship between those two packages and/or how they differ. I am sure that there are others that can help you out here. I, too, don't know how the two packages 'reshape, The Orignal' and 'reshape2, Rebooted' differ. The two articles by Hadley that I read (Statistical Computing and Graphics, 2005, and Journal of Statistical Software, 2007) do not explicitly specify which version of reshape is addressed. Rich Rich, You should be able to get what you want using the dcast function (I think). test.melted - melt(tds.anal) dcast(test.melted, site + sampdate ~ param) Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: You should be able to get what you want using the dcast function (I think). test.melted - melt(tds.anal) dcast(test.melted, site + sampdate ~ param) Dan, I don't think the data frame melted properly. I tried the dcast() but sampdate is not found: c.tds.anal - dcast(m.tds.anal, site + sampdate ~ param) Error in eval(expr, envir, enclos) : object 'sampdate' not found Perhaps I still need to better understand the parameters in the melt() function. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 11:17 AM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: You should be able to get what you want using the dcast function (I think). test.melted - melt(tds.anal) dcast(test.melted, site + sampdate ~ param) Dan, I don't think the data frame melted properly. I tried the dcast() but sampdate is not found: c.tds.anal - dcast(m.tds.anal, site + sampdate ~ param) Error in eval(expr, envir, enclos) : object 'sampdate' not found Perhaps I still need to better understand the parameters in the melt() function. Rich Rich, I read in the sample data that you put in an earlier post. tds.anal - read.table('clipboard',header=TRUE) tds.anal site sampdate param quant 1UDS-O 2006-12-06 TDS 10800 4 STC-FS 1996-06-14 Cond 280 7UDS-O 2007-10-04Mg 1620 9UDS-O 2007-10-04 SO4 7580 19 JCM-10B 2007-06-21Ca79 20 JCM-10B 2007-06-21Cl 114 Then ran the following code library(reshape2) melt(tds.anal) Using site, sampdate, param as id variables site sampdate param variable value 1 UDS-O 2006-12-06 TDSquant 10800 2 STC-FS 1996-06-14 Condquant 280 3 UDS-O 2007-10-04Mgquant 1620 4 UDS-O 2007-10-04 SO4quant 7580 5 JCM-10B 2007-06-21Caquant79 6 JCM-10B 2007-06-21Clquant 114 dcast(melt(tds.anal), site + sampdate ~ param) Using site, sampdate, param as id variables site sampdate Ca Cl Cond Mg SO4 TDS 1 JCM-10B 2007-06-21 79 114 NA NA NANA 2 STC-FS 1996-06-14 NA NA 280 NA NANA 3 UDS-O 2006-12-06 NA NA NA NA NA 10800 4 UDS-O 2007-10-04 NA NA NA 1620 7580NA So you can see that at least it worked for me. Can you show us the actual output you get when running the melt and dcast functions on a sample of your data? I am not sure why sampdate is not found since it appears to be in your original sample data. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: What I think you want is as simple as test.melted - melt(test) Dan, et al.: No, it's not that simple. Unfortunately. I've tried various combinations of id.vars, measure.vars, and variable, but just cannot get the dates to display in the melted data frame. For example, m.tds.anal - melt(tds.anal, id.vars = c('site', 'param'), measure.vars = 'sampdate', variable.name = 'variable', na.rm = T, value.name = 'quant') head(m.tds.anal) site param variable quant 1 UDS-O TDS sampdate 13488 2 STC-FS Cond sampdate 9661 3 UDS-OMg sampdate 13790 4 UDS-O SO4 sampdate 13790 5 JCM-10BCa sampdate 13685 6 JCM-10BCl sampdate 13685 No matter where I put 'sampdate' it displays as that string rather than as the date itself. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: I read in the sample data that you put in an earlier post. tds.anal - read.table('clipboard',header=TRUE) tds.anal site sampdate param quant 1UDS-O 2006-12-06 TDS 10800 4 STC-FS 1996-06-14 Cond 280 7UDS-O 2007-10-04Mg 1620 9UDS-O 2007-10-04 SO4 7580 19 JCM-10B 2007-06-21Ca79 20 JCM-10B 2007-06-21Cl 114 Dan, In your 'tds.anal' is sampdate a factor or a date? Here it's a date: str(tds.anal) 'data.frame': 11434 obs. of 4 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 127 134 134 73 73 91 91 91 91 ... $ sampdate: Date, format: 2006-12-06 1996-06-14 ... $ param : Factor w/ 7 levels Ca,Cl,Cond,..: 7 3 4 6 1 2 4 5 1 4 ... $ quant : num 10800 280 1620 7580 79 114 301 27.5 381 274 ... I wonder if that's what's not working here. Then ran the following code melt(tds.anal) Using site, sampdate, param as id variables site sampdate param variable value 1 UDS-O 2006-12-06 TDSquant 10800 2 STC-FS 1996-06-14 Condquant 280 3 UDS-O 2007-10-04Mgquant 1620 4 UDS-O 2007-10-04 SO4quant 7580 5 JCM-10B 2007-06-21Caquant79 6 JCM-10B 2007-06-21Clquant 114 When I do this I see: melt(tds.anal) Using site, param as id variables site param variable value 1 UDS-O TDS sampdate 13488.000 2 STC-FS Cond sampdate 9661.000 3 UDS-OMg sampdate 13790.000 4 UDS-O SO4 sampdate 13790.000 5JCM-10BCa sampdate 13685.000 6JCM-10BCl sampdate 13685.000 Notice there's no sampdate here. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 12:00 PM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: I read in the sample data that you put in an earlier post. tds.anal - read.table('clipboard',header=TRUE) tds.anal site sampdate param quant 1UDS-O 2006-12-06 TDS 10800 4 STC-FS 1996-06-14 Cond 280 7UDS-O 2007-10-04Mg 1620 9UDS-O 2007-10-04 SO4 7580 19 JCM-10B 2007-06-21Ca79 20 JCM-10B 2007-06-21Cl 114 Dan, In your 'tds.anal' is sampdate a factor or a date? Here it's a date: str(tds.anal) 'data.frame': 11434 obs. of 4 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 127 134 134 73 73 91 91 91 91 ... $ sampdate: Date, format: 2006-12-06 1996-06-14 ... $ param : Factor w/ 7 levels Ca,Cl,Cond,..: 7 3 4 6 1 2 4 5 1 4 ... $ quant : num 10800 280 1620 7580 79 114 301 27.5 381 274 ... I wonder if that's what's not working here. Then ran the following code melt(tds.anal) Using site, sampdate, param as id variables site sampdate param variable value 1 UDS-O 2006-12-06 TDSquant 10800 2 STC-FS 1996-06-14 Condquant 280 3 UDS-O 2007-10-04Mgquant 1620 4 UDS-O 2007-10-04 SO4quant 7580 5 JCM-10B 2007-06-21Caquant79 6 JCM-10B 2007-06-21Clquant 114 When I do this I see: melt(tds.anal) Using site, param as id variables site param variable value 1 UDS-O TDS sampdate 13488.000 2 STC-FS Cond sampdate 9661.000 3 UDS-OMg sampdate 13790.000 4 UDS-O SO4 sampdate 13790.000 5JCM-10BCa sampdate 13685.000 6JCM-10BCl sampdate 13685.000 Notice there's no sampdate here. Rich Rich, You were right about sampdate as I read it in. How about trying this on your data m.tds.anal - melt(tds.anal, id.vars = c('site', 'sampdate', 'param'), variable.name = 'variable', na.rm = T, value.name = 'quant') wanted - dcast(m.tds.anal, site + sampdate ~ param) Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
Try this, based on your small example: tds.a - read.table(textConnection( + site sampdate param quant + 1UDS-O 2006-12-06 TDS 10800 + 4 STC-FS 1996-06-14 Cond 280 + 7UDS-O 2007-10-04Mg 1620 + 9UDS-O 2007-10-04 SO4 7580 + 19 JCM-10B 2007-06-21Ca79 + 20 JCM-10B 2007-06-21Cl 114), header = TRUE, stringsAsFactors = FALSE) closeAllConnections() # Define param so that all of its levels are represented: tds.a - within(tds.a, { param = factor(param, levels = c('TDS', 'Cond', 'Mg', 'Ca', 'Cl', 'Na', 'SO4')) sampdate = as.Date(sampdate) } ) library('reshape2') dcast(tds.a, site + sampdate ~ param, value_var = 'quant') # Result: site sampdate TDS Cond Mg Ca Cl SO4 1 JCM-10B 2007-06-21NA NA NA 79 114 NA 2 STC-FS 1996-06-14NA 280 NA NA NA NA 3 UDS-O 2006-12-06 10800 NA NA NA NA NA 4 UDS-O 2007-10-04NA NA 1620 NA NA 7580 HTH, Dennis On Thu, Oct 27, 2011 at 8:26 AM, Rich Shepard rshep...@appl-ecosys.com wrote: This is my first excursion into using reshape2 and I want to ensure that the melt() function call is syntactically correct. The unmodifed data frame is organized this way: head(tds.anal) site sampdate param quant 1 UDS-O 2006-12-06 TDS 10800 4 STC-FS 1996-06-14 Cond 280 7 UDS-O 2007-10-04 Mg 1620 9 UDS-O 2007-10-04 SO4 7580 19 JCM-10B 2007-06-21 Ca 79 20 JCM-10B 2007-06-21 Cl 114 What I want looks like this: site sampdate TDS Cond Mg Ca Cl Na SO4 UDS-O 2006-12-06 10800 NA 1620 NA NA NA 7580 with the actual data for each param, of course. I've read the reshape.pdf, reshape2.pdf, the ?melt help page, and the ?melt.data.frame help page. I'm still unclear on the differences among measure.vars, variable.name, and value.name. After several attempts I have what may be what the melted tds.anal should look like: m.tds.anal - melt(tds.anal, id.vars = c('site', 'sampdate', 'param'), \ measure.vars = 'quant', value.name = 'quant', na.rm = F) head(m.tds.anal) site sampdate param variable quant 1 UDS-O 2006-12-06 TDS quant 10800 2 STC-FS 1996-06-14 Cond quant 280 3 UDS-O 2007-10-04 Mg quant 1620 4 UDS-O 2007-10-04 SO4 quant 7580 5 JCM-10B 2007-06-21 Ca quant 79 6 JCM-10B 2007-06-21 Cl quant 114 Is the melt() function call correct? Should the melted result look like the unmelted (long form in Paul Dalgaard's book) data with the additional 'variable' column containing 'quant' for each row? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
So I was using the rshape package rather than rshape2. I don't know the relationship between those two packages and/or how they differ. I am sure that there are others that can help you out here. I, too, don't know how the two packages 'reshape, The Orignal' and 'reshape2, Rebooted' differ. The two articles by Hadley that I read (Statistical Computing and Graphics, 2005, and Journal of Statistical Software, 2007) do not explicitly specify which version of reshape is addressed. See here: https://github.com/hadley/reshape -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Fri, 28 Oct 2011, Hadley Wickham wrote: See here: https://github.com/hadley/reshape Thanks, Hadley. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Dennis Murphy wrote: # Define param so that all of its levels are represented: tds.a - within(tds.a, { param = factor(param, levels = c('TDS', 'Cond', 'Mg', 'Ca', 'Cl', 'Na', 'SO4')) sampdate = as.Date(sampdate) } ) Dennis, This produces an error: tds.a - within(tds.anal, {param = factor(param, levels = c('TDS', 'Cond', 'Mg', 'Ca', 'Cl', 'Na', 'SO4')) sampdate = as.Date(sampdate)}) Error: unexpected symbol in tds.a - within(tds.anal, {param = factor(param, levels = c('TDS', 'Cond', 'Mg', 'Ca', 'Cl', 'Na', 'SO4')) sampdate So I did the conversion and melting the long way: tds.a - tds.anal tds.a$sampdate - factor(tds.a$sampdate) Now sampdate is a factor, and quant has the proper values: str(tds.a) 'data.frame': 11434 obs. of 4 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 127 134 134 73 73 $ sampdate: Factor w/ 890 levels 1981-11-30,1982-04-28,..: 686 268 727 $ param : Factor w/ 7 levels Ca,Cl,Cond,..: 7 3 4 6 1 2 4 5 1 4 ... $ quant : num 10800 280 1620 7580 79 114 301 27.5 381 274 ... However, ... dcast(tds.a, site + sampdate ~ param, value_var = 'quant') c.tds.anal - dcast(tds.a, site + sampdate ~ param, value_var = 'quant') Aggregation function missing: defaulting to length. That's because c.tds.anal has lost the quant values: str(c.tds.anal) 'data.frame': 2309 obs. of 9 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 1 1 1 2 2 2 2 2 2 2 ... $ sampdate: Factor w/ 890 levels 1981-11-30,1982-04-28,..: 239 252 260 99 101 102 155 163 170 179 ... $ Ca : int 1 0 0 1 1 1 1 1 0 1 ... $ Cl : int 1 0 0 1 1 1 1 1 0 1 ... $ Cond: int 1 1 1 1 1 1 1 1 1 1 ... $ Mg : int 1 0 0 1 1 1 1 1 0 1 ... $ Na : int 0 0 0 1 1 1 1 1 0 1 ... $ SO4 : int 1 1 1 1 1 1 1 1 0 1 ... $ TDS : int 1 1 1 1 1 1 1 1 0 1 ... I don't see what I did differently from your sample code. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Rich Shepard wrote: So I did the conversion and melting the long way: tds.a - tds.anal tds.a$sampdate - factor(tds.a$sampdate) I didn't copy the melt command: tds.anal.m - melt(tds.a) However, ... c.tds.anal - dcast(tds.a, site + sampdate ~ param, value_var = 'quant') And the above was actually c.tds.anal - dcast(m.tds.anal ...) but produces the same error. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 2:41 PM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() On Thu, 27 Oct 2011, Dennis Murphy wrote: # Define param so that all of its levels are represented: tds.a - within(tds.a, { param = factor(param, levels = c('TDS', 'Cond', 'Mg', 'Ca', 'Cl', 'Na', 'SO4')) sampdate = as.Date(sampdate) } ) Dennis, This produces an error: tds.a - within(tds.anal, {param = factor(param, levels = c('TDS', 'Cond', 'Mg', 'Ca', 'Cl', 'Na', 'SO4')) sampdate = as.Date(sampdate)}) Error: unexpected symbol in tds.a - within(tds.anal, {param = factor(param, levels = c('TDS', 'Cond', 'Mg', 'Ca', 'Cl', 'Na', 'SO4')) sampdate So I did the conversion and melting the long way: tds.a - tds.anal tds.a$sampdate - factor(tds.a$sampdate) Now sampdate is a factor, and quant has the proper values: str(tds.a) 'data.frame': 11434 obs. of 4 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 127 134 134 73 73 $ sampdate: Factor w/ 890 levels 1981-11-30,1982-04-28,..: 686 268 727 $ param : Factor w/ 7 levels Ca,Cl,Cond,..: 7 3 4 6 1 2 4 5 1 4 ... $ quant : num 10800 280 1620 7580 79 114 301 27.5 381 274 ... However, ... dcast(tds.a, site + sampdate ~ param, value_var = 'quant') c.tds.anal - dcast(tds.a, site + sampdate ~ param, value_var = 'quant') Aggregation function missing: defaulting to length. That's because c.tds.anal has lost the quant values: str(c.tds.anal) 'data.frame': 2309 obs. of 9 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 1 1 1 2 2 2 2 2 2 2 ... $ sampdate: Factor w/ 890 levels 1981-11-30,1982-04-28,..: 239 252 260 99 101 102 155 163 170 179 ... $ Ca : int 1 0 0 1 1 1 1 1 0 1 ... $ Cl : int 1 0 0 1 1 1 1 1 0 1 ... $ Cond: int 1 1 1 1 1 1 1 1 1 1 ... $ Mg : int 1 0 0 1 1 1 1 1 0 1 ... $ Na : int 0 0 0 1 1 1 1 1 0 1 ... $ SO4 : int 1 1 1 1 1 1 1 1 0 1 ... $ TDS : int 1 1 1 1 1 1 1 1 0 1 ... I don't see what I did differently from your sample code. Rich Rich, It appears that what you did differently was to dcast the un-melted data. In addition, you shouldn't need to turn your dates into factors. I assume that your unaltered data is in tds.anal. Melt the data this way: m.tds.anal - melt(tds.anal, id.vars = c('site', 'sampdate', 'param')) Then, cast the data this way: wanted - dcast(m.tds.anal, site + sampdate ~ param) Does this do what you wanted? Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: It appears that what you did differently was to dcast the un-melted data. In addition, you shouldn't need to turn your dates into factors. I assume that your unaltered data is in tds.anal. Dan, It's been that sort of a day when that might well have been what I did. So I checked and re-melted the data with similar lack of results. Melt the data this way: m.tds.anal - melt(tds.anal, id.vars = c('site', 'sampdate', 'param')) OK. This yields: head(m.tds.anal) site sampdate param variable value 1 UDS-O 2006-12-06 TDSquant 10800 2 STC-FS 1996-06-14 Condquant 280 3 UDS-O 2007-10-04Mgquant 1620 4 UDS-O 2007-10-04 SO4quant 7580 5 JCM-10B 2007-06-21Caquant79 6 JCM-10B 2007-06-21Clquant 114 Then, cast the data this way: wanted - dcast(m.tds.anal, site + sampdate ~ param) Does this do what you wanted? I still get this error: wanted - dcast(m.tds.anal, site + sampdate ~ param) Aggregation function missing: defaulting to length and the values are replaced by lengths in the output: head(wanted) site sampdate Ca Cl Cond Mg Na SO4 TDS 1 BC-0.5 1996-04-19 1 11 1 0 1 1 2 BC-0.5 1996-05-21 0 01 0 0 1 1 3 BC-0.5 1996-06-02 0 01 0 0 1 1 4 BC-1 1994-05-24 1 11 1 1 1 1 5 BC-1 1994-06-02 1 11 1 1 1 1 6 BC-1 1994-06-10 1 11 1 1 1 1 This is what happens each time I try. That's why I thought that sampdate needed to be a factor rather than a date. I certainly have no idea why I'm not getting valid results, but appreciate the help that you and others provide so I can learn to do this correctly. Thanks again, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, October 27, 2011 4:29 PM To: r-help@r-project.org Subject: Re: [R] Syntax Check: rshape2 melt() On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: It appears that what you did differently was to dcast the un-melted data. In addition, you shouldn't need to turn your dates into factors. I assume that your unaltered data is in tds.anal. Dan, It's been that sort of a day when that might well have been what I did. So I checked and re-melted the data with similar lack of results. Melt the data this way: m.tds.anal - melt(tds.anal, id.vars = c('site', 'sampdate', 'param')) OK. This yields: head(m.tds.anal) site sampdate param variable value 1 UDS-O 2006-12-06 TDSquant 10800 2 STC-FS 1996-06-14 Condquant 280 3 UDS-O 2007-10-04Mgquant 1620 4 UDS-O 2007-10-04 SO4quant 7580 5 JCM-10B 2007-06-21Caquant79 6 JCM-10B 2007-06-21Clquant 114 Then, cast the data this way: wanted - dcast(m.tds.anal, site + sampdate ~ param) Does this do what you wanted? I still get this error: wanted - dcast(m.tds.anal, site + sampdate ~ param) Aggregation function missing: defaulting to length and the values are replaced by lengths in the output: head(wanted) site sampdate Ca Cl Cond Mg Na SO4 TDS 1 BC-0.5 1996-04-19 1 11 1 0 1 1 2 BC-0.5 1996-05-21 0 01 0 0 1 1 3 BC-0.5 1996-06-02 0 01 0 0 1 1 4 BC-1 1994-05-24 1 11 1 1 1 1 5 BC-1 1994-06-02 1 11 1 1 1 1 6 BC-1 1994-06-10 1 11 1 1 1 1 This is what happens each time I try. That's why I thought that sampdate needed to be a factor rather than a date. I certainly have no idea why I'm not getting valid results, but appreciate the help that you and others provide so I can learn to do this correctly. Thanks again, Rich Rich, I will be leaving work shortly, and won't be able to look at this again until tomorrow some time. However, I went back and reread the help page on dcast. Here is part of that page. fun.aggregate aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified. It looks like your data may be slightly different than I expected. The fact that dcast is looking for an aggregation function suggests that you have more than one record for some combination of site|sampdate|param. No param value ('TDS', 'SO4', ...) should occur more than once for any combination of site and sampdate (i.e. no multiple measures of the same parameter on the same day in the same site). So, you might want to check your data to make sure you have no more than 1 measurement per param per sampdate per site. If that isn't the problem, then I would start thinking about restarting R and then re-installing reshape2. Good luck. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Syntax Check: rshape2 melt()
On Thu, 27 Oct 2011, Nordlund, Dan (DSHS/RDA) wrote: I will be leaving work shortly, and won't be able to look at this again until tomorrow some time. However, I went back and reread the help page on dcast. Here is part of that page. fun.aggregate aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified. I read that, too, Dan. But did not see how that applies to these data. It looks like your data may be slightly different than I expected. The fact that dcast is looking for an aggregation function suggests that you have more than one record for some combination of site|sampdate|param. No param value ('TDS', 'SO4', ...) should occur more than once for any combination of site and sampdate (i.e. no multiple measures of the same parameter on the same day in the same site). So, you might want to check your data to make sure you have no more than 1 measurement per param per sampdate per site. Nope. Can't. In the database table site, date, param are the compound primary key. Besides, it makes no sense to have two concentration measurements of the same chemical on the same day from the same location. I spent a lot of time getting rid of duplicate records because the same site was listed with two spellings (one with a hyphen, one without) or some records were listed as TDS while others from the same day and location were entered in their spreadsheet as Total Dissolved Solids. That table is clean, and the .csv copy didn't change any values on its way into R. If that isn't the problem, then I would start thinking about restarting R and then re-installing reshape2. Good luck. I'll do that tomorrow. It's sorta' a Microsoft solution: shut down computer, spin around it your chair three times, restart computer. Too bad there is not more detailed diagnostic information about the source of the error, or a way to stick a stub of code in the function call to illucidate just what dcast() thinks it sees. Thanks, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.