Re: [R] data format
This is the kind of problem the package tidyR has been designed for. On 19 Aug 2015, at 16:29, minikg min...@cmfri.org.in wrote: Hi, I have a dataset consisting of landmarks of each sample's coordinates as given below. landmark X Y X Y X Y P1 534 7 26 7 32 P2 46 45 48 42 44 48 P3 73 45 72 44 71 46 P4 92 43 90 43 89 42 please help me to change my data format to samplep1x1p1y1p2x2p2y2p3x3p3y3p4x4p4y4 1534 46 45 73 45 92 43 2726 48 42 72 44 90 43 3732 44 48 71 46 89 42 Thanks -- View this message in context: http://r.789695.n4.nabble.com/data-format-tp4711278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. signature.asc Description: Message signed with OpenPGP using GPGMail __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data format
Hi, I have a dataset consisting of landmarks of each sample's coordinates as given below. landmarkX Y X Y X Y P1 534 7 26 7 32 P2 46 45 48 42 44 48 P3 73 45 72 44 71 46 P4 92 43 90 43 89 42 please help me to change my data format to sample p1x1p1y1p2x2p2y2p3x3p3y3p4x4p4y4 1 534 46 45 73 45 92 43 2 726 48 42 72 44 90 43 3 732 44 48 71 46 89 42 Thanks -- View this message in context: http://r.789695.n4.nabble.com/data-format-tp4711278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format
This looks like data for a morphometrics analysis so you should know about package geomorph. Data like yours is often stored as a three dimensional array so we switch to that format and then use the two.d.array() function in package geomorph: Assuming your dataset is called dat: arr - array(as.matrix(dat[, -1]), dim=c(4, 2, 3)) library(geomorph) mat - two.d.array(arr) colnames(mat) - paste0(p, rep(1:4, each=2), + rep(c(x, y), 4), rep(1:4, each=2)) mat p1x1 p1y1 p2x2 p2y2 p3x3 p3y3 p4x4 p4y4 [1,]5 34 46 45 73 45 92 43 [2,]7 26 48 42 72 44 90 43 [3,]7 32 44 48 71 46 89 42 dat2 - data.frame(sample=1:3, mat) dat2 sample p1x1 p1y1 p2x2 p2y2 p3x3 p3y3 p4x4 p4y4 1 15 34 46 45 73 45 92 43 2 27 26 48 42 72 44 90 43 3 37 32 44 48 71 46 89 42 - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of minikg Sent: Wednesday, August 19, 2015 9:29 AM To: r-help@r-project.org Subject: [R] data format Hi, I have a dataset consisting of landmarks of each sample's coordinates as given below. landmarkX Y X Y X Y P1 534 7 26 7 32 P2 46 45 48 42 44 48 P3 73 45 72 44 71 46 P4 92 43 90 43 89 42 please help me to change my data format to sample p1x1p1y1p2x2p2y2p3x3p3y3p4x4p4y4 1 534 46 45 73 45 92 43 2 726 48 42 72 44 90 43 3 732 44 48 71 46 89 42 Thanks -- View this message in context: http://r.789695.n4.nabble.com/data-format-tp4711278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format setting
Thanks Frede,it helped alot. eliza From: fr...@vestas.com To: eliza_bo...@hotmail.com; r-help@r-project.org Date: Sat, 14 Jun 2014 06:09:08 +0200 Subject: RE: [R] data format setting Hi Eliza To me it seems like that you're not thinking before you messing about with the data before an analysis. The years with data for 366 days is leap years. It happens every fourth year and the extra day falls on the 29th of februar. I guess it is the results from the dcast function that screws things up to make you believe that it's day number 366. The best thing to do is to do your analysis on the complete data with some missing values for February 29th between leap years. Or you can discard the leap year day for leap years and do the analysis for all years of 365 days. What is the rationale by imputing missing data using the approx function? I mean the no leap year has only 365 days. If you for some unknown reasons you want to fill in value for the NAs one natural way is to substitute the NAs on February 29th by the mean of the values on February 28th and Marts 1st. I think there is a na.approx function for that in some package (perhaps zoo). Other metods are available in R: google for R + impute. Best Regards Frede Sendt fra Samsung mobil Oprindelig meddelelse Fra: eliza botto Dato:13/06/2014 20.48 (GMT+01:00) Til: r-help@r-project.org Emne: Re: [R] data format setting Thanks dennis, It worked but I had to do some simple modifications to get to the ultimate format. Now I have a list in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000... some columns had 365 rows and some 366. those with 365 rows had their 366 row as NA. Now I want to apply approx. command to interpolate 366 values to 365, but when I apply approx command I gives out something which is with $x and $y, and frankly speaking it messed up everything. Is their a way that i do it neatly without getting the format deteriorated? In any way, thank-you very much indeed. Eliza Date: Fri, 13 Jun 2014 11:11:37 -0700 Subject: Re: [R] data format setting From: djmu...@gmail.com To: eliza_bo...@hotmail.com Hi: Maybe something like this: library(reshape2) L - split(DF, DF$year) L2 - llply(L, function(d) dcast(d, month + day ~ year, value.var = discharge)) Obviously untested, so caveat emptor. The idea is to use the dcast function to reshape the data from long to wide format within year. Dennis On Fri, Jun 13, 2014 at 8:55 AM, eliza botto eliza_bo...@hotmail.com wrote: Dear R family, I hope you all be doing great. I have a dataset of following format. The data file is of the following format. st year month day discharge 1 A 2004 1 1 6.752828 2 A 2004 1 2 7.602053 3 A 2004 1 3 5.583619 4 A 2004 1 4 5.019562 5 A 2004 1 5 4.804489 6 A 2004 1 6 4.363541 7 A 2004 1 7 3.801333 8 A 2004 1 8 3.455991 9 A 2004 1 9 3.402634 10 A 2004 1 10 3.250693 .. .. continue .. .. st year month day discharge 2AY 196710 3 0.56 20001AY 196710 4 0.56 20002AY 196710 5 0.48 20003AY 196710 6 0.56 20004AY 196710 7 0.48 20005AY 196710 8 0.40 20006AY 196710 9 0.40 20007AY 196710 10 0.56 20008AY 196710 11 0.56 20009AY 196710 12 0.65 20010AY 196710 13 0.85 you can see that there are five columns. The first column has the name of the station. I want to split the data w.r.t the names of the stations. Each station has data for certain years. for example A has data for years from 2004 to 2010 and for AY its from 1967 to 2000.similarly for other years there is data for different number of years. I want to make a list of matrices each containing the data for that station in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000 each column should have 365 to 366 values depending on whether there is a leap year or not. obviously for non-leap years 366th row should be an NA. kindly help me on it. Thankyou very much in advance. Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version
[R] data format setting
Dear R family, I hope you all be doing great. I have a dataset of following format. The data file is of the following format. st year month day discharge 1 A 2004 1 1 6.752828 2 A 2004 1 2 7.602053 3 A 2004 1 3 5.583619 4 A 2004 1 4 5.019562 5 A 2004 1 5 4.804489 6 A 2004 1 6 4.363541 7 A 2004 1 7 3.801333 8 A 2004 1 8 3.455991 9 A 2004 1 9 3.402634 10 A 2004 1 10 3.250693 .. .. continue .. .. st year month day discharge 2AY 196710 3 0.56 20001AY 196710 4 0.56 20002AY 196710 5 0.48 20003AY 196710 6 0.56 20004AY 196710 7 0.48 20005AY 196710 8 0.40 20006AY 196710 9 0.40 20007AY 196710 10 0.56 20008AY 196710 11 0.56 20009AY 196710 12 0.65 20010AY 196710 13 0.85 you can see that there are five columns. The first column has the name of the station. I want to split the data w.r.t the names of the stations. Each station has data for certain years. for example A has data for years from 2004 to 2010 and for AY its from 1967 to 2000.similarly for other years there is data for different number of years. I want to make a list of matrices each containing the data for that station in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000 each column should have 365 to 366 values depending on whether there is a leap year or not. obviously for non-leap years 366th row should be an NA. kindly help me on it. Thankyou very much in advance. Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format setting
?sort, ?unique, and subset come to mind. Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600FAX:(360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels:300 Desmond Drive, Lacey, WA 98503-1274 On Fri, 13 Jun 2014, eliza botto wrote: Dear R family, I hope you all be doing great. I have a dataset of following format. The data file is of the following format. st year month day discharge 1 A 2004 1 1 6.752828 2 A 2004 1 2 7.602053 3 A 2004 1 3 5.583619 4 A 2004 1 4 5.019562 5 A 2004 1 5 4.804489 6 A 2004 1 6 4.363541 7 A 2004 1 7 3.801333 8 A 2004 1 8 3.455991 9 A 2004 1 9 3.402634 10 A 2004 1 10 3.250693 .. .. continue .. .. st year month day discharge 2AY 196710 3 0.56 20001AY 196710 4 0.56 20002AY 196710 5 0.48 20003AY 196710 6 0.56 20004AY 196710 7 0.48 20005AY 196710 8 0.40 20006AY 196710 9 0.40 20007AY 196710 10 0.56 20008AY 196710 11 0.56 20009AY 196710 12 0.65 20010AY 196710 13 0.85 you can see that there are five columns. The first column has the name of the station. I want to split the data w.r.t the names of the stations. Each station has data for certain years. for example A has data for years from 2004 to 2010 and for AY its from 1967 to 2000.similarly for other years there is data for different number of years. I want to make a list of matrices each containing the data for that station in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000 each column should have 365 to 366 values depending on whether there is a leap year or not. obviously for non-leap years 366th row should be an NA. kindly help me on it. Thankyou very much in advance. Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format setting
Thanks dennis, It worked but I had to do some simple modifications to get to the ultimate format. Now I have a list in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000... some columns had 365 rows and some 366. those with 365 rows had their 366 row as NA. Now I want to apply approx. command to interpolate 366 values to 365, but when I apply approx command I gives out something which is with $x and $y, and frankly speaking it messed up everything. Is their a way that i do it neatly without getting the format deteriorated? In any way, thank-you very much indeed. Eliza Date: Fri, 13 Jun 2014 11:11:37 -0700 Subject: Re: [R] data format setting From: djmu...@gmail.com To: eliza_bo...@hotmail.com Hi: Maybe something like this: library(reshape2) L - split(DF, DF$year) L2 - llply(L, function(d) dcast(d, month + day ~ year, value.var = discharge)) Obviously untested, so caveat emptor. The idea is to use the dcast function to reshape the data from long to wide format within year. Dennis On Fri, Jun 13, 2014 at 8:55 AM, eliza botto eliza_bo...@hotmail.com wrote: Dear R family, I hope you all be doing great. I have a dataset of following format. The data file is of the following format. st year month day discharge 1 A 2004 1 1 6.752828 2 A 2004 1 2 7.602053 3 A 2004 1 3 5.583619 4 A 2004 1 4 5.019562 5 A 2004 1 5 4.804489 6 A 2004 1 6 4.363541 7 A 2004 1 7 3.801333 8 A 2004 1 8 3.455991 9 A 2004 1 9 3.402634 10 A 2004 1 10 3.250693 .. .. continue .. .. st year month day discharge 2AY 196710 3 0.56 20001AY 196710 4 0.56 20002AY 196710 5 0.48 20003AY 196710 6 0.56 20004AY 196710 7 0.48 20005AY 196710 8 0.40 20006AY 196710 9 0.40 20007AY 196710 10 0.56 20008AY 196710 11 0.56 20009AY 196710 12 0.65 20010AY 196710 13 0.85 you can see that there are five columns. The first column has the name of the station. I want to split the data w.r.t the names of the stations. Each station has data for certain years. for example A has data for years from 2004 to 2010 and for AY its from 1967 to 2000.similarly for other years there is data for different number of years. I want to make a list of matrices each containing the data for that station in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000 each column should have 365 to 366 values depending on whether there is a leap year or not. obviously for non-leap years 366th row should be an NA. kindly help me on it. Thankyou very much in advance. Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format setting
As always, you are requested to post in plain text and to provide a reproducible example. Messed things up is quite vague. FWIW: In general, processing in sequence is best done BEFORE you cast your data to wide format. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On June 13, 2014 12:46:01 PM MDT, eliza botto eliza_bo...@hotmail.com wrote: Thanks dennis, It worked but I had to do some simple modifications to get to the ultimate format. Now I have a list in the following format $A 2004 200520062007200820092010 .. ... ... .. ... $AY 1967 19682000... some columns had 365 rows and some 366. those with 365 rows had their 366 row as NA. Now I want to apply approx. command to interpolate 366 values to 365, but when I apply approx command I gives out something which is with $x and $y, and frankly speaking it messed up everything. Is their a way that i do it neatly without getting the format deteriorated? In any way, thank-you very much indeed. Eliza Date: Fri, 13 Jun 2014 11:11:37 -0700 Subject: Re: [R] data format setting From: djmu...@gmail.com To: eliza_bo...@hotmail.com Hi: Maybe something like this: library(reshape2) L - split(DF, DF$year) L2 - llply(L, function(d) dcast(d, month + day ~ year, value.var = discharge)) Obviously untested, so caveat emptor. The idea is to use the dcast function to reshape the data from long to wide format within year. Dennis On Fri, Jun 13, 2014 at 8:55 AM, eliza botto eliza_bo...@hotmail.com wrote: Dear R family, I hope you all be doing great. I have a dataset of following format. The data file is of the following format. st year month day discharge 1 A 2004 1 1 6.752828 2 A 2004 1 2 7.602053 3 A 2004 1 3 5.583619 4 A 2004 1 4 5.019562 5 A 2004 1 5 4.804489 6 A 2004 1 6 4.363541 7 A 2004 1 7 3.801333 8 A 2004 1 8 3.455991 9 A 2004 1 9 3.402634 10 A 2004 1 10 3.250693 .. .. continue .. .. st year month day discharge 2AY 196710 3 0.56 20001AY 196710 4 0.56 20002AY 196710 5 0.48 20003AY 196710 6 0.56 20004AY 196710 7 0.48 20005AY 196710 8 0.40 20006AY 196710 9 0.40 20007AY 196710 10 0.56 20008AY 196710 11 0.56 20009AY 196710 12 0.65 20010AY 196710 13 0.85 you can see that there are five columns. The first column has the name of the station. I want to split the data w.r.t the names of the stations. Each station has data for certain years. for example A has data for years from 2004 to 2010 and for AY its from 1967 to 2000.similarly for other years there is data for different number of years. I want to make a list of matrices each containing the data for that station in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000 each column should have 365 to 366 values depending on whether there is a leap year or not. obviously for non-leap years 366th row should be an NA. kindly help me on it. Thankyou very much in advance. Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format setting
Hi Eliza To me it seems like that you're not thinking before you messing about with the data before an analysis. The years with data for 366 days is leap years. It happens every fourth year and the extra day falls on the 29th of februar. I guess it is the results from the dcast function that screws things up to make you believe that it's day number 366. The best thing to do is to do your analysis on the complete data with some missing values for February 29th between leap years. Or you can discard the leap year day for leap years and do the analysis for all years of 365 days. What is the rationale by imputing missing data using the approx function? I mean the no leap year has only 365 days. If you for some unknown reasons you want to fill in value for the NAs one natural way is to substitute the NAs on February 29th by the mean of the values on February 28th and Marts 1st. I think there is a na.approx function for that in some package (perhaps zoo). Other metods are available in R: google for R + impute. Best Regards Frede Sendt fra Samsung mobil Oprindelig meddelelse Fra: eliza botto Dato:13/06/2014 20.48 (GMT+01:00) Til: r-help@r-project.org Emne: Re: [R] data format setting Thanks dennis, It worked but I had to do some simple modifications to get to the ultimate format. Now I have a list in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000... some columns had 365 rows and some 366. those with 365 rows had their 366 row as NA. Now I want to apply approx. command to interpolate 366 values to 365, but when I apply approx command I gives out something which is with $x and $y, and frankly speaking it messed up everything. Is their a way that i do it neatly without getting the format deteriorated? In any way, thank-you very much indeed. Eliza Date: Fri, 13 Jun 2014 11:11:37 -0700 Subject: Re: [R] data format setting From: djmu...@gmail.com To: eliza_bo...@hotmail.com Hi: Maybe something like this: library(reshape2) L - split(DF, DF$year) L2 - llply(L, function(d) dcast(d, month + day ~ year, value.var = discharge)) Obviously untested, so caveat emptor. The idea is to use the dcast function to reshape the data from long to wide format within year. Dennis On Fri, Jun 13, 2014 at 8:55 AM, eliza botto eliza_bo...@hotmail.com wrote: Dear R family, I hope you all be doing great. I have a dataset of following format. The data file is of the following format. st year month day discharge 1 A 2004 1 1 6.752828 2 A 2004 1 2 7.602053 3 A 2004 1 3 5.583619 4 A 2004 1 4 5.019562 5 A 2004 1 5 4.804489 6 A 2004 1 6 4.363541 7 A 2004 1 7 3.801333 8 A 2004 1 8 3.455991 9 A 2004 1 9 3.402634 10 A 2004 1 10 3.250693 .. .. continue .. .. st year month day discharge 2AY 196710 3 0.56 20001AY 196710 4 0.56 20002AY 196710 5 0.48 20003AY 196710 6 0.56 20004AY 196710 7 0.48 20005AY 196710 8 0.40 20006AY 196710 9 0.40 20007AY 196710 10 0.56 20008AY 196710 11 0.56 20009AY 196710 12 0.65 20010AY 196710 13 0.85 you can see that there are five columns. The first column has the name of the station. I want to split the data w.r.t the names of the stations. Each station has data for certain years. for example A has data for years from 2004 to 2010 and for AY its from 1967 to 2000.similarly for other years there is data for different number of years. I want to make a list of matrices each containing the data for that station in the following format $A 2004200520062007200820092010 .. ... ... .. ... $AY 196719682000 each column should have 365 to 366 values depending on whether there is a leap year or not. obviously for non-leap years 366th row should be an NA. kindly help me on it. Thankyou very much in advance. Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted
Re: [R] data format
Hi elisa, Try this: mat1-matrix(signif(c(1.200407,1.861941,1.560613,2.129241,2.047772,1.784105,1.777159,1.988596,2.163199,2.446993,3.593623,5.706672),digits=3),ncol=1) list1- list(mat1,mat1,mat1) list2-lapply(list1,function(x) data.frame(date1=format(seq.Date(as.Date(1911.01.01,format=%Y.%m.%d),by=month,length.out=12),format=%Y.%m.%d),value=x,stringsAsFactors=FALSE)) list3- lapply(list2,function(x){ substr(x[,1],6,6)- ifelse(substr(x[,1],6,6)==0, ,substr(x[,1],6,6));substr(x[,1],9,9)- ifelse(substr(x[,1],9,9)==0, ,substr(x[,1],9,9));x}) list4- lapply(list3,function(x) {x[,2]-sprintf(%.2f,x[,2]);data.frame(col1=c(EXACT DATA,FROM 1911 1 1 TO 1911 12 1,do.call(paste,x)),stringsAsFactors=FALSE)}) list4[[1]] # col1 #1 EXACT DATA #2 FROM 1911 1 1 TO 1911 12 1 #3 1911. 1. 1 1.20 #4 1911. 2. 1 1.86 #5 1911. 3. 1 1.56 #6 1911. 4. 1 2.13 #7 1911. 5. 1 2.05 #8 1911. 6. 1 1.78 #9 1911. 7. 1 1.78 #10 1911. 8. 1 1.99 #11 1911. 9. 1 2.16 #12 1911.10. 1 2.45 #13 1911.11. 1 3.59 #14 1911.12. 1 5.71 A.K. From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Wednesday, February 20, 2013 8:25 AM Subject: RE: data format Dear Arun, i have a slight inquiry, and i hope you wont mind if i have a list of 124 like the following [[1]] [,1] [1,] 1.200407 [2,] 1.861941 [3,] 1.560613 [4,] 2.129241 [5,] 2.047772 [6,] 1.784105 [7,] 1.777159 [8,] 1.988596 [9,] 2.163199 [10,] 2.446993 [11,] 3.593623 [12,] 5.706672 and i want them all in the following manner [[1]] EXACT DATA FROM 1911 1 1 TO 1911 12 1 1911. 1. 1 1.20 1911. 2. 1 1.86 1911. 3. 1 1.56 1911. 4. 1 2.12 1911. 5. 1 2.04 1911. 6. 1 1.78 1911. 7. 1 1.77 1911. 8. 1 1.98 1911. 9. 1 2.16 1911.10. 1 2.44 1911.11. 1 3.59 1911.12. 1 5.70 date pattern should be same as before and the following two line should be inserted on the top of every list EXACT DATA FROM 1911 1 1 TO 1911 12 1 thankyou so very much in advance. i hope you wont my frequent questions elisa Date: Tue, 19 Feb 2013 08:18:25 -0800 From: smartpink...@yahoo.com Subject: Re: data format To: eliza_bo...@hotmail.com Hi Elisa, No problem. Arun From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Tuesday, February 19, 2013 11:10 AM Subject: RE: data format Thanks arun. it worked!! i am so glad elisa Date: Tue, 19 Feb 2013 07:22:20 -0800 From: smartpink...@yahoo.com Subject: Re: data format To: eliza_bo...@hotmail.com CC: r-help@r-project.org Hi, Try this: el- read.csv(el.csv,header=TRUE,sep=\t,stringsAsFactors=FALSE) elsplit- split(el,el$st) datetrial-data.frame(date1=seq.Date(as.Date(1930.1.1,format=%Y.%m.%d),as.Date(2010.12.31,format=%Y.%m.%d),by=day)) elsplit1- lapply(elsplit,function(x) data.frame(date1=as.Date(paste(x[,2],x[,3],x[,4],sep=-),format=%Y-%m-%d),discharge=x[,5])) elsplit2-lapply(elsplit1,function(x) x[order(x[,1]),]) library(plyr) elsplit3-lapply(elsplit2,function(x) join(datetrial,x,by=date1,type=full)) elsplit4-lapply(elsplit3,function(x) {x[,2][is.na(x[,2])]- -.000;x}) elsplit5-lapply(elsplit4,function(x) {x[,1]-format(x[,1],%Y.%m.%d);x}) elsplit6-lapply(elsplit5,function(x){substr(x[,1],6,6)-ifelse(substr(x[,1],6,6)==0, ,substr(x[,1],6,6));substr(x[,1],9,9)- ifelse(substr(x[,1],9,9)==0, ,substr(x[,1],9,9));x}) elsplit6[[1]][1:4,] # date1 discharge #1 1930. 1. 1 -.000 #2 1930. 1. 2 -.000 #3 1930. 1. 3 -.000 #4 1930. 1. 4 -.000 length(elsplit6) #[1] 124 tail(elsplit6[[124]],25) # date1 discharge #29561 2010.12. 7 -.000 #29562 2010.12. 8 -.000 #29563 2010.12. 9 -.000 #29564 2010.12.10 -.000 #29565 2010.12.11 -.000 #29566 2010.12.12 -.000 #29567 2010.12.13 -.000 #29568 2010.12.14 -.000 #29569 2010.12.15 -.000 #29570 2010.12.16 -.000 #29571 2010.12.17 -.000 #29572 2010.12.18 -.000 #29573 2010.12.19 -.000 #29574 2010.12.20 -.000 #29575 2010.12.21 -.000 #29576 2010.12.22 -.000 #29577 2010.12.23 -.000 #29578 2010.12.24 -.000 #29579 2010.12.25 -.000 #29580 2010.12.26 -.000 #29581 2010.12.27 -.000 #29582 2010.12.28 -.000 #29583 2010.12.29 -.000 #29584 2010.12.30 -.000 #29585 2010.12.31 -.000 str(head(elsplit6,3)) #List of 3 # $ AGOMO:'data.frame': 29585 obs. of 2 variables: # ..$ date1 : chr [1:29585] 1930. 1. 1 1930. 1. 2 1930. 1. 3 1930. 1. 4 ... #..$ discharge: chr [1:29585] -.000 -.000 -.000 -.000 ... #$ AGONO:'data.frame': 29585 obs. of 2
Re: [R] data format
Hi, Try this: el- read.csv(el.csv,header=TRUE,sep=\t,stringsAsFactors=FALSE) elsplit- split(el,el$st) datetrial-data.frame(date1=seq.Date(as.Date(1930.1.1,format=%Y.%m.%d),as.Date(2010.12.31,format=%Y.%m.%d),by=day)) elsplit1- lapply(elsplit,function(x) data.frame(date1=as.Date(paste(x[,2],x[,3],x[,4],sep=-),format=%Y-%m-%d),discharge=x[,5])) elsplit2-lapply(elsplit1,function(x) x[order(x[,1]),]) library(plyr) elsplit3-lapply(elsplit2,function(x) join(datetrial,x,by=date1,type=full)) elsplit4-lapply(elsplit3,function(x) {x[,2][is.na(x[,2])]- -.000;x}) elsplit5-lapply(elsplit4,function(x) {x[,1]-format(x[,1],%Y.%m.%d);x}) elsplit6-lapply(elsplit5,function(x){substr(x[,1],6,6)-ifelse(substr(x[,1],6,6)==0, ,substr(x[,1],6,6));substr(x[,1],9,9)- ifelse(substr(x[,1],9,9)==0, ,substr(x[,1],9,9));x}) elsplit6[[1]][1:4,] # date1 discharge #1 1930. 1. 1 -.000 #2 1930. 1. 2 -.000 #3 1930. 1. 3 -.000 #4 1930. 1. 4 -.000 length(elsplit6) #[1] 124 tail(elsplit6[[124]],25) # date1 discharge #29561 2010.12. 7 -.000 #29562 2010.12. 8 -.000 #29563 2010.12. 9 -.000 #29564 2010.12.10 -.000 #29565 2010.12.11 -.000 #29566 2010.12.12 -.000 #29567 2010.12.13 -.000 #29568 2010.12.14 -.000 #29569 2010.12.15 -.000 #29570 2010.12.16 -.000 #29571 2010.12.17 -.000 #29572 2010.12.18 -.000 #29573 2010.12.19 -.000 #29574 2010.12.20 -.000 #29575 2010.12.21 -.000 #29576 2010.12.22 -.000 #29577 2010.12.23 -.000 #29578 2010.12.24 -.000 #29579 2010.12.25 -.000 #29580 2010.12.26 -.000 #29581 2010.12.27 -.000 #29582 2010.12.28 -.000 #29583 2010.12.29 -.000 #29584 2010.12.30 -.000 #29585 2010.12.31 -.000 str(head(elsplit6,3)) #List of 3 # $ AGOMO:'data.frame': 29585 obs. of 2 variables: # ..$ date1 : chr [1:29585] 1930. 1. 1 1930. 1. 2 1930. 1. 3 1930. 1. 4 ... #..$ discharge: chr [1:29585] -.000 -.000 -.000 -.000 ... #$ AGONO:'data.frame': 29585 obs. of 2 variables: #..$ date1 : chr [1:29585] 1930. 1. 1 1930. 1. 2 1930. 1. 3 1930. 1. 4 ... #..$ discharge: chr [1:29585] -.000 -.000 -.000 -.000 ... #$ ANZMA:'data.frame': 29585 obs. of 2 variables: #..$ date1 : chr [1:29585] 1930. 1. 1 1930. 1. 2 1930. 1. 3 1930. 1. 4 ... #..$ discharge: chr [1:29585] -.000 -.000 -.000 -.000 ... Regarding the space between date1 and discharge, I haven't checked it as you didn't mention whether it is needed in data.frame or not. A.K. From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Tuesday, February 19, 2013 10:01 AM Subject: RE: THANKS ARUN.. ITS A CHARACTER SORRY FOR NOT TELLING YOU IN ADVANCE ELISA Date: Tue, 19 Feb 2013 07:00:03 -0800 From: smartpink...@yahoo.com Subject: Re: To: eliza_bo...@hotmail.com Hi, One more doubt. You mentioned about -.000. Is it going to be a number or character like -.000? If it is a number, the final product will be -. Arun From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Tuesday, February 19, 2013 9:16 AM Subject: RE: How can u be wrong arun?? you are right. elisa Date: Tue, 19 Feb 2013 06:15:31 -0800 From: smartpink...@yahoo.com Subject: Re: To: eliza_bo...@hotmail.com Hi Elisa, Just a doubt regarding the format of the date. Is it the same format as the previous one? 0 replaced by one space if either month or day is less than 10. Also, if I am correct, the list elements are for the different stationname, right? Arun From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Tuesday, February 19, 2013 8:35 AM Subject: Dear Arun, [Text file is also attached if format is changed, where as el is data file Attached with email is the excel file with contains the data. the data is following form col1. col2. col3.col4.col5. stationname year month day discharge A 2004 11232 A 2004 1 2 334 . B 2009 11 323 B 2009 12332 There are stations where data starts from and ends at different years but i want each year to start from 1930 and ends at 2010 with -.000 for those days when data is missing. i want to make a list which should appear like the following [[A]] 1930. 1. 1 -.000 1930. 1. 2 -.000 1930. 1. 3 -.000 1930. 1. 4 -.000 1930. 1. 5 -.000 1930. 1. 6 -.000 1930. 1. 7 -.000 1930. 1. 8 -.000 1930. 1. 9 -.000 1930. 1.10 -.000 1930. 1.11 -.000 1930. 1.12 -.000 1930. 1.13 -.000
[R] data format for ordination
Hello, I want to do an unconstrained ordination to look at my plant community data but don't know how to account for the fact I have multiple visits per site. Do I need to look at each month separately? My data set is of 30 field sites that I visited 5 times in the year and the abundance of each plant species is in each column; sitemonthspecies aspecies b species c ...etc 1 may 1 0 7 1 june 5 0 9 1 july 2 8 0 1 aug 12 6 0 1 sept14 5 0 2may 2etc (I have additional environmental data for later analyses) Also, is there much difference between analysis in Canoco and R? Thanks Cathy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data format
Dear all, I have a input file like following : T TTTAG TTAAC GGATT ACGTA How can I make a single vector with this like following: AGTTAACGGATTACGTA Best regards Albert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format
On Thu, Jul 7, 2011 at 4:37 PM, albert coster albertcoster2...@gmail.comwrote: Dear all, I have a input file like following : T TTTAG TTAAC GGATT ACGTA How can I make a single vector with this like following: AGTTAACGGATTACGTA ?paste, specifically the collapse argument Rainer Best regards Albert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Stellenbosch University South Africa Tel : +33 - (0)9 53 10 27 44 Cell: +33 - (0)6 85 62 59 98 Fax (F): +33 - (0)9 58 10 27 44 Fax (D):+49 - (0)3 21 21 25 22 44 email: rai...@krugs.de Skype: RMkrug [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format
x - readLines(textConnection(T + TTTAG + TTAAC + GGATT + ACGTA)) closeAllConnections() paste(x, collapse = '') [1] AGTTAACGGATTACGTA On Thu, Jul 7, 2011 at 10:37 AM, albert coster albertcoster2...@gmail.com wrote: Dear all, I have a input file like following : T TTTAG TTAAC GGATT ACGTA How can I make a single vector with this like following: AGTTAACGGATTACGTA Best regards Albert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data format question for triangle.plot package ade4
hello, I am trying to develop a triangle plot but am having difficultly assigning the row.names to the 3 columns in the data.frame Here is what I've done, attach(SoilVegHydro) dim(SoilVegHydro) 129239 # now take 3 variables from main data.frame for plotting dat - cbind.data.frame(TP, meanAnnualDepthAve, BulkDensity) # These are variables held in the data frame SoilVegHydro row.names(dat) - paste(row.names(SoilVegHydro$Physiogomy), rep(c(1,2,3), rep(1292, 3)), sep = ) # following the syntax from the help triangle.plot page this is returned when the last line is submitted. row.names(dat) - paste(row.names(SoilVegHydro$Physiogomy), rep(c(1,2,3), rep(1292,3)), sep=) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 1, 1, 1, : invalid 'row.names' length I'm not certain how to define the row.names . If anyone can help I'd appreciate it. I'm using R 2.11.1 (2010-5-31) on Windows XP Thanks Steve Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data format question for triangle.plot package ade4
On Jul 8, 2010, at 4:41 PM, steve_fried...@nps.gov wrote: hello, I am trying to develop a triangle plot but am having difficultly assigning the row.names to the 3 columns in the data.frame Here is what I've done, attach(SoilVegHydro) dim(SoilVegHydro) 129239 # now take 3 variables from main data.frame for plotting dat - cbind.data.frame(TP, meanAnnualDepthAve, BulkDensity) # These are variables held in the data frame SoilVegHydro Did that dat object have what you wanted? The function call did not make any reference to SoilVegHydro. What does str(dat) return? Oh, never mind, I now see you use attach. row.names(dat) - paste(row.names(SoilVegHydro$Physiogomy), Generally row.names is used on a dataframe rather than on a column vector. dat - data.frame(1:3, LETTERS[1:3]) row.names(dat$X1) row.names(dat) [1] 1 2 3 length(row.names(dat$X1)) [1] 0 rep(c(1,2,3), rep(1292, 3)), sep = ) # following the syntax from the help triangle.plot page this is returned when the last line is submitted. row.names(dat) - paste(row.names(SoilVegHydro$Physiogomy), rep(c(1,2,3), rep(1292,3)), sep=) Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 1, 1, 1, : invalid 'row.names' length I'm not certain how to define the row.names . If anyone can help I'd appreciate it. I'm using R 2.11.1 (2010-5-31) on Windows XP Thanks Steve Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data format for KSVM
Noah Silverman wrote: Hi, I have a process using svm from the e1071 library. It's called a *package* which is probably installed in a *library* of packages. it works. I want to try using the KSVM library instead. The same data used wiht e1071 gives me an error with KSVM. I guess you are talking about the ksvm *function* in *package* kernlab now, right? My data is a data.frame. sample code: svm_formula - formula(y ~ a + B + C) You do not use svm_function below, do you? svm_model - ksvm(formula, data=train_data, type=C-svc, kernel=rbfdot, C=1) I get the following error: object is not a matrix ksvm works for me. Please specify a reproducible example (including the data) or give us at least the output of str(data) and specofy which verions of R and kernlab you are talking about. Uwe Ligges So I tried this: svm_model - ksvm(formula, data=as.matrix(train_data), type=C-svc, kernel=rbfdot, C=1, scaled=FALSE) Now I get this error: Error in model.fram.definition(data = list(v1 = c(1.1234, -2.3232: Object is not a matrix My data was previously scaled with the scale() function so that the mean is centered at 0. and the range is {-1,1} Can anyone provide some suggestions as to why I'm getting an error? Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data format for KSVM
Hi, I have a process using svm from the e1071 library. it works. I want to try using the KSVM library instead. The same data used wiht e1071 gives me an error with KSVM. My data is a data.frame. sample code: svm_formula - formula(y ~ a + B + C) svm_model - ksvm(formula, data=train_data, type=C-svc, kernel=rbfdot, C=1) I get the following error: object is not a matrix So I tried this: svm_model - ksvm(formula, data=as.matrix(train_data), type=C-svc, kernel=rbfdot, C=1, scaled=FALSE) Now I get this error: Error in model.fram.definition(data = list(v1 = c(1.1234, -2.3232: Object is not a matrix My data was previously scaled with the scale() function so that the mean is centered at 0. and the range is {-1,1} Can anyone provide some suggestions as to why I'm getting an error? Thanks! -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data format issue
Dear all- I have a dataset (see a sample below - but the whole dataset is June 2005 - June 2008). The LST format is YYMMDDHHmm and I would like to get the hourly average of the mph for the summer months (spanning all years). I have been trying to use aggregate but am not having much success at all! any thoughts would be greatly appreciated. thanks- sherri LST inch mphDeg DegF DegF%volts Degmph w/m2 050601 0.00 13.6 218.1 36.8 -999 65.1 -999 -999 18.20.2 0506010005 0.00 12.9 214.3 36.8 -999 65.5 -999 -999 16.90.2 0506010010 0.00 14.4 215.7 36.9 -999 65.4 -999 -999 20.40.2 0506010015 0.00 13.8 215.8 36.8 -999 65.7 -999 -999 19.70.3 0506010020 0.00 11.9 213.4 36.8 -999 65.6 -999 -999 14.60.2 0506010025 0.00 12.7 212.4 36.8 -999 65.4 -999 -999 16.90.2 0506010030 0.00 14.1 215.8 36.8 -999 65.9 -999 -999 19.10.2 0506010035 0.00 14.8 217.2 36.7 -999 66.2 -999 -999 20.40.2 0506010040 0.00 16.2 222.0 36.8 -999 66.6 -999 -999 20.20.2 0506010045 0.00 13.6 219.5 36.7 -999 66.6 -999 -999 18.40.2 0506010050 0.00 14.8 217.6 36.7 -999 66.2 -999 -999 20.00.2 0506010055 0.00 13.1 214.8 36.7 -999 65.9 -999 -999 20.20.2 0506010100 0.00 12.2 214.3 36.7 -999 65.2 -999 -999 15.60.2 0506010105 0.00 14.2 207.8 36.7 -999 65.0 -999 -999 19.90.2 0506010110 0.00 15.4 207.0 36.7 -999 64.4 -999 -999 20.20.2 0506010115 0.00 17.2 205.9 36.7 -999 64.5 -999 -999 22.10.2 0506010120 0.00 16.8 208.9 36.8 -999 65.0 -999 -999 21.90.2 0506010125 0.00 18.4 214.0 36.9 -999 65.1 -999 -999 26.40.2 0506010130 0.00 17.3 214.7 37.0 -999 65.5 -999 -999 24.00.2 0506010135 0.00 18.4 214.3 37.1 -999 65.2 -999 -999 24.90.2 0506010140 0.00 19.6 216.6 37.3 -999 65.3 -999 -999 26.70.2 0506010145 0.00 19.7 220.5 37.5 -999 65.1 -999 -999 27.50.2 0506010150 0.00 19.6 215.5 37.6 -999 64.6 -999 -999 26.40.2 0506010155 0.00 21.8 220.1 37.8 -999 64.1 -999 -999 31.20.2 0506010200 0.00 23.4 222.9 37.9 -999 63.8 -999 -999 31.80.2 0506010205 0.00 24.0 221.7 37.9 -999 63.7 -999 -999 30.30.2 0506010210 0.00 24.2 223.4 38.0 -999 63.5 -999 -999 28.20.2 0506010215 0.00 23.8 224.9 38.0 -999 63.4 -999 -999 30.30.2 0506010220 0.00 23.9 225.1 38.1 -999 63.5 -999 -999 29.50.2 0506010225 0.00 23.9 227.4 38.1 -999 63.5 -999 -999 30.30.2 0506010230 0.00 23.9 226.0 38.0 -999 63.6 -999 -999 27.50.2 0506010235 0.00 21.5 221.4 38.0 -999 63.7 -999 -999 28.40.2 0506010240 0.00 22.3 222.6 37.9 -999 63.8 -999 -999 27.90.2 0506010245 0.00 21.5 223.9 37.9 -999 64.0 -999 -999 28.40.2 0506010250 0.00 22.2 226.7 37.8 -999 64.2 -999 -999 27.70.2 0506010255 0.00 21.9 223.5 37.8 -999 64.3 -999 -999 26.90.2 0506010300 0.00 22.0 223.2 37.7 -999 64.3 -999 -999 28.00.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format issue
Does this do it for you: # quick and dirty -- remove the 'mm' from the data and then aggregate x$hours - (x$LST %/% 100) * 100 aggregate(x$mph, list(x$hours), mean) Group.1x 1 50601 13.82500 2 506010100 17.55000 3 506010200 23.04167 4 506010300 22.0 You can also 'filter' out the months for only the summer On Sat, Dec 20, 2008 at 9:09 PM, Sherri Heck sh...@ucar.edu wrote: Dear all- I have a dataset (see a sample below - but the whole dataset is June 2005 - June 2008). The LST format is YYMMDDHHmm and I would like to get the hourly average of the mph for the summer months (spanning all years). I have been trying to use aggregate but am not having much success at all! any thoughts would be greatly appreciated. thanks- sherri LST inch mphDeg DegF DegF%volts Degmph w/m2 050601 0.00 13.6 218.1 36.8 -999 65.1 -999 -999 18.2 0.2 0506010005 0.00 12.9 214.3 36.8 -999 65.5 -999 -999 16.9 0.2 0506010010 0.00 14.4 215.7 36.9 -999 65.4 -999 -999 20.4 0.2 0506010015 0.00 13.8 215.8 36.8 -999 65.7 -999 -999 19.7 0.3 0506010020 0.00 11.9 213.4 36.8 -999 65.6 -999 -999 14.6 0.2 0506010025 0.00 12.7 212.4 36.8 -999 65.4 -999 -999 16.9 0.2 0506010030 0.00 14.1 215.8 36.8 -999 65.9 -999 -999 19.1 0.2 0506010035 0.00 14.8 217.2 36.7 -999 66.2 -999 -999 20.4 0.2 0506010040 0.00 16.2 222.0 36.8 -999 66.6 -999 -999 20.2 0.2 0506010045 0.00 13.6 219.5 36.7 -999 66.6 -999 -999 18.4 0.2 0506010050 0.00 14.8 217.6 36.7 -999 66.2 -999 -999 20.0 0.2 0506010055 0.00 13.1 214.8 36.7 -999 65.9 -999 -999 20.2 0.2 0506010100 0.00 12.2 214.3 36.7 -999 65.2 -999 -999 15.6 0.2 0506010105 0.00 14.2 207.8 36.7 -999 65.0 -999 -999 19.9 0.2 0506010110 0.00 15.4 207.0 36.7 -999 64.4 -999 -999 20.2 0.2 0506010115 0.00 17.2 205.9 36.7 -999 64.5 -999 -999 22.1 0.2 0506010120 0.00 16.8 208.9 36.8 -999 65.0 -999 -999 21.9 0.2 0506010125 0.00 18.4 214.0 36.9 -999 65.1 -999 -999 26.4 0.2 0506010130 0.00 17.3 214.7 37.0 -999 65.5 -999 -999 24.0 0.2 0506010135 0.00 18.4 214.3 37.1 -999 65.2 -999 -999 24.9 0.2 0506010140 0.00 19.6 216.6 37.3 -999 65.3 -999 -999 26.7 0.2 0506010145 0.00 19.7 220.5 37.5 -999 65.1 -999 -999 27.5 0.2 0506010150 0.00 19.6 215.5 37.6 -999 64.6 -999 -999 26.4 0.2 0506010155 0.00 21.8 220.1 37.8 -999 64.1 -999 -999 31.2 0.2 0506010200 0.00 23.4 222.9 37.9 -999 63.8 -999 -999 31.8 0.2 0506010205 0.00 24.0 221.7 37.9 -999 63.7 -999 -999 30.3 0.2 0506010210 0.00 24.2 223.4 38.0 -999 63.5 -999 -999 28.2 0.2 0506010215 0.00 23.8 224.9 38.0 -999 63.4 -999 -999 30.3 0.2 0506010220 0.00 23.9 225.1 38.1 -999 63.5 -999 -999 29.5 0.2 0506010225 0.00 23.9 227.4 38.1 -999 63.5 -999 -999 30.3 0.2 0506010230 0.00 23.9 226.0 38.0 -999 63.6 -999 -999 27.5 0.2 0506010235 0.00 21.5 221.4 38.0 -999 63.7 -999 -999 28.4 0.2 0506010240 0.00 22.3 222.6 37.9 -999 63.8 -999 -999 27.9 0.2 0506010245 0.00 21.5 223.9 37.9 -999 64.0 -999 -999 28.4 0.2 0506010250 0.00 22.2 226.7 37.8 -999 64.2 -999 -999 27.7 0.2 0506010255 0.00 21.9 223.5 37.8 -999 64.3 -999 -999 26.9 0.2 0506010300 0.00 22.0 223.2 37.7 -999 64.3 -999 -999 28.0 0.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format issue
Use read.zoo and aggregate.zoo from zoo and months, hours and as.chron from chron. Note that we must read in col 1 as character to ensure leading zeros don't get dropped. There are two mph columns and it is assumed you want both: Lines - LST inch mphDeg DegF DegF%volts Deg mph w/m2 050601 0.00 13.6 218.1 36.8 -999 65.1 -999 -999 18.20.2 0506010005 0.00 12.9 214.3 36.8 -999 65.5 -999 -999 16.90.2 0506010010 0.00 14.4 215.7 36.9 -999 65.4 -999 -999 20.40.2 0506010015 0.00 13.8 215.8 36.8 -999 65.7 -999 -999 19.70.3 0506010020 0.00 11.9 213.4 36.8 -999 65.6 -999 -999 14.60.2 0506010025 0.00 12.7 212.4 36.8 -999 65.4 -999 -999 16.90.2 0506010030 0.00 14.1 215.8 36.8 -999 65.9 -999 -999 19.10.2 0506010035 0.00 14.8 217.2 36.7 -999 66.2 -999 -999 20.40.2 0506010040 0.00 16.2 222.0 36.8 -999 66.6 -999 -999 20.20.2 0506010045 0.00 13.6 219.5 36.7 -999 66.6 -999 -999 18.40.2 0506010050 0.00 14.8 217.6 36.7 -999 66.2 -999 -999 20.00.2 0506010055 0.00 13.1 214.8 36.7 -999 65.9 -999 -999 20.20.2 0506010100 0.00 12.2 214.3 36.7 -999 65.2 -999 -999 15.60.2 0506010105 0.00 14.2 207.8 36.7 -999 65.0 -999 -999 19.90.2 0506010110 0.00 15.4 207.0 36.7 -999 64.4 -999 -999 20.20.2 0506010115 0.00 17.2 205.9 36.7 -999 64.5 -999 -999 22.10.2 0506010120 0.00 16.8 208.9 36.8 -999 65.0 -999 -999 21.90.2 0506010125 0.00 18.4 214.0 36.9 -999 65.1 -999 -999 26.40.2 0506010130 0.00 17.3 214.7 37.0 -999 65.5 -999 -999 24.00.2 0506010135 0.00 18.4 214.3 37.1 -999 65.2 -999 -999 24.90.2 0506010140 0.00 19.6 216.6 37.3 -999 65.3 -999 -999 26.70.2 0506010145 0.00 19.7 220.5 37.5 -999 65.1 -999 -999 27.50.2 0506010150 0.00 19.6 215.5 37.6 -999 64.6 -999 -999 26.40.2 0506010155 0.00 21.8 220.1 37.8 -999 64.1 -999 -999 31.20.2 0506010200 0.00 23.4 222.9 37.9 -999 63.8 -999 -999 31.80.2 0506010205 0.00 24.0 221.7 37.9 -999 63.7 -999 -999 30.30.2 0506010210 0.00 24.2 223.4 38.0 -999 63.5 -999 -999 28.20.2 0506010215 0.00 23.8 224.9 38.0 -999 63.4 -999 -999 30.30.2 0506010220 0.00 23.9 225.1 38.1 -999 63.5 -999 -999 29.50.2 0506010225 0.00 23.9 227.4 38.1 -999 63.5 -999 -999 30.30.2 0506010230 0.00 23.9 226.0 38.0 -999 63.6 -999 -999 27.50.2 0506010235 0.00 21.5 221.4 38.0 -999 63.7 -999 -999 28.40.2 0506010240 0.00 22.3 222.6 37.9 -999 63.8 -999 -999 27.90.2 0506010245 0.00 21.5 223.9 37.9 -999 64.0 -999 -999 28.40.2 0506010250 0.00 22.2 226.7 37.8 -999 64.2 -999 -999 27.70.2 0506010255 0.00 21.9 223.5 37.8 -999 64.3 -999 -999 26.90.2 0506010300 0.00 22.0 223.2 37.7 -999 64.3 -999 -999 28.00.2 library(zoo) library(chron) z - read.zoo(textConnection(Lines), header = TRUE, na.strings = -999, format = %y%m%d%H%M, FUN = as.chron, colClasses = c(character, rep(numeric, 10))) mph - z[months(time(z)) %in% c(Jun, Jul, Aug), grep(mph, colnames(z))] aggregate(mph, hours, mean) On Sat, Dec 20, 2008 at 9:09 PM, Sherri Heck sh...@ucar.edu wrote: Dear all- I have a dataset (see a sample below - but the whole dataset is June 2005 - June 2008). The LST format is YYMMDDHHmm and I would like to get the hourly average of the mph for the summer months (spanning all years). I have been trying to use aggregate but am not having much success at all! any thoughts would be greatly appreciated. thanks- sherri LST inch mphDeg DegF DegF%volts Degmph w/m2 050601 0.00 13.6 218.1 36.8 -999 65.1 -999 -999 18.2 0.2 0506010005 0.00 12.9 214.3 36.8 -999 65.5 -999 -999 16.9 0.2 0506010010 0.00 14.4 215.7 36.9 -999 65.4 -999 -999 20.4 0.2 0506010015 0.00 13.8 215.8 36.8 -999 65.7 -999 -999 19.7 0.3 0506010020 0.00 11.9 213.4 36.8 -999 65.6 -999 -999 14.6 0.2 0506010025 0.00 12.7 212.4 36.8 -999 65.4 -999 -999 16.9 0.2 0506010030 0.00 14.1 215.8 36.8 -999 65.9 -999 -999 19.1 0.2 0506010035 0.00 14.8 217.2 36.7 -999 66.2 -999 -999 20.4 0.2 0506010040 0.00 16.2 222.0 36.8 -999 66.6 -999 -999 20.2 0.2 0506010045 0.00 13.6 219.5 36.7 -999 66.6 -999 -999 18.4 0.2 0506010050 0.00 14.8 217.6 36.7 -999 66.2 -999
Re: [R] Data format for BiodiversityR
Hello, maybe it is better if you copy an extract of your dataset file in the message because the attached file did'nt seem to get through. Margherita 2008/9/14 Ndoh Innocent (Holy) [EMAIL PROTECTED] Greetings dear friends. Please, I really find problems having the program read my datasets (here attached). Have converted datasets to csv, imported but always not reaching the target. Would be very happy if some one out can help me on time. Thanks Ndoh Mbue Innocent International corporation office China University of Geosciences 388 Lumo road 430074, Wuhan-China Tel: 0086 27 67885947/0086 15927262962 A gentlemen should be truly a moral person, a straightforward and reliable personality,in solidarity with the community and rooted in self rescpect __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data format for BiodiversityR
Greetings dear friends. Please, I really find problems having the program read my datasets (here attached). Have converted datasets to csv, imported but always not reaching the target. Would be very happy if some one out can help me on time. Thanks Ndoh Mbue Innocent International corporation office China University of Geosciences 388 Lumo road 430074, Wuhan-China Tel: 0086 27 67885947/0086 15927262962 A gentlemen should be truly a moral person, a straightforward and reliable personality,in solidarity with the community and rooted in self rescpect __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data format
Hi, How can I analyze the data collected in database formatting (with labels) rather than splitted by individual columns (almost in excel)? For example (comma separated data); Label,Value Good,10 Bad,12 Good,15 Good,18 Good,12 Bad,15 Bad,10 etc... ks.test or chisq.test can be done. Splitting the data into new columns is not applicable cos' I'll use R-integration in another software. Thanks for your concern Emre -- --- Emre ÜNAL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format
Here is how to read it in. Then you can run your analysis: x - read.csv(textConnection(Label,Value + Good,10 + Bad,12 + Good,15 + Good,18 + Good,12 + Bad,15 + Bad,10)) x Label Value 1 Good10 2 Bad12 3 Good15 4 Good18 5 Good12 6 Bad15 7 Bad10 On 10/30/07, Emre Unal [EMAIL PROTECTED] wrote: Hi, How can I analyze the data collected in database formatting (with labels) rather than splitted by individual columns (almost in excel)? For example (comma separated data); Label,Value Good,10 Bad,12 Good,15 Good,18 Good,12 Bad,15 Bad,10 etc... ks.test or chisq.test can be done. Splitting the data into new columns is not applicable cos' I'll use R-integration in another software. Thanks for your concern Emre -- --- Emre ÜNAL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data format
There are many ways. A simple one is to use split() to divide your 'Value' column using your 'Label' column as index. For example, # Create dataset mydata=data.frame(Label=c('Good','Bad','Good','Good','Good','Bad','Bad'), Value=c(10,12,15,18,12,15,10)) # Split the data mydata=split(mydata$Value,mydata$Label) # Do a ks test ks.test(mydata[[1]],mydata[[2]]) Julian Emre Unal wrote: Hi, How can I analyze the data collected in database formatting (with labels) rather than splitted by individual columns (almost in excel)? For example (comma separated data); Label,Value Good,10 Bad,12 Good,15 Good,18 Good,12 Bad,15 Bad,10 etc... ks.test or chisq.test can be done. Splitting the data into new columns is not applicable cos' I'll use R-integration in another software. Thanks for your concern Emre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data format problem
Hello, First of all, thanks everyone that reply to my questions and I sorry to spend so many time to reply. I'm find a way to treat this problem using the zoo package: bbrass = scan(C:/Program Files/R/data PTIN/bbrass_client_2471_pool_72644_percent_in_use_500_NA.dat) regts.start = ISOdatetime(2006, 7, 1, hour=0, min=0, sec=0, tz=GMT) #2006 07 01 00 regts.end = ISOdatetime(2006, 7, 22, hour=2, min=0, sec=0, tz=GMT)#2006 07 22 02 regts.zoo - zooreg(bbrass, regts.start, regts.end, deltat=3600) #I won't the time to the hour... summary(regts.zoo) Index regts.zoo Min. :2006-07-01 00:00:00 Min. :48 1st Qu.:2006-07-06 06:15:00 1st Qu.:58 Median :2006-07-11 12:30:00 Median :62 Mean :2006-07-11 12:30:00 Mean :65 3rd Qu.:2006-07-16 18:45:00 3rd Qu.:74 Max. :2006-07-22 01:00:00 Max. :81 NA's :40 Then I create a time series and the regul fuction (pastecs package) works fine. Thanks again for the replies, Joao Santos Joao Santos wrote: Hello, I problem is in the format of the date, my time series is like this: 2006070100 1244 6162 2006070101 1221 6060 2006070102 1214 6060 2006070103 1194 5959 2006070104 1182 5858 2006070105 1178 5858 2006070106 1176 5858 2006070107 1173 5858 2006070108 1179 5859 2006070109 1246 6162 . When I attempt to format the time like this: A - read.table(file, sep=\t, col.names=c(date, my1, my2, my3)) temp - as.Date(A$date, format=%Y%m%d%H) temp I get [1] 4403-05-21 4403-05-22 4403-05-23 4403-05-24 4403-05-25 [6] 4403-05-26 4403-05-27 4403-05-28 4403-05-29 4403-05-30 Another problem is in REGUL, I using the variables created in the extraction of the data but the regulation is not possible REGUL Ts.regul-regul(A$date, y=A$my2, xmin=2006070100, n=800, units=hours, frequency=1, deltat=1/3600, datemin=NULL, dateformat=m/d/Y, tol=NULL, tol.type=both, methods=linear, rule=1, f=0, periodic=FALSE, window=(2006080316 - 2006070100)/(800 - 1), split=100, specs=NULL) I think if the question is resolved the function REGUL will work to. Can someone help me? I new to this forum and in the utilisation of R. Thanks for the help in advance, João Santos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/%28no-subject%29-tf4602032.html#a13382815 Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.