Re: [R] Trouble converting hourly data into daily data
Hi Jean, Thanks for the help. I couldn't quite get the results I needed with the merge command, but I ended up using the following work-around: Weather - read.csv(Weather.csv) Weather$diff.time - abs(.5 - Weather$TimeNumeric) agg - aggregate(diff.time ~ Date, data = Weather, FUN = which.min) n.obs - cumsum(rle(as.double(Weather$Date))$lengths) n.obs - c(0, n.obs[1:(length(n.obs) - 1)]) noon.ind - agg$diff.time + n.obs subset - Weather[noon.ind,] Cheers, Sean On Mon, Dec 19, 2011 at 6:03 AM, Jean V Adams jvad...@usgs.gov wrote: Sean Baumgarten wrote on 12/14/2011 06:38:08 PM: Hello, I have a data frame with hourly or sub-hourly weather records that span several years, and from that data frame I'm trying to select only the records taken closest to noon for each day. Here's what I've done so far: #Add a column to the data frame showing the difference between noon and the observation time (I converted time to a 0-1 scale so 0.5 represents noon): data$Diff_from_noon - abs(0.5-data$Time) #Find the minimum value of Diff_from_noon for each Date: aggregated - aggregate(Diff_from_noon ~ Date, data, FUN=min) The problem is that the aggregated data frame only has two columns: Date and Diff_from_noon. I can't figure out how to get the columns with the actual weather variables to carry over from the original data frame. Any suggestions you have would be much appreciated. Thanks, Sean You don't provide any example data, so I will use data from R datasets, airquality. After using the aggregate() function to find the minimum Day for each Month, merge the resulting data frame with the original data frame to see all the columns corresponding to the selected minimums. aggregated - aggregate(Day ~ Month, airquality, FUN=min) aggregated Month Day 1 5 1 2 6 1 3 7 1 4 8 1 5 9 1 merge(aggregated, airquality) Month Day Ozone Solar.R Wind Temp 1 5 141 190 7.4 67 2 6 1NA 286 8.6 78 3 7 1 135 269 4.1 84 4 8 139 83 6.9 81 5 9 196 167 6.9 91 For your data, the code would look like this: aggregated - aggregate(Diff_from_noon ~ Date, data, FUN=min) merge(aggregated, data) I recommend that you use a name other than data for your data frame, since data() is a built in R function. Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble converting hourly data into daily data
Sean Baumgarten wrote on 12/14/2011 06:38:08 PM: Hello, I have a data frame with hourly or sub-hourly weather records that span several years, and from that data frame I'm trying to select only the records taken closest to noon for each day. Here's what I've done so far: #Add a column to the data frame showing the difference between noon and the observation time (I converted time to a 0-1 scale so 0.5 represents noon): data$Diff_from_noon - abs(0.5-data$Time) #Find the minimum value of Diff_from_noon for each Date: aggregated - aggregate(Diff_from_noon ~ Date, data, FUN=min) The problem is that the aggregated data frame only has two columns: Date and Diff_from_noon. I can't figure out how to get the columns with the actual weather variables to carry over from the original data frame. Any suggestions you have would be much appreciated. Thanks, Sean You don't provide any example data, so I will use data from R datasets, airquality. After using the aggregate() function to find the minimum Day for each Month, merge the resulting data frame with the original data frame to see all the columns corresponding to the selected minimums. aggregated - aggregate(Day ~ Month, airquality, FUN=min) aggregated Month Day 1 5 1 2 6 1 3 7 1 4 8 1 5 9 1 merge(aggregated, airquality) Month Day Ozone Solar.R Wind Temp 1 5 141 190 7.4 67 2 6 1NA 286 8.6 78 3 7 1 135 269 4.1 84 4 8 139 83 6.9 81 5 9 196 167 6.9 91 For your data, the code would look like this: aggregated - aggregate(Diff_from_noon ~ Date, data, FUN=min) merge(aggregated, data) I recommend that you use a name other than data for your data frame, since data() is a built in R function. Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble converting hourly data into daily data
Hello, I have a data frame with hourly or sub-hourly weather records that span several years, and from that data frame I'm trying to select only the records taken closest to noon for each day. Here's what I've done so far: #Add a column to the data frame showing the difference between noon and the observation time (I converted time to a 0-1 scale so 0.5 represents noon): data$Diff_from_noon - abs(0.5-data$Time) #Find the minimum value of Diff_from_noon for each Date: aggregated - aggregate(Diff_from_noon ~ Date, data, FUN=min) The problem is that the aggregated data frame only has two columns: Date and Diff_from_noon. I can't figure out how to get the columns with the actual weather variables to carry over from the original data frame. Any suggestions you have would be much appreciated. Thanks, Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.