I think this does what you want. It uses 'findInterval' to determine where a possible match is:
> myvscan<-data.frame(c(1,NA,1.5),as.POSIXct(c("12:00:00","12:14:00","12:20:00"), format="%H:%M:%S")) > # convert to numeric > names(myvscan)<-c("Latitude","DateTime") > myvscan$tn <- as.numeric(myvscan$DateTime) # numeric for findInterval > mygarmin<-data.frame(c(20,30,40),as.POSIXct(c("12:00:00","12:10:00","12:15:00"), format="%H:%M:%S")) > names(mygarmin)<-c("Latitude","DateTime") > mygarmin$tn <- as.numeric(mygarmin$DateTime) > > # use 'findInterval' > na.indx <- which(is.na(myvscan$Latitude)) # find NAs > # replace with garmin latitude > myvscan$Latitude[na.indx] <- mygarmin$Latitude[findInterval(myvscan$tn[na.indx], mygarmin$tn)] > > > myvscan Latitude DateTime tn 1 1.0 2009-05-22 12:00:00 1243008000 2 30.0 2009-05-22 12:14:00 1243008840 3 1.5 2009-05-22 12:20:00 1243009200 > On Fri, May 22, 2009 at 12:45 AM, Tim Clark <mudiver1...@yahoo.com> wrote: > > Dear List, > > I need some help in coming up with a function that will take two data sets, > determine if a value is missing in one, find a value in the second that was > taken at about the same time, and substitute the second value in for where > the first should have been. My problem is from a fish tracking study. We > put acoustic tags in fish and track them for several days. Location data is > supposed to be automatically recorded every time we detect a "ping" from the > fish. Unfortunately the GPS had some problems and sometimes the fishes > depth was recorded but not its location. I fortunately had a back-up GPS > that was taking location data every five minutes. I would like to merge the > two files, replacing the missing value in the vscan (automatic) file with > the location from the garmin file. Since we were getting vscan records > every 1-2 seconds and garmin records every 5 minutes, I need to find the > right place in the vscan file to place the garmin record - i.e. the > closest in time, but not greater than 5 minutes. I have written a > function that does this. However, it works with my test data but locks up my > computer with my real data. I have several million vscan records and > several thousand garmin records. Is there a better way to do this? > > > My function and test data: > > myvscan<-data.frame(c(1,NA,1.5),times(c("12:00:00","12:14:00","12:20:00"))) > names(myvscan)<-c("Latitude","DateTime") > mygarmin<-data.frame(c(20,30,40),times(("12:00:00","12:10:00","12:15:00"))) > names(mygarmin)<-c("Latitude","DateTime") > > minute.diff<-1/24/12 #Time diff is in days, so this is 5 minutes > for (k in 1:nrow(myvscan)) > { > if (is.na(myvscan$Latitude[k])) > { > if ((min(abs(mygarmin$DateTime-myvscan$DateTime[k]))) < minute.diff ) > { > index.min.date<-which.min(abs(mygarmin$DateTime-myvscan$DateTime[k])) > myvscan$Latitude[k]<-mygarmin$Latitude[index.min.date] > }}} > > I appreciate your help and advice. > > Aloha, > > Tim > > > > > Tim Clark > Department of Zoology > University of Hawaii > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.