Hello! I have a solution for my task that is based on a loop. However, it's too slow for my real-life problem that is much larger in scope. However, I cannot use merge. Any advice on how to do it faster? Thanks a lot for any hint on how to speed it up!
# I have 'mydata' data frame: set.seed(123) mydata <- data.frame(myid = 1001:1100, version = sample(1:20, 100, replace = T)) head(mydata) table(mydata$version) # I have 'myinfo' data frame that contains information for each 'version': set.seed(12) myinfo <- data.frame(version = sort(rep(1:20, 30)), a = rnorm(60), b = rnorm(60), c = rnorm(60), d = rnorm(60)) head(myinfo, 40) ### MY SOLUTION WITH A LOOP: ### Looping through each id of mydata and grabbing ### all columns from 'myinfo' for the corresponding 'version': # 1. Creating placeholder list for the results: result <- split(mydata[c("myid", "version")], f = list(mydata$myid)) length(result) (result)[1:3] # 2. Looping through each element of 'result': for(i in 1:length(result)){ id <- result[[i]]$myid result[[i]] <- myinfo[myinfo$version == result[[i]]$version, ] result[[i]]$myid <- id result[[i]] <- result[[i]][c(5, 1:4)] } result <- do.call(rbind, result) head(result) # This is the desired result -- Dimitri Liakhovitski ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.