Hey all; I'm a beginner++ user of R, trying to use it to do some processing of data sets of over 1M rows, and running into a snafu. imagine that my input is a huge table of transactions, each linked to a specif user id. as I run through the transactions, I need to update a separate table for the users, but I am finding that the traditional ways of doing a table lookup are way too slow to support this kind of operation.
i.e: for(i in 1:1000000) { userid = transactions$userid[i]; amt = transactions$amounts[i]; users[users$id == userid,'amt'] += amt; } I assume this is a linear lookup through the users table (in which there are 10's of thousands of rows), when really what I need is O(constant time), or at worst O(log(# users)). is there any way to manage a list of ID's (be they numeric, string, etc) and have them efficiently mapped to some other table index? I see the CRAN package for SQLite hashes, but that seems to be going a bit too far. thanks, Mike Intern, Oyster Card Group, Transport for London (feel free to email back to this address, I'm posting through NAbble so I hope it works). -- View this message in context: http://www.nabble.com/Lookups-in-R-tf4026062.html#a11435994 Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.