Don't include the key setting within the benchmark.
________________________________________ From: [email protected] [[email protected]] on behalf of ekbrown [[email protected]] Sent: Friday, 22 March 2013 1:39 PM To: [email protected] Subject: [datatable-help] Quicker w/o keys set Hello. I'm new to data.table(). I am apparently not setting the keys correctly to get the increase in speed talked about in the vignettes, as I get a (much) quicker time *without* keys set. Take a look at the following benchmarking tests. Any ideas? Thanks. Earl Brown > library("data.table") > library("rbenchmark") > > # generates random data > num.files <- 2000 > num.words <- 1000000 > logical.vector <- sample(c(TRUE, FALSE), num.words, replace=T) > file.names <- rep(1:num.files, length.out=num.words) > > # defines functions > benDTNoKey <- function(aa, bb) { + dt <- data.table(as.numeric(aa), bb) + dt[,sum(V1), by = bb][,V1] + } > > benDTWithKey <- function(aa, bb) { + dt <- data.table(as.numeric(aa), bb) + setkey(dt) + dt[,sum(V1), by = bb][,V1] + } > > benTapply <- function(aa, bb) tapply(aa, bb, sum) > > # runs benchmarking > benchmark(benTapply(logical.vector, file.names), > benDTWithKey(logical.vector, file.names), benDTNoKey(logical.vector, > file.names), replications = 10, columns = c("test", "replications", > "elapsed")) test replications elapsed 3 benDTNoKey(logical.vector, file.names) 10 *0.753* 2 benDTWithKey(logical.vector, file.names) 10 *4.776* 1 benTapply(logical.vector, file.names) 10 6.218 > > # tests for sameness among results > one <- benTapply(logical.vector, file.names) > two <- benDTWithKey(logical.vector, file.names) > three <- benDTNoKey(logical.vector, file.names) > identical(as.integer(one), as.integer(two)) [1] TRUE > identical(as.integer(two), as.integer(three)) [1] TRUE -- View this message in context: http://r.789695.n4.nabble.com/Quicker-w-o-keys-set-tp4662157.html Sent from the datatable-help mailing list archive at Nabble.com. _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
