Hi all, Slightly less dense question (hopefully). In the code below I have two versions of the same function - one uses operator[] and the other uses iterators. Following the Rcpp introduction, I had expected the iterator version to be substantially faster, but I'm only seeing a minor improvement (~10%). Why doesn't using iterators help me much here? Possible explanations:
* I'm using iterators incorrectly in my code * Iterators help most when the vector access is sequential, and here the counts index is bouncing all over the place, so I shouldn't expect much improvement. Any ideas would be much appreciated. Thanks! Hadley library(inline) count_bin <- cxxfunction(signature(x = "numeric", binwidth = "numeric", origin = "numeric", nbins = "integer"), ' int nbins_ = as<int>(nbins); double binwidth_ = as<double>(binwidth); double origin_ = as<double>(origin); Rcpp::NumericVector counts(nbins_); Rcpp::NumericVector x_(x); int n = x_.size(); for(int i = 0; i < n; i++) { counts[(int) ((x_[i] - origin_) / binwidth_)]++; } return counts; ', plugin = "Rcpp") count_bini <- cxxfunction(signature(x = "numeric", binwidth = "numeric", origin = "numeric", nbins = "integer"), ' int nbins_ = as<int>(nbins); double binwidth_ = as<double>(binwidth); double origin_ = as<double>(origin); Rcpp::NumericVector counts(nbins_); Rcpp::NumericVector x_(x); int n = x_.size(); Rcpp::NumericVector::iterator x_i = x_.begin(); Rcpp::NumericVector::iterator counts_i = counts.begin(); for(int i = 0; i < n; i++) { counts_i[(int) ((x_i[i] - origin_) / binwidth_)]++; } return counts; ', plugin = "Rcpp") x <- rnorm(1e7, sd = 3) origin <- min(x) binwidth <- 1 n <- ceiling((max(x) - origin) / binwidth) system.time(y1 <- count_bin(x, binwidth, origin, nbins = n)) system.time(y2 <- count_bini(x, binwidth, origin, nbins = n)) all.equal(y1, y2) library(microbenchmark) microbenchmark( operator = count_bin(x, binwidth, origin, nbins = n), iterator = count_bini(x, binwidth, origin, nbins = n)) ) # The real reason I'm exploring this is as a more efficient version # of tabulate for doing equal bin counts. The Rcpp version is about 10x # faster, mainly (I think) because it avoids creating a modified copy of the # vector system.time(y3 <- tabulate((x - origin) / binwidth + 1, nbins = n)) all.equal(y1, y3) -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel