Hi all, The rcpp-devel list has been very helpful to me so far, so I hope you don't mind another question! I'm trying to speed up my 2d convolution function:
library(inline) # 2d convolution ------------------------------------------------------------- convolve_2d <- cxxfunction(signature(sampleS = "numeric", kernelS = "numeric"), plugin = "Rcpp", ' Rcpp::NumericMatrix sample(sampleS), kernel(kernelS); int x_s = sample.nrow(), x_k = kernel.nrow(); int y_s = sample.ncol(), y_k = kernel.ncol(); Rcpp::NumericMatrix output(x_s + x_k - 1, y_s + y_k - 1); for (int row = 0; row < x_s; row++) { for (int col = 0; col < y_s; col++) { for (int i = 0; i < x_k; i++) { for (int j = 0; j < y_k; j++) { output(row + i, col + j) += sample(row, col) * kernel(i, j); } } } } return output; ') x <- diag(1000) k <- matrix(runif(20* 20), ncol = 20) system.time(convolve_2d(x, k)) # user system elapsed # 14.759 0.028 15.524 I have a vague idea that to get better performance I need to get closer to bare pointers, and I need to be careful about the order of my loops to ensure that I'm working over contiguous chunks of memory as often as possible, but otherwise I've basically exhausted my C++/Rcpp knowledge. Where should I start looking to improve the performance of this function? The example data basically matches the real problem - x is not usually going to be much bigger than 1000 x 1000 and k typically will be much smaller. (And hence, based on what I've read, this technique should be faster than doing it via a discrete fft) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel