On 7 January 2012 at 11:40, baptiste auguie wrote: | Hi, | | (Rcpp)Armadillo has a 1D convolution function < | http://arma.sourceforge.net/docs.html#conv >. It might be a very | inefficient way, to use it sequentially for 2D; I haven't tried.
Good point. Off-list, I already suggested to Hadley to look at what the image processing folks have done. And as Conrad is now proud author of several published papers in the area (kudos!), he may have a pointers. Conrad? Dirk | HTH, | | baptiste | | | | On 7 January 2012 07:43, Hadley Wickham <had...@rice.edu> wrote: | > Hi all, | > | > The rcpp-devel list has been very helpful to me so far, so I hope you | > don't mind another question! I'm trying to speed up my 2d convolution | > function: | > | > | > | > library(inline) | > # 2d convolution ------------------------------------------------------------- | > | > convolve_2d <- cxxfunction(signature(sampleS = "numeric", kernelS = | > "numeric"), plugin = "Rcpp", ' | > Rcpp::NumericMatrix sample(sampleS), kernel(kernelS); | > int x_s = sample.nrow(), x_k = kernel.nrow(); | > int y_s = sample.ncol(), y_k = kernel.ncol(); | > | > Rcpp::NumericMatrix output(x_s + x_k - 1, y_s + y_k - 1); | > for (int row = 0; row < x_s; row++) { | > for (int col = 0; col < y_s; col++) { | > for (int i = 0; i < x_k; i++) { | > for (int j = 0; j < y_k; j++) { | > output(row + i, col + j) += sample(row, col) * kernel(i, j); | > } | > } | > } | > } | > return output; | > ') | > | > | > x <- diag(1000) | > k <- matrix(runif(20* 20), ncol = 20) | > system.time(convolve_2d(x, k)) | > # user system elapsed | > # 14.759 0.028 15.524 | > | > I have a vague idea that to get better performance I need to get | > closer to bare pointers, and I need to be careful about the order of | > my loops to ensure that I'm working over contiguous chunks of memory | > as often as possible, but otherwise I've basically exhausted my | > C++/Rcpp knowledge. Where should I start looking to improve the | > performance of this function? | > | > The example data basically matches the real problem - x is not usually | > going to be much bigger than 1000 x 1000 and k typically will be much | > smaller. (And hence, based on what I've read, this technique should | > be faster than doing it via a discrete fft) | > | > Hadley | > | > -- | > Assistant Professor / Dobelman Family Junior Chair | > Department of Statistics / Rice University | > http://had.co.nz/ | > _______________________________________________ | > Rcpp-devel mailing list | > Rcpp-devel@lists.r-forge.r-project.org | > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | _______________________________________________ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- "Outside of a dog, a book is a man's best friend. Inside of a dog, it is too dark to read." -- Groucho Marx _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel