I just saw that I left a syntax error in the .R and the first _Rout.txt files. Notice that in the second _Rout.txt file the order of the arguments in the constructors for the MMatrixXd and the MVectorXd are in a different order than in the .R and the first _Rout.txt files. The correct order has the pointer first, then the dimensions. For the first _Rout.txt file this part of the code is not used.
On Tue, Jul 19, 2011 at 10:00 AM, Douglas Bates <ba...@stat.wisc.edu> wrote: > On Thu, Jul 14, 2011 at 10:21 AM, Alireza Mahani > <alireza.s.mah...@gmail.com> wrote: >> (I am using a LINUX machine) >> >> Jeff, >> >> In creating reproducible results, I 'partially' answered my question. I have >> attached two scripts, 'mvMultiply.r' and 'mvMultiply.cc'. Please copy both >> files into your chosen directory, then run 'Rscript mvMultiply.r' in that >> directory while changing the two boolean parameters 'INCLUDE_DATAPREP' and >> 'ROWMAJOR' to all four permutations. (The variable 'diffVec' is there to >> verify that the two methods produce the same output vector.) >> >> Below are the results that I get, along with discussion (tR and tCall are in >> sec): >> >> INCLUDE_DATAPREP,ROWMAJOR,tR,tCall >> F,F,13.536,13.875 >> F,T,13.824,14.299 >> T,F,13.688,18.167 >> T,T,13.982,30.730 >> >> Interpretation: The execution time for the .Call line is nearly identical to >> the call to R operator '%*%'. Two data preparation lines for the .Call >> method create the overhead: >> >> A <- t(A) (~12sec, or 12usec per call) >> dim(A) <- dim(A)[1] * dim(A)[2] (~4sec, or 4usec per call) >> >> While the first line can be avoided by providing options in c++ function (as >> is done in the BLAS API), I wonder if the second line can be avoided, aside >> from the obvious option of rewriting the R scripts to use vectors instead of >> matrices. But this defies one of the primary advantages of using R, which is >> succinct, high-level coding. In particular, if one has several matrices as >> input into a .Call function, then the overhead from matrix-to-vector >> transformations can add up. To summarize, my questions are: >> >> 1- Do the above results seem reasonable to you? Is there a similar penalty >> in R's '%*%' operator for transforming matrices to vectors before calling >> BLAS functions? >> 2- Are there techniques for reducing the above overhead for developers >> looking to augment their R code with compiled code? >> >> Regards, >> Alireza >> >> --------------------------------------- >> # mvMultiply.r >> # comparing performance of matrix multiplication in R (using '%*%' operator) >> vs. calling compiled code (using .Call function) >> # y [m x 1] = A [m x n] %*% x [n x 1] >> >> rm(list = ls()) >> system("R CMD SHLIB mvMultiply.cc") >> dyn.load("mvMultiply.so") >> >> INCLUDE_DATAPREP <- F >> ROWMAJOR <- F #indicates whether the c++ function treats A in a row-major or >> column-major fashion >> >> m <- 100 >> n <- 10 >> N <- 1000000 >> >> diffVec <- array(0, dim = N) >> tR <- 0.0 >> tCall <- 0.0 >> for (i in 1:N) { >> >> A <- runif(m*n); dim(A) <- c(m,n) >> x <- runif(n) >> >> t1 <- proc.time()[3] >> y1 <- A %*% x >> tR <- tR + proc.time()[3] - t1 >> >> if (INCLUDE_DATAPREP) { >> t1 <- proc.time()[3] >> } >> if (ROWMAJOR) { #since R will convert matrix to vector in a >> column-major >> fashion, if the c++ function expects a row-major format, we need to >> transpose A before converting it to a vector >> A <- t(A) >> } >> dim(A) <- dim(A)[1] * dim(A)[2] >> if (!INCLUDE_DATAPREP) { >> t1 <- proc.time()[3] >> } >> y2 <- .Call("matvecMultiply", as.double(A), as.double(x), >> as.logical(c(ROWMAJOR))) >> tCall <- tCall + proc.time()[3] - t1 >> >> diffVec[i] <- max(abs(y2 - y1)) >> } >> cat("Data prep time for '.Call' included: ", INCLUDE_DATAPREP, "\n") >> cat("C++ function expects row-major matrix: ", ROWMAJOR, "\n") >> cat("Time - Using '%*%' operator in R: ", tR, "sec\n") >> cat("Time - Using '.Call' function: ", tCall, "sec\n") >> cat("Maximum difference between methods: ", max(diffVec), "\n") >> >> dyn.unload("mvMultiply.so") >> --------------------------------------- >> # mvMultiply.cc >> #include <Rinternals.h> >> #include <R.h> >> >> extern "C" { >> >> SEXP matvecMultiply(SEXP A, SEXP x, SEXP rowmajor) { >> double *rA = REAL(A), *rx = REAL(x), *ry; >> int *rrm = LOGICAL(rowmajor); >> int n = length(x); >> int m = length(A) / n; >> SEXP y; >> PROTECT(y = allocVector(REALSXP, m)); >> ry = REAL(y); >> for (int i = 0; i < m; i++) { >> ry[i] = 0.0; >> for (int j = 0; j < n; j++) { >> if (rrm[0] == 1) { >> ry[i] += rA[i * n + j] * rx[j]; >> } else { >> ry[i] += rA[j * m + i] * rx[j]; >> } >> } >> } >> UNPROTECT(1); >> return(y); >> } >> >> } >> > > I realize that you are just beginning to use the .C and .Call > interfaces and your example is therefore a simple one. However, if > you plan to continue with such development it is worthwhile learning > of some of the tools available. I think one of the most important is > the "inline" package that can take a C or C++ code segment as a text > string and go through all the steps of creating and loading a > .Call'able compiled function. > > Second, if you are going to use C++ (the code you show could be C code > as it doesn't use any C++ extensions) then you should look at the Rcpp > package written by Dirk Eddelbuettel and Romain Francois which allows > for comparatively painless interfacing of R objects and C++ objects. > The Rcpp-devel list, which I have copied on this reply, is for > questions related to that system. The inline package allows for > various "plugin" constructions to wrap your code in the appropriate > headers and point the compiler to the locations of header files and > libraries. There are two extensions to Rcpp for numerical linear > algebra in C++, RcppArmadillo and RcppEigen. I show the use of > RcppEigen here. > > Third there are several packages in R that do the busy work of > benchmarking expressions and neatly formulating the results. I use > the rbenchmark package. > > Putting all these together yields the enclosed script and results. > > In Eigen, a MatrixXd object is the equivalent of R's numeric matrix > (similarly MatrixXi for integer and MatrixXcd for complex) and a > VectorXd object is the equivalent of a numeric vector. A "mapped" > matrix or vector is one that uses the storage allocated by R, thereby > avoiding a copy operation (similar to your accessing elements of the > arrays through the pointer returned by REAL()). To adhere to R's > functional programming semantics it is a good idea to declare such > objects as const. The 'as' and 'wrap' functions are provided by Rcpp > with extensions in RcppEigen to the Eigen classes in the development > version. In the released versions of Rcpp and RcppEigen we use > intermediate Rcpp objects. These functions have the advantage of > checking the types of R objects being passed. The Eigen code for > matrix multiplication will check the consistency of dimensions in the > operation. > > Given the size of the matrix you are working with it is not surprising > that interpretation overhead and checking will be a large part of the > elapsed time, hence the relative differences between different methods > of doing the numerical calculation will be small. The operation of > multiplying a 100 x 10 matrix by a 10-vector involves "only" 1000 > floating point operations. Furthermore, each element of the matrix is > used only once so sophisticated methods of manipulating cache contents > won't buy you much. These benchmark results are from a system that > uses Atlas BLAS (basic linear algebra subroutines); other systems will > provide different results. Interestingly, I found on some systems > using R's BLAS, which are not accelerated, the R code is closer in > speed to the code using Eigen. An example is given in the second > version of the output. > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel