You're right about the code preceding .C. I stripped down the .C and .Call codes to be as similar as possible, and the timings were much closer.
R code: Call_matrix_multiply = function(A,B){ C <- matrix(0,nrow(A),ncol(B)) .Call("R_CALL_matrix_multiply", A, B, C) return(C) } C_matrix_multiply = function(A,B){ C <- matrix(0,nrow(A),ncol(B)) .C("R_matrix_multiply",A,B,nrow(A),ncol(A),ncol(B),C,DUP=FALSE,NAOK=TRUE) return(C) } C code: void R_CALL_matrix_multiply(SEXP A, SEXP B, SEXP C) { double one = 1.0; double zero = 0.0; int *dimA = INTEGER(getAttrib(A, R_DimSymbol)); int *dimB = INTEGER(getAttrib(B, R_DimSymbol)); F77_CALL(dgemm)("N","N",dimA,dimB+1,dimA+1,&one,REAL(A),dimA,REAL(B),dimA+1,&zero,REAL(C),dimA); } void R_matrix_multiply(double * A, double * B, int * m, int *n, int * p, double * C){ double one = 1.0; double zero = 0.0; //Here is the actual multiplication F77_CALL(dgemm)("N","N",m,p,n,&one,A,m,B,n,&zero,C,m); } Timings: > m = 2000 > n = 2000 > p = 2000 > A = matrix(rnorm(m*n),m,n) > B = matrix(rnorm(n*p),n,p) > library(CMATRIX) > system.time(C_matrix_multiply(A,B)) user system elapsed 2.782 0.035 1.611 > system.time(Call_matrix_multiply(A,B)) user system elapsed 2.789 0.032 1.629 > > m = 2000 > n = 2000 > p = 2000 > A = matrix(rnorm(m*n),m,n) > B = matrix(rnorm(n*p),n,p) > library(CMATRIX) > system.time(C_matrix_multiply(A,B)) user system elapsed 2.793 0.029 1.609 > system.time(Call_matrix_multiply(A,B)) user system elapsed 2.787 0.029 1.586 Even so, it seems the .Call interface has a lot of advantages. I think I will use it for this project and see how I like it. Jason On Tue, Feb 22, 2011 at 7:27 AM, Simon Urbanek <simon.urba...@r-project.org> wrote: > On Feb 22, 2011, at 1:55 AM, Jason Rudy wrote: > >> I just tried that myself and the .Call version is substantially >> faster. It seems like there is a lot more going on in the .Call C >> code than the .C. Why is .Call faster? Does it have to do with the >> way arguments are passed into the C function? I tried with DUP=FALSE >> and NAOK=TRUE, but .Call was still the winner. >> > > .Call is the "native" interface so it passes all objects directly as > references - essentially the same way that R uses internally (.C with > DUP=FALSE and NAOK=TRUE comes close to that, though; the fastest is .External > in this respect). Also you're allocating objects as you need them so there > will be less copying afterwards. However, I suspect that major part of the > difference is also in the code preceding the C code involved in this -- for > example as.double() will implicitly create a copy since it needs to strip > attributes whereas coerceVector is a no-op if the type matches. > > .C is there for historical compatibility - it is essentially equivalent to > .Fortran which was the original (and only) interface way back when. .Call is > in comparison a more recent addition, that's why you still find a lot of .C > code. Personally I don't use .C at all because compared to .Call it is so > cumbersome and error-prone (you can't even tell the length of the passed > vectors in C!), but others have different preferences. > > Cheers, > Simon > > >> On Sun, Feb 20, 2011 at 4:27 PM, Simon Urbanek >> <simon.urba...@r-project.org> wrote: >>> Jason, >>> >>> FWIW the direct interface (.Call) is more efficient and makes passing >>> things from R simpler: >>> >>> C_matrix_multiply = function(A,B) .Call("R_matrix_multiply", A, B) >>> >>> The drawback is a bit more legwork on the C side, but it also gives you >>> more flexibility: >>> >>> SEXP R_matrix_multiply(SEXP A, SEXP B) { >>> double one = 1.0; >>> double zero = 0.0; >>> int *dimA = INTEGER(getAttrib(A, R_DimSymbol)); >>> int *dimB = INTEGER(getAttrib(B, R_DimSymbol)); >>> SEXP sDimC = PROTECT(allocVector(INTSXP, 2)); >>> int *dimC = INTEGER(sDimC); >>> SEXP C = PROTECT(allocVector(REALSXP, dimA[0] * dimB[1])); >>> if (dimA[1] != dimB[0]) error("incompatible matrices!"); >>> dimC[0] = dimA[0]; >>> dimC[1] = dimB[1]; >>> setAttrib(C, R_DimSymbol, sDimC); >>> A = PROTECT(coerceVector(A, REALSXP)); >>> B = PROTECT(coerceVector(B, REALSXP)); >>> >>> F77_CALL(dgemm)("N","N",dimA,dimB+1,dimA+1,&one,REAL(A),dimA,REAL(B),dimA+1,&zero,REAL(C),dimA); >>> UNPROTECT(4); >>> return C; >>> } >>> >>> For comparison: >>>> A=matrix(rnorm(1e5),500) >>>> B=matrix(rnorm(1e5),,500) >>> >>> .Call: >>> >>>> system.time(for (i in 1:10) C_matrix_multiply(A,B)) >>> user system elapsed >>> 0.656 0.008 0.686 >>> >>> .C: >>> >>>> system.time(for (i in 1:10) CC_matrix_multiply(A,B)) >>> user system elapsed >>> 0.886 0.044 0.943 >>> >>> >>> in fact .Call is even a tiny bit faster than %*%: >>> >>>> system.time(for (i in 1:10) A %*% B) >>> user system elapsed >>> 0.658 0.004 0.665 >>> >>> (it's not just a measurement error - it's consistent for more replications >>> etc. - but it's really negligible - possibly just due to dispatch of %*%) >>> >>> Cheers, >>> Simon >>> >>> >>> On Feb 20, 2011, at 5:23 PM, Jason Rudy wrote: >>> >>>> It was indeed a simple problem! I took a look at that array.c as you >>>> suggested and that cleared it right up. So, the correct C code is: >>>> >>>> #include <R.h> >>>> #include <R_ext/Utils.h> >>>> #include <R_ext/Lapack.h> >>>> #include <R_ext/BLAS.h> >>>> >>>> void R_matrix_multiply(double * A, double * B, int * m, int *n, int * >>>> p, double * C){ >>>> >>>> double one = 1.0; >>>> double zero = 0.0; >>>> >>>> //Just printing the input arguments >>>> Rprintf("m = %d, n = %d, p = %d\n",*m,*n,*p); >>>> int i; >>>> for(i=0;i<(*m**n);i++){ >>>> Rprintf("%f ",A[i]); >>>> } >>>> Rprintf("\n"); >>>> for(i=0;i<(*n**p);i++){ >>>> Rprintf("%f ",B[i]); >>>> } >>>> Rprintf("\n"); >>>> for(i=0;i<(*m**p);i++){ >>>> Rprintf("%f ",C[i]); >>>> } >>>> Rprintf("\n"); >>>> >>>> //Here is the actual multiplication >>>> F77_CALL(dgemm)("N","N",m,p,n,&one,A,m,B,n,&zero,C,m); >>>> } >>>> >>>> The only difference being that I had the 4th and 5th arguments (n and >>>> p) mixed up. There was also a problem in my R code after the >>>> multiplication took place. For the record, the correct R code is: >>>> >>>> C_matrix_multiply = function(A,B){ >>>> C <- matrix(0,nrow(A),ncol(B)) >>>> cout <- >>>> .C("R_matrix_multiply",as.double(A),as.double(B),nrow(A),ncol(A),ncol(B),as.double(C)) >>>> return(matrix(cout[[6]],nrow(A),ncol(B))) >>>> } >>>> >>>> Thanks for the help. Now that I have a functioning example I am well >>>> on my way to completing this project. >>>> >>>> -Jason >>>> >>>> On Sun, Feb 20, 2011 at 7:42 AM, Prof Brian Ripley >>>> <rip...@stats.ox.ac.uk> wrote: >>>>> Look a close look at matprod in src/main/array in the R sources. >>>>> Hint: it is the other dimensions you have wrong. >>>>> >>>>> And as BLAS is Fortran, counts do start at 1. >>>>> >>>>> On Sat, 19 Feb 2011, Jason Rudy wrote: >>>>> >>>>>> Dear R-devel, >>>>>> >>>>>> I've written a numerical solver for SOCPs (second order cone programs) >>>>>> in R, and now I want to move most of the solver code into C for speed. >>>>>> I've written combined R/C packages before, but in this case I need to >>>>>> do matrix operations in my C code. As I have never done that before, >>>>>> I'm trying to write some simple examples to make sure I understand the >>>>>> basics. I am stuck on the first one. I'm trying to write a function >>>>>> to multiply two matrices using the blas routine dgemm. The name of my >>>>>> example package is CMATRIX. My code is as follows. >>>>>> >>>>>> I have a file matrix.c in my src directory: >>>>>> >>>>>> #include <R.h> >>>>>> #include <R_ext/Utils.h> >>>>>> #include <R_ext/Lapack.h> >>>>>> #include <R_ext/BLAS.h> >>>>>> >>>>>> //Computes C = A*B >>>>>> void R_matrix_multiply(double * A, double * B, int * m, int *n, int * >>>>>> p, double * C){ >>>>>> double one = 1.0; >>>>>> double zero = 0.0; >>>>>> >>>>>> //Just printing the input arguments >>>>>> Rprintf("m = %d, n = %d, p = %d\n",*m,*n,*p); >>>>>> int i; >>>>>> for(i=0;i<(*m**n);i++){ >>>>>> Rprintf("%f ",A[i]); >>>>>> } >>>>>> Rprintf("\n"); >>>>>> for(i=0;i<(*n**p);i++){ >>>>>> Rprintf("%f ",B[i]); >>>>>> } >>>>>> Rprintf("\n"); >>>>>> for(i=0;i<(*m**p);i++){ >>>>>> Rprintf("%f ",C[i]); >>>>>> } >>>>>> Rprintf("\n"); >>>>>> >>>>>> >>>>>> //Here is the actual multiplication >>>>>> F77_CALL(dgemm)("N","N",m,n,p,&one,A,m,B,n,&zero,C,m); >>>>>> } >>>>>> >>>>>> And the file C_matrix_multiply.R in my R directory: >>>>>> >>>>>> C_matrix_multiply = function(A,B){ >>>>>> C <- matrix(0,nrow(A),ncol(B)) >>>>>> cout <- >>>>>> .C("R_matrix_multiply",as.double(A),as.double(B),nrow(A),ncol(A),ncol(B),as.double(C)) >>>>>> return(matrix(cout$C,nrowA,ncol(B))) >>>>>> >>>>>> } >>>>>> >>>>>> My namespace file is: >>>>>> >>>>>> export("C_matrix_multiply") >>>>>> useDynLib(CMATRIX.so,R_matrix_multiply) >>>>>> >>>>>> I'm not sure if it's necessary, but I've also included a Makevars.in >>>>>> file in my src directory: >>>>>> >>>>>> PKG_CPPFLAGS=@PKG_CPPFLAGS@ >>>>>> PKG_CFLAGS=@PKG_CFLAGS@ >>>>>> PKG_LIBS=@PKG_LIBS@ ${LAPACK_LIBS} ${BLAS_LIBS} ${FLIBS} >>>>>> >>>>>> which I simply copied from the diversitree package, which seems to use >>>>>> a lot of fortran. I have the same problem (which I am getting to) >>>>>> with or without this Makevars.in file. >>>>>> >>>>>> I install my package using: >>>>>> >>>>>> R CMD INSTALL CMATRIX >>>>>> >>>>>> Then I start up R and attempt to run the following code: >>>>>> >>>>>> #Make some random matrices >>>>>> A = matrix(rnorm(8),4,2) >>>>>> B = matrix(rnorm(6),2,3) >>>>>> >>>>>> #Load my package >>>>>> library(CMATRIX) >>>>>> >>>>>> #Print the matrices >>>>>> A >>>>>> B >>>>>> >>>>>> #Try to multiply them >>>>>> product = C_matrix_multiply(A,B) >>>>>> >>>>>> What I want, and what according to my understanding should happen, is >>>>>> for product to contain the same matrix as would result from A %*% B. >>>>>> Instead, I get the following: >>>>>> >>>>>>> A = matrix(rnorm(8),4,2) >>>>>>> B = matrix(rnorm(6),2,3) >>>>>>> library(CMATRIX) >>>>>>> A >>>>>> >>>>>> [,1] [,2] >>>>>> [1,] -0.4981664 -0.7243532 >>>>>> [2,] 0.1428766 -1.5501623 >>>>>> [3,] -2.0624701 1.5104507 >>>>>> [4,] -0.5871962 0.3049442 >>>>>>> >>>>>>> B >>>>>> >>>>>> [,1] [,2] [,3] >>>>>> [1,] 0.02477964 0.5827084 1.8434375 >>>>>> [2,] -0.20200104 1.7294264 0.9071397 >>>>>>> >>>>>>> C_matrix_multiply(A,B) >>>>>> >>>>>> m = 4, n = 2, p = 3 >>>>>> -0.498166 0.142877 -2.062470 -0.587196 -0.724353 -1.550162 1.510451 >>>>>> 0.304944 >>>>>> 0.024780 -0.202001 0.582708 1.729426 1.843437 0.907140 >>>>>> 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 >>>>>> 0.000000 0.000000 0.000000 0.000000 0.000000 >>>>>> Parameter 10 to routine DGEMM was incorrect >>>>>> Mac OS BLAS parameter error in DGEMM , parameter #0, (unavailable), is 0 >>>>>> >>>>>> and R immediately dies. I know the arguments are being passed into >>>>>> the C code and everything up to my F77_CALL is functioning based on >>>>>> the printed output. The problem is definitely something to do with my >>>>>> F77_CALL(dgemm) line. My understanding is that parameter 10 should be >>>>>> the leading dimension of the matrix B, which in this case should be >>>>>> equal to 2, the number of rows in that matrix, which is what I am >>>>>> doing. I have also considered that parameter numbering starts at 0, >>>>>> in which case the incorrect parameter is &zero, but again that seems >>>>>> correct to me. All of my reading and research suggests I am doing >>>>>> everything correctly, so I am somewhat stumped. Perhaps I am missing >>>>>> something simple or obvious, as I have never done this before and am >>>>>> proceeding with only google and the R docs as my guide. I am >>>>>> wondering if anybody can see what I'm doing wrong here, or perhaps >>>>>> something I could do to try to fix it. Any assistance would be >>>>>> greatly appreciated. >>>>>> >>>>>> Best Regards, >>>>>> >>>>>> Jason Rudy >>>>>> Graduate Student >>>>>> Bioinformatics and Medical Informatics Program >>>>>> San Diego State University >>>>>> >>>>>> ______________________________________________ >>>>>> R-devel@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>>> >>>>> >>>>> -- >>>>> Brian D. Ripley, rip...@stats.ox.ac.uk >>>>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >>>>> University of Oxford, Tel: +44 1865 272861 (self) >>>>> 1 South Parks Road, +44 1865 272866 (PA) >>>>> Oxford OX1 3TG, UK Fax: +44 1865 272595 >>>>> >>>> >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>>> >>> >>> >> >> > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel