On 14 June 2012 at 08:01, Douglas Bates wrote: | On Wed, Jun 13, 2012 at 6:53 PM, Julian Smith <[email protected]> wrote: | > Doesn't svd in R by default compute D, U and V? | | > http://stat.ethz.ch/R-manual/R-patched/library/base/html/svd.html | | You're right but the default is the 'thin' U when X is n by p and n >= | p. Does the svd in Armadillo return the full n by n matrix U?
Thanks for that earlier hint re 'thin' and 'full' SVDs. | Another thing to check is what underlying Lapack routine is called. | There are two: dgesvd, which is the older algorithm and the one with | the expected name if you know the Lapack conventions, and dgesdd which | is a newer and faster algorithm. R uses dgesdd. Good call. It looks like Armadillo uses dgesv(d), grep did not find dgesdd it seems: : edd@max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ grep dges * atlas_bones.hpp: int wrapper_clapack_dgesv(const enum CBLAS_ORDER Order, const int N, const int NRHS, double *A, const int lda, int *ipiv, double *B, const int ldb); atlas_wrapper.hpp: return arma_atlas(clapack_dgesv)(Order, N, NRHS, (T*)A, lda, ipiv, (T*)B, ldb); glue_histc_meat.hpp: "histc(): parameter 'edges' must be a vector" lapack_bones.hpp: #define arma_dgesvd dgesvd lapack_bones.hpp: #define arma_dgesv dgesv lapack_bones.hpp: #define arma_dgesvd DGESVD lapack_bones.hpp: #define arma_dgesv DGESV lapack_bones.hpp: void arma_fortran(arma_dgesvd)(char* jobu, char* jobvt, blas_int* m, blas_int* n, double* a, blas_int* lda, double* s, double* u, blas_int* ldu, double* vt, blas_int* ldvt, double* work, blas_int* lwork, blas_int* info); lapack_bones.hpp: void arma_fortran(arma_dgesv)(blas_int* n, blas_int* nrhs, double* a, blas_int* lda, blas_int* ipiv, double* b, blas_int* ldb, blas_int* info); lapack_wrapper.hpp: arma_fortran(arma_dgesvd)(jobu, jobvt, m, n, (T*)a, lda, (T*)s, (T*)u, ldu, (T*)vt, ldvt, (T*)work, lwork, info); lapack_wrapper.hpp: arma_fortran(arma_dgesv)(n, nrhs, (T*)a, lda, ipiv, (T*)b, ldb, info); edd@max:~/svn/rcpp/pkg/RcppArmadillo/inst/include/armadillo_bits$ Conrad, any interest in switching to dgesdd? Dirk, at useR and across the room from Doug | > On Wed, Jun 13, 2012 at 4:07 PM, Douglas Bates <[email protected]> wrote: | >> | >> On Wed, Jun 13, 2012 at 5:16 PM, Dirk Eddelbuettel <[email protected]> wrote: | >> > | >> > On 13 June 2012 at 15:05, Julian Smith wrote: | >> > | I agree that RcppEigen is a little bit faster, but ease of use is | >> > important to | >> > | me, so I feel like RcppArmadillo might win out in my application. | >> > | >> > Yup, that my personal view too. | >> > | >> > | | RcppArmadillo will use the very same LAPACK and BLAS libs your R | >> > session | >> > | | uses. So MKL, OpenBlas, ... are all options. Eigen actually has its | >> > own | >> > | code | >> > | | outperforming LAPACK, so it doesn't as much there. | >> > | | >> > | Why do you think R outperforms RcppArmadillo in this example below? | >> > Anyway to | >> > | speed this up? | >> > | >> > That is odd. "I guess it shouldn't." I shall take another look -- as I | >> > understand it both should go to the same underlying Lapack routine. I | >> > may | >> > have to consult with Conrad on this. | >> > | >> > Thanks for posting a full and reproducible example! | >> > | >> > Dirk | >> > | >> > | require(RcppArmadillo) | >> > | require(inline) | >> > | | >> > | arma.code <- ' | >> > | using namespace arma; | >> > | NumericMatrix Xr(Xs); | >> > | int n = Xr.nrow(), k = Xr.ncol(); | >> > | mat X(Xr.begin(), n, k, false); | >> > | mat U; | >> > | vec s; | >> > | mat V; | >> > | svd(U, s, V, X); | >> > | return wrap(s); | >> > | ' | >> | >> Because the arma code is evaluating the singular vectors (U and V) as | >> well as the singular values (S) whereas the R code is only evaluating | >> the singular values. There is considerably more effort required to | >> evaluate the singular vectors in addition to the singular values. | >> | >> > | rcppsvd <- cxxfunction(signature(Xs="numeric"), | >> > | arma.code, | >> > | plugin="RcppArmadillo") | >> > | | >> > | A<-matrix(rnorm(5000^2), 5000) | >> > | | >> > | > system.time(rcppsvd(A)) | >> > | user system elapsed | >> > | 1992.406 4.862 1988.737 | >> > | | >> > | > system.time(svd(A)) | >> > | user system elapsed | >> > | 652.496 2.641 652.614 | >> > | | >> > | On Wed, Jun 13, 2012 at 11:43 AM, Dirk Eddelbuettel <[email protected]> | >> > wrote: | >> > | | >> > | | >> > | On 13 June 2012 at 10:57, Julian Smith wrote: | >> > | | I've been toying with both RcppArmadillo and RcppEigen the past | >> > few days | >> > | and | >> > | | don't know which library to continue using. RcppEigen seems | >> > really slick, | >> > | but | >> > | | appears to be lacking some of the decompositions I want and | >> > isn't nearly | >> > | as | >> > | | fast to code. RcppArmadillo seems about as fast, easier to code | >> > up etc. | >> > | What | >> > | | are some of the advantages/disadvantages of both? | >> > | | >> > | That's pretty close. I have been a fan of [Rcpp]Armadillo which I | >> > find | >> > | easier to get my head around. Doug, however, moved from | >> > [Rcpp]Armadillo | >> > | to | >> > | [Rcpp]Eigen as it has some things he needs. Eigen should have a | >> > "larger" | >> > | API | >> > | than Armadillo, but I find the code and docs harder to navigate. | >> > | | >> > | And you should find Eigen to be a little faster. Andreas Alfons | >> > went as far | >> > | as building 'robustHD' using RcppArmadillo with a drop-in for | >> > RcppEigen (in | >> > | package 'sparseLTSEigen'; both package names from memmory and I | >> > may have | >> > | mistyped). He reported a performance gain of around 25% for his | >> > problem | >> > | sets. On the 'fastLm' benchmark, we find the fast Eigen-based | >> > | decompositions | >> > | to be much faster than Armadillo. | >> > | | >> > | | Can you call LAPACK or BLAS from either? Is there a wrapper in | >> > RcppEigen | >> > | to | >> > | | call LAPACK functions? Want some other decomposition methods, | >> > dont like | >> > | the | >> > | | JacobiSVD method in Eigen. | >> > | | >> > | You need to differentiate between the Eigen and Armadillo docs | >> > _for their | >> > | libraries_ and what happens when you access the Rcpp* variant from | >> > R. | >> > | | >> > | RcppArmadillo will use the very same LAPACK and BLAS libs your R | >> > session | >> > | uses. So MKL, OpenBlas, ... are all options. Eigen actually has | >> > its own | >> > | code | >> > | outperforming LAPACK, so it doesn't as much there. | >> > | | >> > | Hope this helps, Dirk (at useR!) | >> > | | >> > | | | >> > | | | >> > ---------------------------------------------------------------------- | >> > | | _______________________________________________ | >> > | | Rcpp-devel mailing list | >> > | | [email protected] | >> > | | | >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | >> > | -- | >> > | Dirk Eddelbuettel | [email protected] | http://dirk.eddelbuettel.com | >> > | | >> > | | >> > | >> > -- | >> > Dirk Eddelbuettel | [email protected] | http://dirk.eddelbuettel.com | >> > _______________________________________________ | >> > Rcpp-devel mailing list | >> > [email protected] | >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | > | > -- Dirk Eddelbuettel | [email protected] | http://dirk.eddelbuettel.com _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
