I am sorry. I think you are right. I know little about computation. And it will be great that you write what you would like to wrote to explain to me. I am really learning a lot from this discussion. Before I even have not heard of profiling of a program. Thanks.
Best wishes! Honglang Wang Office C402 Wells Hall Department of Statistics and Probability Michigan State University 1579 I Spartan Village, East Lansing, MI 48823 wangh...@msu.edu On Wed, Dec 5, 2012 at 1:31 PM, Douglas Bates <ba...@stat.wisc.edu> wrote: > > On Tue, Dec 4, 2012 at 9:39 PM, Honglang Wang > <wanghonglang2...@gmail.com>wrote: > >> Yes, the main issue for my coding is the allocation of memory. And I >> have fixed one of the biggest memory allocation issue: 4000 by 4000 >> diagonal matrix. And since I am not familiar with Rcpp and RcppArmadillo, I >> have no idea how to reuse the memory. I hope I can have some materials to >> learn this. Thanks. >> >> No, your main issue is not thinking about the computation. As soon as > you write something like > > arma::vec betahat = arma::inv(Inv)*arma::trans(D)*W*y; > > you are in theory land which has very little relationship to practical > numerical linear algebra. If you want to perform linear algebra > calculations like weighted least squares you should first take a bit of > time to learn about numerical linear algebra as opposed to theoretical > linear algebra. They are very different disciplines. In theoretical > linear algebra you write the solution to a system of linear equations as > above, using the inverse of the system matrix. The first rule of numerical > linear algebra is that you never calculate the inverse of a matrix, unless > you only plan to do toy examples. You mentioned sizes of 4000 by 4000 > which means that the method you have chosen is doing thousands of times > more work than necessary (hint: how do you think that the inverse of a > matrix is calculated in practice? - ans: by solving n systems of equations, > which you are doing here when you could be solving only one). > > Dirk and I wrote about 7 different methods of solving least squares > problems in our vignette on RcppEigen. None of those methods involve > taking the inverse of an n by n matrix. > > R and Rcpp and whatever other programming technologies come along will > never be a "special sauce" that takes the place of thinking about what you > are trying to do in a computation. > > I could explain about using matrix decompositions, especially the Cholesky > and QR decompositions, to solve such problems but you have already ignored > all the other suggestions to think about the problem you are addressing and > decompose it into manageable chunks so I have no expectation that you would > pay attention to what I would write. > > >> >>> What exactly do these timings show? A single call you your function? >>> How many calls? >>> >>> Here I called my function for 100 times. >> >> >>> Building on Romain's point: -- a portion of your function's runtime is >>> in memory allocation >>> (and you have a lot of allocations here). >>> If you're calling your function thousands or millions of times, then >>> it might pay to closely >>> examine your memory allocation strategies and figure out what's >>> temporary, for example. >>> It looks like you're already using copy_aux_mem = false in a number >>> of places, but you're >>> allocating a lot of objects -- of approx what size? >>> >>> For example, wouldn't this work just as well with one less allocation? >>> arma::vec kk = t; >>> arma::uvec q1 = arma::find(arma::abs(tp)<h); >>> kk.elem(q1) = ((1-arma::pow(tp.elem(q1)/h,2))/h)*0.75; >>> // done with q1. let's reuse it. >>> q1 = arma::find(arma::abs(tp)>=h); >>> // was q2 >>> kk.elem(q1).zeros(); >>> >>> You could potentially allocate memory for temporary working space in >>> R, grab it with copy_aux_mem = false, write your temp results there, >>> and reuse these objects in subsequent function calls. It doesn't make >>> sense to go to this trouble, though, if your core algorithm consumes >>> the bulk of runtime. >>> >>> Have you looked on the armadillo notes r.e. inv? Matrix inversion has >>> O(>n^2). You may be aided by pencil-and-paper math here. >>> http://arma.sourceforge.net/docs.html#inv >>> >>> Here my matrix for inverse is only 4 by 4, so I think it's ok. >> >> >>> best, >>> Christian >>> >>> > Dear All, >>> > I have tried out the first example by using RcppArmadillo, but I am not >>> > sure whether the code is efficient or not. And I did the comparison of >>> the >>> > computation time. >>> > >>> > 1) R code using for loop in R: 87.22s >>> > 2) R code using apply: 77.86s >>> > 3) RcppArmadillo by using for loop in C++: 53.102s >>> > 4) RcppArmadillo together with apply in R: 47.310s >>> > >>> > It is kind of not so big increase. I am wondering whether I used an >>> > inefficient way for the C++ coding: >>> >>> >>> >>> -- >>> A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal – Panama! >>> >> >> >> _______________________________________________ >> Rcpp-devel mailing list >> Rcpp-devel@lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel >> > >
_______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel