Thank you all for the responses. Christian, I didn't know about the copy_aux_mem option. I will have to take a look at that.
Dirk, thanks for looking into the 64-bit matrix indices. Doug, the place in my code where I get the error is when I multiply matrices. I might have matrices X and Y, where X is 300000x500 and Y is 500x300000 and I want Z = X * Y. I could break up the matrices into smaller chunks and do the multiplication, but Z is later used in several other multi-step calculations (with addition and multiplication mostly) so I think that would be a last resort. If I can get the 64-bit matrix indices working in RcppArmadillo, I think that will solve much of the problem, because I will only need to return very long vectors and not big matrices. Joshua -- Joshua French, Ph.D. Assistant Professor Department of Mathematical and Statistical Sciences University of Colorado Denver joshua.fre...@ucdenver.edu http://math.ucdenver.edu/~jfrench/ Ph: 303-556-6265 Fax: 303-556-8550 On 7/17/12 8:56 AM, "Douglas Bates" <ba...@stat.wisc.edu> wrote: >On Tue, Jul 17, 2012 at 8:14 AM, Dirk Eddelbuettel <e...@debian.org> wrote: >> >> On 16 July 2012 at 23:30, French, Joshua wrote: >> | I am doing some linear algebra on large matrices in R and receiving >>the >> | following error: "allocMatrix: too many elements specified". From >>what I >> | understand, the error is caused by the fact that R uses 32-bit ints >>and not >> | 64-bit ints for matrix indices, so R doesn't have a way to represent >>all the >> | elements in the very large matrix. >> | >> | My two questions: >> | >> | 1. Armadillo (and presumably RcppArmadillo) will not have this issue >>since >> | Armadillo provided support for 64-bit indices as of version 2.4.0. >>Is there a >> | way to easily utilize this functionality from within RcppArmadillo? >> >> I need to double check but this may have been a compile-time option you >>need >> to enable. In any event ... R indices are still limited so you may not >>be >> able to pass these back and forth. >> >> | 2. I have found in the past that some of the speeds gains from >>RcppArmadillo >> | in comparison to pure R are lost when passing large matrices as >>arguments. >> | There will always be overhead when passing arguments (especially >>large matrix >> | arguments) to pretty much any function. Are there any tricks to >>minimize the >> | overhead when passing a non-sparse matrix argument of say 1,000,000 >>by 500 from >> | R to Armadillo? >> >> I defer all question concerning sparse matrices to Doug and other users >>of >> sparse matrix code. I live mostly in a small-to-medium size dense >>matrix world. > >Actually the question was about non-sparse matrices. It looks as if >it is the number of rows in the matrix that will be problematic. An >upper bound on the number of rows is the maximum integer value divided >by the number of columns. > >> .Machine$integer.max / 500 >[1] 4294967 > >I would try not to exceed about 1/2 to 1/4 of that bound. > >A simple way of handling data sets that are much larger than that is >to work with a sample of the rows. If that is not feasible then I >would create a list of matrices each of size 1,000,000 by 500 or so >and vertically concatenate them in the C++ code. Of course, this >means a copying operation. Also, when you are finished if you need to >pass results back to R then you face a similar problem getting a large >matrix in C++ back into R storage. > >You can create a read-only matrix in C++ using the storage from R as >described by Christian for RcppArmadillo or using the Eigen::Map class >in RcppEigen. > >What are you (Joshua) doing with these large matrices? If the main >calculations involve X'X-type calculations you can carry out the >calculations on horizontal chunks and then assemble the results. _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel