Re: [Rcpp-devel] Forcing a shallow versus deep copy
Is it a big deal that we would cheat on chat reference passing means ? If you want to implement these sort of semantics I think at a _minimum_ the type should be const (otherwise it looks like you are going to actually modify the matrix in place which would appear to bypass the implicit memory barrier of SEXP). Realize that you won't actually bypass the memory barrier but it sure looks like you intend to for a reader of the code. Rcpp::RNGScope __rngScope; arma::mat m = Rcpp::asarma::mat (mSEXP); test_ref(m); It looks like this behavior changed as of rev 4400 when the full_name() method was introduced. I may not understand the mechanism you established 100% but to me this generated code looks potentially problematic if you are taking a reference to a stack variable establish within the as method. My guess is that you have something more sophisticated going on here and there is no memory problem, however I'd love to understand things a bit better to be 100% sure there isn't something to drill into further. ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Le 13/09/13 14:00, JJ Allaire a écrit : Is it a big deal that we would cheat on chat reference passing means ? If you want to implement these sort of semantics I think at a _minimum_ the type should be const (otherwise it looks like you are going to actually modify the matrix in place which would appear to bypass the implicit memory barrier of SEXP). Realize that you won't actually bypass the memory barrier but it sure looks like you intend to for a reader of the code. arma::mat has the ability to use auxiliary memory. We might want something that modifies the underlying memory of the object, e.g. void double_me( arma::mat x){ x += x ; } and changes to x be brought back to the R object we pass in. But I realize this might be a strech and we can definitely only have const references. Which is easier to implement anyway and we would not need the reference counting stuff I was talking about before. The arma::mat ctor I'd use enforces memory to be bound to what we pass in for the lifetime of the matrix. From the docs; mat(aux_mem*, n_rows, n_cols, copy_aux_mem = true, strict = true) Create a matrix using data from writeable auxiliary memory. By default the matrix allocates its own memory and copies data from the auxiliary memory (for safety). However, if copy_aux_mem is set to false, the matrix will instead directly use the auxiliary memory (ie. no copying). This is faster, but can be dangerous unless you know what you're doing! The strict variable comes into effect only if copy_aux_mem is set to false (ie. the matrix is directly using auxiliary memory). If strict is set to true, the matrix will be bound to the auxiliary memory for its lifetime; the number of elements in the matrix can't be changed (directly or indirectly). If strict is set to false, the matrix will not be bound to the auxiliary memory for its lifetime, ie., the size of the matrix can be changed. If the requested number of elements is different to the size of the auxiliary memory, new memory will be allocated and the auxiliary memory will no longer be used. Rcpp::RNGScope __rngScope; arma::mat m = Rcpp::asarma::mat (mSEXP); test_ref(m); It looks like this behavior changed as of rev 4400 when the full_name() method was introduced. I may not understand the mechanism you established 100% but to me this generated code looks potentially problematic if you are taking a reference to a stack variable establish within the as method. This was to support additional calling capabilities for classes handled by modules. If we have a module exposed class, we don't want to have to pass it by value as we used to have to. That change allowed me to pass the object by reference, by const reference, by pointer or by const pointer. With module objects, what is really stored is a pointer to the object, so from a T* we can get T, const T, T* and const T* My guess is that you have something more sophisticated going on here and there is no memory problem, however I'd love to understand things a bit better to be 100% sure there isn't something to drill into further. What we used to do before is to trim out the const and reference out of the parameters, so if we had a function like this: void foo( const arma::mat x){ // do stuff } we had an implicit as call, but it was not a call to as const arma::mat, it was a call to as arma::mat , which was creating a copy. So we were implementing pass by reference by using pass by value. Not good. -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Le 13/09/13 14:15, Romain Francois a écrit : But I realize this might be a strech and we can definitely only have const references. Which is easier to implement anyway and we would not need the reference counting stuff I was talking about before. spoke too soon. We would need it otherwise we run into the passing reference to a temporary problem. -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Le 13/09/13 14:00, JJ Allaire a écrit : Is it a big deal that we would cheat on chat reference passing means ? If you want to implement these sort of semantics I think at a _minimum_ the type should be const (otherwise it looks like you are going to actually modify the matrix in place which would appear to bypass the implicit memory barrier of SEXP). Realize that you won't actually bypass the memory barrier but it sure looks like you intend to for a reader of the code. Rcpp::RNGScope __rngScope; arma::mat m = Rcpp::asarma::mat (mSEXP); test_ref(m); It looks like this behavior changed as of rev 4400 when the full_name() method was introduced. I may not understand the mechanism you established 100% but to me this generated code looks potentially problematic if you are taking a reference to a stack variable establish within the as method. My guess is that you have something more sophisticated going on here and there is no memory problem, however I'd love to understand things a bit better to be 100% sure there isn't something to drill into further. Here is where I am now. To wrap up this function: // [[Rcpp::export]] void test_const_ref( const arma::mat m ){} This code gets created by the attributes parser: RcppExport SEXP sourceCpp_71975_test_const_ref(SEXP mSEXP) { BEGIN_RCPP { Rcpp::RNGScope __rngScope; Rcpp::InputParameter const arma::mat m(mSEXP ); test_const_ref(m); } return R_NilValue; END_RCPP } The difference is this line: Rcpp::InputParameter const arma::mat m(mSEXP ); instead of this line: const arma::mat m = Rcpp::as const arma::mat ( mSEXP ) ; The InputParameter template class need to be able to take a SEXP asinput and have a conversion operator to the requested type. So the default implementation obvisouly used Rcpp::as, this is how the default class is implemented: template typename T class InputParameter { public: InputParameter(SEXP x_) : x(x_){} inline operator T() { return asT(x) ; } private: SEXP x ; } ; So we get exactly the same as before. What we gain however is that we can redefine InputParameter for other types and we can take advantage of its destructor to do something when the InputParameter object goes out of scope. Here is how I implemented a custom version for const reference to arma::Mat : template typename T class InputParameter const arma::MatT { public: typedef const typename arma::MatT const_reference ; InputParameter( SEXP x_ ) : m(x_), mat( m.begin(), m.nrow(), m.ncol(), false ){} inline operator const_reference(){ return mat ; } private: Rcpp::Matrix Rcpp::traits::r_sexptype_traitsT::rtype m ; arma::MatT mat ; } ; The arma::mat is a member of InputParameter, constructed via the advanced constructor, so using the same memory as the R object, and we retrieve a reference to this object with the operator const_reference This is simple and elegant. And now we can pass down references and const references of armadillo matrices from R without performance penalty. This makes using RcppArmadillo even more compelling. It leaves the issue of what happens when we return an armadillo matrix. At the moment, this still makes a copy of the data. I don't see a way around that just yet. If we want to avoid making a copy, we need to construct the arma::mat out of R memory and return that R object. I also have to deal with references and const references of other arma types (arma::rowvec, etc ...). I'm happy to discuss the changes I've made in Rcpp and RcppArmadillo for this. For now I've included the version for non const references too, but maybe I should not, although it does work perfectly. This is much better ythan what we used to have where we would allow passing references but still make lots of data copies which sort of goes against using references. When I see a function that passes an object by reference, I tend to think that calling the function is cheap. Now it is. I'd specifically would like to hear from Gabor and Baptiste about the simplification of being able to just use (const) references as inputs and have RcppArmadillo simply borrow memory from the R object : // [[Rcpp::export]] arma::mat plus( const arma::mat m1, const arma::mat m2){ return m1 + m2 ; } Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
On 13 September 2013 at 17:56, Romain Francois wrote: | Here is where I am now. To wrap up this function: [...] | This is simple and elegant. And now we can pass down references and | const references of armadillo matrices from R without performance penalty. | | This makes using RcppArmadillo even more compelling. Love it. Thanks a bunch for making that change. Really nice. Dirk PS Baptiste is AFAIK on vacation -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
On Fri, Jul 12, 2013 at 1:42 AM, Dirk Eddelbuettel e...@debian.org wrote: On 11 July 2013 at 19:21, Gabor Grothendieck wrote: | 1. Just to be clear what we have been discussing here is not just how to | avoid copying but how to avoid copying while using as and wrap | or approaches that automatically generate as and wrap. I was already | aware of how to avoid copying using Armadillo how to use Armadillo types | as arguments and return values to autogen as and wrap. The problem is | not that but that these two things cannot be done at once - its either or. I must still be misunderstanding as this still reads to me as if you are suspecting that we somehow keep layers making extra copies. We're not. And I've known you long enough to know that you are not likely to suspect this either. So what is it then? As Romain said, some of the choice have to do with the representation on both the R and C++ side -- for Rcpp itself we can be lightweight and efficient via proxy classes, but this does not mean we can do this for _any arbitrary C++ class_ coming from another project. As eg Armadillo. RcppArmadilo already does pretty well, and code review may make it better. We do not know of any fat to cut, or we'd cut it ourselves. We care about a few things, but performance is clearly among them. I think Romain's proposal will clarify this. | 2. Regarding the quesiton of performance impact there are two situations | which should be distinguished: | | i. We call C++ from R and it does some processing and then returns and | we don't call it again. In that case its likely that copying or not won't | make a big difference or at least it won't if the actual C++ computation | time is large coimpared to the time spent in copying. | | ii. We factor out the inner loop of the code and only recode that in C++ | and repeatedly call it many times. In that case the copying is multiplied | by the number of iterations and might very well have a significant impact. In case ii) I'd try to use a different design and make it more like i): You generally do not want to call down from R to object code a bazillion times as there is always some overhead, and multiplying even something rather efficient by a veryBigNumber can make small times large in the aggregate. Sure and sugar, rcpparmadillo and other facilities do make it easier to move more functionality into C++; nevertheless, it can be the case that a relatively small amount of R code repeatedly invoked is responsible for the performance hit in a program and from the viewpoint of reducing complexity and increasing maintainability it can be desirable to just move that minimum portion to the C++ side minimizing the dual language aspect of the code. By making call overhead as fast as one can while retaining any automatic Rcpp features then this is facilitated. If its not possible in general then if it were just possible for Armadillo objects and selected other situations then this would still be nice. Dirk | | On Thu, Jul 11, 2013 at 6:55 PM, Dirk Eddelbuettel e...@debian.org wrote: | | Everybody has this existing example in their copy of Armadillo. | | I am running it here from SVN rather than the installed directory, but this | should not make a difference. Machine is my not-overly-powerful thinkpad used | for traveling: | | edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ r fastLm.r | Loading required package: methods | | Attaching package: ‘Rcpp’ | | The following object is masked from ‘package:inline’: | | registerPlugin | | test replications relative elapsed user.self sys.self | 2 fLmTwoCasts(X, y) 50001.000 0.184 0.204 0.164 | 1 fLmOneCast(X, y) 50001.011 0.186 0.200 0.172 | 4 fastLmPureDotCall(X, y) 50001.141 0.210 0.236 0.184 | 3 fastLmPure(X, y) 50002.027 0.373 0.412 0.332 | 6 lm.fit(X, y) 50002.685 0.494 0.528 0.456 | 5 fastLm(frm, data = trees) 5000 36.380 6.694 7.332 6.028 | 7 lm(frm, data = trees) 5000 42.734 7.863 8.628 7.068 | edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ | | What we are talking about here is the difference between 'fLmTwoCasts' and | 'fLmOneCasts'. If you use larger objects, the different with be larger. But | the relative differences are tiny. | | It would be nice to make this more elegant, and I look forward to Romain's | proposals, but methinks that we may well have bigger fish to fry. | | Dirk, still in Sydney | | -- | Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com | ___ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | |
Re: [Rcpp-devel] Forcing a shallow versus deep copy
I apologize if my emails were badly phrased, or disrespectful. No intention of saying anything was broken, suspicious or wrong. I second Gabor. His described use case matches mine. The outer loop is an optimization routine coming from other libraries. Rcpp is used to speed up the objective, gradient and hessian computations and hence the data is constantly passed along to all of these functions. Another use case to consider is recursion with data passed along. A toy example is gib(0) = values(0); gib(1) = values(1); gib(x) = gib(x-1) + gib(x-2) + values(x). Values = vector of non negative integers. A naive implementation with aux memory allocation may cause the number of copies in memory to grow with exponential order in x. In case ii) I'd try to use a different design and make it more like i): You generally do not want to call down from R to object code a bazillion times as there is always some overhead, and multiplying even something rather efficient by a veryBigNumber can make small times large in the aggregate. Sure and sugar, rcpparmadillo and other facilities do make it easier to move more functionality into C++; nevertheless, it can be the case that a relatively small amount of R code repeatedly invoked is responsible for the performance hit in a program and from the viewpoint of reducing complexity and increasing maintainability it can be desirable to just move that minimum portion to the C++ side minimizing the dual language aspect of the code. By making call overhead as fast as one can while retaining any automatic Rcpp features then this is facilitated. If its not possible in general then if it were just possible for Armadillo objects and selected other situations then this would still be nice. Dirk ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Hi, That's great, thanks for considering this! Following this discussion, I went to browse through my code looking for wrap() and as() statements that could benefit from a speed-up of memory reuse. Of course I didn't find any. I switched to using Modules when they were introduced, the code being much nicer to read, and these conversions only happen behind the scene. My c++ functions thus only deal with native Armadillo / C++ objects, and I leave it up to the modules to magically do the required conversions in and out. It's a brilliant interface, very readable. From what I understand, however, the resulting code can often lose a factor 2-3 in speed, compared to the now much more verbose alternative of explicitly converting and sharing the memory with this type of code: arma::mat A(M.begin(), M.rows(), M.cols(), false); From this perspective, the possibility of setting copy_aux_mem to false in as(), as used by modules, would be very welcome. Best regards, baptiste On 11 July 2013 10:22, rom...@r-enthusiasts.com wrote: Hello, This comes up every now and then, I think we can find a syntax to initiate an arma::mat that would allow what you want. It is not likely it will come via attributes. The idea is to keep them simple. The solutions I see below would eventually lead to clutter, and we are heading in the less clutter direction. I'll think about it and propose something. Romain Le 2013-07-11 14:32, Changi Han a écrit : Hello, I think I (superficially) understand the difference between: // [[Rcpp::export]] double sum1(Rcpp::NumericMatrix M) { arma::mat A(M.begin(), M.rows(), M.cols(), false); return sum(sum(A)); } // [[Rcpp::export]] double sum2(arma::mat A) { return sum(sum(A)); } Partly out of laziness, partly because sum2 is more elegant, and partly to avoid namespace pollution, I was wondering if there is a way to force a shallow copy in sum2. If not, then may I submit a low priority feature request. An attribute? Some thing like: // [[Rcpp::export]] double sum2(arma::mat A) { // [[ Rcpp::shallow ( A ) ]] return sum(sum(A)); } Or (akin to C++11 generalized attributes) // [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] } double sum2(arma::mat A) { return sum(sum(A)); } An alternative is to have an argument in sourceCpp that takes a list/vector of objects that are to be shallow or deep copied. For example in sum1, if M is changed within the function before casting to the arma::mat, then might be cleaner to add M to a list/vector of objects to be deep copied rather than cloning M within sum1: leads to one fewer variable name. Just a thought. I can certainly live with the additional step. As always, thanks for all the Rcpp goodness. Cheers, Changi Han __**_ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-**project.orgRcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-**project.org/cgi-bin/mailman/** listinfo/rcpp-develhttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
I am sure there are better ways to achieve the goal. I would suggest that these two be similar if possible. I think the naive expectation is for them to be consistent. // [[Rcpp::export]] stuff function(Rcpp::stuff) { } // [[Rcpp::export]] stuff function(arma::stuff) { } Thank you again. Cheers. On Thu, Jul 11, 2013 at 9:22 PM, rom...@r-enthusiasts.com wrote: Hello, This comes up every now and then, I think we can find a syntax to initiate an arma::mat that would allow what you want. It is not likely it will come via attributes. The idea is to keep them simple. The solutions I see below would eventually lead to clutter, and we are heading in the less clutter direction. I'll think about it and propose something. Romain Le 2013-07-11 14:32, Changi Han a écrit : Hello, I think I (superficially) understand the difference between: // [[Rcpp::export]] double sum1(Rcpp::NumericMatrix M) { arma::mat A(M.begin(), M.rows(), M.cols(), false); return sum(sum(A)); } // [[Rcpp::export]] double sum2(arma::mat A) { return sum(sum(A)); } Partly out of laziness, partly because sum2 is more elegant, and partly to avoid namespace pollution, I was wondering if there is a way to force a shallow copy in sum2. If not, then may I submit a low priority feature request. An attribute? Some thing like: // [[Rcpp::export]] double sum2(arma::mat A) { // [[ Rcpp::shallow ( A ) ]] return sum(sum(A)); } Or (akin to C++11 generalized attributes) // [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] } double sum2(arma::mat A) { return sum(sum(A)); } An alternative is to have an argument in sourceCpp that takes a list/vector of objects that are to be shallow or deep copied. For example in sum1, if M is changed within the function before casting to the arma::mat, then might be cleaner to add M to a list/vector of objects to be deep copied rather than cloning M within sum1: leads to one fewer variable name. Just a thought. I can certainly live with the additional step. As always, thanks for all the Rcpp goodness. Cheers, Changi Han __**_ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-**project.orgRcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-**project.org/cgi-bin/mailman/** listinfo/rcpp-develhttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
These __are__ similar. The difference is in the classes themselves. Rcpp classes are proxy classes so C++ copy mechanism does not apply to them. arma classes are proper c++ classes, so C++ semantics apply. I'm at useR right now, so I can't really work on this. I'll submit at least ideas later. Romain Le 2013-07-11 15:34, Changi Han a écrit : I am sure there are better ways to achieve the goal. I would suggest that these two be similar if possible. I think the naive expectation is for them to be consistent. // [[Rcpp::export]] stuff function(Rcpp::stuff) { } // [[Rcpp::export]] stuff function(arma::stuff) { } Thank you again. Cheers. On Thu, Jul 11, 2013 at 9:22 PM, rom...@r-enthusiasts.com [3] wrote: Hello, This comes up every now and then, I think we can find a syntax to initiate an arma::mat that would allow what you want. It is not likely it will come via attributes. The idea is to keep them simple. The solutions I see below would eventually lead to clutter, and we are heading in the less clutter direction. Ill think about it and propose something. Romain Le 2013-07-11 14:32, Changi Han a écrit : Hello, I think I (superficially) understand the difference between: // [[Rcpp::export]] double sum1(Rcpp::NumericMatrix M) { arma::mat A(M.begin(), M.rows(), M.cols(), false); return sum(sum(A)); } // [[Rcpp::export]] double sum2(arma::mat A) { return sum(sum(A)); } Partly out of laziness, partly because sum2 is more elegant, and partly to avoid namespace pollution, I was wondering if there is a way to force a shallow copy in sum2. If not, then may I submit a low priority feature request. An attribute? Some thing like: // [[Rcpp::export]] double sum2(arma::mat A) { // [[ Rcpp::shallow ( A ) ]] return sum(sum(A)); } Or (akin to C++11 generalized attributes) // [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] } double sum2(arma::mat A) { return sum(sum(A)); } An alternative is to have an argument in sourceCpp that takes a list/vector of objects that are to be shallow or deep copied. For example in sum1, if M is changed within the function before casting to the arma::mat, then might be cleaner to add M to a list/vector of objects to be deep copied rather than cloning M within sum1: leads to one fewer variable name. Just a thought. I can certainly live with the additional step. As always, thanks for all the Rcpp goodness. Cheers, Changi Han ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org [1] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel [2] Links: -- [1] mailto:Rcpp-devel@lists.r-forge.r-project.org [2] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel [3] mailto:rom...@r-enthusiasts.com ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
On 11 July 2013 at 10:33, baptiste auguie wrote: | Hi, | | That's great, thanks for considering this! | | Following this discussion, I went to browse through my code looking for wrap() | and as() statements that could benefit from a speed-up of memory reuse. Of | course I didn't find any. | I switched to using Modules when they were introduced, the code being much | nicer to read, and these conversions only happen behind the scene. | My c++ functions thus only deal with native Armadillo / C++ objects, and I | leave it up to the modules to magically do the required conversions in and out. | It's a brilliant interface, very readable. | | From what I understand, however, the resulting code can often lose a factor 2-3 | in speed, compared to the now much more verbose alternative of explicitly | converting and sharing the memory with this type of code: No way. I have seen 2 to 3 __per cent__ which is very different from a factor 2 or 3. This whole discussion is mostly a non-issue, really, as best as I can tell because the cost ois really not that large. Dirk | arma::mat A(M.begin(), M.rows(), M.cols(), false); | | From this perspective, the possibility of setting copy_aux_mem to false in as | (), as used by modules, would be very welcome. | | Best regards, | | baptiste | | | On 11 July 2013 10:22, rom...@r-enthusiasts.com wrote: | | | Hello, | | This comes up every now and then, I think we can find a syntax to initiate | an arma::mat that would allow what you want. | | It is not likely it will come via attributes. The idea is to keep them | simple. The solutions I see below would eventually lead to clutter, and we | are heading in the less clutter direction. | | I'll think about it and propose something. | | Romain | | Le 2013-07-11 14:32, Changi Han a écrit : | | | Hello, | | I think I (superficially) understand the difference between: | | // [[Rcpp::export]] | double sum1(Rcpp::NumericMatrix M) { | arma::mat A(M.begin(), M.rows(), M.cols(), false); | return sum(sum(A)); | } | | // [[Rcpp::export]] | double sum2(arma::mat A) { | return sum(sum(A)); | } | | Partly out of laziness, partly because sum2 is more elegant, and | partly to avoid namespace pollution, I was wondering if there is a way | to force a shallow copy in sum2. | | If not, then may I submit a low priority feature request. An | attribute? Some thing like: | | // [[Rcpp::export]] | double sum2(arma::mat A) { | // [[ Rcpp::shallow ( A ) ]] | return sum(sum(A)); | } | | Or (akin to C++11 generalized attributes) | | // [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] } | double sum2(arma::mat A) { | return sum(sum(A)); | } | | An alternative is to have an argument in sourceCpp that takes a | list/vector of objects that are to be shallow or deep copied. | | For example in sum1, if M is changed within the function before | casting to the arma::mat, then might be cleaner to add M to a | list/vector of objects to be deep copied rather than cloning M within | sum1: leads to one fewer variable name. | | Just a thought. I can certainly live with the additional step. As | always, thanks for all the Rcpp goodness. | | Cheers, | Changi Han | | | ___ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | | | | -- | ___ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Everybody has this existing example in their copy of Armadillo. I am running it here from SVN rather than the installed directory, but this should not make a difference. Machine is my not-overly-powerful thinkpad used for traveling: edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ r fastLm.r Loading required package: methods Attaching package: ‘Rcpp’ The following object is masked from ‘package:inline’: registerPlugin test replications relative elapsed user.self sys.self 2 fLmTwoCasts(X, y) 50001.000 0.184 0.2040.164 1 fLmOneCast(X, y) 50001.011 0.186 0.2000.172 4 fastLmPureDotCall(X, y) 50001.141 0.210 0.2360.184 3 fastLmPure(X, y) 50002.027 0.373 0.4120.332 6 lm.fit(X, y) 50002.685 0.494 0.5280.456 5 fastLm(frm, data = trees) 5000 36.380 6.694 7.3326.028 7 lm(frm, data = trees) 5000 42.734 7.863 8.6287.068 edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ What we are talking about here is the difference between 'fLmTwoCasts' and 'fLmOneCasts'. If you use larger objects, the different with be larger. But the relative differences are tiny. It would be nice to make this more elegant, and I look forward to Romain's proposals, but methinks that we may well have bigger fish to fry. Dirk, still in Sydney -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
On 11 July 2013 at 19:21, Gabor Grothendieck wrote: | 1. Just to be clear what we have been discussing here is not just how to | avoid copying but how to avoid copying while using as and wrap | or approaches that automatically generate as and wrap. I was already | aware of how to avoid copying using Armadillo how to use Armadillo types | as arguments and return values to autogen as and wrap. The problem is | not that but that these two things cannot be done at once - its either or. I must still be misunderstanding as this still reads to me as if you are suspecting that we somehow keep layers making extra copies. We're not. And I've known you long enough to know that you are not likely to suspect this either. So what is it then? As Romain said, some of the choice have to do with the representation on both the R and C++ side -- for Rcpp itself we can be lightweight and efficient via proxy classes, but this does not mean we can do this for _any arbitrary C++ class_ coming from another project. As eg Armadillo. RcppArmadilo already does pretty well, and code review may make it better. We do not know of any fat to cut, or we'd cut it ourselves. We care about a few things, but performance is clearly among them. | 2. Regarding the quesiton of performance impact there are two situations | which should be distinguished: | | i. We call C++ from R and it does some processing and then returns and | we don't call it again. In that case its likely that copying or not won't | make a big difference or at least it won't if the actual C++ computation | time is large coimpared to the time spent in copying. | | ii. We factor out the inner loop of the code and only recode that in C++ | and repeatedly call it many times. In that case the copying is multiplied | by the number of iterations and might very well have a significant impact. In case ii) I'd try to use a different design and make it more like i): You generally do not want to call down from R to object code a bazillion times as there is always some overhead, and multiplying even something rather efficient by a veryBigNumber can make small times large in the aggregate. Dirk | | On Thu, Jul 11, 2013 at 6:55 PM, Dirk Eddelbuettel e...@debian.org wrote: | | Everybody has this existing example in their copy of Armadillo. | | I am running it here from SVN rather than the installed directory, but this | should not make a difference. Machine is my not-overly-powerful thinkpad used | for traveling: | | edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ r fastLm.r | Loading required package: methods | | Attaching package: ‘Rcpp’ | | The following object is masked from ‘package:inline’: | | registerPlugin | | test replications relative elapsed user.self sys.self | 2 fLmTwoCasts(X, y) 50001.000 0.184 0.2040.164 | 1 fLmOneCast(X, y) 50001.011 0.186 0.2000.172 | 4 fastLmPureDotCall(X, y) 50001.141 0.210 0.2360.184 | 3 fastLmPure(X, y) 50002.027 0.373 0.4120.332 | 6 lm.fit(X, y) 50002.685 0.494 0.5280.456 | 5 fastLm(frm, data = trees) 5000 36.380 6.694 7.3326.028 | 7 lm(frm, data = trees) 5000 42.734 7.863 8.6287.068 | edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ | | What we are talking about here is the difference between 'fLmTwoCasts' and | 'fLmOneCasts'. If you use larger objects, the different with be larger. But | the relative differences are tiny. | | It would be nice to make this more elegant, and I look forward to Romain's | proposals, but methinks that we may well have bigger fish to fry. | | Dirk, still in Sydney | | -- | Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com | ___ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel | | | | -- | Statistics Software Consulting | GKX Group, GKX Associates Inc. | tel: 1-877-GKX-GROUP | email: ggrothendieck at gmail.com -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel