Re: [Rcpp-devel] Forcing a shallow versus deep copy
On 13 September 2013 at 17:56, Romain Francois wrote: | Here is where I am now. To wrap up this function: [...] | This is simple and elegant. And now we can pass down references and | const references of armadillo matrices from R without performance penalty. | | This makes using RcppArmadillo even more compelling. Love it. Thanks a bunch for making that change. Really nice. Dirk PS Baptiste is AFAIK on vacation -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Le 13/09/13 14:00, JJ Allaire a écrit : Is it a big deal that we would cheat on chat reference passing means ? If you want to implement these sort of semantics I think at a _minimum_ the type should be const & (otherwise it looks like you are going to actually modify the matrix in place which would appear to bypass the implicit memory barrier of SEXP). Realize that you won't actually bypass the memory barrier but it sure looks like you intend to for a reader of the code. Rcpp::RNGScope __rngScope; arma::mat& m = Rcpp::as(mSEXP); test_ref(m); It looks like this behavior changed as of rev 4400 when the full_name() method was introduced. I may not understand the mechanism you established 100% but to me this generated code looks potentially problematic if you are taking a reference to a stack variable establish within the as<> method. My guess is that you have something more sophisticated going on here and there is no memory problem, however I'd love to understand things a bit better to be 100% sure there isn't something to drill into further. Here is where I am now. To wrap up this function: // [[Rcpp::export]] void test_const_ref( const arma::mat& m ){} This code gets created by the attributes parser: RcppExport SEXP sourceCpp_71975_test_const_ref(SEXP mSEXP) { BEGIN_RCPP { Rcpp::RNGScope __rngScope; Rcpp::InputParameter< const arma::mat&> m(mSEXP ); test_const_ref(m); } return R_NilValue; END_RCPP } The difference is this line: Rcpp::InputParameter< const arma::mat&> m(mSEXP ); instead of this line: const arma::mat& m = Rcpp::as< const arma::mat& >( mSEXP ) ; The InputParameter template class need to be able to take a SEXP asinput and have a conversion operator to the requested type. So the default implementation obvisouly used Rcpp::as, this is how the default class is implemented: template class InputParameter { public: InputParameter(SEXP x_) : x(x_){} inline operator T() { return as(x) ; } private: SEXP x ; } ; So we get exactly the same as before. What we gain however is that we can redefine InputParameter for other types and we can take advantage of its destructor to do something when the InputParameter object goes out of scope. Here is how I implemented a custom version for const reference to arma::Mat : template class InputParameter< const arma::Mat& > { public: typedef const typename arma::Mat& const_reference ; InputParameter( SEXP x_ ) : m(x_), mat( m.begin(), m.nrow(), m.ncol(), false ){} inline operator const_reference(){ return mat ; } private: Rcpp::Matrix< Rcpp::traits::r_sexptype_traits::rtype > m ; arma::Mat mat ; } ; The arma::mat is a member of InputParameter, constructed via the advanced constructor, so using the same memory as the R object, and we retrieve a reference to this object with the operator const_reference This is simple and elegant. And now we can pass down references and const references of armadillo matrices from R without performance penalty. This makes using RcppArmadillo even more compelling. It leaves the issue of what happens when we return an armadillo matrix. At the moment, this still makes a copy of the data. I don't see a way around that just yet. If we want to avoid making a copy, we need to construct the arma::mat out of R memory and return that R object. I also have to deal with references and const references of other arma types (arma::rowvec, etc ...). I'm happy to discuss the changes I've made in Rcpp and RcppArmadillo for this. For now I've included the version for non const references too, but maybe I should not, although it does work perfectly. This is much better ythan what we used to have where we would allow passing references but still make lots of data copies which sort of goes against using references. When I see a function that passes an object by reference, I tend to think that calling the function is cheap. Now it is. I'd specifically would like to hear from Gabor and Baptiste about the simplification of being able to just use (const) references as inputs and have RcppArmadillo simply borrow memory from the R object : // [[Rcpp::export]] arma::mat plus( const arma::mat& m1, const arma::mat& m2){ return m1 + m2 ; } Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Le 13/09/13 14:15, Romain Francois a écrit : But I realize this might be a strech and we can definitely only have const references. Which is easier to implement anyway and we would not need the reference counting stuff I was talking about before. spoke too soon. We would need it otherwise we run into the passing reference to a temporary problem. -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Le 13/09/13 14:00, JJ Allaire a écrit : Is it a big deal that we would cheat on chat reference passing means ? If you want to implement these sort of semantics I think at a _minimum_ the type should be const & (otherwise it looks like you are going to actually modify the matrix in place which would appear to bypass the implicit memory barrier of SEXP). Realize that you won't actually bypass the memory barrier but it sure looks like you intend to for a reader of the code. arma::mat has the ability to use auxiliary memory. We might want something that modifies the underlying memory of the object, e.g. void double_me( arma::mat& x){ x += x ; } and changes to x be brought back to the R object we pass in. But I realize this might be a strech and we can definitely only have const references. Which is easier to implement anyway and we would not need the reference counting stuff I was talking about before. The arma::mat ctor I'd use enforces memory to be bound to what we pass in for the lifetime of the matrix. From the docs; mat(aux_mem*, n_rows, n_cols, copy_aux_mem = true, strict = true) Create a matrix using data from writeable auxiliary memory. By default the matrix allocates its own memory and copies data from the auxiliary memory (for safety). However, if copy_aux_mem is set to false, the matrix will instead directly use the auxiliary memory (ie. no copying). This is faster, but can be dangerous unless you know what you're doing! The strict variable comes into effect only if copy_aux_mem is set to false (ie. the matrix is directly using auxiliary memory). If strict is set to true, the matrix will be bound to the auxiliary memory for its lifetime; the number of elements in the matrix can't be changed (directly or indirectly). If strict is set to false, the matrix will not be bound to the auxiliary memory for its lifetime, ie., the size of the matrix can be changed. If the requested number of elements is different to the size of the auxiliary memory, new memory will be allocated and the auxiliary memory will no longer be used. Rcpp::RNGScope __rngScope; arma::mat& m = Rcpp::as(mSEXP); test_ref(m); It looks like this behavior changed as of rev 4400 when the full_name() method was introduced. I may not understand the mechanism you established 100% but to me this generated code looks potentially problematic if you are taking a reference to a stack variable establish within the as<> method. This was to support additional calling capabilities for classes handled by modules. If we have a module exposed class, we don't want to have to pass it by value as we used to have to. That change allowed me to pass the object by reference, by const reference, by pointer or by const pointer. With module objects, what is really stored is a pointer to the object, so from a T* we can get T&, const T&, T* and const T* My guess is that you have something more sophisticated going on here and there is no memory problem, however I'd love to understand things a bit better to be 100% sure there isn't something to drill into further. What we used to do before is to trim out the const and reference out of the parameters, so if we had a function like this: void foo( const arma::mat& x){ // do stuff } we had an implicit as call, but it was not a call to as< const arma::mat&>, it was a call to as< arma::mat >, which was creating a copy. So we were implementing pass by reference by using pass by value. Not good. -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
> > Is it a big deal that we would cheat on chat reference passing means ? > If you want to implement these sort of semantics I think at a _minimum_ the type should be const & (otherwise it looks like you are going to actually modify the matrix in place which would appear to bypass the implicit memory barrier of SEXP). Realize that you won't actually bypass the memory barrier but it sure looks like you intend to for a reader of the code. > Rcpp::RNGScope __rngScope; > arma::mat& m = Rcpp::as(mSEXP); > test_ref(m); > It looks like this behavior changed as of rev 4400 when the full_name() method was introduced. I may not understand the mechanism you established 100% but to me this generated code looks potentially problematic if you are taking a reference to a stack variable establish within the as<> method. My guess is that you have something more sophisticated going on here and there is no memory problem, however I'd love to understand things a bit better to be 100% sure there isn't something to drill into further. ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Forcing a shallow versus deep copy
Hello, (sorry this had to wait a few months, but now I have an idea about this) We all love to be able to do that: #include using namespace Rcpp ; // [[Rcpp::depends("RcppArmadillo")]] // [[Rcpp::export]] void do_stuff( arma::mat m ){ // do stuff with m } This is really nice. But R does not know about arma::mat, so Rcpp creates an arma::mat object out of the numeric matrix it is given. This means data copy and therefore it will be inefficient. check this: arma::mat 6490.124 6546.9065 6589.4080 6651.9825 8279.696 100 NumericMatrix3.9414.21955.28907.3570 22.686 100 Those numbers are microseconds that are taken by these functions: // [[Rcpp::export]] void test_copy( arma::mat m ){} // [[Rcpp::export]] void test_NumericMatrix( NumericMatrix m ){} So now, you believe me that passing an arma::mat by copy as a parameter of a function that is exposed through attributes is expensive, at least compared to doing so with a numeric matrix where there is no data copy because of Rcpp's design choices. But we know how to make this more efficient right, we just use the advanced constructor in armadillo, the one that uses auxiliary memory and we end up with something like this: // [[Rcpp::export]] void test_arma_adv( NumericMatrix m_ ){ arma::mat m(m_.begin(), m_.nrow(), m_.ncol(), false) ; // do stuff with m } That's a lot more typing and some of us are frsutrated by this. I've had a look at this this morning, and I have a partial solution. We could pretend that when we use pass by reference with armadillo objects, what happens is that behind the scenes the advanced constructor using the existing memory is used. This means implementing versions of as for references and const references to arma matrices. This actually means implementing specializations of the Exporter class for these. Here is one right here: template class Exporter< arma::Mat& > { public: typedef typename Rcpp::Matrix< Rcpp::traits::r_sexptype_traits::rtype > MATRIX ; Exporter(SEXP x) : mat(x) {} arma::Mat& get(){ arma::Mat* m = new arma::Mat( mat.begin(), mat.nrow(), mat.ncol(), false ) ; pointers.push_back( m ) ; return *m ; } static std::vector< arma::Mat* > pointers ; private: MATRIX mat ; }; With this, we can now have functions like this: // [[Rcpp::export]] void test_ref( arma::mat& m ){ m(0,0) = 4 ; } call them from R, and let armadillo work in place: > x <- matrix(rep(1, 16), 4) > test_ref(x) > x [,1] [,2] [,3] [,4] [1,]4111 [2,]1111 [3,]1111 [4,]1111 Now the catch is that the static std::vector< arma::Mat* > pointers grows by 1 each time one such reference is used and currently I don't know of a mechanism to remove the pointers when I no longer need them. I have a start of a plan for that, see below. But the main question I'm asking here: would it make sense to be able to have functions like this ? : // [[Rcpp::export]] void foo( arma::mat& m ){ // do stuff with m, given m shares memory with the // R matrix it was constructed for } Is it a big deal that we would cheat on chat reference passing means ? JJ, Maybe the attributes parser can help. At the moment this function // [[Rcpp::export]] void test_ref( arma::mat& m ){ m(0,0) = 4 ; } gets wrapped to this: RcppExport SEXP sourceCpp_21667_test_ref(SEXP mSEXP) { BEGIN_RCPP { Rcpp::RNGScope __rngScope; arma::mat& m = Rcpp::as(mSEXP); test_ref(m); } return R_NilValue; END_RCPP } Maybe we can add something extra after the call to test_ref that would "let go" of the reference, something like this: RcppExport SEXP sourceCpp_21667_test_ref(SEXP mSEXP) { BEGIN_RCPP { Rcpp::RNGScope __rngScope; arma::mat& m = Rcpp::as(mSEXP); test_ref(m); Rcpp::release< arma::mat& >( m ) ; } return R_NilValue; END_RCPP } Where release would be a dummy most of the time, i.e template void release( const T& x){} but have specific implementations when needed, e.g. for T = arma::Mat& and T = const arma::Mat& Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 Le 11/07/13 15:22, rom...@r-enthusiasts.com a écrit : Hello, This comes up every now and then, I think we can find a syntax to initiate an arma::mat that would allow what you want. It is not likely it will come via attributes. The idea is to keep them simple. The solutions I see below would eventually lead to clutter, and we are heading in the less clutter direction. I'll think about it and propose something. Romain Le 2013-07-11 14:32, Changi Han a écrit : Hello, I think I (superficially) understand the difference betwee