Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Dirk Eddelbuettel

On 13 September 2013 at 17:56, Romain Francois wrote:
| Here is where I am now. To wrap up this function:
[...]
| This is simple and elegant. And now we can pass down references and 
| const references of armadillo matrices from R without performance penalty.
| 
| This makes using RcppArmadillo even more compelling.

Love it. Thanks a bunch for making that change. Really nice.

Dirk

PS Baptiste is AFAIK on vacation

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Romain Francois

Le 13/09/13 14:00, JJ Allaire a écrit :

Is it a big deal that we would cheat on chat reference passing means ?


If you want to implement these sort of semantics I think at a _minimum_
the type should be const & (otherwise it looks like you are going to
actually modify the matrix in place which would appear to bypass the
implicit memory barrier of SEXP). Realize that you won't actually bypass
the memory barrier but it sure looks like you intend to for a reader of
the code.

 Rcpp::RNGScope __rngScope;
 arma::mat& m = Rcpp::as(mSEXP);
 test_ref(m);


It looks like this behavior changed as of rev 4400 when the full_name()
method was introduced. I may not understand the mechanism you
established 100% but to me this generated code looks potentially
problematic if you are taking a reference to a stack variable establish
within the as<> method. My guess is that you have something more
sophisticated going on here and there is no memory problem, however I'd
love to understand things a bit better to be 100% sure there isn't
something to drill into further.


Here is where I am now. To wrap up this function:

// [[Rcpp::export]]
void test_const_ref( const arma::mat& m ){}

This code gets created by the attributes parser:

RcppExport SEXP sourceCpp_71975_test_const_ref(SEXP mSEXP) {
BEGIN_RCPP
{
Rcpp::RNGScope __rngScope;
Rcpp::InputParameter< const arma::mat&> m(mSEXP );
test_const_ref(m);
}
return R_NilValue;
END_RCPP
}

The difference is this line:

Rcpp::InputParameter< const arma::mat&> m(mSEXP );

instead of this line:

const arma::mat& m = Rcpp::as< const arma::mat& >( mSEXP ) ;



The InputParameter template class need to be able to take a SEXP asinput 
and have a conversion operator to the requested type. So the default 
implementation obvisouly used Rcpp::as, this is how the default class is 
implemented:


template 
class InputParameter {
public:
InputParameter(SEXP x_) : x(x_){}

inline operator T() { return as(x) ; }

private:
SEXP x ;
} ;

So we get exactly the same as before. What we gain however is that we 
can redefine InputParameter for other types and we can take advantage of 
its destructor to do something when the InputParameter object goes out 
of scope. Here is how I implemented a custom version for const reference 
to arma::Mat :


template 
class InputParameter< const arma::Mat& > {
public:
typedef const typename arma::Mat& const_reference ;

			InputParameter( SEXP x_ ) : m(x_), mat( m.begin(), m.nrow(), 
m.ncol(), false ){}


inline operator const_reference(){
return mat ;
}

private:
Rcpp::Matrix< Rcpp::traits::r_sexptype_traits::rtype 
> m ;
arma::Mat mat ;
} ;

The arma::mat is a member of InputParameter, constructed via the 
advanced constructor, so using the same memory as the R object, and we 
retrieve a reference to this object with the operator const_reference



This is simple and elegant. And now we can pass down references and 
const references of armadillo matrices from R without performance penalty.


This makes using RcppArmadillo even more compelling.

It leaves the issue of what happens when we return an armadillo matrix. 
At the moment, this still makes a copy of the data. I don't see a way 
around that just yet. If we want to avoid making a copy, we need to 
construct the arma::mat out of R memory and return that R object.


I also have to deal with references and const references of other arma 
types (arma::rowvec, etc ...).


I'm happy to discuss the changes I've made in Rcpp and RcppArmadillo for 
this. For now I've included the version for non const references too, 
but maybe I should not, although it does work perfectly. This is much 
better ythan what we used to have where we would allow passing 
references but still make lots of data copies which sort of goes against 
using references. When I see a function that passes an object by 
reference, I tend to think that calling the function is cheap. Now it is.



I'd specifically would like to hear from Gabor and Baptiste about the 
simplification of being able to just use (const) references as inputs 
and have RcppArmadillo simply borrow memory from the R object :


// [[Rcpp::export]]
arma::mat plus( const arma::mat& m1, const arma::mat& m2){
return m1 + m2 ;
}

Romain

--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Romain Francois

Le 13/09/13 14:15, Romain Francois a écrit :


But I realize this might be a strech and we can definitely only have
const references. Which is easier to implement anyway and we would not
need the reference counting stuff I was talking about before.


spoke too soon. We would need it otherwise we run into the passing 
reference to a temporary problem.


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Romain Francois

Le 13/09/13 14:00, JJ Allaire a écrit :

Is it a big deal that we would cheat on chat reference passing means ?


If you want to implement these sort of semantics I think at a _minimum_
the type should be const & (otherwise it looks like you are going to
actually modify the matrix in place which would appear to bypass the
implicit memory barrier of SEXP). Realize that you won't actually bypass
the memory barrier but it sure looks like you intend to for a reader of
the code.


arma::mat has the ability to use auxiliary memory. We might want 
something that modifies the underlying memory of the object, e.g.


void double_me( arma::mat& x){
   x += x ;
}

and changes to x be brought back to the R object we pass in.


But I realize this might be a strech and we can definitely only have 
const references. Which is easier to implement anyway and we would not 
need the reference counting stuff I was talking about before.





The arma::mat ctor I'd use enforces memory to be bound to what we pass 
in for the lifetime of the matrix. From the docs;


mat(aux_mem*, n_rows, n_cols, copy_aux_mem = true, strict = true)

Create a matrix using data from writeable auxiliary memory. By 
default the matrix allocates its own memory and copies data from the 
auxiliary memory (for safety). However, if copy_aux_mem is set to false, 
the matrix will instead directly use the auxiliary memory (ie. no 
copying). This is faster, but can be dangerous unless you know what 
you're doing!


The strict variable comes into effect only if copy_aux_mem is set 
to false (ie. the matrix is directly using auxiliary memory). If strict 
is set to true, the matrix will be bound to the auxiliary memory for its 
lifetime; the number of elements in the matrix can't be changed 
(directly or indirectly). If strict is set to false, the matrix will not 
be bound to the auxiliary memory for its lifetime, ie., the size of the 
matrix can be changed. If the requested number of elements is different 
to the size of the auxiliary memory, new memory will be allocated and 
the auxiliary memory will no longer be used.



 Rcpp::RNGScope __rngScope;
 arma::mat& m = Rcpp::as(mSEXP);
 test_ref(m);


It looks like this behavior changed as of rev 4400 when the full_name()
method was introduced. I may not understand the mechanism you
established 100% but to me this generated code looks potentially
problematic if you are taking a reference to a stack variable establish
within the as<> method.


This was to support additional calling capabilities for classes handled 
by modules. If we have a module exposed class, we don't want to have to 
pass it by value as we used to have to.


That change allowed me to pass the object by reference, by const 
reference, by pointer or by const pointer.


With module objects, what is really stored is a pointer to the object, 
so from a T* we can get T&, const T&, T* and const T*



My guess is that you have something more
sophisticated going on here and there is no memory problem, however I'd
love to understand things a bit better to be 100% sure there isn't
something to drill into further.


What we used to do before is to trim out the const and reference out of 
the parameters, so if we had a function like this:



void foo( const arma::mat& x){
   // do stuff
}

we had an implicit as call, but it was not a call to as< const 
arma::mat&>, it was a call to as< arma::mat >, which was creating a 
copy. So we were implementing pass by reference by using pass by value. 
Not good.


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread JJ Allaire
>
> Is it a big deal that we would cheat on chat reference passing means ?
>

If you want to implement these sort of semantics I think at a _minimum_ the
type should be const & (otherwise it looks like you are going to actually
modify the matrix in place which would appear to bypass the implicit memory
barrier of SEXP). Realize that you won't actually bypass the memory barrier
but it sure looks like you intend to for a reader of the code.



> Rcpp::RNGScope __rngScope;
> arma::mat& m = Rcpp::as(mSEXP);
> test_ref(m);
>

It looks like this behavior changed as of rev 4400 when the full_name()
method was introduced. I may not understand the mechanism you established
100% but to me this generated code looks potentially problematic if you are
taking a reference to a stack variable establish within the as<> method. My
guess is that you have something more sophisticated going on here and there
is no memory problem, however I'd love to understand things a bit better to
be 100% sure there isn't something to drill into further.
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Romain Francois

Hello,

(sorry this had to wait a few months, but now I have an idea about this)

We all love to be able to do that:

#include 
using namespace Rcpp ;

// [[Rcpp::depends("RcppArmadillo")]]

// [[Rcpp::export]]
void do_stuff( arma::mat m ){
// do stuff with m
}

This is really nice. But R does not know about arma::mat, so Rcpp 
creates an arma::mat object out of the numeric matrix it is given. This 
means data copy and therefore it will be inefficient. check this:


   arma::mat 6490.124 6546.9065 6589.4080 6651.9825 8279.696   100
   NumericMatrix3.9414.21955.28907.3570   22.686   100

Those numbers are microseconds that are taken by these functions:

// [[Rcpp::export]]
void test_copy( arma::mat m ){}

// [[Rcpp::export]]
void test_NumericMatrix( NumericMatrix m ){}


So now, you believe me that passing an arma::mat by copy as a parameter 
of a function that is exposed through attributes is expensive, at least 
compared to doing so with a numeric matrix where there is no data copy 
because of Rcpp's design choices.


But we know how to make this more efficient right, we just use the 
advanced constructor in armadillo, the one that uses auxiliary memory 
and we end up with something like this:


// [[Rcpp::export]]
void test_arma_adv( NumericMatrix m_ ){
arma::mat m(m_.begin(), m_.nrow(), m_.ncol(), false) ;
// do stuff with m
}

That's a lot more typing and some of us are frsutrated by this.

I've had a look at this this morning, and I have a partial solution. We 
could pretend that when we use pass by reference with armadillo objects, 
what happens is that behind the scenes the advanced constructor using 
the existing memory is used.


This means implementing versions of as for references and const 
references to arma matrices. This actually means implementing 
specializations of the Exporter class for these.


Here is one right here:

template 
class Exporter< arma::Mat& > {
public:
		typedef typename Rcpp::Matrix< 
Rcpp::traits::r_sexptype_traits::rtype > MATRIX ;


Exporter(SEXP x) : mat(x) {}

arma::Mat& get(){
			arma::Mat* m = new arma::Mat( mat.begin(), mat.nrow(), 
mat.ncol(), false ) ;

pointers.push_back( m ) ;
return *m ;
}

static std::vector< arma::Mat* > pointers ;

private:
MATRIX mat ;
};

With this, we can now have functions like this:

// [[Rcpp::export]]
void test_ref( arma::mat& m ){
m(0,0) = 4 ;
}

call them from R, and let armadillo work in place:

> x <- matrix(rep(1, 16), 4)

> test_ref(x)

> x
 [,1] [,2] [,3] [,4]
[1,]4111
[2,]1111
[3,]1111
[4,]1111


Now the catch is that the static std::vector< arma::Mat* > pointers 
grows by 1 each time one such reference is used and currently I don't 
know of a mechanism to remove the pointers when I no longer need them. I 
have a start of a plan for that, see below.


But the main question I'm asking here:


would it make sense to be able to have functions like this ? :

// [[Rcpp::export]]
void foo( arma::mat& m ){
// do stuff with m, given m shares memory with the
// R matrix it was constructed for
}

Is it a big deal that we would cheat on chat reference passing means ?







JJ, Maybe the attributes parser can help. At the moment this function

// [[Rcpp::export]]
void test_ref( arma::mat& m ){
m(0,0) = 4 ;
}

gets wrapped to this:

RcppExport SEXP sourceCpp_21667_test_ref(SEXP mSEXP) {
BEGIN_RCPP
{
Rcpp::RNGScope __rngScope;
arma::mat& m = Rcpp::as(mSEXP);
test_ref(m);
}
return R_NilValue;
END_RCPP
}

Maybe we can add something extra after the call to test_ref that would 
"let go" of the reference, something like this:


RcppExport SEXP sourceCpp_21667_test_ref(SEXP mSEXP) {
BEGIN_RCPP
{
Rcpp::RNGScope __rngScope;
arma::mat& m = Rcpp::as(mSEXP);
test_ref(m);
Rcpp::release< arma::mat& >( m ) ;
}
return R_NilValue;
END_RCPP
}

Where release would be a dummy most of the time, i.e

template 
void release( const T& x){}

but have specific implementations when needed, e.g. for T = 
arma::Mat& and T = const arma::Mat&


Romain


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

Le 11/07/13 15:22, rom...@r-enthusiasts.com a écrit :


Hello,

This comes up every now and then, I think we can find a syntax to
initiate an arma::mat that would allow what you want.

It is not likely it will come via attributes. The idea is to keep them
simple. The solutions I see below would eventually lead to clutter, and
we are heading in the less clutter direction.

I'll think about it and propose something.

Romain

Le 2013-07-11 14:32, Changi Han a écrit :

Hello,

I think I (superficially) understand the difference betwee