Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread JJ Allaire

 Is it a big deal that we would cheat on chat reference passing means ?


If you want to implement these sort of semantics I think at a _minimum_ the
type should be const  (otherwise it looks like you are going to actually
modify the matrix in place which would appear to bypass the implicit memory
barrier of SEXP). Realize that you won't actually bypass the memory barrier
but it sure looks like you intend to for a reader of the code.



 Rcpp::RNGScope __rngScope;
 arma::mat m = Rcpp::asarma::mat (mSEXP);
 test_ref(m);


It looks like this behavior changed as of rev 4400 when the full_name()
method was introduced. I may not understand the mechanism you established
100% but to me this generated code looks potentially problematic if you are
taking a reference to a stack variable establish within the as method. My
guess is that you have something more sophisticated going on here and there
is no memory problem, however I'd love to understand things a bit better to
be 100% sure there isn't something to drill into further.
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Romain Francois

Le 13/09/13 14:00, JJ Allaire a écrit :

Is it a big deal that we would cheat on chat reference passing means ?


If you want to implement these sort of semantics I think at a _minimum_
the type should be const  (otherwise it looks like you are going to
actually modify the matrix in place which would appear to bypass the
implicit memory barrier of SEXP). Realize that you won't actually bypass
the memory barrier but it sure looks like you intend to for a reader of
the code.


arma::mat has the ability to use auxiliary memory. We might want 
something that modifies the underlying memory of the object, e.g.


void double_me( arma::mat x){
   x += x ;
}

and changes to x be brought back to the R object we pass in.


But I realize this might be a strech and we can definitely only have 
const references. Which is easier to implement anyway and we would not 
need the reference counting stuff I was talking about before.





The arma::mat ctor I'd use enforces memory to be bound to what we pass 
in for the lifetime of the matrix. From the docs;


mat(aux_mem*, n_rows, n_cols, copy_aux_mem = true, strict = true)

Create a matrix using data from writeable auxiliary memory. By 
default the matrix allocates its own memory and copies data from the 
auxiliary memory (for safety). However, if copy_aux_mem is set to false, 
the matrix will instead directly use the auxiliary memory (ie. no 
copying). This is faster, but can be dangerous unless you know what 
you're doing!


The strict variable comes into effect only if copy_aux_mem is set 
to false (ie. the matrix is directly using auxiliary memory). If strict 
is set to true, the matrix will be bound to the auxiliary memory for its 
lifetime; the number of elements in the matrix can't be changed 
(directly or indirectly). If strict is set to false, the matrix will not 
be bound to the auxiliary memory for its lifetime, ie., the size of the 
matrix can be changed. If the requested number of elements is different 
to the size of the auxiliary memory, new memory will be allocated and 
the auxiliary memory will no longer be used.



 Rcpp::RNGScope __rngScope;
 arma::mat m = Rcpp::asarma::mat (mSEXP);
 test_ref(m);


It looks like this behavior changed as of rev 4400 when the full_name()
method was introduced. I may not understand the mechanism you
established 100% but to me this generated code looks potentially
problematic if you are taking a reference to a stack variable establish
within the as method.


This was to support additional calling capabilities for classes handled 
by modules. If we have a module exposed class, we don't want to have to 
pass it by value as we used to have to.


That change allowed me to pass the object by reference, by const 
reference, by pointer or by const pointer.


With module objects, what is really stored is a pointer to the object, 
so from a T* we can get T, const T, T* and const T*



My guess is that you have something more
sophisticated going on here and there is no memory problem, however I'd
love to understand things a bit better to be 100% sure there isn't
something to drill into further.


What we used to do before is to trim out the const and reference out of 
the parameters, so if we had a function like this:



void foo( const arma::mat x){
   // do stuff
}

we had an implicit as call, but it was not a call to as const 
arma::mat, it was a call to as arma::mat , which was creating a 
copy. So we were implementing pass by reference by using pass by value. 
Not good.


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Romain Francois

Le 13/09/13 14:15, Romain Francois a écrit :


But I realize this might be a strech and we can definitely only have
const references. Which is easier to implement anyway and we would not
need the reference counting stuff I was talking about before.


spoke too soon. We would need it otherwise we run into the passing 
reference to a temporary problem.


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Romain Francois

Le 13/09/13 14:00, JJ Allaire a écrit :

Is it a big deal that we would cheat on chat reference passing means ?


If you want to implement these sort of semantics I think at a _minimum_
the type should be const  (otherwise it looks like you are going to
actually modify the matrix in place which would appear to bypass the
implicit memory barrier of SEXP). Realize that you won't actually bypass
the memory barrier but it sure looks like you intend to for a reader of
the code.

 Rcpp::RNGScope __rngScope;
 arma::mat m = Rcpp::asarma::mat (mSEXP);
 test_ref(m);


It looks like this behavior changed as of rev 4400 when the full_name()
method was introduced. I may not understand the mechanism you
established 100% but to me this generated code looks potentially
problematic if you are taking a reference to a stack variable establish
within the as method. My guess is that you have something more
sophisticated going on here and there is no memory problem, however I'd
love to understand things a bit better to be 100% sure there isn't
something to drill into further.


Here is where I am now. To wrap up this function:

// [[Rcpp::export]]
void test_const_ref( const arma::mat m ){}

This code gets created by the attributes parser:

RcppExport SEXP sourceCpp_71975_test_const_ref(SEXP mSEXP) {
BEGIN_RCPP
{
Rcpp::RNGScope __rngScope;
Rcpp::InputParameter const arma::mat m(mSEXP );
test_const_ref(m);
}
return R_NilValue;
END_RCPP
}

The difference is this line:

Rcpp::InputParameter const arma::mat m(mSEXP );

instead of this line:

const arma::mat m = Rcpp::as const arma::mat ( mSEXP ) ;



The InputParameter template class need to be able to take a SEXP asinput 
and have a conversion operator to the requested type. So the default 
implementation obvisouly used Rcpp::as, this is how the default class is 
implemented:


template typename T
class InputParameter {
public:
InputParameter(SEXP x_) : x(x_){}

inline operator T() { return asT(x) ; }

private:
SEXP x ;
} ;

So we get exactly the same as before. What we gain however is that we 
can redefine InputParameter for other types and we can take advantage of 
its destructor to do something when the InputParameter object goes out 
of scope. Here is how I implemented a custom version for const reference 
to arma::Mat :


template typename T
class InputParameter const arma::MatT  {
public:
typedef const typename arma::MatT const_reference ;

			InputParameter( SEXP x_ ) : m(x_), mat( m.begin(), m.nrow(), 
m.ncol(), false ){}


inline operator const_reference(){
return mat ;
}

private:
Rcpp::Matrix Rcpp::traits::r_sexptype_traitsT::rtype 
 m ;
arma::MatT mat ;
} ;

The arma::mat is a member of InputParameter, constructed via the 
advanced constructor, so using the same memory as the R object, and we 
retrieve a reference to this object with the operator const_reference



This is simple and elegant. And now we can pass down references and 
const references of armadillo matrices from R without performance penalty.


This makes using RcppArmadillo even more compelling.

It leaves the issue of what happens when we return an armadillo matrix. 
At the moment, this still makes a copy of the data. I don't see a way 
around that just yet. If we want to avoid making a copy, we need to 
construct the arma::mat out of R memory and return that R object.


I also have to deal with references and const references of other arma 
types (arma::rowvec, etc ...).


I'm happy to discuss the changes I've made in Rcpp and RcppArmadillo for 
this. For now I've included the version for non const references too, 
but maybe I should not, although it does work perfectly. This is much 
better ythan what we used to have where we would allow passing 
references but still make lots of data copies which sort of goes against 
using references. When I see a function that passes an object by 
reference, I tend to think that calling the function is cheap. Now it is.



I'd specifically would like to hear from Gabor and Baptiste about the 
simplification of being able to just use (const) references as inputs 
and have RcppArmadillo simply borrow memory from the R object :


// [[Rcpp::export]]
arma::mat plus( const arma::mat m1, const arma::mat m2){
return m1 + m2 ;
}

Romain

--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-09-13 Thread Dirk Eddelbuettel

On 13 September 2013 at 17:56, Romain Francois wrote:
| Here is where I am now. To wrap up this function:
[...]
| This is simple and elegant. And now we can pass down references and 
| const references of armadillo matrices from R without performance penalty.
| 
| This makes using RcppArmadillo even more compelling.

Love it. Thanks a bunch for making that change. Really nice.

Dirk

PS Baptiste is AFAIK on vacation

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-12 Thread Gabor Grothendieck
On Fri, Jul 12, 2013 at 1:42 AM, Dirk Eddelbuettel e...@debian.org wrote:

 On 11 July 2013 at 19:21, Gabor Grothendieck wrote:
 | 1. Just to be clear what we have been discussing here is not just how to
 | avoid copying but how to avoid copying while using as and wrap
 | or approaches that automatically generate as and wrap.  I was already
 | aware of how to avoid copying using Armadillo how to use Armadillo types
 | as arguments and return values to autogen as and wrap.  The problem is
 | not that but that these two things cannot be done at once - its either or.

 I must still be misunderstanding as this still reads to me as if you are
 suspecting that we somehow keep layers making extra copies.

 We're not. And I've known you long enough to know that you are not likely to
 suspect this either.  So what is it then?

 As Romain said, some of the choice have to do with the representation on both
 the R and C++ side -- for Rcpp itself we can be lightweight and efficient via
 proxy classes, but this does not mean we can do this for _any arbitrary C++
 class_ coming from another project. As eg Armadillo.  RcppArmadilo already
 does pretty well, and code review may make it better.  We do not know of any
 fat to cut, or we'd cut it ourselves.  We care about a few things, but
 performance is clearly among them.

I think Romain's proposal will clarify this.


 | 2. Regarding the quesiton of performance impact there are two situations
 | which should be distinguished:
 |
 | i. We call C++ from R and it does some processing and then returns and
 | we don't call it again. In that case its likely that copying or not won't
 | make a big difference or at least it won't if the actual C++ computation
 | time is large coimpared to the time spent in copying.
 |
 | ii. We factor out the inner loop of the code and only recode that in C++
 | and repeatedly call it many times.  In that case the copying is multiplied
 | by the number of iterations and might very well have a significant impact.

 In case ii) I'd try to use a different design and make it more like i): You
 generally do not want to call down from R to object code a bazillion times as
 there is always some overhead, and multiplying even something rather
 efficient by a veryBigNumber can make small times large in the aggregate.

Sure and sugar, rcpparmadillo and other facilities do make it easier to move
more functionality into C++; nevertheless, it can be the case that a relatively
small amount of R code repeatedly
invoked is responsible for the performance hit in a program and from
the viewpoint
of reducing complexity and increasing maintainability it can be
desirable to just
move that minimum portion to the C++ side minimizing the dual language aspect
of the code.  By making call overhead as fast
as one can while retaining any automatic Rcpp features then this
is facilitated.  If its not possible in general then if it were just possible
for Armadillo objects and selected other situations then this would
still be nice.


 Dirk

 |
 | On Thu, Jul 11, 2013 at 6:55 PM, Dirk Eddelbuettel e...@debian.org wrote:
 | 
 |  Everybody has this existing example in their copy of Armadillo.
 | 
 |  I am running it here from SVN rather than the installed directory, but 
 this
 |  should not make a difference. Machine is my not-overly-powerful thinkpad 
 used
 |  for traveling:
 | 
 |  edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ r fastLm.r
 |  Loading required package: methods
 | 
 |  Attaching package: ‘Rcpp’
 | 
 |  The following object is masked from ‘package:inline’:
 | 
 |  registerPlugin
 | 
 | test replications relative elapsed user.self 
 sys.self
 |  2 fLmTwoCasts(X, y) 50001.000   0.184 0.204
 0.164
 |  1  fLmOneCast(X, y) 50001.011   0.186 0.200
 0.172
 |  4   fastLmPureDotCall(X, y) 50001.141   0.210 0.236
 0.184
 |  3  fastLmPure(X, y) 50002.027   0.373 0.412
 0.332
 |  6  lm.fit(X, y) 50002.685   0.494 0.528
 0.456
 |  5 fastLm(frm, data = trees) 5000   36.380   6.694 7.332
 6.028
 |  7 lm(frm, data = trees) 5000   42.734   7.863 8.628
 7.068
 |  edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$
 | 
 |  What we are talking about here is the difference between 'fLmTwoCasts' and
 |  'fLmOneCasts'.  If you use larger objects, the different with be larger.  
 But
 |  the relative differences are tiny.
 | 
 |  It would be nice to make this more elegant, and I look forward to Romain's
 |  proposals, but methinks that we may well have bigger fish to fry.
 | 
 |  Dirk, still in Sydney
 | 
 |  --
 |  Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
 |  ___
 |  Rcpp-devel mailing list
 |  Rcpp-devel@lists.r-forge.r-project.org
 |  https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
 |
 |
 

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-12 Thread Changi Han
I apologize if my emails were badly phrased, or disrespectful. No intention
of saying anything was broken, suspicious or wrong.

I second Gabor. His described use case matches mine. The outer loop is an
optimization routine coming from other libraries. Rcpp is used to speed up
the objective, gradient and hessian computations and hence the data is
constantly passed along to all of these functions. Another use case to
consider is recursion with data passed along. A toy example is gib(0) =
values(0); gib(1) = values(1); gib(x) = gib(x-1) + gib(x-2) + values(x).
Values = vector of non negative integers. A naive implementation with aux
memory allocation may cause the number of copies in memory to grow with
exponential order in x.


 In case ii) I'd try to use a different design and make it more like i):
 You
  generally do not want to call down from R to object code a bazillion
 times as
  there is always some overhead, and multiplying even something rather
  efficient by a veryBigNumber can make small times large in the aggregate.

 Sure and sugar, rcpparmadillo and other facilities do make it easier to
 move
 more functionality into C++; nevertheless, it can be the case that a
 relatively
 small amount of R code repeatedly
 invoked is responsible for the performance hit in a program and from
 the viewpoint
 of reducing complexity and increasing maintainability it can be
 desirable to just
 move that minimum portion to the C++ side minimizing the dual language
 aspect
 of the code.  By making call overhead as fast
 as one can while retaining any automatic Rcpp features then this
 is facilitated.  If its not possible in general then if it were just
 possible
 for Armadillo objects and selected other situations then this would
 still be nice.

 
  Dirk
 
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-11 Thread baptiste auguie
Hi,

That's great, thanks for considering this!

Following this discussion, I went to browse through my code looking for
wrap() and as() statements that could benefit from a speed-up of memory
reuse. Of course I didn't find any.
I switched to using Modules when they were introduced, the code being much
nicer to read, and these conversions only happen behind the scene.
My c++ functions thus only deal with native Armadillo / C++ objects, and I
leave it up to the modules to magically do the required conversions in and
out. It's a brilliant interface, very readable.

From what I understand, however, the resulting code can often lose a factor
2-3 in speed, compared to the now much more verbose alternative of
explicitly converting and sharing the memory with this type of code:

arma::mat A(M.begin(), M.rows(), M.cols(), false);

From this perspective, the possibility of setting copy_aux_mem to false in
as(), as used by modules, would be very welcome.

Best regards,

baptiste


On 11 July 2013 10:22, rom...@r-enthusiasts.com wrote:


 Hello,

 This comes up every now and then, I think we can find a syntax to initiate
 an arma::mat that would allow what you want.

 It is not likely it will come via attributes. The idea is to keep them
 simple. The solutions I see below would eventually lead to clutter, and we
 are heading in the less clutter direction.

 I'll think about it and propose something.

 Romain

 Le 2013-07-11 14:32, Changi Han a écrit :

  Hello,

 I think I (superficially) understand the difference between:

 // [[Rcpp::export]]
 double sum1(Rcpp::NumericMatrix M) {
 arma::mat A(M.begin(), M.rows(), M.cols(), false);
  return sum(sum(A));
 }

 // [[Rcpp::export]]
 double sum2(arma::mat A) {
 return sum(sum(A));
 }

 Partly out of laziness, partly because sum2 is more elegant, and
 partly to avoid namespace pollution, I was wondering if there is a way
 to force a shallow copy in sum2.

 If not, then may I submit a low priority feature request. An
 attribute? Some thing like:

 // [[Rcpp::export]]
 double sum2(arma::mat A) {
 // [[ Rcpp::shallow ( A ) ]]
  return sum(sum(A));
 }

 Or (akin to C++11 generalized attributes)

 // [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] }
 double sum2(arma::mat A) {
 return sum(sum(A));
  }

 An alternative is to have an argument in sourceCpp that takes a
 list/vector of objects that are to be shallow or deep copied.

 For example in sum1, if M is changed within the function before
 casting to the arma::mat, then might be cleaner to add M to a
 list/vector of objects to be deep copied rather than cloning M within
 sum1: leads to one fewer variable name.

 Just a thought. I can certainly live with the additional step. As
 always, thanks for all the Rcpp goodness.

 Cheers,
 Changi Han


 __**_
 Rcpp-devel mailing list
 Rcpp-devel@lists.r-forge.r-**project.orgRcpp-devel@lists.r-forge.r-project.org
 https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
 listinfo/rcpp-develhttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-11 Thread Changi Han
I am sure there are better ways to achieve the goal. I would suggest that
these two be similar if possible. I think the naive expectation is for them
to be consistent.

// [[Rcpp::export]]
stuff function(Rcpp::stuff) {
}

// [[Rcpp::export]]
stuff function(arma::stuff) {
}

Thank you again. Cheers.


On Thu, Jul 11, 2013 at 9:22 PM, rom...@r-enthusiasts.com wrote:


 Hello,

 This comes up every now and then, I think we can find a syntax to initiate
 an arma::mat that would allow what you want.

 It is not likely it will come via attributes. The idea is to keep them
 simple. The solutions I see below would eventually lead to clutter, and we
 are heading in the less clutter direction.

 I'll think about it and propose something.

 Romain

 Le 2013-07-11 14:32, Changi Han a écrit :

 Hello,

 I think I (superficially) understand the difference between:

 // [[Rcpp::export]]
 double sum1(Rcpp::NumericMatrix M) {
 arma::mat A(M.begin(), M.rows(), M.cols(), false);
  return sum(sum(A));
 }

 // [[Rcpp::export]]
 double sum2(arma::mat A) {
 return sum(sum(A));
 }

 Partly out of laziness, partly because sum2 is more elegant, and
 partly to avoid namespace pollution, I was wondering if there is a way
 to force a shallow copy in sum2.

 If not, then may I submit a low priority feature request. An
 attribute? Some thing like:

 // [[Rcpp::export]]
 double sum2(arma::mat A) {
 // [[ Rcpp::shallow ( A ) ]]
  return sum(sum(A));
 }

 Or (akin to C++11 generalized attributes)

 // [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] }
 double sum2(arma::mat A) {
 return sum(sum(A));
  }

 An alternative is to have an argument in sourceCpp that takes a
 list/vector of objects that are to be shallow or deep copied.

 For example in sum1, if M is changed within the function before
 casting to the arma::mat, then might be cleaner to add M to a
 list/vector of objects to be deep copied rather than cloning M within
 sum1: leads to one fewer variable name.

 Just a thought. I can certainly live with the additional step. As
 always, thanks for all the Rcpp goodness.

 Cheers,
 Changi Han


 __**_
 Rcpp-devel mailing list
 Rcpp-devel@lists.r-forge.r-**project.orgRcpp-devel@lists.r-forge.r-project.org
 https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
 listinfo/rcpp-develhttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-11 Thread romain
These __are__ similar. The difference is in the classes themselves. 
Rcpp classes are proxy classes so C++ copy mechanism does not apply to 
them. arma classes are proper c++ classes, so C++ semantics apply.


I'm at useR right now, so I can't really work on this. I'll submit at 
least ideas later.


Romain


Le 2013-07-11 15:34, Changi Han a écrit :

I am sure there are better ways to achieve the goal. I would suggest
that these two be similar if possible. I think the naive expectation
is for them to be consistent.

// [[Rcpp::export]]

stuff function(Rcpp::stuff) {
}

// [[Rcpp::export]]

stuff function(arma::stuff) {
 }

Thank you again. Cheers.

On Thu, Jul 11, 2013 at 9:22 PM, rom...@r-enthusiasts.com [3] 
wrote:



Hello,

This comes up every now and then, I think we can find a syntax to
initiate an arma::mat that would allow what you want.

It is not likely it will come via attributes. The idea is to keep
them simple. The solutions I see below would eventually lead to
clutter, and we are heading in the less clutter direction.

Ill think about it and propose something.

Romain

Le 2013-07-11 14:32, Changi Han a écrit :


Hello,

I think I (superficially) understand the difference between:

// [[Rcpp::export]]
double sum1(Rcpp::NumericMatrix M) {
    arma::mat A(M.begin(), M.rows(), M.cols(), false);
     return sum(sum(A));
}

// [[Rcpp::export]]
double sum2(arma::mat A) {
    return sum(sum(A));
}

Partly out of laziness, partly because sum2 is more elegant, and
partly to avoid namespace pollution, I was wondering if there is
a way
to force a shallow copy in sum2.

If not, then may I submit a low priority feature request. An
attribute? Some thing like:

// [[Rcpp::export]]
double sum2(arma::mat A) {
    // [[ Rcpp::shallow ( A ) ]]
     return sum(sum(A));
}

Or (akin to C++11 generalized attributes)

// [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] }
double sum2(arma::mat A) {
    return sum(sum(A));
 }

An alternative is to have an argument in sourceCpp that takes a
list/vector of objects that are to be shallow or deep copied.

For example in sum1, if M is changed within the function before
casting to the arma::mat, then might be cleaner to add M to a
list/vector of objects to be deep copied rather than cloning M
within
sum1: leads to one fewer variable name.

Just a thought. I can certainly live with the additional step. As
always, thanks for all the Rcpp goodness.

Cheers,
Changi Han


___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org [1]



https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

[2]




Links:
--
[1] mailto:Rcpp-devel@lists.r-forge.r-project.org
[2] 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

[3] mailto:rom...@r-enthusiasts.com


___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-11 Thread Dirk Eddelbuettel

On 11 July 2013 at 10:33, baptiste auguie wrote:
| Hi,
| 
| That's great, thanks for considering this!
| 
| Following this discussion, I went to browse through my code looking for wrap()
| and as() statements that could benefit from a speed-up of memory reuse. Of
| course I didn't find any. 
| I switched to using Modules when they were introduced, the code being much
| nicer to read, and these conversions only happen behind the scene.
| My c++ functions thus only deal with native Armadillo / C++ objects, and I
| leave it up to the modules to magically do the required conversions in and 
out.
| It's a brilliant interface, very readable.
| 
| From what I understand, however, the resulting code can often lose a factor 
2-3
| in speed, compared to the now much more verbose alternative of explicitly
| converting and sharing the memory with this type of code:

No way.

I have seen 2 to 3 __per cent__ which is very different from a factor 2 or 3.

This whole discussion is mostly a non-issue, really, as best as I can tell
because the cost ois really not that large.

Dirk
 
| arma::mat A(M.begin(), M.rows(), M.cols(), false);
| 
| From this perspective, the possibility of setting copy_aux_mem to false in as
| (), as used by modules, would be very welcome.
| 
| Best regards,
| 
| baptiste
| 
| 
| On 11 July 2013 10:22, rom...@r-enthusiasts.com wrote:
| 
| 
| Hello,
| 
| This comes up every now and then, I think we can find a syntax to initiate
| an arma::mat that would allow what you want.
| 
| It is not likely it will come via attributes. The idea is to keep them
| simple. The solutions I see below would eventually lead to clutter, and we
| are heading in the less clutter direction.
| 
| I'll think about it and propose something.
| 
| Romain
| 
| Le 2013-07-11 14:32, Changi Han a écrit :
| 
| 
| Hello,
| 
| I think I (superficially) understand the difference between:
| 
| // [[Rcpp::export]]
| double sum1(Rcpp::NumericMatrix M) {
|     arma::mat A(M.begin(), M.rows(), M.cols(), false);
|      return sum(sum(A));
| }
| 
| // [[Rcpp::export]]
| double sum2(arma::mat A) {
|     return sum(sum(A));
| }
| 
| Partly out of laziness, partly because sum2 is more elegant, and
| partly to avoid namespace pollution, I was wondering if there is a way
| to force a shallow copy in sum2.
| 
| If not, then may I submit a low priority feature request. An
| attribute? Some thing like:
| 
| // [[Rcpp::export]]
| double sum2(arma::mat A) {
|     // [[ Rcpp::shallow ( A ) ]]
|      return sum(sum(A));
| }
| 
| Or (akin to C++11 generalized attributes)
| 
| // [[Rcpp::export]] { [[ Rcpp::shallow ( A ) ]] }
| double sum2(arma::mat A) {
|     return sum(sum(A));
|  }
| 
| An alternative is to have an argument in sourceCpp that takes a
| list/vector of objects that are to be shallow or deep copied.
| 
| For example in sum1, if M is changed within the function before
| casting to the arma::mat, then might be cleaner to add M to a
| list/vector of objects to be deep copied rather than cloning M within
| sum1: leads to one fewer variable name.
| 
| Just a thought. I can certainly live with the additional step. As
| always, thanks for all the Rcpp goodness.
| 
| Cheers,
| Changi Han
| 
| 
| ___
| Rcpp-devel mailing list
| Rcpp-devel@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| 
| 
| 
| --
| ___
| Rcpp-devel mailing list
| Rcpp-devel@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-11 Thread Dirk Eddelbuettel

Everybody has this existing example in their copy of Armadillo. 

I am running it here from SVN rather than the installed directory, but this
should not make a difference. Machine is my not-overly-powerful thinkpad used
for traveling:

edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ r fastLm.r 
Loading required package: methods

Attaching package: ‘Rcpp’

The following object is masked from ‘package:inline’:

registerPlugin

   test replications relative elapsed user.self sys.self
2 fLmTwoCasts(X, y) 50001.000   0.184 0.2040.164
1  fLmOneCast(X, y) 50001.011   0.186 0.2000.172
4   fastLmPureDotCall(X, y) 50001.141   0.210 0.2360.184
3  fastLmPure(X, y) 50002.027   0.373 0.4120.332
6  lm.fit(X, y) 50002.685   0.494 0.5280.456
5 fastLm(frm, data = trees) 5000   36.380   6.694 7.3326.028
7 lm(frm, data = trees) 5000   42.734   7.863 8.6287.068
edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ 

What we are talking about here is the difference between 'fLmTwoCasts' and
'fLmOneCasts'.  If you use larger objects, the different with be larger.  But
the relative differences are tiny.

It would be nice to make this more elegant, and I look forward to Romain's
proposals, but methinks that we may well have bigger fish to fry.

Dirk, still in Sydney

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Re: [Rcpp-devel] Forcing a shallow versus deep copy

2013-07-11 Thread Dirk Eddelbuettel

On 11 July 2013 at 19:21, Gabor Grothendieck wrote:
| 1. Just to be clear what we have been discussing here is not just how to
| avoid copying but how to avoid copying while using as and wrap
| or approaches that automatically generate as and wrap.  I was already
| aware of how to avoid copying using Armadillo how to use Armadillo types
| as arguments and return values to autogen as and wrap.  The problem is
| not that but that these two things cannot be done at once - its either or.

I must still be misunderstanding as this still reads to me as if you are
suspecting that we somehow keep layers making extra copies. 

We're not. And I've known you long enough to know that you are not likely to
suspect this either.  So what is it then?

As Romain said, some of the choice have to do with the representation on both
the R and C++ side -- for Rcpp itself we can be lightweight and efficient via
proxy classes, but this does not mean we can do this for _any arbitrary C++
class_ coming from another project. As eg Armadillo.  RcppArmadilo already
does pretty well, and code review may make it better.  We do not know of any
fat to cut, or we'd cut it ourselves.  We care about a few things, but
performance is clearly among them.
 
| 2. Regarding the quesiton of performance impact there are two situations
| which should be distinguished:
| 
| i. We call C++ from R and it does some processing and then returns and
| we don't call it again. In that case its likely that copying or not won't
| make a big difference or at least it won't if the actual C++ computation
| time is large coimpared to the time spent in copying.
| 
| ii. We factor out the inner loop of the code and only recode that in C++
| and repeatedly call it many times.  In that case the copying is multiplied
| by the number of iterations and might very well have a significant impact.

In case ii) I'd try to use a different design and make it more like i): You
generally do not want to call down from R to object code a bazillion times as
there is always some overhead, and multiplying even something rather
efficient by a veryBigNumber can make small times large in the aggregate.

Dirk

| 
| On Thu, Jul 11, 2013 at 6:55 PM, Dirk Eddelbuettel e...@debian.org wrote:
| 
|  Everybody has this existing example in their copy of Armadillo.
| 
|  I am running it here from SVN rather than the installed directory, but this
|  should not make a difference. Machine is my not-overly-powerful thinkpad 
used
|  for traveling:
| 
|  edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ r fastLm.r
|  Loading required package: methods
| 
|  Attaching package: ‘Rcpp’
| 
|  The following object is masked from ‘package:inline’:
| 
|  registerPlugin
| 
| test replications relative elapsed user.self sys.self
|  2 fLmTwoCasts(X, y) 50001.000   0.184 0.2040.164
|  1  fLmOneCast(X, y) 50001.011   0.186 0.2000.172
|  4   fastLmPureDotCall(X, y) 50001.141   0.210 0.2360.184
|  3  fastLmPure(X, y) 50002.027   0.373 0.4120.332
|  6  lm.fit(X, y) 50002.685   0.494 0.5280.456
|  5 fastLm(frm, data = trees) 5000   36.380   6.694 7.3326.028
|  7 lm(frm, data = trees) 5000   42.734   7.863 8.6287.068
|  edd@don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$
| 
|  What we are talking about here is the difference between 'fLmTwoCasts' and
|  'fLmOneCasts'.  If you use larger objects, the different with be larger.  
But
|  the relative differences are tiny.
| 
|  It would be nice to make this more elegant, and I look forward to Romain's
|  proposals, but methinks that we may well have bigger fish to fry.
| 
|  Dirk, still in Sydney
| 
|  --
|  Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
|  ___
|  Rcpp-devel mailing list
|  Rcpp-devel@lists.r-forge.r-project.org
|  https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| 
| 
| 
| --
| Statistics  Software Consulting
| GKX Group, GKX Associates Inc.
| tel: 1-877-GKX-GROUP
| email: ggrothendieck at gmail.com

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel