hi.  a little more on this same subject.

let's say one RCPP_EXPORT_CLASS's two classes: a and b, and let's say
that b contains some number of a's (as std::vector<a>, say).

then, the following appear to be true:

1.  if you pass one of the a's from a b object to a method of a, the
a object is *copied* before being invoked.  so, any changes to the a
object are not visible in the b object's copy of the a object.

2.  if you *return* a pointer to an a object from a C++ function, R will
decide that it owns the object, and, if the R code subsequently deletes
the reference to the object, it will be eligible for garbage
collection.  so, if a method of the b class returns one of the a objects
sitting inside it, R will likely decide to garbage collect memory that
it shouldn't ought to.  this will likely result in a crash.

the (too verbosely commented) test case below demonstrates this,
focusing on number 2, but with a slight aside in the direction of number
1.

as i mention in the test case, i suspect the R reference class
philosophy is that the unit of sharing is what is returned by a new(),
and even if that has things inside it that look exactly like things
outside it which themselves *are* shared (since they were returned by
their own, "more specific", new) the things inside are *not* shared when
used as parameters.

(i suppose a possible rule -- i suppose this with *no* clue as to
implementation -- would be like with C/C++, i.e., that a pointer, and/or
maybe a reference, indicates sharing, though only for cases where
sharing is legal, i.e., reference objects; maybe other objects with a *
or an & would result in a compile-time error?)

cheers, Greg
----
// $Id: test7.cc,v 1.6 2013/03/10 01:19:32 minshall Exp minshall $

// so, it seems that returning a reference class object pointer does
// not result in R thinking the object belongs to C++, so R eventually
// garbage collects it.

// in fact, if you *pass* a reference object from inside some other
// reference object in C++ land, that will also be pass-by-value (even
// if you are passing (or, receiving) the object as a pointer -- the
// pointer will be to a copy of the object).  see below clvset()
// (which ends up being pass-by-value) versus ceset() (which ends up
// being pass-by-reference).

// i suspect the R reference class philosophy is that the unit of
// sharing is what is returned by a new(), and even if that has things
// inside it that look exactly like things outside it which *are*
// shared (since they were returned by their own, "more specific",
// new) the thing inside is *not* shared when used as a parameter.

// Martin Morgan (Fred Hutchinson Cancer Research Center) in his 17-18
// November, 2010 slides "Reference Classes" remarks that among the
// challenges remaining for reference classes is "[r]etaining
// copy-on-change illusion when appropriate."

// by the way, if you are running R under gdb, and want to see if your
// parameter is "pass-by-reference" or "pass-by-value", you can set a
// gdb breakpoint at your function; query R about the C++ object:

// > ce
// C++ object <0x1008cea00> of class "cluseval" <0x10400bb70>

// then call your function via the object, and, at the breakpoint, see
// what address is passed for your object to your function:

// ce$ceset(new(cluster,3,4));

// Breakpoint 1, ceset (ce=0x1008cea00, cl={id = 3, type = 4}) at ...

// in this case, we were pass by reference, as 0x1008ceaa00 is the
// same for the R representation of the object, and the address of the
// object passed to ce$ceset.  in this following case, we are
// pass-by-value (the clv parameter of rset is a copy of ce$list)

// > ce$list
// C++ object <0x104023920> of class "cluster_v" <0x10400bd90>

// > ce$list$rset(new(cluster,3,4));
// Breakpoint 1, rset (clv=0x104006f60, cl={id = 3, ncells = 0, type = 4, 
parent = 0}) ...

// (this all assumes your function is receiving pointers, rather than
// objects.  also, note that i changed single- to double-quotes above,
// for ease of cxxfunction copying and pasting.)

// to run, first put the source in inc
// (inc <- <SINGLEQUOTE><SOURCE><SINGLEQUOTE>)
// then do something like:

#if 0

require(inline);
require(Rcpp);

cld <- cxxfunction(,plugin="Rcpp", includes=inc);

clm <- Module("test7_module", getDynLib(cld));
cluster <- clm$cluster;
cluster_v <- clm$cluster_v;
cluseval <- clm$cluseval;

ce <- new(cluseval);

cat("\nnotice the size does not change after clvset-ing a new cluster\n");
cat("this is because clvsize if getting a pointer to a *copy* of the object\n");
cat("rather than a pointer to the object sitting inside ce\n");
ce$list$size();
ce$list$clvset(new(cluster,100,101));
ce$list$size();


ce$ceset(new(cluster,1,2));
ce$ceset(new(cluster,3,4));
ce$ceset(new(cluster,5,6));
cat("notice, here the size *has* changed; pass-by-reference has happened\n");
ce$list$size();

cat("on my system, the following never manages to print out \"3\"\n")
gctorture(TRUE);

print(1);
c12 <- ce$list$rget(1);
c34 <- ce$list$rget(2);
c56 <- ce$list$rget(3); 

print(2);
c12 <- ce$list$rget(1);
c34 <- ce$list$rget(2);
c56 <- ce$list$rget(3); 

print(3);
c12 <- ce$list$rget(1);
c34 <- ce$list$rget(2);
c56 <- ce$list$rget(3); 

print(4);
c12 <- ce$list$rget(1);
c34 <- ce$list$rget(2);
c56 <- ce$list$rget(3); 

gctorture(FALSE);

#endif /* 0 */

#include <vector>

#include <R.h>
#include <RcppCommon.h>
#include <Rcpp/S4.h>

#include <Rcpp.h>

        
class cluster {
public:
    int id;
    int type;
public:
    cluster(int _id, int _type) {
        id = _id;
        type = _type;
    }
    cluster() {
        cluster(0, 0);
    }
};

typedef std::vector<cluster> cluster_v;

// okay, in order to not copy too much data, we create a "bundle"
// class holding big stuff that floats around.
class cluseval {
public:
    cluster_v list;
};

// deal with index mismatch between R and ...
cluster* rget(cluster_v* clv, int rindex) {
    return &clv->at(rindex-1);
}

// this actually uses the reference class, and so is, effectively,
// pass-by-reference
void ceset(cluseval* ce, cluster cl) {
    ce->list.push_back(cl);
}

// this does *not* use the reference class, and so is, effectively,
// pass-by-value.
void clvset(cluster_v* clv, cluster cl) {
    clv->push_back(cl);
}    


RCPP_EXPOSED_CLASS(cluster)
RCPP_EXPOSED_CLASS_NODECL(cluster_v)
RCPP_EXPOSED_CLASS(cluseval);

using namespace Rcpp;
RCPP_MODULE(test7_module) {
    class_<cluseval>("cluseval")
        .constructor()
        .field_readonly("list", &cluseval::list)
        .method("ceset", &ceset);
    class_<cluster>("cluster")
        .constructor()
        .constructor<int,int>()
        .field_readonly("id", &cluster::id)
        .field_readonly("type", &cluster::type);
    class_<cluster_v>("cluster_v")
        .method("size", &cluster_v::size)
        .method("rget", &rget)
        .method("clvset", &clvset);
};
_______________________________________________
Rcpp-devel mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to