On 01/03/2010 01:34 AM, Simon Urbanek wrote:
On Jan 2, 2010, at 4:08 PM, Laurent Gautier wrote:
On 1/2/10 8:28 PM, Simon Urbanek wrote:
On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:
On 1/2/10 5:56 PM, Duncan Murdoch wrote:
On 02/01/2010 11:36 AM, Laurent Gautier wrote:
[Disclaimer: what is below reflects my understanding from
reading the R source, others will correct where deemed
necessary]
On 1/2/10 12:00 PM, r-devel-requ...@r-project.org wrote:
(...)
Another possibility is to maintain your own list or environment
of objects, and just protect/preserve the list as a whole.
Interesting idea, this would let one perform his/her own
bookkeeping on the list/environment. How is R garbage collection
checking contained objects ? (I am thinking performances here, and
may be cyclic references).
You don't really want to care because the GC is global for all
objects (and cycles are supported by the GC in R) - so whether you
keep it yourself or Preserve/Release is practically irrelevant (the
protection stack is handled separately).
I guess that I'll have to know in order to understand that I don't really want
to care. ;-)
The garbage collector must somehow know if an object is available for
collection (and will have to check whether an object is PROTECTed or not, or
Preserved or not).
I suppose that upon being called the garbage collector will first look into the
PROTECTed and Preserved objects, mark them as unavailble for collection, then
recursively mark objects contained in them.
The GC marks recursively from all known roots of which Preserved list is one of
many and all elements of the protection stack are treated as such as well (FWIW
the Preserved and protected list are in that order the last two). Since this
involves (by definition) all live objects it doesn't matter to which other
object you assign the node. The only detail is that protection stack does not
change the generation (since there is no real node to assign to).
As for keeping your own list -- if you really care about performance
that much (to be honest in vast majority of cases it doesn't matter)
you have to be careful how you implement it. Technically the fastest
way is preallocated generic vector but it really depends on how you
deal with the access so you can easily perform worse than
Preserve/Release if you're not careful.
Releasing being of linear complexity, having few thousands of Preserved objects
not being anticipated as an extraordinary situation, and Preserve/Release
cycles being quite frequent, I start minding a bit about the performance.
Keeping my own list would let me experiment with various strategies (and
eventually offer
Sure - what I meant is that you have to optimize for one thing or the other so
you have to be careful what you do.
As a side note - the best way (IMHO) to deal with all those issues is
to use external pointers because a) you get very efficient C
finalizers b) you can directly (and very efficiently) tie lifespan of
other objects to the same SEXP and c) as guardians they can nicely
track other objects that hold them.
Thanks. I am not certain to follow everything. Are you suggesting that rather
than Preserve-ing/Release-ing a list/environment that would act as a guardian
for several objects, one should use an external pointer (to an arbitrary C
pointer) ? In that case, how does one indicate that an external pointer acts as
a container ?
Or are you suggesting that rather than Preserve-in/Release-ing R objects one should use
an external pointer acting as a proxy for a SEXP (argument "prot" in
R_MakeExternalPtr(void *p, SEXP tag, SEXP prot) ) ?
(but in that case the external pointer will itself have to go through
Preserve/Release cycles...)
I was guessing that you use this in conjunction with some C++ magic not just
plain R objects and thus you have to deal with two life spans. From the other
messages I think you are dealing with the simple situation of wrapping an R
object as reference in the other system with explicit memory management (i.e.
in C++ you have explicit new/delete life cycle) in which case you really don't
need anything more than Preserve. It is more interesting when you want to track
the life of R objects without imposing the life span - i.e when you want to
know when an object in R is collected such that you can delete it from the
other system (i.e. you don't explicitly retain it by the reference).
Cheers,
Simon
Many thanks for this clarification. Rcpp is using a jri wanabee
approach. Essentially we have :
class RObject{
public:
RObject(SEXP x){ preserve x }
~RObject(){ release x}
private:
SEXP x ;
}
For the story, I'm also doing the other way (rJava wanabee) in the CPP
package
(http://romainfrancois.blog.free.fr/index.php?post/2009/12/22/CPP-package-%3A-exposing-C-objects)
that wraps up arbitrary C++ objects (currently stl containers) as
external pointers. You might recognize some patterns here.
> # create the ob