On Sat, Feb 5, 2011 at 9:52 AM, Douglas Bates <ba...@stat.wisc.edu> wrote: > As Dirk and Romain know I have been struggling to debug a memory > protection problem that I encounter in code based on Rcpp Modules. As > with all memory protection problems, it is very difficult to track > down and I have kind-of run out of options right now. > > I plan, for the time being, to use code that acts like the > module-based code without going through the modules. I plan to create > the object in C++ and pass back an external pointer that will be used > to locate the object for later method calls. Perhaps it is lack of > imagination but I haven't thought of a way to construct an object and > make it persistent until I decide to release the pointer. The best > way I have been able to derive is to put the C++ object on a global > stack but that approach screams "error-prone". > > My question may be somewhat vaguely stated. What I am trying to avoid > is creating an object in C++ and returning an Rcpp::XPtr then, on > return to R, having the C++ object go out of scope so the external > pointer's target is gone. Somehow I need to hold on to the C++ object > until the XPtr object is garbage collected. > > Suggestions welcome.
It might be helpful to sketch what I think are the basic assumptions behind xptr-based implementations of C++ persistence in the R environment. Input from experts on R internals and garbage collection algorithms might be helpful here. I assume that R uses C's malloc to grab chunks of memory as needed and manages this memory using garbage collection algorithms. On the other hand, C++ uses new to allocate objects in the heap, and a pointer to this memory is typically assigned to an R external pointer so that R has a handle to it. R can pass this handle as one of the parameters in a function call to C++ so that the called code can manipulate the C++ object pointed to. This quickly gets tedious, and an important contribution of Rcpp modules is to automate some of the steps by using the C++ compiler (and templates) to capture type information that would otherwise have to be specified by hand. As you mention, all stack-based objects created during the C++ function call are destroyed when it returns to R. On the other hand, heap-based objects are not (a common source of memory leaks). In this case this is not a memory leak: we have simply passed the responsibility for managing the lifetime of objects pointed to by external pointers to the R side. An Rcpp application has the following dependencies: R's gc'ed memory <-- Rcpp --> C++'s heap memory (and transient stack memory) A fundamental assumption is that R's managed memory will not crash into C++'s heap memory. Since R ultimately gets its memory from the OS using C/malloc, and C++ gets its heap memory from the OS using new, the OS should take care of keeping these memory areas separate. Thus the right-pointing arrow above is unlikely to lead to problems. On the other hand, the left-pointing arrow arises because C++ applications can have many pointers to R's managed memory, all of which must be protected from the garbage collection process. In particular,objects allocated on the C++ heap that are not necessarily part of any running C++ code may have such pointers. Here are some basic assumptions underlying this strategy: 1. R's memory management does not use any non-standard features that would break the compatibility between C/malloc and C++/new. 2. Mixed use of PROTECT/UNPROTECT and R_PreseveObject/R_ReleaseObject is safe. (The latter pair are currently undocumented.) 3. Once a pointer to R's memory is protected this memory is not moved by the garbage collector while a C++ function holds a pointer to this memory. It is interesting to observe that the Windows .Net framework includes the kind of C++ interface that we are talking about here. In particular, C++ code can point to objects in the CLR (Common Language Runtime---think Java, or R) through what are called tracking pointers. These pointers are automatically updated when the object pointed to is moved by the garbage collector, addressing a problem like the one described in item 3 above. If I recall correctly Brian Ripley has warned that R's memory management may not play well with the C++ memory allocator (or even with C/malloc). It might be helpful to know what precisely were his concerns. Dominick > _______________________________________________ > Rcpp-devel mailing list > Rcpp-devel@lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel > _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel