Terry,

On Oct 3, 2011, at 10:32 AM, Terry Therneau wrote:

> I'm looking at memory efficiency for some of the survival code.  The
> following fragment appears in coxph.fit
>    coxfit <- .C("coxfit2", iter=as.integer(maxiter),
>                   as.integer(n),
>                   as.integer(nvar), stime,
>                   sstat,
>                   x= x[sorted,] ,
>             ...
> 
> Does this make a second copy of x to pass to the routine (my
> expectation) or will I end up with 3: x and x[sorted,] in the local
> frame of reference, and another due to dup=TRUE?
> 

I'm not sure I'm counting your copies right, but I'd say the latter (although 
the sorting cannot be technically called a copy ;)).
There are 4 distinct, separate objects:
x -> x[sorted,] -> double-array to pass to C -> result vector
If you care about speed, you should definitely use .Call().

Note for debugging: tracemem is actually smart and flags the intermediate 
memory object created inside .C for passing as a proper duplication even though 
it is not a real one (no duplicate() involved) since the object is not an R 
object at all. It then also flags the allocation of the result object as a 
duplication from the intermediate object, so in summary tracemem gives you the 
true number of copies.

As far as I remember .C is a legacy left-over from the ancient Fortran 
interface in original S (it's not really a C interface at all - it is a Fortran 
interface that happens to not care about source language and C can be used to 
create Fortran-looking object code) so unless one needs Fortran, one should not 
be using .C ;). It can be used, but should not be used for anything but maybe 
didactic purposes IMHO.

Cheers,
Simon

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to