Terry, On Oct 3, 2011, at 10:32 AM, Terry Therneau wrote:
> I'm looking at memory efficiency for some of the survival code. The > following fragment appears in coxph.fit > coxfit <- .C("coxfit2", iter=as.integer(maxiter), > as.integer(n), > as.integer(nvar), stime, > sstat, > x= x[sorted,] , > ... > > Does this make a second copy of x to pass to the routine (my > expectation) or will I end up with 3: x and x[sorted,] in the local > frame of reference, and another due to dup=TRUE? > I'm not sure I'm counting your copies right, but I'd say the latter (although the sorting cannot be technically called a copy ;)). There are 4 distinct, separate objects: x -> x[sorted,] -> double-array to pass to C -> result vector If you care about speed, you should definitely use .Call(). Note for debugging: tracemem is actually smart and flags the intermediate memory object created inside .C for passing as a proper duplication even though it is not a real one (no duplicate() involved) since the object is not an R object at all. It then also flags the allocation of the result object as a duplication from the intermediate object, so in summary tracemem gives you the true number of copies. As far as I remember .C is a legacy left-over from the ancient Fortran interface in original S (it's not really a C interface at all - it is a Fortran interface that happens to not care about source language and C can be used to create Fortran-looking object code) so unless one needs Fortran, one should not be using .C ;). It can be used, but should not be used for anything but maybe didactic purposes IMHO. Cheers, Simon ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel