On 3/28/07, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > On 3/28/2007 5:25 PM, Henrik Bengtsson wrote: > > Hi, > > > > when doing as.double() on an object that is already a double, the > > object seems to be copied internally, doubling the memory requirement. > > See example below. Same for as.character() etc. Is this intended? > > > > Example: > > > > % R --vanilla > >> x <- double(1e7) > >> gc() > > used (Mb) gc trigger (Mb) max used (Mb) > > Ncells 234019 6.3 467875 12.5 350000 9.4 > > Vcells 10103774 77.1 11476770 87.6 10104223 77.1 > >> x <- as.double(x) > >> gc() > > used (Mb) gc trigger (Mb) max used (Mb) > > Ncells 234113 6.3 467875 12.5 350000 9.4 > > Vcells 10103790 77.1 21354156 163.0 20103818 153.4 > > > > However, couldn't this easily be avoided by letting as.double() return > > the object as is if already a double? > > as.double calls the internal as.vector, which also strips off > attributes. But in the case where the output is identical to the input, > this does seem like an easy optimization. I don't know if it would help > most people, but it might help in the kinds of cases you mention.
What about, as.double.double <- function(x, ...) { if (is.null(attributes(x))) x else NextMethod("as.double", x, ...) } and same for as.integer(), as.logical(), as.complex(), as.raw(), and as.character()? /Henrik > > Duncan Murdoch > > > > > Example: > > > > % R --vanilla > >> as.double.double <- function(x, ...) x > >> x <- double(1e7) > >> gc() > > used (Mb) gc trigger (Mb) max used (Mb) > > Ncells 234019 6.3 467875 12.5 350000 9.4 > > Vcells 10103774 77.1 11476770 87.6 10104223 77.1 > >> x <- as.double(x) > >> gc() > > used (Mb) gc trigger (Mb) max used (Mb) > > Ncells 234028 6.3 467875 12.5 350000 9.4 > > Vcells 10103779 77.1 12130608 92.6 10104223 77.1 > > > > What's the catch? > > > > > > The reason why I bring it up, is because many (most?) methods are > > using as.double() etc "just in case" when passing arguments to > > .Call(), .Fortran() etc, e.g. stats::smooth.spline(): > > > > fit <- .Fortran(R_qsbart, as.double(penalty), as.double(dofoff), > > x = as.double(xbar), y = as.double(ybar), w = as.double(wbar), > > <etc>) > > > > Your memory usage is peaking in the actual call and the garbage > > collector cannot clean it up until after the call. This seems to be > > waste of memory, especially when the objects are large (100-1000MBs). > > > > Cheers > > > > Henrik > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel