I am using environments to avoid making copies (by keeping references). But it seems like there is a hidden copy going on somewhere - for
example in the code fragment below, I am creating a reference to "y"
(of size 500MB) and storing the reference in object "data". But when I save "data" and then restore it in another R session, gc() claims it is using twice the amount of memory. Where/How is this happening?


Thanks for any help in working around this - my datasets are just not fitting into my 4GB, 32 bit linux machine (even though my actual data size is around 800MB)

Nawaaz

> new.ref <- function(value = NULL) {
+     ref <- list(env = new.env())
+     class(ref) <- "refObject"
+     assign("value", value, env = ref$env)
+     ref
+ }
> object.size(y)
[1] 587941404
> y.ref = new.ref(y)
> object.size(y.ref)
[1] 328
> data = list()
> data$y.ref = y.ref
> object.size(data)
[1] 492
> save(data, "data.RData")

...

run R again
===========

> load("data.RData")
> gc()
            used   (Mb) gc trigger   (Mb)
Ncells    141051    3.8     350000    9.4
Vcells 147037925 1121.9  147390241 1124.5

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to