I'm working with a 350MB CSV file on a server that has 3GB of RAM, yet I'm
hitting a memory error when I try to store the data frame into a survey
design object, the R object that stores data for complex sample survey data.
When I launch R, I execute the following line from Windows:
"C:\Program Files\R\R-2.9.1\bin\Rgui.exe" --max-mem-size=2047M
Anything higher, and I get an error message saying the maximum has been set
to 2047M.
Here are the commands:
> library(survey)
#this step takes more than five minutes
> data08<-read.csv("data08.csv",header=TRUE,nrows=210437)
> object.size(data08)
#329877112 bytes
#Looking at Windows Task Manager, Mem Usage for Rgui.exe is already 659,632K
> brr.dsgn <-svrepdesign( data = data08 , repweights = data08[, grep(
"^repwgt" , colnames( data08)) ], type = "BRR" , combined.weights = TRUE ,
weights = data08$mainwgt )
#Error: cannot allocate vector of size 254.5 Mb
#The survey design object does not get created.
#This also causes Windows Task Manager, Mem Usage to spike to 1,748,136K
#And here are some memory diagnostics
> memory.limit()
[1] 2047
> memory.size()
[1] 1449.06
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 131148 3.6 593642 15.9 15680924 418.8
Vcells 45479988 347.0 173526492 1324.0 220358611 1681.3
A description of the survey package can be found here:
http://faculty.washington.edu/tlumley/survey/
I tried creating a work-around by using the database-backed survey objects
(DB SO), included in the survey package to conserve memory on larger
datasets like this one. Unfortunately, I don't think the survey package
supports database connections for replicate weight designs yet, since I've
only been able to get a database connection working after creating a
svydesign object and not a svrepdesign object - and also because neither the
DB SO website nor the svrepdesign help page make any mention of those
parameters.
The DB SOs are described in detail here:
http://faculty.washington.edu/tlumley/survey/svy-dbi.html
Any advice would be truly appreciated.
Thanks,
Anthony Damico
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.