Doran, Harold wrote:
Thanks. But, I think I am doing that. I use rm() and gc() as the code moves along. The datasets are stored as a list. Is there a way that I can save the list object and call each dataset within a list one at a time, or must the entire list be in memory at once?
The list is in memory - and must be to access its elements.
Either save the list elements to separate files, or even better make use of a database.
Uwe Ligges
Harold
-----Original Message-----
From: Uwe Ligges [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 19, 2005 5:51 AM
To: Doran, Harold
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Data Simulation in R
Doran, Harold wrote:
Dear List:
A few weeks ago I posted some questions regarding data simulation and received some very helpful comments, thank you. I have modified my code accordingly and have made some progress.
However, I now am facing a new challenge along similar lines. I am attempting to simulate 250 datasets and then run the data through a linear model. I use rm() and gc() as I move along to clean up the workspace and preserve memory. However, my aim is to use sample sizes of 5,000 and 10,000. By any measure this is a huge task.
My machine has 2GB RAM and a Pentium 4 2.8 GHz machine with Windows
XP.
I have the following in the "target" section of the Windows shortcut --max-mem-size=1812M
With such large samples, R is unable to perform the analysis, at least
with the code I have developed. It halts when it runs out of memory. A
collegue subsequently constructed the simulation using another software program with a similar computer and, while it took over night
(and then some), the program produced the results desired.
I am curious if it is the case that such large simulations are out of the grasp of R or if my code is not adequately organized to perform the simulation.
I would appreciate any thoughts or advice.
Don't hold all datasets (and results, if they are big) in the memory at the same time!!!
Either generate them when you use them and delete them afterwards, or save them to disc an only load one by one for further analyses. Also, you might want to call gc() after you removed large objects...
Uwe Ligges
Harold
library(MASS)
library(nlme)
mu<-c(100,150,200,250)
Sigma<-matrix(c(400,80,80,80,80,400,80,80,80,80,400,80,80,80,80,400),4
,4
)
mu2<-c(0,0,0)
Sigma2<-diag(64,3)
sample.size<-5000
N<-250 #Number of datasets
#Take a single draw from VL distribution vl.error<-mvrnorm(n=N, mu2, Sigma2)
#Step 1 Create Data Data <- lapply(seq(N), function(x) as.data.frame(cbind(1:10,mvrnorm(n=sample.size, mu, Sigma))))
#Step 2 Add Vertical Linking Error for(i in seq(along=Data)){ Data[[i]]$V6 <- Data[[i]]$V2 Data[[i]]$V7 <- Data[[i]]$V3 + vl.error[i,1] Data[[i]]$V8 <- Data[[i]]$V4 + vl.error[i,2] Data[[i]]$V9 <- Data[[i]]$V5 + vl.error[i,3] }
#Step 3 Restructure for Longitudinal Analysis long <- lapply(Data, function(x) reshape(x, idvar="Data[[i]]$V1", varying=list(c(names(Data[[i]])[2:5]),c(names(Data[[i]])[6:9])),
v.names=c("score.1","score.2"), direction="long"))
##################### #Clean up Workspace rm(Data,vl.error) gc() #####################
# Step 4 Run GLS
glsrun1 <- lapply(long, function(x) gls(score.1~I(time-1), data=x, correlation=corAR1(form=~1|V1), method='ML'))
# Extract intercepts and slopes int1 <- sapply(glsrun1, function(x) x$coefficient[1]) slo1 <- sapply(glsrun1, function(x) x$coefficient[2])
################ #Clean up workspace rm(glsrun1) gc()
glsrun2 <- lapply(long, function(x) gls(score.2~I(time-1), data=x, correlation=corAR1(form=~1|V1), method='ML'))
# Extract intercepts and slopes int2 <- sapply(glsrun2, function(x) x$coefficient[1]) slo2 <- sapply(glsrun2, function(x) x$coefficient[2])
#Clean up workspace rm(glsrun2) gc()
# Print Results
cat("Original Standard Errors","\n", "Intercept","\t", sd(int1),"\n","Slope","\t","\t", sd(slo1),"\n")
cat("Modified Standard Errors","\n", "Intercept","\t", sd(int2),"\n","Slope","\t","\t", sd(slo2),"\n")
rm(list=ls()) gc()
[[alternative HTML version deleted]]
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html