Hi I am not an expert in such issues (never really run into problems with memory size). >From what I have read in previous posts on this topic (and there are numerous) the simplest way would be to go to 64 byte system (Linux, W Vista, 7), where size of objects is limited by amount of memory only.
There are some packages dealing with big data (biglm, ...) or database approach (sqldf) Your version is a bit obsolete so upgrading could help but not with your final operation. Sometimes it can help to rethink why do you need such a huge amount of data together in memory and if you can not use only sampled data for further study. Regards Petr Ralf B <ralf.bie...@gmail.com> napsal dne 05.08.2010 11:13:40: > Thank you for such a careful and thorough analysis of the problem and > your comparison with your configuration. I very much appreciate. > For completeness and (perhaps) further comparison, I have executed > 'version' and sessionInfo() as well: > > > > version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status RC > major 2 > minor 10.0 > year 2009 > month 10 > day 25 > svn rev 50206 > language R > version.string R version 2.10.0 RC (2009-10-25 r50206) > > sessionInfo() > R version 2.10.0 RC (2009-10-25 r50206) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] splines stats4 grid stats graphics grDevices utils > [8] datasets methods base > > other attached packages: > [1] flexmix_2.2-7 multcomp_1.1-7 survival_2.35-8 mvtnorm_0.9-9 > [5] modeltools_0.2-16 lattice_0.18-3 car_1.2-16 psych_1.0-88 > [9] nortest_1.0 gplots_2.8.0 caTools_1.10 bitops_1.0-4.1 > [13] gdata_2.8.0 gtools_2.6.2 ggplot2_0.8.7 digest_0.4.2 > [17] reshape_0.8.3 plyr_0.1.9 proto_0.3-8 RJDBC_0.1-5 > [21] rJava_0.8-2 DBI_0.2-5 > > loaded via a namespace (and not attached): > [1] tools_2.10.0 > > > memory.limit() > [1] 2047 > > > > Also, the example i presented was a simplified reproduction of the > real data structure. My real data structure does not have reused > vectors. I merely wanted to show the error occurring when processing > large vectors into data frames and then binding these data frames > together. I hope this additional information helps. I might add that I > am running this in StatET under Eclipse with 512 MB of allocated RAM > in the environment. > > Besides adding more memory, can you spot simple ways of how memory use > can be improved? I know that I am running quite a bit of baggage. > Unfortunately my script is rather comprehensive and my example is > really just a simplified part that I created to reproduce the problem. > > Thanks, > Ralf > > > > > > > On Thu, Aug 5, 2010 at 4:44 AM, Petr PIKAL <petr.pi...@precheza.cz> wrote: > > Hi > > > > r-help-boun...@r-project.org napsal dne 05.08.2010 09:53:21: > > > >> I am dealing with very large data frames, artificially created with > >> the following code, that are combined using rbind. > >> > >> > >> a <- rnorm(5000000) > >> b <- rnorm(5000000) > >> c <- rnorm(5000000) > >> d <- rnorm(5000000) > >> first <- data.frame(one=a, two=b, three=c, four=d) > >> second <- data.frame(one=d, two=c, three=b, four=a) > > > > Up to this point there is no error on my system > > > >> version > > _ > > platform i386-pc-mingw32 > > arch i386 > > os mingw32 > > system i386, mingw32 > > status Under development (unstable) > > major 2 > > minor 12.0 > > year 2010 > > month 05 > > day 31 > > svn rev 52164 > > language R > > version.string R version 2.12.0 Under development (unstable) (2010-05-31 > > r52164) > > > >> sessionInfo() > > R version 2.12.0 Under development (unstable) (2010-05-31 r52164) > > Platform: i386-pc-mingw32/i386 (32-bit) > > > > attached base packages: > > [1] stats grDevices datasets utils graphics methods base > > > > other attached packages: > > [1] lattice_0.18-8 fun_1.0 > > > > loaded via a namespace (and not attached): > > [1] grid_2.12.0 tools_2.12.0 > > > >> rbind(first, second) > > > > Although size of first and second is only roughly 160 MB their > > concatenation probably consumes all remaining memory space as you already > > have a-d first and second in memory. > > > > Regards > > Petr > > > >> > >> which results in the following error for each of the statements: > >> > >> > a <- rnorm(5000000) > >> Error: cannot allocate vector of size 38.1 Mb > >> > b <- rnorm(5000000) > >> Error: cannot allocate vector of size 38.1 Mb > >> > c <- rnorm(5000000) > >> Error: cannot allocate vector of size 38.1 Mb > >> > d <- rnorm(5000000) > >> Error: cannot allocate vector of size 38.1 Mb > >> > first <- data.frame(one=a, two=b, three=c, four=d) > >> Error: cannot allocate vector of size 38.1 Mb > >> > second <- data.frame(one=d, two=c, three=b, four=a) > >> Error: cannot allocate vector of size 38.1 Mb > >> > rbind(first, second) > >> > >> When running memory.limit() I am getting this: > >> > >> memory.limit() > >> [1] 2047 > >> > >> Which shows me that I have 2 GB of memory available. What is wrong? > >> Shouldn't 38 MB be very feasible? > >> > >> Best, > >> Ralf > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.