"Gabriel Diaz" <[EMAIL PROTECTED]> writes: > I'm taking an overview to the project documentation, and seems the > database is the way to go to handle log files of GB order (normally > between 2 and 4 GB each 15 day dump).
> In this document http://cran.r-project.org/doc/manuals/R-data.html, > says R will load all data into memory to process it when using > read.table and such. Using a database will do the same? Well, > currently i have no machine with > 2 GB of memory. Remember, swap too. This means you're using more time, not running into a hard limit. If you're concerned about gross size, then preprocessing could be useful; but consider: RAM is cheap. Calibrate RAM purchases w.r.t. hours of your coding time, -before- you start the project. Then you can at least mutter to yourself when you waste more than the cost of core trying to make the problem small. :) It's entirely reasonable to do all your development work on a smaller set, and then dump the real data into it and go home. Unless you've got something O(N^2) or so, you should be fine. - Allen S. Rout ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
