"Gabriel Diaz" <[EMAIL PROTECTED]> writes:

> I'm taking an overview to the project documentation, and seems the
> database is the way to go to handle log files of GB order (normally
> between 2 and 4 GB each 15 day dump).

> In this document http://cran.r-project.org/doc/manuals/R-data.html,
> says R will load all data into memory to process it when using
> read.table and such. Using a database will do the same? Well,
> currently i have no machine with > 2 GB of memory.

Remember, swap too.  This means you're using more time, not running
into a hard limit.

If you're concerned about gross size, then preprocessing could be
useful; but consider: RAM is cheap.  Calibrate RAM purchases
w.r.t. hours of your coding time, -before- you start the project.
Then you can at least mutter to yourself when you waste more than the
cost of core trying to make the problem small. :)

It's entirely reasonable to do all your development work on a smaller
set, and then dump the real data into it and go home.  Unless you've
got something O(N^2) or so, you should be fine.


- Allen S. Rout

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to