I have been searching all day & most of last night, but can't find any
benchmarking or recommendations regarding R system requirements for very
large (2-5GB) data sets to help guide our hardware configuration. If
anybody has experience with this they're willing to share or could
anybody point me in a direction that might be productive to research, it
would be much appreciated. Specifically: will R simply use as much
memory as the OS makes available to it, unlimited? Is there a
multi-threading version R, packages? Does the core R package support
64-bit & should I expect to see any difference in how memory's handled
under that version? Is 3 GB of memory to 1GB of data a reasonable
ballpark?
Our testing thus far has been on a windows 32-bit box w/1GB of RAM & 1
CPU; it appears to indicate something like 3GB of RAM for every 1GB of
sql table (ex-indexes, byte-sized factors). At this point, we're
planning on setting up a dual core 64-bit Linux box w/16GB of RAM for
starters, since we have summed-down sql tables of approx 2-5GB
generally.
Here's details, just for context, or in case I'm misinterpreting the
results, or in case there's some more memory-efficient way to get data
in R's binary format than going w/the data.frame.
R session:
> library(RODBC)
> channel<-odbcConnect("psmrd")
> FivePer <-data.frame(sqlQuery(channel, "select * from
AUTCombinedWA_BILossCost_5per"))
Error: cannot allocate vector of size 2000 Kb
In addition: Warning messages:
1: Reached total allocation of 1023Mb: see
help(memory.size)
2: Reached total allocation of 1023Mb: see
help(memory.size)
ODBC connection:
Microsoft SQL Server ODBC Driver Version 03.86.1830
Data Source Name: psmrd
Data Source Description:
Server: psmrdcdw01\modeling
Database: OpenSeas_Work1
Language: (Default)
Translate Character Data: Yes
Log Long Running Queries: No
Log Driver Statistics: No
Use Integrated Security: Yes
Use Regional Settings: No
Prepared Statements Option: Drop temporary procedures on
disconnect
Use Failover Server: No
Use ANSI Quoted Identifiers: Yes
Use ANSI Null, Paddings and Warnings: Yes
Data Encryption: No
Please be patient, I'm a new R user (or at least I'm trying to be...at
this point I'm mostly a new R-help-reader); I'd appreciated being
pointed in the right direction if this isn't the right help list to send
this question to...or if this question is poorly worded (I did read the
posting guide).
Jill Willie
Open Seas
Safeco Insurance
[EMAIL PROTECTED]
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.