It depends on what you mean by 'handle', but probably not. You'll likely have to split the file into multiple files unless you have some rather high end hardware. However, in my limited experience, there's almost always a meaningful way to split the data (geographically, or by other categories).

A few things I've learned recently working with large datasets:

1. Store files in .rda format using save() -- the load times are much faster and loading takes up less memory
2. If your data are integers, store them as integers!
3. Don't store character variables in dataframes -- use factors


-roger

Thomas W Volscho wrote:
Dear List,
I have some projects where I use enormous datasets.  For instance, the 5% PUMS 
microdata from the Census Bureau.  After deleting cases I may have a dataset 
with 7 million+ rows and 50+ columns.  Will R handle a datafile of this size?  
If so, how?

Thank you in advance,
Tom Volscho

************************************ Thomas W. Volscho
Graduate Student
Dept. of Sociology U-2068
University of Connecticut
Storrs, CT 06269
Phone: (860) 486-3882
http://vm.uconn.edu/~twv00001


______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/

______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to