A few things I've learned recently working with large datasets:
1. Store files in .rda format using save() -- the load times are much faster and loading takes up less memory
2. If your data are integers, store them as integers!
3. Don't store character variables in dataframes -- use factors
-roger
Thomas W Volscho wrote:
Dear List, I have some projects where I use enormous datasets. For instance, the 5% PUMS microdata from the Census Bureau. After deleting cases I may have a dataset with 7 million+ rows and 50+ columns. Will R handle a datafile of this size? If so, how?
Thank you in advance, Tom Volscho
************************************ Thomas W. Volscho
Graduate Student
Dept. of Sociology U-2068
University of Connecticut
Storrs, CT 06269
Phone: (860) 486-3882
http://vm.uconn.edu/~twv00001
______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/
______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
