In addition to Dirks advice about the biglm package, you may also want to look at the RSQLite and SQLiteDF packages which may make dealing with the large dataset faster and easier, especially for passing the chunks to bigglm.
-- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of André de Boer > Sent: Wednesday, September 08, 2010 5:27 AM > To: r-help@r-project.org > Subject: [R] big data > > Hello, > > I searched the internet but i didn't find the answer for the next > problem: > I want to do a glm on a csv file consisting of 25 columns and 4 mln > rows. > Not all the columns are relevant. My problem is to read the data into > R. > Manipulate the data and then do a glm. > > I've tried with: > > dd<-scan("myfile.csv",colClasses=classes) > dat<-as.data.frame(dd) > > My question is: what is the right way to do is? > Can someone give me a hint? > > Thanks, > Arend > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.