Re: [R] big data

Greg Snow Wed, 08 Sep 2010 10:06:51 -0700

In addition to Dirks advice about the biglm package, you may also want to look 
at the RSQLite and SQLiteDF packages which may make dealing with the large 
dataset faster and easier, especially for passing the chunks to bigglm.


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of André de Boer
> Sent: Wednesday, September 08, 2010 5:27 AM
> To: r-help@r-project.org
> Subject: [R] big data
> 
> Hello,
> 
> I searched the internet but i didn't find the answer for the next
> problem:
> I want to do a glm on a csv file consisting of 25 columns and 4 mln
> rows.
> Not all the columns are relevant. My problem is to read the data into
> R.
> Manipulate the data and then do a glm.
> 
> I've tried with:
> 
> dd<-scan("myfile.csv",colClasses=classes)
> dat<-as.data.frame(dd)
> 
> My question is: what is the right way to do is?
> Can someone give me a hint?
> 
> Thanks,
> Arend
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] big data

Reply via email to