Ben: On Thu, Mar 29, 2012 at 5:41 AM, Ben Bolker <bbol...@gmail.com> wrote: > <abigailclifton <at> me.com> writes: > > >> I am trying to fit a generalised linear model to some loan >> application and default data. The purpose of this is to eventually >> work out the probability an applicant will default. > >> However, R seems to crash or die when I run "glm" on anything >> greater than a 5-way saturated model for my data. > > What does "crash or die" mean? Are you getting error messages? > What are they? Is the R application actually quitting? > >> My first question: is the best way to fit a generalised linear model >> in R to fit the saturated model and extract the significant terms >> only, or to start at the null model and to work up to the optimum >> one? > > This is more of a statistical practice question than an R question. > Opinions differ Well, to clarify: I do not think opinions differ on the first proposal -- reduce model to only significant terms. This should **not** be done.
I also would say (more tentatively) that modern practice rejects the notion of an "optimum" model to begin with,preferring shrinkage of other methodology. Cheers, Bert but in general I would say if it is computationally > feasible that you should start (and maybe finish) with the > full model. > >> I am importing a csv file with 3500 rows and 27 columns (3500x27 matrix). > >> My second question: is there anyway to increase the memory >> I have so R can cope with more analysis? > > help("Memory-limits") >> >> I can send my code if it would help to answer the question. > > Please read the posting guide (link at the bottom of every R-help > posting) and follow its advice. We don't know enough about your > situation to help. You could also try reading > http://tinyurl.com/reproducible-000 ... > > This works for me: > > z <- matrix(rnorm(3500*27),ncol=27) > y <- sample(0:1,replace=TRUE,size=3500) > colnames(z) <- c(letters,"A") > d <- data.frame(y=y,z) > gg <- glm(y~.,data=d,family="binomial") > gg <- glm(y~a*b*c*d*e*f*g*h,data=d,family="binomial") > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.