Dear John, Thanks for the pointers. I will read this.
Pedro At 14:41 10/11/2005, you wrote: >Dear Pedro, > >The basic point, which relates to the principle of marginality in >formulating linear models, applies whether the predictors are factors, >covariates, or both. I think that this is a common topic in books on linear >models; I certainly discuss it in my Applied Regression, Linear Models, and >Related Methods. > >Regards, > John > >-------------------------------- >John Fox >Department of Sociology >McMaster University >Hamilton, Ontario >Canada L8S 4M4 >905-525-9140x23604 >http://socserv.mcmaster.ca/jfox >-------------------------------- > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de Barros > > Sent: Wednesday, November 09, 2005 10:45 AM > > To: [email protected] > > Subject: Re: [R] Interpretation of output from glm > > Importance: High > > > > Dear John, > > > > Thanks for the quick reply. I did indeed have these ideas, > > but somehow "floating", and all I could find about this > > mentioned categorical predictors. Can you suggest a good book > > where I could try to learn more about this? > > > > Thanks again, > > > > Pedro > > At 01:49 09/11/2005, you wrote: > > >Dear Pedro, > > > > > > > > > > -----Original Message----- > > > > From: [EMAIL PROTECTED] > > > > [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de > > > > Barros > > > > Sent: Tuesday, November 08, 2005 9:47 AM > > > > To: [email protected] > > > > Subject: [R] Interpretation of output from glm > > > > Importance: High > > > > > > > > I am fitting a logistic model to binary data. The > > response variable > > > > is a factor (0 or 1) and all predictors are continuous variables. > > > > The main predictor is LT (I expect a logistic relation between LT > > > > and the probability of being > > > > mature) and the other are variables I expect to modify > > this relation. > > > > > > > > I want to test if all predictors contribute significantly for the > > > > fit or not I fit the full model, and get these results > > > > > > > > > summary(HMMaturation.glmfit.Full) > > > > > > > > Call: > > > > glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom, > > > > family = binomial(link = "logit"), data = HMIndSamples) > > > > > > > > Deviance Residuals: > > > > Min 1Q Median 3Q Max > > > > -3.0983 -0.7620 0.2540 0.7202 2.0292 > > > > > > > > Coefficients: > > > > Estimate Std. Error z value Pr(>|z|) > > > > (Intercept) -8.789e-01 3.694e-01 -2.379 0.01735 * > > > > LT 5.372e-02 1.798e-02 2.987 0.00281 ** > > > > CondF -6.763e-02 9.296e-03 -7.275 3.46e-13 *** > > > > Biom -1.375e-02 2.005e-03 -6.856 7.07e-12 *** > > > > LT:CondF 2.434e-03 3.813e-04 6.383 1.74e-10 *** > > > > LT:Biom 7.833e-04 9.614e-05 8.148 3.71e-16 *** > > > > --- > > > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > > > > > > (Dispersion parameter for binomial family taken to be 1) > > > > > > > > Null deviance: 10272.4 on 8224 degrees of freedom Residual > > > > deviance: 7185.8 on 8219 degrees of freedom > > > > AIC: 7197.8 > > > > > > > > Number of Fisher Scoring iterations: 8 > > > > > > > > However, when I run anova on the fit, I get > > > > > anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of > > Deviance > > > > Table > > > > > > > > Model: binomial, link: logit > > > > > > > > Response: Mature > > > > > > > > Terms added sequentially (first to last) > > > > > > > > > > > > Df Deviance Resid. Df Resid. Dev P(>|Chi|) > > > > NULL 8224 10272.4 > > > > LT 1 2873.8 8223 7398.7 0.0 > > > > CondF 1 0.1 8222 7398.5 0.7 > > > > Biom 1 0.2 8221 7398.3 0.7 > > > > LT:CondF 1 142.1 8220 7256.3 9.413e-33 > > > > LT:Biom 1 70.4 8219 7185.8 4.763e-17 > > > > Warning message: > > > > fitted probabilities numerically 0 or 1 occurred in: > > method(x = x[, > > > > varseq <= i, drop = FALSE], y = object$y, weights = > > > > object$prior.weights, > > > > > > > > > > > > I am having a little difficulty interpreting these results. > > > > The result from the fit tells me that all predictors are > > > > significant, while the anova indicates that besides LT (the main > > > > variable), only the interaction of the other terms is > > significant, > > > > but the main effects are not. > > > > I believe that in the first output (on the glm object), the > > > > significance of all terms is calculated considering each of them > > > > alone in the model (i.e. > > > > removing all other terms), while the anova output is (as it says) > > > > considering the sequential addition of the terms. > > > > > > > > So, there are 2 questions: > > > > a) Can I tell that the interactions are significant, but not the > > > > main effects? > > > > > >In a model with this structure, the "main effects" represent slopes > > >over the origin (i.e., where the other variables in the > > product terms > > >are 0), and aren't meaningfully interpreted as main effects. > > (Is there > > >even any data near the origin?) > > > > > > > b) Is it legitimate to consider a model where the > > interactions are > > > > considered, but not the main effects CondF and Biom? > > > > > >Generally, no: That is, such a model is interpretable, but it places > > >strange constraints on the regression surface -- that the CondF and > > >Biom slopes are 0 over the origin. > > > > > >None of this is specific to logistic regression -- it > > applies generally > > >to generalized linear models, including linear models. > > > > > >I hope this helps, > > > John > > > > ______________________________________________ > > [email protected] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
