Dear Pedro, The basic point, which relates to the principle of marginality in formulating linear models, applies whether the predictors are factors, covariates, or both. I think that this is a common topic in books on linear models; I certainly discuss it in my Applied Regression, Linear Models, and Related Methods.
Regards, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -------------------------------- > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de Barros > Sent: Wednesday, November 09, 2005 10:45 AM > To: [email protected] > Subject: Re: [R] Interpretation of output from glm > Importance: High > > Dear John, > > Thanks for the quick reply. I did indeed have these ideas, > but somehow "floating", and all I could find about this > mentioned categorical predictors. Can you suggest a good book > where I could try to learn more about this? > > Thanks again, > > Pedro > At 01:49 09/11/2005, you wrote: > >Dear Pedro, > > > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de > > > Barros > > > Sent: Tuesday, November 08, 2005 9:47 AM > > > To: [email protected] > > > Subject: [R] Interpretation of output from glm > > > Importance: High > > > > > > I am fitting a logistic model to binary data. The > response variable > > > is a factor (0 or 1) and all predictors are continuous variables. > > > The main predictor is LT (I expect a logistic relation between LT > > > and the probability of being > > > mature) and the other are variables I expect to modify > this relation. > > > > > > I want to test if all predictors contribute significantly for the > > > fit or not I fit the full model, and get these results > > > > > > > summary(HMMaturation.glmfit.Full) > > > > > > Call: > > > glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom, > > > family = binomial(link = "logit"), data = HMIndSamples) > > > > > > Deviance Residuals: > > > Min 1Q Median 3Q Max > > > -3.0983 -0.7620 0.2540 0.7202 2.0292 > > > > > > Coefficients: > > > Estimate Std. Error z value Pr(>|z|) > > > (Intercept) -8.789e-01 3.694e-01 -2.379 0.01735 * > > > LT 5.372e-02 1.798e-02 2.987 0.00281 ** > > > CondF -6.763e-02 9.296e-03 -7.275 3.46e-13 *** > > > Biom -1.375e-02 2.005e-03 -6.856 7.07e-12 *** > > > LT:CondF 2.434e-03 3.813e-04 6.383 1.74e-10 *** > > > LT:Biom 7.833e-04 9.614e-05 8.148 3.71e-16 *** > > > --- > > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > > > > (Dispersion parameter for binomial family taken to be 1) > > > > > > Null deviance: 10272.4 on 8224 degrees of freedom Residual > > > deviance: 7185.8 on 8219 degrees of freedom > > > AIC: 7197.8 > > > > > > Number of Fisher Scoring iterations: 8 > > > > > > However, when I run anova on the fit, I get > > > > anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of > Deviance > > > Table > > > > > > Model: binomial, link: logit > > > > > > Response: Mature > > > > > > Terms added sequentially (first to last) > > > > > > > > > Df Deviance Resid. Df Resid. Dev P(>|Chi|) > > > NULL 8224 10272.4 > > > LT 1 2873.8 8223 7398.7 0.0 > > > CondF 1 0.1 8222 7398.5 0.7 > > > Biom 1 0.2 8221 7398.3 0.7 > > > LT:CondF 1 142.1 8220 7256.3 9.413e-33 > > > LT:Biom 1 70.4 8219 7185.8 4.763e-17 > > > Warning message: > > > fitted probabilities numerically 0 or 1 occurred in: > method(x = x[, > > > varseq <= i, drop = FALSE], y = object$y, weights = > > > object$prior.weights, > > > > > > > > > I am having a little difficulty interpreting these results. > > > The result from the fit tells me that all predictors are > > > significant, while the anova indicates that besides LT (the main > > > variable), only the interaction of the other terms is > significant, > > > but the main effects are not. > > > I believe that in the first output (on the glm object), the > > > significance of all terms is calculated considering each of them > > > alone in the model (i.e. > > > removing all other terms), while the anova output is (as it says) > > > considering the sequential addition of the terms. > > > > > > So, there are 2 questions: > > > a) Can I tell that the interactions are significant, but not the > > > main effects? > > > >In a model with this structure, the "main effects" represent slopes > >over the origin (i.e., where the other variables in the > product terms > >are 0), and aren't meaningfully interpreted as main effects. > (Is there > >even any data near the origin?) > > > > > b) Is it legitimate to consider a model where the > interactions are > > > considered, but not the main effects CondF and Biom? > > > >Generally, no: That is, such a model is interpretable, but it places > >strange constraints on the regression surface -- that the CondF and > >Biom slopes are 0 over the origin. > > > >None of this is specific to logistic regression -- it > applies generally > >to generalized linear models, including linear models. > > > >I hope this helps, > > John > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
