On Thu, 16 Dec 1999, Burke Johnson wrote:
A student of mine is getting ready to develop a GLM prediction model
that will include a mixture of categorical and quantitative predictor
variables. We will probably not include interaction terms in the model
(i.e., it will be a main effects only model).
Why would one NOT include interaction terms in the model, at least in the
exploratory stages of analysis? As Joe Ward pointed out, somewhat
obliquely, you can miss a great deal that's going on in the data if you
rule out of order all the interesting stuff beforehand.
[If you want an example, there are several rather neat ones around.]
Here's my question: Do you suggest using dummy coding (0,1) or effects
coding (1,0,-1) for the categorical variables included in the model?
I was puzzled about your assertion in the next paragraph until I read
Rich Ulrich's reply. What you call "effects coding", if limited to that
(1,0,-1), is what I'd call a "linear effect" for a 3-category variable if
the categories are at least ordered. You didn't say how many categories
there are in any of the categorical variables in question; apparently
not binary variables, else (1,0,-1) could not apply. If you use
indicator variables of the form (0,1), you'd need more than 1 such
variable if you have more than 2 categories to be represented, and you
did not indicate whether you had in mind as many indicator variables as
there are degrees of freedom in the categorical variable(s) of interest,
or (for some unexplained and to me unimaginable reason) were planning to
use only one such variable for each of the categorical variables.
And of course if you use (1,0,-1) for a ternary variable, you ought
also to use the complementary (-1,2,-1) to represent the remaining degree
of freedom. If you're thinking of variables with more than 3 categories,
how did you plan to code the 4th, 5th, ... category?
The reason I'm asking is because dummy coding does not always give the
same result for a factorial design as does ANOVA and effects coding,
Either you are in error in this assertion, or you mean something
different fromk what _I_ have in mind by "dummy coding" and "effects
coding". As another respondent has pointed out, the results are
equivalent whatever the coding, so long as all the degrees of freedom
implied by the several categories are represented in the codes.
and, hence, Pedhazur recommends using effects coding rather than dummy
coding in the factorial case.
As another respondent has remarked, this seems to me most unlikely.
Where, precisely, does Pedhazur recommend any such thing?
Do you know if the choice of dummy or effects coding matters for a main
effects only model with multiple categorical and quantitatively scaled
predictor variables?
As implied above, it ought not to matter so long as all the d.f. are
properly accounted for. IF they are, what you describe is equivalent to
an analysis of covariance constrained to additive effects only. (Some of
our colleagues consider "analysis of covariance" an old-fashioned term,
possibly even a misleading one; but in the old-fashioned sense that may
still make sense to some of us, that's what it is. ANCOVA is, of course,
a subset of the general linear model, which is what I suppose you mean by
GLM.)
One would still have to question the reason, if any, for the constraint.
One is tempted to suspect that your student would really rather not be
bothered with interactions, because they're less easy to think about than
a model containing main effects only; but perhaps that is a base canard.
Whatever the case, the _best_ way of dealing with interactions one would
like not to exist is to model them and show that they are in fact not
detectable in the data at hand. If they _are_ detectable, well, sorry,
folks, that's the way the cookie crumbles sometimes, and the universe (as
represented, however imperfectly, in the data) may be trying to tell you
something interesting. Maybe even useful. If so, 'twould be rude to
ignore it; and being rude to the universe is a loser's game.
-- DFB.
Donald F. Burrill [EMAIL PROTECTED]
348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED]
MSC #29, Plymouth, NH 03264 603-535-2597
184 Nashua Road, Bedford, NH 03110 603-471-7128