On Thu, 16 Dec 1999, Burke Johnson wrote:

> A student of mine is getting ready to develop a GLM prediction model
> that will include a mixture of categorical and quantitative predictor
> variables.  We will probably not include interaction terms in the model 
> (i.e., it will be a main effects only model). 

Why would one NOT include interaction terms in the model, at least in the 
exploratory stages of analysis?  As Joe Ward pointed out, somewhat 
obliquely, you can miss a great deal that's going on in the data if you 
rule out of order all the interesting stuff beforehand.
   [If you want an example, there are several rather neat ones around.] 
 
> Here's my question:  Do you suggest using dummy coding (0,1) or effects 
> coding (1,0,-1) for the categorical variables included in the model? 

I was puzzled about your assertion in the next paragraph until I read 
Rich Ulrich's reply.  What you call "effects coding", if limited to that 
(1,0,-1), is what I'd call a "linear effect" for a 3-category variable if 
the categories are at least ordered.  You didn't say how many categories 
there are in any of the categorical variables in question;  apparently 
not binary variables, else (1,0,-1) could not apply.  If you use 
indicator variables of the form (0,1), you'd need more than 1 such 
variable if you have more than 2 categories to be represented, and you 
did not indicate whether you had in mind as many indicator variables as 
there are degrees of freedom in the categorical variable(s) of interest, 
or (for some unexplained and to me unimaginable reason) were planning to 
use only one such variable for each of the categorical variables.
  And of course if you use (1,0,-1) for a ternary variable, you ought 
also to use the complementary (-1,2,-1) to represent the remaining degree 
of freedom.  If you're thinking of variables with more than 3 categories, 
how did you plan to code the 4th, 5th, ... category?

> The reason I'm asking is because dummy coding does not always give the
> same result for a factorial design as does ANOVA and effects coding, 

Either you are in error in this assertion, or you mean something 
different fromk what _I_ have in mind by "dummy coding" and "effects 
coding".  As another respondent has pointed out, the results are 
equivalent whatever the coding, so long as all the degrees of freedom 
implied by the several categories are represented in the codes.

> and, hence, Pedhazur recommends using effects coding rather than dummy
> coding in the factorial case. 

As another respondent has remarked, this seems to me most unlikely.  
Where, precisely, does Pedhazur recommend any such thing?

> Do you know if the choice of dummy or effects coding matters for a main
> effects only model with multiple categorical and quantitatively scaled
> predictor variables? 

As implied above, it ought not to matter so long as all the d.f. are 
properly accounted for.  IF they are, what you describe is equivalent to 
an analysis of covariance constrained to additive effects only.  (Some of 
our colleagues consider "analysis of covariance" an old-fashioned term, & 
possibly even a misleading one;  but in the old-fashioned sense that may 
still make sense to some of us, that's what it is.  ANCOVA is, of course, 
a subset of the general linear model, which is what I suppose you mean by 
GLM.)

One would still have to question the reason, if any, for the constraint. 
One is tempted to suspect that your student would really rather not be 
bothered with interactions, because they're less easy to think about than 
a model containing main effects only;  but perhaps that is a base canard. 
Whatever the case, the _best_ way of dealing with interactions one would 
like not to exist is to model them and show that they are in fact not 
detectable in the data at hand.  If they _are_ detectable, well, sorry, 
folks, that's the way the cookie crumbles sometimes, and the universe (as 
represented, however imperfectly, in the data) may be trying to tell you 
something interesting.  Maybe even useful.  If so, 'twould be rude to 
ignore it;  and being rude to the universe is a loser's game.
                                                                -- DFB.
 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128  

Reply via email to