[ECOLOG-L] Statistical question regarding interaction terms

Cortney Watt Wed, 06 May 2009 06:38:47 -0700

Hi all
These are the emails I got regarding my question on not getting significance in 
an interaction 
term, even when simple effects show there is significance.
Thanks to everyone who responded!


1)  Your question might be better suited to the r-sig list:
R-sig-ecology mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

2)  I am not quite the stats authority that some others on this list are, but 
I'll put in my two cents. 
My first response is that I don't think you have any actual problem. It's just 
that the results are a 
little ambiguous, as sometimes happens in statistics.

In the three separate within-elevation analyses, the canopy treatment had a 
significant effect for 
at two elevations but not the third. That doesn't necessarily mean that the 
differences between 
canopy treatments were significantly different among the three elevations. In 
other words, it 
doesn't necessarily mean that you have a significant interaction. In fact the 
interaction term in the 
full analyses (all three elevations included) was not significant, perhaps due 
to type II error from 
low power, or perhaps because the effect of canopy really does not depend on 
elevation.

I have run into this kind of situation a few times in my own research. I think 
you should just report 
that the effect of canopy was significant for two elevations but not the third, 
however, there is 
insufficient statistical evidence to conclude that the effect of canopy 
depended on elevation.
Good luck

3) Although you've described your model structure, it is difficult to answer 
this question without 
visualizing the data.  That caveat aside, you would expect stronger 
between-groups (factors) main 
effects simply because those models have more degrees of freedom (i.e. a 
greater number of 
observations relative to the number of parameters estimated) as compared to the 
full models with 
the interactions.

I'm not familiar with Underwood, but the usual process of model selection is 
begin with the "full" 
or maximal model including all interaction terms and removing those terms with 
highly non-
significant effects. However, if you are building a predictive model based on 
the knowledge that 
species richness is dependent on levels of tidal elevation and seaweed cover, 
then it moot to 
undertake model selection (you have an a priori model and the interest lies in 
estimating the 
parameters).  Alternatively, if you don't have a predictive model in mind a 
priori, then you might 
consider model simplification procedures but which consider the number of 
parameters in the fit 
of the model (e.g. AIC).

4) I am not an expert on statistics or ecology so you can take it for what its 
worth. But I am 
curious why you are analyzing elevation and cover as factors?

Is there a good reason to do this? I am not familiar with this domain but I 
think that regression 
analysis may be more powerful in this case.

See Frank Harrell's site on this
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/CatContinuous

I am guessing that technically your cover measures are ordinal but it seems 
that many ecologist 
treat them as continuous anyway.

5) I can't answer you why, but you might consider contacting John K. 
Kruschke([email protected]) in the Indiana University Psychology department 
for [R] code 
that will allow a Bayesian approach to this exact problem. It won't yield 
wildly different results, but 
you might find the results more informative. A portion of the book he is 
writing deals with this 
problem by reassessing Qian and Shen's (2007) look at limpets and fish size.

6) Without looking at your data (means, variances, etc), I would say that it is 
probably a mistake to 
interpret a lack of significance as the lack of a pattern.  The nonsignificant 
interaction test in the 
full model tells you that there is not detectable variation in the canopy 
effect across elevations.  
The proper procedure is then to pool all of the elevation treatments in order 
to obtain the best 
estimate of the canopy treatment (i.e., an ANOVA with no interaction term).  
When you pull out 
each elevation treatment individually and test for a canopy effect, you may not 
always find a 
significant result because you have lower sample size within each treatment 
than you did in the 
full model, and that elevation treatment might also have a higher sample 
variance, further 
reducing your power.  Remember that significance is not an immutable thing - it 
depends on the 
signal (how different are the means) as well as the variance around that signal 
and the sample 
size.  Instead of looking at the significance result, look at the mean effect - 
does there actually 
seem to be a difference among the elevation treatments in the size of the 
canopy effect?  Only if 
the answer is yes would I worry that the ANOVA is not detecting an interaction. 
   You may also 
want to check for heterogeneity of variances - if there is a big difference in 
sample variance 
among treatments you may need to transform your data some.

[ECOLOG-L] Statistical question regarding interaction terms

Reply via email to