The zero problem is common in other fields such as in the analysis of water 
quality data where concentrations of chemicals are often near or below 
detection limits. Dennis Helsel's book Nondetects and data analysis provides 
some thoughtful discussions, including how a failure to properly account for 
them caused the Challenger space shuttle explosion, and some nonparametric 
models and censored data models.
 
As to planning for zeros in the experimental design, in my limited experience 
sometimes it will be apparent that the data will have many zeros and other 
times it won't. Given that many ecological experiments are at field sites where 
the responses aren't known ahead of time this is often the case. I tend to plan 
both ways now but I'd have to say that' despite having more than an average 
amount of statistical training' that I find the necessary expertise for the 
statistical designs to be a bit beyond me and good statistical advice in this 
particular area to be rare. 

To me it is also not always clear how to interpret the meaning of the zeros 
since I often don't know their cause and if someone could tell me how to do the 
equivalent of a three-way randomized complete blocks design with blocks as 
random effects and include the treatment interactions I'd be really happy and I 
could publish some data that have been sitting for 10 years waiting for the 
appropriate statistical techniques to be developed
 
John Gerlach





----- Original Message ----
From: Michele Scardi <[EMAIL PROTECTED]>
To: [email protected]
Sent: Monday, January 14, 2008 4:58:18 AM
Subject: Re: [ECOLOG-L] Data set with many many zeros..... Help?

Monday, January 14, 2008, 4:11:24 AM, Warren Aney wrote:
WWA> Bill, are we the Luddites in this arena?  I agree with you, and my
WWA> statistics professor would have taken it one important step further:  
Choose
WWA> your statistical analysis methods before you start collecting your data --
WWA> ...

"To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of."

Sir Ronald Aylmer Fisher


As for the "Data set with nmany zeros" thread, I'm sure that ZIP,
ZINB, ZAP, ZANB, etc. are absolutely appropriate (and effective) in
some cases, but the real problem is that in some case zeros cannot be
regarded as plain numbers. They are just a completely different beast.

Imagine you are relating the abundance of a given species to an
environmental variable (e.g. temperature). Usually you'll find an
optimum value (=max abundance) and a range of values that are
associated to decreasing abundances.

But what about zeros? Zeros mean: species present, but not found;
species absent because of other reasons (i.e. temperature is ok,
something else is not ok: e.g. competition); species absent because
temperature is too low; species absent because temperature is too
high; species absent because temperature is way too low; species
absent because temperature is way too high; etc.

Basically, all these responses produce a zero abundance (i.e. they are
quantitatively identical), but they are completely different from the
qualitative point of view. So, the bottom line is that data
transformations as well as ZIP, ZINB, etc. models can solve some
problems, but in some other cases what is really relevant is the
meaning of the zero vs. non-zero values, and in those cases a
qualitative approach (contingency tables, chi-square stats, etc., but
keep it as ismple as possible!) can
be a better option.

Best,

Michele


--------------------------------
Michele Scardi
Associate Professor of Ecology

Department of Biology
University of Rome "Tor Vergata"
Via della Ricerca Scientifica
00133 Roma
Italy

http://www.mare-net.com/mscardi
--------------------------------

Reply via email to