Jim R and Michele hit on very important and related points. I almost wrote in my original reply that:
"If the critters being sampled in the quadrats are mobile, you'll need to verify that your ability to detect (and count) them is 100% and does not vary by quadrat (perhaps as a function of the factors you're "testing"). If detectability does vary, you've got a new set of problems." If detectability varies (it almost always does) but is not measured, the count (abundance) information is suspect and confounded. One "solution" is to fall back on presence/absence data. This connects to Michele's point about what the zeros mean. The critical possibility in the context of detectability is that if you didn't find the critter in a given quadrat, it could have been present and you just missed it. This is part of the reason the counts are pretty useless - they're confounded with your ability to find the critters. This is worst when covariates you're interested in affect detectability itself, as Michele noted (e.g., temp). Jim noted a distinction between what questions could be answered with presence/absence data vs. those that could be answered with abundance data. Abundance is the holy grail, and perhaps this investigator is working with easy-to-count critters that don't move around. Then the counts are solid and you can trust the zeros to mean the critter didn't occur there. In those cases, the zero-inflated count models are an obvious choice. Even if you just have presence/absence data, without confusion about the zeros you can proceed with logistic regression techniques and ask questions about the covariates you're interested in; in much the same way you would do with abundance in the zero-inflated count models. If you have detectability issues and you didn't measure it or have a design that allowed post hoc inferences to be made about it, you're probably down to qualitative approaches like those Michele mentioned. But, your "inferences" will be very weak. The best thing might be to use that as a lesson in future study planning. If you lucked in to a situation where you can go back and use repeated samples to check on detectability, you're in the world of occupancy models. For an intro, see: http://www.uvm.edu/envnr/vtcfwru/spreadsheets/occupancy/occupancy.htm Dave Hewitt VIMS, Gloucester Point, VA At 01:58 PM 1/14/2008 +0100, you wrote: >As for the "Data set with nmany zeros" thread, I'm sure that ZIP, >ZINB, ZAP, ZANB, etc. are absolutely appropriate (and effective) in >some cases, but the real problem is that in some case zeros cannot be >regarded as plain numbers. They are just a completely different beast. > >Imagine you are relating the abundance of a given species to an >environmental variable (e.g. temperature). Usually you'll find an >optimum value (=max abundance) and a range of values that are >associated to decreasing abundances. >But what about zeros? Zeros mean: species present, but not found; >species absent because of other reasons (i.e. temperature is ok, >something else is not ok: e.g. competition); species absent because >temperature is too low; species absent because temperature is too >high; species absent because temperature is way too low; species >absent because temperature is way too high; etc. > >Basically, all these responses produce a zero abundance (i.e. they are >quantitatively identical), but they are completely different from the >qualitative point of view. So, the bottom line is that data >transformations as well as ZIP, ZINB, etc. models can solve some >problems, but in some other cases what is really relevant is the >meaning of the zero vs. non-zero values, and in those cases a >qualitative approach (contingency tables, chi-square stats, etc., but >keep it as ismple as possible!) can >be a better option.
