Re: Data set with many many zeros..... Help?

Dave Hewitt Mon, 14 Jan 2008 09:16:11 -0800

Jim R and Michele hit on very important and related points.

I almost wrote in my original reply that:

"If the critters being sampled in the quadrats are mobile, you'll need to 
verify that your ability to detect (and count) them is 100% and does not 
vary by quadrat (perhaps as a function of the factors you're "testing"). If 
detectability does vary, you've got a new set of problems."

If detectability varies (it almost always does) but is not measured, the 
count (abundance) information is suspect and confounded. One "solution" is 
to fall back on presence/absence data. This connects to Michele's point 
about what the zeros mean. The critical possibility in the context of 
detectability is that if you didn't find the critter in a given quadrat, it 
could have been present and you just missed it. This is part of the reason 
the counts are pretty useless - they're confounded with your ability to 
find the critters. This is worst when covariates you're interested in 
affect detectability itself, as Michele noted (e.g., temp).

Jim noted a distinction between what questions could be answered with 
presence/absence data vs. those that could be answered with abundance data. 
Abundance is the holy grail, and perhaps this investigator is working with 
easy-to-count critters that don't move around. Then the counts are solid 
and you can trust the zeros to mean the critter didn't occur there. In 
those cases, the zero-inflated count models are an obvious choice. Even if 
you just have presence/absence data, without confusion about the zeros you 
can proceed with logistic regression techniques and ask questions about the 
covariates you're interested in; in much the same way you would do with 
abundance in the zero-inflated count models.

If you have detectability issues and you didn't measure it or have a design 
that allowed post hoc inferences to be made about it, you're probably down 
to qualitative approaches like those Michele mentioned. But, your 
"inferences" will be very weak. The best thing might be to use that as a 
lesson in future study planning.

If you lucked in to a situation where you can go back and use repeated 
samples to check on detectability, you're in the world of occupancy models. 
For an intro, see:
http://www.uvm.edu/envnr/vtcfwru/spreadsheets/occupancy/occupancy.htm

Dave Hewitt
VIMS, Gloucester Point, VA

At 01:58 PM 1/14/2008 +0100, you wrote:
>As for the "Data set with nmany zeros" thread, I'm sure that ZIP,
>ZINB, ZAP, ZANB, etc. are absolutely appropriate (and effective) in
>some cases, but the real problem is that in some case zeros cannot be
>regarded as plain numbers. They are just a completely different beast.
>
>Imagine you are relating the abundance of a given species to an
>environmental variable (e.g. temperature). Usually you'll find an
>optimum value (=max abundance) and a range of values that are
>associated to decreasing abundances.

>But what about zeros? Zeros mean: species present, but not found;
>species absent because of other reasons (i.e. temperature is ok,
>something else is not ok: e.g. competition); species absent because
>temperature is too low; species absent because temperature is too
>high; species absent because temperature is way too low; species
>absent because temperature is way too high; etc.
>
>Basically, all these responses produce a zero abundance (i.e. they are
>quantitatively identical), but they are completely different from the
>qualitative point of view. So, the bottom line is that data
>transformations as well as ZIP, ZINB, etc. models can solve some
>problems, but in some other cases what is really relevant is the
>meaning of the zero vs. non-zero values, and in those cases a
>qualitative approach (contingency tables, chi-square stats, etc., but
>keep it as ismple as possible!) can
>be a better option.

Re: Data set with many many zeros..... Help?

Reply via email to