On Sat, 12 Jan 2008 15:38:33 -0400, Stephen Cole <[EMAIL PROTECTED]> wrote:
>Hello Ecolog - I was wondering if anyone had any advice on the following >problem. > >I have a data set that is infested by a plague of zeros that is causing me >to violate all assumptions of classic parametric testing. These are true >zeros in that the organisms in question did not occur in my randomly sampled >quadrats. They are not "missing data" > >I have a fully nested Hierarchical design >My response variable is density obtained from quadrat counts. >my explanatory variables are as follows > >Region (3 levels-fixed) >Location(Region) (4 levels - random >Site(Location(Region)) (4 levels - random) > >My plan was to analyze the data with a nested anova and then proceed to >calculate variance components to allow me to parse out the variance that >could be attributed to each spatial scale in my design. Since it is known >that violations of assumptions severely distort variance components in >random factors, i would really like to clean up my data set to meet the >assumptions but as of yet i have found no acceptable remedial measure. > Stephen, The good news for you is that this is a common problem; it is called zero inflation. The solution is zero inflated Poisson, zero inflated negative binomial, zero altered poisson, or zero altered negative binomial GLMs. These are mixture models. Just Google ZIP, ZINB, ZAP, ZANB (or hurdle models). There is a nice online pdf from Zeileis, Kleiber and Jackman, showing you how to do these analyses in R. The book from Cameron and Trivedi gives the maths. Our next book has a 40 page chapter on this stuff (in R), but that won't help you now. The difference between ZI and ZA is the nature of the zeros (false zeros or true zeros), and the difference between Poisson and NB is wether you have extra overdispersion due to the counts, or only due to the zeros. Software in R for this stuff is reasonably new. Packages pscl and VGAM are good starting points. The bad news is that I am not sure what you have in terms of software for ZIPs + random effects. Both Cameron and Trivedi and Hilbe (2007) discuss these methods in the context of random effects. There was a paper in Environmetrics (end of 2007) applying ZIP with spatial/temporal correlation on seal data...in R. There are more, all very recent, papers with ZIP/ZAP + random effects. You may have to write the software code for doing this...I don't know. Having said that...you say that your random effects have 4 levels. I doubt if this is enough! Perhaps you should consider them as fixed? See Pinheiro and Bates. ZIP/ZAP is very interesting stuff! Alain Dr. Alain F. Zuur First author of: 1. Analysing Ecological Data (2007). Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p. URL: www.springer.com/0-387-45967-7 2. Analysing Ecological data using GLMM and GAMM in R. (2008). Zuur, AF, Ieno, EN, Walker, N and Smith, GM. Springer. 3. An introduction to R for life scientists: - With a paper submission guide - (2008). Zuur, AF, Ieno, EN and Meesters, EHGW. Springer Other books: http://www.brodgar.com/books.htm Statistical consultancy, courses, data analysis and software Highland Statistics Ltd. 6 Laverock road UK - AB41 6FN Newburgh Tel: 0044 1358 788177 Email: [EMAIL PROTECTED] URL: www.highstat.com URL: www.brodgar.com >Has anyone else run into this problem when analysing abundance data. I am >aware of conditional models, but i have no practical experience with them >and i am not even sure how to proceed with analysis in that case. I have >been using the R program to tackle this problem and i have also found no >advice on the r-help mailing list. > >Thanks for any help that can be provided > >Stephen Cole >Marine Ecology Lab >Saint Francis Xavier University >=========================================================================
