Z is correct, of course. I was just being a little too simplistic in my explanation trying to emphasize the reversal of signs of the coefficients in the logistic regression part of the zero-inflated model.
Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: [email protected] <[email protected]> tel: 970 226-9326 On Wed, Aug 14, 2013 at 4:07 AM, Achim Zeileis <[email protected]>wrote: > On Tue, 13 Aug 2013, Cade, Brian wrote: > > Lauria: For historical reasons the logistic regression (binomial with >> logit link) model portion of a zero-inflated count model is usually >> structured to predict the probability of the 0 counts rather than the >> nonzero (>=1) counts so the coefficients will be the negative of what you >> expect based on the count model portion (as in your output). It is simple >> to interpret the probability of the logistic regression portion as the >> probability of the nonzero counts by just taking the negative of the >> coefficient estimates provided for the probability of the zero counts. >> > > This is a common misinterpretation but not quite correct. > > The zero-inflation model is a mixture model of two components: (1) a count > component (Poisson, NB, ...), and (2) a zero mass component (i.e., zero > with probability 1). Hence, the observed zeros in the data can come from > both sources: either they are "random" zeros from component (1) or "excess" > zeros from component (2). > > The binomial zero-inflation part of the model predicts the probability > that a given observation belongs to component (1). Thus, the probability of > an "excess zero". But this is _not_ the probability of observing a zero in > the data (which is larger than the excess zero probability). > > If you want a model that first models zero vs. non-zero and second the > non-zero counts, use the hurdle model. This has exactly the interpretation > you describe above. > > Best, > Z > > Brian >> >> Brian S. Cade, PhD >> >> U. S. Geological Survey >> Fort Collins Science Center >> 2150 Centre Ave., Bldg. C >> Fort Collins, CO 80526-8818 >> >> email: [email protected] <[email protected]> >> tel: 970 226-9326 >> >> >> >> On Tue, Aug 13, 2013 at 9:06 AM, Lauria, Valentina < >> [email protected]> wrote: >> >> Dear All, >>> >>> I am running a negative binomial model in R using the package pscl in >>> oder >>> to estimate bed sediment movements versus river discharge. Currently we >>> have deployed 4 different plates to test if a combination of more than >>> one >>> plate would better describe the sediment movements when the river >>> discharge >>> changes over time. >>> >>> My data are positively skewed and zero-inflated. I did run both >>> zero-inflated Poisson and zero-inflated negative binomial regression and >>> compared them using the VUONG test which showed that the negative >>> binomial >>> works better than a simple zero-inflated Poisson. >>> >>> My models look like: >>> >>> >>> 1) plate1 ~ river discharge >>> 2) (plate 1 + plate 2) ~ river discharge >>> 3) (plate 1 + plate 2 +plate 3) ~ river discharge >>> 4) (plate 1 + plate 2 + plate 3 + plate 4) ~ river discharge >>> >>> >>> My main problem as I am new to these type of models is that I get a >>> different sign for the coefficent of discharge in the output of the >>> zero-inflated negative binomial model (please see below). What does this >>> mean? Also how could I compare the different models (1-4) i.e. what tells >>> me which is performing best? Thank you very much in advance for any >>> comments and suggestions!! >>> >>> Kind Regards, >>> Valentina >>> >>> >>> Call: >>> zeroinfl(formula = plate1 ~ discharge, data = datafit_plates, dist = >>> "negbin", EM = TRUE) >>> Pearson residuals: >>> Min 1Q Median 3Q Max >>> -0.6770 -0.3564 -0.2101 -0.0814 12.3421 >>> >>> Count model coefficients (negbin with log link): >>> Estimate Std. Error z value Pr(>|z|) >>> (Intercept) 2.557066 0.036593 69.88 <2e-16 *** >>> discharge 0.064698 0.001983 32.63 <2e-16 *** >>> Log(theta) -0.775736 0.012451 -62.30 <2e-16 *** >>> >>> Zero-inflation model coefficients (binomial with logit link): >>> Estimate Std. Error z value Pr(>|z|) >>> (Intercept) 13.01011 0.22602 57.56 <2e-16 *** >>> discharge -1.64293 0.03092 -53.14 <2e-16 *** >>> Theta = 0.4604 >>> Number of iterations in BFGS optimization: 1 >>> Log-likelihood: -6.933e+04 on 5 Df >>> >>> >>> >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**________________ >>> [email protected] mailing list >>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide >>> http://www.R-project.org/**posting-guide.html<http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________**________________ >> [email protected] mailing list >> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide http://www.R-project.org/** >> posting-guide.html <http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> [[alternative HTML version deleted]] ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

