On Sat, 5 May 2012, Christopher Desjardins wrote:

Hi,
I am a little confused at the output from predict() for a zeroinfl object.

Here's my confusion:

## From zeroinfl package
fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin")


## The raw zero-inflated overdispersed data
> table(bioChemists$art)

  0   1   2   3   4   5   6   7   8   9  10  11  12  16  19
275 246 178  84  67  27  17  12   1   2   1   1   2   1   1

## The default output from predict. It looks like it is doing a horrible
job. Does it really predict 7 zeros?

No, see also this R-help post on "Zero-inflated regression models: predicting no 0s":
https://stat.ethz.ch/pipermail/r-help/2011-June/279765.html

The predicted _mean_ of a negative binomial distribution is not the most likely outcome (i.e., the _mode_) of the distribution. The post above presents some hands on examples.


> table(round(predict(fm_zinb2)) )

  0   1   2   3   4   5   6  10
  7 354 487  45  12   6   3   1

##  The output from predict using "count"
> table(round(predict(fm_zinb2,type="count")))

  1   2   3   4   5   6  10
312 536  45  12   6   3   1

## The output from predict using "zero", but here it predicts 24
"structural" zeros?
> table(round(predict(fm_zinb2,type="zero")))

  0   1
891  24


So my question is how do I interpret these different outputs from the
zeroinf object? What are the differences? The help page just left me
confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y)
and would most accurately track table(bioChemists$art) but I am wrong. How
can I find the E(Y) that would most closely track the raw data?

Thanks,
Chris

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to