I am analysing parasite egg count data and am having trouble with glm with a
negative binomial family.
In my first data set, 55% of the 3000 cases have a zero count, and the
non-zero counts range from 94 to 145,781.
Eventually, I want to run bic.glm, so I need to be able to use glm(family=
neg.bin(theta)). But first I ran glm.nb to get an estimate of theta:
> hook.nb<- glm.nb(fh, data=hook)
This works fine, with no errors, and summary( hook.nb) produces the
following:
Call:
glm.nb(formula = fh, data = hook, init.theta = 0.0938126159640384,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.45830 -1.16385 -1.01820 0.00535 2.86513
<snip>
Theta: 0.09381
Std. Err.: 0.00299
2 x log-likelihood: -23750.45300
Then I tried to use this estimate of theta to specify a glm of the negative
binomial family but got the following error:
> hook.fam<-glm(fh, data=hook, family=neg.bin(0.09381))
Error: NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
step size truncated due to divergence
When I change theta to be 1 or more, the glm converges (as does bic.glm), so
I thought maybe the estimate of theta was wrong. But when I ran a negative
binomial regression in Stata, I got the same theta. (theta = 1/alpha =
1/10.65954 = .09381268)
In my second set of data, 75% of the cases have zero counts, and the
non-zero cases range from 94 - 16,688. In this case, I get errors when I run
the glm.nb:
> asc.nb<-glm.nb (fa, data=asc)
There were 26 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: algorithm did not converge in: glm.fitter(x = X, y = Y, w = w, etastart =
eta, offset = offset, ...
<exactly the same message in 2-24>
25: algorithm did not converge in: glm.fitter(x = X, y = Y, w = w, etastart
= eta, offset = offset, ...
26: alternation limit reached in: glm.nb(fa, data = asc)
Despite these errors, I do get output:
Call:
glm.nb(formula = fa, data = asc, init.theta = 0.030379484707051,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
- 0.949 -0.787 -0.745 -0.645 2.576
<snip>
Theta: 0.03038
Std. Err.: 0.00125
Warning while fitting theta: alternation limit reached
2 x log-likelihood: -15743.70600
Again, this estimate of theta agrees with pretty well with Stata's estimate
(theta = 1/alpha = 1/32.45225 = .03081451). But the glm with negative
binomial family specification gave the same error as above:
> asc.fam<-glm(fa, data=asc, family=neg.bin(0.0304))
Error: NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
step size truncated due to divergence
Everything runs smoothly when theta is 1 or more, so I don't think anything
is wrong with my data (which has no missings and all real numbers). I think
the problem must be with theta or with my specification of it. I have the
Venables and Ripley book, and am able to run glm(family=negative binomial(
0.03)) and bic.glm(glm.family=negative.binomial(0.03)) on the quine data
that comes with MASS. I have looked in the R-help archives and googled, but
have not found much besides a few old bug reports (which have been fixed) to
help me figure out why one of the glm.nb algorithms did not converge and why
both of the glm(family=neg.bin()) calls throw errors. Any ideas?
Thanks,
Elizabeth
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html