Dear all, I have a puzzle regarding the estimation of Neg. Binomial event count model in R. I would greatly appreciate if anyone could shed some light on my puzzle.
Using the glm.nb command, or the zelig command developed by Gary King et. al., I obtain the same point estimates in R as well as in Stata. However, if I write my own likelihood function to estimate a neg. binomial event count model, my coefficient estimates are a little bit different (more than 5%), and my log-likelihood score is actually better. When I run Monte Carlo simulation, I found the coefficient estimates of my likelihood function have smaller variance and are closer to the true values than the glm.nb or Stata results do. I don't quite understand why my neg. binomial likelihood function produce different result from glm.nb or Stata, because I have written log-likelihood function for OLS, Logit, Probit, Poisson event count model, and my point estimates are all very close to the results produced by glm.nb or Stata, and I think the tiny difference might be due to different optimization algorithm used. In the negative binomial case, the difference is too big to be credit to optimization algorithm. Here is my code for data generating process as well as estimation. ****************************************************************************** # Data Generation obs=500 beta = c(0,1) # Generate the independent variable z= runif(obs) # Generate X matrix x = cbind(rep(1,obs),z) # Generate the dependent variable from a neg. binomial distribution phi = exp(x%*%beta) sigma_2 = 10 y = rnbinom(obs,size=phi/(sigma_2-1),mu=phi) # Neg. Binomial log-likelihood function like.nb=function(par,x,y) { phi_e=exp(x%*%par[1:2]) sig2_e=exp(par[3])+1 sum(lgamma(phi_e/(sig2_e-1)+y)-lgamma(phi_e/(sig2_e-1))+y*log((sig2_e-1)/sig2_e)-(phi_e/( sig2_e-1))*log(sig2_e)) } # Ng. Binomial regression b = solve(t(x)%*%x,(t(x)%*%y)) b = matrix(c(b,var(y)/mean(y)),nrow=3) # set initial value for means and variance nb.res = optim(b,like.nb, y=y, x=x, method="BFGS", control=list(fnscale=-1), hessian=T) nbcoef=nb.res$par nbvcovm=solve(-nb.res$hessian) # Alternative glm function library(MASS) summary(glm.nb(y~z)) ****************************************************************************** Thanks a lot in advance for your help. Best, Xiaobo ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.