Dear Lister, I am facing a strange problem fitting a GLM of the negative binomial family. Actually, I tried to estimate theta (the scale parameter) through glm.nb from MASS and could get convergence only relaxing the convergence tolerance to 1e-3. With warning messages:
glm1<-glm.nb(nbcas~.,data=zonesdb4,control=glm.control(epsilon = 1e-3)) There were 25 warnings (use warnings() to see them) > warnings() Warning messages: 1: iteration limit reached in: theta.ml(Y, mu, n, w, limit = control$maxit, trace = control$trace > ... 2: NaNs produced in: sqrt(1/i) etc.... The estimate of theta was: 0.0939. When trying to compute confidence interval then, I got this message: > confint(glm1a) Waiting for profiling to be done... Error in profile.glm(object, which = parm, alpha = (1 - level)/4, trace = trace) : profiling has found a better solution, so original fit had not converged Moreover, trying some other solutions "by hand" (without warnings produced, here) with glm(.... family=negative.binomial(1)....), I found that theta = 0.7 lead to a much lower AIC (AIC= 1073) than theta = 1 (AIC=1211). Facing such unstable results my first reaction has been to forget about fitting a negative binomial model on this data set. The reader will find the dataset in a dumped format below for trials. A friend of mine tried the same with STATA and got the following result without any warning from STATA : . glm nbcas pop area v_100khab gares ports axe_routier lacs, family(nbinomial) link(log) eform Iteration 0: log likelihood = -616.84031 Iteration 1: log likelihood = -599.77767 Iteration 2: log likelihood = -597.22486 Iteration 3: log likelihood = -597.14784 Iteration 4: log likelihood = -597.14778 Iteration 5: log likelihood = -597.14778 Generalized linear models No. of obs = 92 Optimization : ML Residual df = 84 Scale parameter = 1 Deviance = 597.0007978 (1/df) Deviance = 7.107152 Pearson = 335.6135273 (1/df) Pearson = 3.995399 Variance function: V(u) = u+(1)u^2 [Neg. Binomial] Link function : g(u) = ln(u) [Log] AIC = 13.15539 Log likelihood = -597.1477759 BIC = 217.1706 ------------------------------------------------------------------------------ | OIM nbcas | IRR Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- pop | 1.000011 1.82e-06 6.02 0.000 1.000007 1.000014 area | 1.000014 .0000244 0.57 0.569 .9999661 1.000062 v_100khab | 2.485394 .7924087 2.86 0.004 1.330485 4.642806 gares | 2.185483 .7648255 2.23 0.025 1.100686 4.339418 ports | .1805793 .100423 -3.08 0.002 .0607158 .5370744 axe_routier | .828243 .2258397 -0.69 0.489 .4853532 1.413376 lacs | 20.50183 12.17126 5.09 0.000 6.404161 65.63311 Has somebody an idea about (1) why the AIC values given are so different between softwares (R = 1211, STATA= 13.15) for the same model and (2) what can explain so different behaviour between R and STATA ? Here below the data.frame: zonesdb4 <- structure(list(nbcas = as.integer(c(318, 0, 42, 3011, 6, 911, 45, 273, 0, 0, 89, 122, 407, 83, 0, 1844, 58, 0, 0, 0, 0, 8926, 0, 0, 0, 0, 108, 0, 13, 1884, 0, 0, 0, 0, 963, 0, 199, 735, 0, 2182, 971, 0, 65, 0, 7927, 30, 0, 186, 0, 1363, 808, 0, 0, 0, 0, 135, 0, 1338, 0, 0, 488, 0, 344, 0, 0, 454, 4808, 0, 692, 0, 0, 24, 1301, 0, 0, 474, 228, 0, 0, 98, 44, 0, 0, 0, 1562, 375, 327, 0, 0, 0, 0, 0)), pop = as.integer(c(247215, 55709, 63625, 253153, 51789, 142806, 129839, 95799, 129996, 66668, 76043, 267232, 153200, 136333, 264888, 198244, 233600, 89152, 128085, 71803, 81911, 122523, 149806, 122470, 50979, 160773, 80700, 56146, 226965, 245322, 165768, 215129, 46843, 108471, 108690, 188724, 161794, 226965, 95850, 156326, 145291, 51789, 218808, 53189, 245854, 152047, 146509, 243765, 65012, 226830, 66742, 144762, 93858, 73793, 123107, 189793, 91013, 135212, 67487, 105050, 194903, 206606, 62169, 96832, 145570, 167062, 1598576, 146509, 103928, 118334, 91509, 295644, 139650, 106980, 66529, 126126, 257341, 56973, 33793, 164072, 145225, 155638, 131100, 100880, 245482, 166213, 127365, 113713, 57540, 78571, 62499, 81916)), Area = c(10027.1, 9525.3, 638.646, 861.486, 4966.32, 11973, 1823.89, 1327.45, 789.595, 4892.38, 638.908, 15959.8, 2036.56, 7397.62, 4626.03, 10237.5, 9823.24, 4253.59, 2448.78, 12280.2, 910.972, 16365, 2041.92, 4343.46, 1081.42, 1601.11, 4664.47, 335.865, 2818.68, 7348.1, 1148.41, 265.158, 14883.6, 3698.58, 12444.6, 1711.45, 15462, 10419.5, 13119.2, 1276.14, 872.91, 19291.4, 6719.82, 8505.53, 13219.8, 13069, 5212.03, 3924.42, 791.219, 881.281, 10038.5, 9101.94, 7925.71, 943.062, 9888, 20585.3, 4600.35, 3258.27, 11813.4, 130.184, 10644.3, 1925.89, 1892.88, 3833.6, 350.3, 7154.79, 2800.63, 559.986, 3152, 7095.39, 6058.3, 113.225, 5067.66, 1293.05, 15109.8, 4111.54, 94.5213, 4012.91, 1468.02, 10651.3, 8541.69, 1806.28, 20166.3, 1110.75, 2026.98, 21114.4, 2041.51, 17740.9, 16627.5, 15266.1, 525.467, 371.132), V_100kHab = structure(as.integer(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2)), .Label = c("0", "1"), class = "factor"), gares = structure(as.integer(c(2, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Label = c("0", "1"), class = "factor"), ports = structure(as.integer(c(2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1)), .Label = c("0", "1"), class = "factor"), axe_routier = structure(as.integer(c(2, 3, 3, 3, 2, 2, 2, 3, 2, 3, 2, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 3, 2, 2, 3, 2, 3, 2, 3, 3, 3, 2, 2, 3, 2, 3, 2, 3, 2, 3, 3, 2, 3, 2, 3, 2, 2, 3, 3, 2, 3, 2, 2, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, 3, 3, 2, 3, 3, 2, 3, 3, 2, 3, 2, 3, 3, 3, 2, 3, 3, 3, 2, 2, 3, 3)), .Label = c("0", "1", "2"), class = "factor"), lacs = structure(as.integer(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1)), .Label = c("0", "1"), class = "factor")), .Names = c("nbcas", "pop", "Area", "V_100kHab", "gares", "ports", "axe_routier", "lacs"), row.names = c("Ankoro", "Dilolo", "Tshitenge", "Tshitshimbi", "Tshofa", "Tshudi Loto", "Vanga Kete", "Wikong", "Djalo Djeka", "Fungurume", "Gandajika", "Kabalo", "Kabeya Kamuanga", "Kabinda", "Kabondo Dianda", "Kabongo", "Kafakumba", "Bena Dibele", "Kafubu", "Kalamba", "Kalambayi Kabanga", "Kalemie", "Kalenda", "Kalonda Est", "Kamalondo", "Kamana", "Kambove", "Kamiji", "Bibanga", "Kamina", "Kampemba", "Kanda Kanda", "Kansimba", "Kanzenze", "Kapanga", "Kapolowe", "Kasaji", "Kasenga", "Katako Kombe", "Katuba", "Kenya", "Kiambi", "Kikula", "Kilela Balanda", "Kilwa", "Kinda", "Bukenya", "Kinkondja", "Kipushi", "Kisanga", "Kitenge", "Kole", "Kongolo", "Likasi", "Lodja", "Lomela", "Lualaba", "Butumba", "Lubao", "Lubilanji", "Lubudi", "Lubumbashi", "Ludimbi Lukula", "Lukafu", "Lukelenge", "Lusambo", "Cilundu", "Makota", "Malemba Nkulu", "Manika", "Manono", "Miabi", "Minga", "Muene Ditu", "Mufunga Sampwe", "Mukanga", "Mukumbi", "Mulongo", "Mulumba", "Mutshatsha", "Nyemba", "Dilala", "Nyunzu", "Panda", "Pania Mutombo", "Pweto", "Rwashi", "Sakania", "Sandoa", "Songa", "Tshamilemba", "Tshilenge"), class = "data.frame", na.action = structure(as.integer(c(8, 34, 40, 41, 45, 71, 73, 79, 80, 83, 84, 85, 86, 93)), .Names = c("Wembo Nyama", "Kaniama", "Kasansa", "Bukama", "Kayamba", "Luputa", "Lwamba", "Mbuji mayi", "Mbulula", "Mitwaba", "Moba", "Dikungu Tshumbe", "Mpokolo", "Mumbunda"), class = "omit")) [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.