Hi, Until now, I thought that the results of glm() and bigglm() would coincide. Probably a naive assumption?
Anyways, I've been using bigglm() on some datasets I have available. One of the sets has >15M observations. I have 3 continuous predictors (A, B, C) and a binary outcome (Y). And tried the following: m1 <- bigglm(Y~A+B+C, family=binomial(), data=dataset1, chunksize=10e6) m2 <- bigglm(Y~A*B+C, family=binomial(), data=dataset1, chunksize=10e6) imp <- m1$deviance-m2$deviance For my surprise "imp" was negative. I then tried the same models, using glm() instead... and as I expected, "imp" was positive. I also noticed differences on the coefficients estimated by glm() and bigglm() - small differences, though, and CIs for the coefficients (a given coefficient compared across methods) overlap. Are such incrongruences expected? What can I use to check for convergence with bigglm(), as this might be one plausible cause for a negative difference on the deviances? Thank you very much, -benilton > sessionInfo() R version 2.5.0 (2007-04-23) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.iso885915;LC_NUMERIC=C;LC_TIME=en_US.iso885915;LC_COLLATE =en_US.iso885915;LC_MONETARY=en_US.iso885915;LC_MESSAGES=en_US.iso885915 ;LC_PAPER=en_US.iso885915;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASU REMENT=en_US.iso885915;LC_IDENTIFICATION=C attached base packages: [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" [7] "base" other attached packages: biglm "0.4" ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.