Have you considered "lmer" in library(lme4)? See for example sec/ 4 pm "Two-level models for binary data" in vignette("MlmSoftRev") wiht library(mlmRev) in addition to www.r-project.org -> "Documentation: Newsletter" -> "R News Volume 5/1" -> "Fitting Linear Mixed Models in R" by Doug Bates, pp. 27-30.
If you have more questions after reviewing this material please submit another question, preferably following the posting guide! "http://www.R-project.org/posting-guide.html". The posting guide is not just another symbol of burocracy. It was written to try to help questioners improve the chances that they will get the information they want quickly. I believe it is quite effective when it is used. Many people get answers to their questions in minutes, but that requires a question that a potential respondent can understand and formulate a sensible answer in seconds. spencer graves Kyle G. Lundstedt wrote: > Hello, > I'm interested in correcting for and measuring unobserved > heterogeneity ("missing variables") using R. In particular, I'm > searching for a simple way to measure the amount of unobserved > heterogeneity remaining in a series of increasingly complex models > (adding additional variables to each new model) on the same data. > I have a static database of 400,000 or so individual mortgage > loans, each of which is observed monthly from origination (t=0) until > termination (a binary yes/no variable). In my update database, there > are up to 60 months of observed data for each loan in the static > database, and an individual loan has an "average life" of roughly 36 > months. > Each loan has static covariates observed at origination, such as > original loan amount and credit score, as well as time-varying > covariates (TVC) such as age, interest rates, and house prices. > Because these TVC change each month, I've constructed a modeling > database that merges the static database with the update database. > The resulting "loan-month" modeling database has one observation > for every loan-month, and the static covariates remain the same for > all loan-months for a given loan. Thus, the modeling database has > roughly 14.4 million loan-month records. A loan is considered > "active" as long as it has not yet terminated or been censored; my > interest is in predicting termination. > This type of data is often referred to as "event history" or > "discrete hazard" data. The standard R package to apply to such data > is "survival", with which I could estimate a Cox proportional hazard > model using coxph. The advantage of such an approach is that > unobserved heterogeneity is easily addressed using the "frailty" term. > The disadvantages, at least for my purposes, are two-fold. > First, my audience is unfamiliar with hazard models. Second, my > monthly data has many "ties" (many terminations in the same month), > so I've been told that coxph won't work well on a large dataset with > many ties. > On the other hand, because the data is measured discretely each > month, many references suggest applying generalized linear models > (GLM, "logit"-type models) or even generalized addivitive models > (GAM, "logit"-type models that incorporate nonlinearity in individual > covariates). The advantage to this approach is that GLM and GAM are > readily available in R, and my audience is very familiar with logit- > type models. > The disadvantage, however, is that I am totally unfamiliar with > ways to correct for and measure unobserved heterogeneity using GLM/ > GAM-type models. I've been told that unobserved heterogeneity in the > hazard framework is analogous to random effects in the GLM/GAM > framework, but there seem to be a number of R packages that address > this issue in different ways. > So, I'd greatly appreciate suggestions on a simple way to > incorporate unobserved heterogeneity into a GLM/GAM-type model. I'm > not much of a statistician, so simple examples are always helpful. > I'm also happy to track down specific article/book references, if > folks think those might be of help. > > Many thanks, > Kyle > --- > kyle at hotmail . com > (email altered in obvious ways) > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html