Dear Saurav, I get the feeling that you are looking for mixed models. Try something like.
library(lme4) glmer(s ~ age + gender + gemedu + gemhinc + es_gdppc + imf_pop + estbbo_m + (1|yearctry), family = binomial(link = "probit"), data = adpopdata) HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens saurav pathak Verzonden: donderdag 16 juli 2009 12:18 Aan: r-help@r-project.org Onderwerp: [R] PROBIT REGRESSION FOR GROUPED/CLUSTERED DATA Hello all I have been working to fix this for weeks now, It should be simple to fix. Please help Let me explain what I am doing, I have a data set for 65 countries over a period of 9 years (2000-2008). Each country has on an average say 2000 interviews, so that the total set has roughly 65*9*2000 data points/observations (of course there are missing vales as well). Now let me explain how are the data clustered or grouped. I use the variable "yearctry" which is computed as year*10000+ international phone code of the country, say for example USA with calling code 001 for the year 2000 will have a yearctry value = 2000001. Under this particular value of yearctry of 2000001 there are roughly 2000 observations, next for the same year for say UK the yearctry value would be 2000044 (having roughly 2000 observations) , and similarly so on for the rest of the 63 countries for the year 2000 and all other years from 2000 to 2008. For say the year 2001, the values of yearctry for USA and UK would be 2001001 and 2001044 respectively (again 2000 obseravations for each country roughly) and so on for the other 63 countries as well. So the data set is *grouped/clustered using "yearctry"* I am trying to look into a selection bias if any within each "yearctry" (ie 2000 observation for one country for 9 years and so on for 65 countries) value, essentially therefore I wish to check for 65*9 values of "yearctry" with each "yearctry" having 2000 observations roughly. Hence I use the glm/probit to look into the selection bias where all my dependant variable "s" are either 0 or 1. The formula *myProbit<- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc + imf_pop + estbbo_m, family = binomial(link = "probit"), data = adpopdata)* is the Heckman selection equation based on all observations without taking into account the fact that each "yearctry" is unique, I want the selection equation to recognise the uniqueness of each "yearctry" value , takes one "yearctry" at a time, estimates the probit, goes to the next "yearctry" repeats the probit regression and then give me the result. At the moment I do not accomplish that using the above formula. The above formula does regression on a bulk basis, but I wish that it recognises one yearctry from the other and then performs the regression for all yearctry values and finally produces me the result Is there any other model recommended that should do the job other than the glm???If Yes please help how? Let me give you the exact command that Stata uses, so that things become very clear: *xtprobit s age gender gemeduc gemhinc es_gdppc imf_pop estbbo_m, i(yearctry)* This does exactly what I wish to accomplish in R, ie does the heckman selection equation for the selection variables (seven in my case) based upon the uniqueness of "yearctrty" I have worked weeks on this, kindly help me, I think it is a small issue to fix in the equation, although since I am new to R, I do not exactly know what exactly will fix my problem, so any help will be highly appreciated Thanks -- Dr.Saurav Pathak PhD, Univ.of.Florida Mechanical Engineering Doctoral Student Innovation and Entrepreneurship Imperial College Business School s.patha...@imperial.ac.uk 0044-7795321121 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.