Thanks for your reply Charles. I do indeed have other variables. I apologize for being vague, here is my study in more detail:
I have a cohort of births. My outcome is a dichotomous variable for presence/absence of a birth defect. For each cohort member I estimate the date of conception, and assign a pollution level during the relevant period of gestation. All cohort members conceived on the same day are assigned the same pollution level. These cohort members also have a covariate, t, which indicates the day of follow-up. For example, if the first day of my study is Jan 1, 1987, the data would look like: Date t Conceptions Cases Pollution Stratum Jan 1, 1987 1 100 1 10 1 Jan 2, 1987 2 105 0 8 2 Jan 3, 1987 3 101 1 11 3 . . Jan 1, 1988 366 109 1 13 1 Jan 2, 1988 367 111 2 19 2 Jan 3, 1988 368 103 0 14 3 . . . I make matched pairs of days (Strata) to control for the influence of season. I also want to account for long-term trends, eg increasing birth defects ascertainment and decreasing pollution levels over time, so I want to fit a cubic spline using the variable t. I have already analyzed this data as a time series (I don't use the Stratum variable in the time-series analyses), but now I am exploring some alternatives. My full dataset has 3,115 strata. So my final model would look like: clogit(Cases/Conceptions ~ Pollution + f(t) + strata(Stratum)). So, just to reiterate, my goal is to make this model without having to bring in the individual-level data. I would be just as happy to do a conditional Poisson as I would be to do a conditional logistic regression - either would seem to be appropriate here - if that opens up some other options. Thanks very much for your time and interest, Matt Strickland Epidemiologist Birth Defects Branch U.S. Centers for Disease Control and Prevention -----Original Message----- From: Charles C. Berry [mailto:[EMAIL PROTECTED] Sent: Thursday, May 31, 2007 1:12 PM To: Strickland, Matthew (CDC/CCHP/NCBDDD) (CTR) Cc: [email protected]; [EMAIL PROTECTED] Subject: Re: [R] Conditional logistic regression for "events/trials" format On Thu, 31 May 2007, Strickland, Matthew (CDC/CCHP/NCBDDD) (CTR) wrote: > Dear R users, > > I have a large individual-level dataset (~700,000 records) which I am > performing a conditional logistic regression on. Key variables include > the dichotomous outcome, dichotomous exposure, and the stratum to > which each person belongs. > > Using this individual-level dataset I can successfully use clogit to > create the model I want. However reading this large .csv file into R > and running the models takes a fair amount of time. > > Alternatively, I could choose to "collapse" the dataset so that each > row has the number of events, number of individuals, and the exposure > and stratum. In SAS they call this the "events/trials" format. This > would make my dataset much smaller and presumably speed things up. > I think you have described the data for forming a 2 by 2 by K table of counts. In which case, loglin(), loglm(), mantelhaen.test(), and - if K is not too large - glm(... , family=poisson) would be suitable. But you say 'models' above suggesting that there are some other variables. If so, you need to be a bit more specific in describing your setup. > So my question is: can I use clogit (or possibly another function) to > perform a conditional logistic regression when the data is in this > "events/trials" format? I am using R version 2.5.0. > > Thank you very much, > Matt Strickland > Birth Defects Branch > U.S. Centers for Disease Control > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
