Dan, this is correct, it will be a fair number but certainly not everybody. e.g. in our set at Northwestern, we have a total population closer to like 2 million but this cuts it down to a few hundred thousand. This denominator is important as we are trying to determine how diabetic outcomes are different from a similar at risk population.
Your date cutoff is fine. And yes the clinical variables and geocoded data are for the whole denominator. For geocoded we will need this to determine who would have likely received medicaid if that state had expanded medicaid. Thanks, Abel ________________________________ From: Dan Connolly <[email protected]> Sent: Thursday, December 1, 2016 3:59 PM To: Furmanchuk, Al'ona (NU); Goel, Satyender (NU) Cc: <[email protected]>; Kho, Abel Subject: denominator is everybody with two visits? RE: data collection for next-D: i2b2, babel, OMOP, PCORnet CDM, ACS, geocoding We talked about 10s of thousands; now I read: 1. Denominator or study population Minimum two encounters (inpatient, outpatient, and emergency ΜΆ see Part A1 in Appendix A) in EDW at any time on different days. AND age between 18 and 89 years old at time of encounter. Age is defined as a difference between the date of birth and date of encounter Am I reading this right? If so, that makes the denominator more like 100s of thousands per site... or millions. More or less everybody. Well... not quite everybody... The SQL code refers to CDM tables. We follow SCILHS in cutting off our CDM population at Jan 1 2010. Is that as expected? And we'll be collecting Clinical Variables in the EHR and Geocoded Data from the whole denominator population? ref NEXT-D Data Request Detail<https://informatics.gpcnetwork.org/trac/Project/attachment/ticket/539/NEXT-D_Request%20for%20Data_Detailed_12.1.16.docx> -- Dan ________________________________ From: Dan Connolly Sent: Monday, November 07, 2016 2:43 PM To: Al'ona Furmanchuk; Satyender Goel Cc: <[email protected]>; Abel Kho Subject: data collection for next-D: i2b2, babel, OMOP, PCORnet CDM, ACS, geocoding There's a lot of alphabet soup, here. In preparation for the Nov 15 call, I'd like to get the discussion started in email. (Note the gpc-dev public archive<http://listserv.kumc.edu/pipermail/gpc-dev/>). I would prefer to work backward from a mocked up spreadsheet. My questions of 19 Sep<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:53> were: * Does the desired form of the data have one row per patient? * or per visit? * Is patient-day a good enough definition of visit? * what columns / observations / variables are expected for each row? * Nominal, Ordinal, Interval or Ratio? * codes for nominals? * units? Mei's response<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:54>, after talking with Bernie Black and Abel Kho said organize as row-per-visit; yes, patient-day is close enough. She was reluctant to give specifics on columns, but she said the followings are categories of variables listed in the proposal: * Clinical Variables in EMR: . Demographics: gender, race . Treatment: standard diabetes medications . Response to treatment: HbA1c levels, systolic and diastolic blood pressure, HDL and LDL cholesterol, triglycerides . Medication adherence: pharmacy fill data or refill rates . Treatment adherence: weights, checks at least twice a year . Physician adherence: orders for HbA1c, urine microalbumin, pneumonia and flu vaccine, and documented annual foot and eye exams . Health outcomes: renal disease, peripheral artery disease/amputation, retinopathy, cardiovascular disease (coronary events and ischemic stroke) * Supplemental Demographic Variables in Geocoded Data: * Income, education, likelihood of employment, poverty status, owner-occupied house value, health insurance coverage, etc. It would help if there were a shared copy of the proposal that we can all refer to, by the way. I just put what I know in a next-D mock-up<https://docs.google.com/spreadsheets/d/12h3fwK_AZYPCU28XVfu8n45bn6DUQ4qwY9zvgFWozow/edit#gid=1012432412> in google sheets. Feel free to comment and suggest changes. It includes details such as that we would use 05 to represent race=White and 03 for Black, (following the PCORnet data model). The first sheet has mocked up data and the 2nd sheet is a REDCap data dictionary. If we are to collect "Treatment: standard diabetes medications" then we need a similar level of detail. OMOP seems to have very mature methods for handling drug exposures, but we don't have much experience with that. In a recent data collection for breast cancer, we used a REDCap drop-down list of relevant RXNorm codes drawn from the GPC terminology. This is where i2b2 and babel<https://babel.gpcnetwork.org/> come in. With a babel account, you can browse and get details on the terminology as well as a rough sense of what data is available from each GPC site. (It's possible to assemble and save a query that can be actually run at all sites, though that's a bit labor-intensive at this point.) For HbA1c, there may be an issue of which LOINC code to use, but I expect we can set that aside since we had to address it for the PCORnet CDM LAB_RESULT_CDM table. But there may be multiple such results in a single visit. In one recent study, I used the median to aggregate them. Would that approach be appropriate here? And so on for the other clinical EMR data. For income, I have been working with UHD001 Median household income in the past 12 months (in 2013 inflation-adjusted dollars) from ACS. The ACS has 4000+ variables including 15 "median household income" variables (see ticket:140#comment:17<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:17>). Which of those 4000+ variables would you like to use for education, employment, poverty, house value, health insurance coverage, etc? -- Dan This message and any included attachments are intended only for the addressee. The information contained in this message is confidential and may constitute proprietary or non-public information under international, federal, or state laws. Unauthorized forwarding, printing, copying, distribution, or use of such information is strictly prohibited and may be unlawful. If you are not the addressee, please promptly delete this message and notify the sender of the delivery error by e-mail.
_______________________________________________ Gpc-dev mailing list [email protected] http://listserv.kumc.edu/mailman/listinfo/gpc-dev
