In the "People with at least one ordered medications specific to Diabetes 
Mellitus" query, I see:

where (
UPPER(a.RAW_RX_MED_NAME) like UPPER('%Acetohexamide%') or 
UPPER(a.RAW_RX_MED_NAME) like UPPER('%D[i,y]melor%') or

I don't think we populate RAW_RX_MED_NAME. We being KUMC. Not sure about other 
GPC sites.


I notice a long list of rxnorm_cuis in the word 
 e.g. 1156200. Are those in the .sql? Oh. yes. they are. Never mind.


From: Dan Connolly
Sent: Monday, November 07, 2016 2:43 PM
To: Al'ona Furmanchuk; Satyender Goel
Cc: <>; Abel Kho
Subject: data collection for next-D: i2b2, babel, OMOP, PCORnet CDM, ACS, 

There's a lot of alphabet soup, here. In preparation for the Nov 15 call, I'd 
like to get the discussion started in email. (Note the gpc-dev public 

I would prefer to work backward from a mocked up spreadsheet. My questions of 
19 Sep<> 

  *   Does the desired form of the data have one row per patient?
     *   or per visit?
        *   Is patient-day a good enough definition of visit?
  *   what columns / observations / variables are expected for each row?
     *   Nominal, Ordinal, Interval or Ratio?
     *   codes for nominals?
     *   units?

 after talking with Bernie Black and Abel Kho said organize as row-per-visit; 
yes, patient-day is close enough. She was reluctant to give specifics on 
columns, but she said the followings are categories of variables listed in the 

  *   Clinical Variables in EMR:

. Demographics: gender, race
. Treatment: standard diabetes medications
. Response to treatment: HbA1c levels, systolic and diastolic blood pressure, 
HDL and LDL cholesterol, triglycerides
. Medication adherence: pharmacy fill data or refill rates
. Treatment adherence: weights, checks at least twice a year
. Physician adherence: orders for HbA1c, urine microalbumin, pneumonia and flu 
vaccine, and documented annual foot and eye exams
. Health outcomes: renal disease, peripheral artery disease/amputation, 
retinopathy, cardiovascular disease (coronary events and ischemic stroke)

  *   Supplemental Demographic Variables in Geocoded Data:
     *   Income, education, likelihood of employment, poverty status, 
owner-occupied house value, health insurance coverage, etc.

It would help if there were a shared copy of the proposal that we can all refer 
to, by the way.

I just put what I know in a next-D 
 in google sheets. Feel free to comment and suggest changes. It includes 
details such as that we would use 05 to represent race=White and 03 for Black, 
(following the PCORnet data model). The first sheet has mocked up data and the 
2nd sheet is a REDCap data dictionary.

If we are to collect "Treatment: standard diabetes medications" then we need a 
similar level of detail. OMOP seems to have very mature methods for handling 
drug exposures, but we don't have much experience with that. In a recent data 
collection for breast cancer, we used a REDCap drop-down list of relevant 
RXNorm codes drawn from the GPC terminology. This is where i2b2 and 
babel<> come in. With a babel account, you can 
browse and get details on the terminology as well as a rough sense of what data 
is available from each GPC site. (It's possible to assemble and save a query 
that can be actually run at all sites, though that's a bit labor-intensive at 
this point.)

For HbA1c, there may be an issue of which LOINC code to use, but I expect we 
can set that aside since we had to address it for the PCORnet CDM  
LAB_RESULT_CDM table. But there may be multiple such results in a single visit. 
In one recent study, I used the median to aggregate them. Would that approach 
be appropriate here?

And so on for the other clinical EMR data.

For income, I have been working with UHD001 Median household income in the past 
12 months (in 2013 inflation-adjusted dollars) from ACS. The ACS has 4000+ 
variables including 15 "median household income" variables (see 
 Which of those 4000+ variables would you like to use for education, 
employment, poverty, house value, health insurance coverage, etc?


Gpc-dev mailing list

Reply via email to