RAW_RX_MED_NAME? RE: data collection for next-D: i2b2, babel, OMOP, PCORnet CDM, ACS, geocoding

Dan Connolly Thu, 01 Dec 2016 12:54:13 -0800

In the "People with at least one ordered medications specific to Diabetes 
Mellitus" query, I see:


where (
UPPER(a.RAW_RX_MED_NAME) like UPPER('%Acetohexamide%') or 
UPPER(a.RAW_RX_MED_NAME) like UPPER('%D[i,y]melor%') or
...

I don't think we populate RAW_RX_MED_NAME. We being KUMC. Not sure about other 
GPC sites.

ref
https://informatics.gpcnetwork.org/trac/Project/attachment/ticket/539/NextDextractionCode_12.1.16.sql#L207

I notice a long list of rxnorm_cuis in the word 
doc<https://informatics.gpcnetwork.org/trac/Project/attachment/ticket/539/NEXT-D_Request%20for%20Data_Detailed_12.1.16.docx>
 e.g. 1156200. Are those in the .sql? Oh. yes. they are. Never mind.


--
Dan

________________________________
From: Dan Connolly
Sent: Monday, November 07, 2016 2:43 PM
To: Al'ona Furmanchuk; Satyender Goel
Cc: <[email protected]>; Abel Kho
Subject: data collection for next-D: i2b2, babel, OMOP, PCORnet CDM, ACS, 
geocoding

There's a lot of alphabet soup, here. In preparation for the Nov 15 call, I'd 
like to get the discussion started in email. (Note the gpc-dev public 
archive<http://listserv.kumc.edu/pipermail/gpc-dev/>).

I would prefer to work backward from a mocked up spreadsheet. My questions of 
19 Sep<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:53> 
were:

  *   Does the desired form of the data have one row per patient?
     *   or per visit?
        *   Is patient-day a good enough definition of visit?
  *   what columns / observations / variables are expected for each row?
     *   Nominal, Ordinal, Interval or Ratio?
     *   codes for nominals?
     *   units?

Mei's 
response<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:54>,
 after talking with Bernie Black and Abel Kho said organize as row-per-visit; 
yes, patient-day is close enough. She was reluctant to give specifics on 
columns, but she said the followings are categories of variables listed in the 
proposal:

  *   Clinical Variables in EMR:

. Demographics: gender, race
. Treatment: standard diabetes medications
. Response to treatment: HbA1c levels, systolic and diastolic blood pressure, 
HDL and LDL cholesterol, triglycerides
. Medication adherence: pharmacy fill data or refill rates
. Treatment adherence: weights, checks at least twice a year
. Physician adherence: orders for HbA1c, urine microalbumin, pneumonia and flu 
vaccine, and documented annual foot and eye exams
. Health outcomes: renal disease, peripheral artery disease/amputation, 
retinopathy, cardiovascular disease (coronary events and ischemic stroke)

  *   Supplemental Demographic Variables in Geocoded Data:
     *   Income, education, likelihood of employment, poverty status, 
owner-occupied house value, health insurance coverage, etc.

It would help if there were a shared copy of the proposal that we can all refer 
to, by the way.

I just put what I know in a next-D 
mock-up<https://docs.google.com/spreadsheets/d/12h3fwK_AZYPCU28XVfu8n45bn6DUQ4qwY9zvgFWozow/edit#gid=1012432412>
 in google sheets. Feel free to comment and suggest changes. It includes 
details such as that we would use 05 to represent race=White and 03 for Black, 
(following the PCORnet data model). The first sheet has mocked up data and the 
2nd sheet is a REDCap data dictionary.

If we are to collect "Treatment: standard diabetes medications" then we need a 
similar level of detail. OMOP seems to have very mature methods for handling 
drug exposures, but we don't have much experience with that. In a recent data 
collection for breast cancer, we used a REDCap drop-down list of relevant 
RXNorm codes drawn from the GPC terminology. This is where i2b2 and 
babel<https://babel.gpcnetwork.org/> come in. With a babel account, you can 
browse and get details on the terminology as well as a rough sense of what data 
is available from each GPC site. (It's possible to assemble and save a query 
that can be actually run at all sites, though that's a bit labor-intensive at 
this point.)

For HbA1c, there may be an issue of which LOINC code to use, but I expect we 
can set that aside since we had to address it for the PCORnet CDM  
LAB_RESULT_CDM table. But there may be multiple such results in a single visit. 
In one recent study, I used the median to aggregate them. Would that approach 
be appropriate here?

And so on for the other clinical EMR data.

For income, I have been working with UHD001 Median household income in the past 
12 months (in 2013 inflation-adjusted dollars) from ACS. The ACS has 4000+ 
variables including 15 "median household income" variables (see 
ticket:140#comment:17<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:17>).
 Which of those 4000+ variables would you like to use for education, 
employment, poverty, house value, health insurance coverage, etc?

--
Dan

_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

RAW_RX_MED_NAME? RE: data collection for next-D: i2b2, babel, OMOP, PCORnet CDM, ACS, geocoding

Reply via email to