In the "People with at least one ordered medications specific to Diabetes
Mellitus" query, I see:
where (
UPPER(a.RAW_RX_MED_NAME) like UPPER('%Acetohexamide%') or
UPPER(a.RAW_RX_MED_NAME) like UPPER('%D[i,y]melor%') or
...
I don't think we populate RAW_RX_MED_NAME. We being KUMC. Not sure about other
GPC sites.
ref
https://informatics.gpcnetwork.org/trac/Project/attachment/ticket/539/NextDextractionCode_12.1.16.sql#L207
I notice a long list of rxnorm_cuis in the word
doc<https://informatics.gpcnetwork.org/trac/Project/attachment/ticket/539/NEXT-D_Request%20for%20Data_Detailed_12.1.16.docx>
e.g. 1156200. Are those in the .sql? Oh. yes. they are. Never mind.
--
Dan
________________________________
From: Dan Connolly
Sent: Monday, November 07, 2016 2:43 PM
To: Al'ona Furmanchuk; Satyender Goel
Cc: <[email protected]>; Abel Kho
Subject: data collection for next-D: i2b2, babel, OMOP, PCORnet CDM, ACS,
geocoding
There's a lot of alphabet soup, here. In preparation for the Nov 15 call, I'd
like to get the discussion started in email. (Note the gpc-dev public
archive<http://listserv.kumc.edu/pipermail/gpc-dev/>).
I would prefer to work backward from a mocked up spreadsheet. My questions of
19 Sep<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:53>
were:
* Does the desired form of the data have one row per patient?
* or per visit?
* Is patient-day a good enough definition of visit?
* what columns / observations / variables are expected for each row?
* Nominal, Ordinal, Interval or Ratio?
* codes for nominals?
* units?
Mei's
response<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:54>,
after talking with Bernie Black and Abel Kho said organize as row-per-visit;
yes, patient-day is close enough. She was reluctant to give specifics on
columns, but she said the followings are categories of variables listed in the
proposal:
* Clinical Variables in EMR:
. Demographics: gender, race
. Treatment: standard diabetes medications
. Response to treatment: HbA1c levels, systolic and diastolic blood pressure,
HDL and LDL cholesterol, triglycerides
. Medication adherence: pharmacy fill data or refill rates
. Treatment adherence: weights, checks at least twice a year
. Physician adherence: orders for HbA1c, urine microalbumin, pneumonia and flu
vaccine, and documented annual foot and eye exams
. Health outcomes: renal disease, peripheral artery disease/amputation,
retinopathy, cardiovascular disease (coronary events and ischemic stroke)
* Supplemental Demographic Variables in Geocoded Data:
* Income, education, likelihood of employment, poverty status,
owner-occupied house value, health insurance coverage, etc.
It would help if there were a shared copy of the proposal that we can all refer
to, by the way.
I just put what I know in a next-D
mock-up<https://docs.google.com/spreadsheets/d/12h3fwK_AZYPCU28XVfu8n45bn6DUQ4qwY9zvgFWozow/edit#gid=1012432412>
in google sheets. Feel free to comment and suggest changes. It includes
details such as that we would use 05 to represent race=White and 03 for Black,
(following the PCORnet data model). The first sheet has mocked up data and the
2nd sheet is a REDCap data dictionary.
If we are to collect "Treatment: standard diabetes medications" then we need a
similar level of detail. OMOP seems to have very mature methods for handling
drug exposures, but we don't have much experience with that. In a recent data
collection for breast cancer, we used a REDCap drop-down list of relevant
RXNorm codes drawn from the GPC terminology. This is where i2b2 and
babel<https://babel.gpcnetwork.org/> come in. With a babel account, you can
browse and get details on the terminology as well as a rough sense of what data
is available from each GPC site. (It's possible to assemble and save a query
that can be actually run at all sites, though that's a bit labor-intensive at
this point.)
For HbA1c, there may be an issue of which LOINC code to use, but I expect we
can set that aside since we had to address it for the PCORnet CDM
LAB_RESULT_CDM table. But there may be multiple such results in a single visit.
In one recent study, I used the median to aggregate them. Would that approach
be appropriate here?
And so on for the other clinical EMR data.
For income, I have been working with UHD001 Median household income in the past
12 months (in 2013 inflation-adjusted dollars) from ACS. The ACS has 4000+
variables including 15 "median household income" variables (see
ticket:140#comment:17<https://informatics.gpcnetwork.org/trac/Project/ticket/140#comment:17>).
Which of those 4000+ variables would you like to use for education,
employment, poverty, house value, health insurance coverage, etc?
--
Dan
_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev