#145: transform (ETL) GPC i2b2 data to PCORNet CDM
--------------------------+----------------------------
Reporter: dconnolly | Owner: ngraham
Type: enhancement | Status: assigned
Priority: major | Milestone: data-domains2
Component: data-sharing | Resolution:
Keywords: | Blocked By: 109
Blocking: 160 |
--------------------------+----------------------------
Comment (by ngraham):
Using the `c_dimcode` in comment:14 doesn't work well for cases where
there are multiple concepts in the GPC/HERON hierarchy map to a single
concept in the PCORI hierarchy. One example is Hispanic:
||PCORI path||HERON path
||\PCORI\DEMOGRAPHIC\HISPANIC\Y\||\i2b2\Demographics\Ethnicity\Hispanic\||
||\PCORI\DEMOGRAPHIC\HISPANIC\Y\||\i2b2\Demographics\Ethnicity\Hispanic,
Latino or Spanish Origin\||
So, instead of using the `c_dimcode` I plan to insert rows into the
concept dimension that map the PCORI paths to the appropriate concept
codes. I plan to do this by building a mapping between the PCORI paths
and the GPC paths (much like we did in [https://bitbucket.org/njgraham
/pcori-annotated-data-
dictionary/src/1c1c0f980377bbd1adbe839edf84d8de92f5b1de/heron_to_pcori.csv?at=default
heron_to_pcori.csv] for the Annotated data Dictionary).
I tried this out for diagnosis and was able to run a query for `ICD-9-CM`
in the PCORI hierarchy (item key
`<item_key>\\PCORI_\PCORI\DIAGNOSIS\DX_TYPE\09\</item_key>` against the
small KU test data set and return a patient count:
{{{
Finished Query: "ICD-9-CM@08:59:10"
[2.4 secs]
Compute Time: 1 secs
Number of patients for "ICD-9-CM@08:59:10"
patient_count: 141
}}}
Also, demographics (and, specifically, the Hispanic flag noted above).
{{{
Finished Query: "HISPANIC@09:01:26"
[1.7 secs]
Compute Time: 1 secs
Number of patients for "HISPANIC@09:01:26"
patient_count: 721
}}}
--
Ticket URL:
<http://informatics.gpcnetwork.org/trac/Project/ticket/145#comment:15>
gpc-informatics <http://informatics.gpcnetwork.org/>
Greater Plains Network - Informatics
_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev