I have aggregated data from 8 of 10 sites. Despite the fact that we agreed on
our diagnosis ontology March 2014
(#63<https://informatics.gpcnetwork.org/trac/Project/ticket/63>) and labs Nov
2015 (#158<https://informatics.gpcnetwork.org/trac/Project/ticket/158>), terms
were different enough that I had to scrape codes from various parts of paths
and labels. We are not in a position to use SHRINE for queries with these terms.
UTSW is the only site whose IgG lab term path matches the GPC standard:
...
IgG Levels
In [54]:
igg_term = summary[summary.variable.str.contains(r'\bIgG\b|IMMUNOGLOBULIN G',
case=False)].copy()
igg_path_72 = r'\GPC\Laboratory Tests\LP31388-9\LP15838-3\LP31769-0\LP14672-7\
'[4:-1]
igg_path_65 = r'\GPC\Laboratory
Tests\LP31388-9\LP15838-3\LP31769-0\LP14672-7\LP43122-8\2465-3\ '[4:-1]
igg_term['aligned'] = igg_term.concept_path.apply(lambda p:
p.endswith(igg_path_72) or p.endswith(igg_path_65))
per_site(igg_term)[['variable', 'concept_path', 'aligned']]
Out[54]:
variable concept_path aligned
site
IU 0 0 0
KUMC IgG (LP14672-7) \i2b2\Laboratory Tests\Chemistry\Protein\Immun...
False
MCRF 2465-3: Igg \i2b2\LP29693-6\LP29697-7\LP7834-7\55121-8\246...
False
MCW 0 0 0
MU IgG (LP14672-7) \i2b2\Laboratory Tests\Chemistry\Protein\Immun...
False
UIOWA IgG SerPl-mCnc \i2b2\Labtests\LP31388-9\LP15838-3\LP31769-0\L...
False
UNMC IgG \UNMC\Laboratory Results\Blood and body fluids... False
UTHSCSA IgG SerPl-mCnc (2465-3) \i2b2\Laboratory
Tests\Chemistry\Protein\Immun... False
UTSW IgG (LP14672-7) \i2b2\Laboratory Tests\LP31388-9\LP15838-3\LP3...
True
WISC IMMUNOGLOBULIN G
\lab_information\lab_standard\igg\immunoglobul... False
...
We're doing better on diagnoses, though 4 of 10 sites is still a minority:
...
Normalizing diagnosis codes
In [27]:
import re
def icd9_from_label(label):
return [g1 for (g1, g2) in re.findall(r'(\d+(\.\d+)?)', label)]
icd9_from_label(u'283 Acquired hemolytic anemias')
Out[27]:
[u'283']
In [28]:
dx_term = summary[~ summary.variable.isin(igg_term.variable)].copy()
def find_icd9(x):
name_code = icd9_from_label(x.variable)
return ''.join(name_code if name_code
else x.concept_path.split('\\')[-2:])
dx_term['code'] = [find_icd9(row) for (ix, row) in dx_term.iterrows()]
In [29]:
pneumonia = dx_term[dx_term.code == '486'].copy()
gpc_486 = r'\i2b2\Diagnoses\ICD9\A18090800\A8359006\A8354357\A12966666\486\
'[20:-1]
pneumonia['aligned'] = pneumonia.concept_path.apply(lambda p:
p.endswith(gpc_486))
per_site(pneumonia)[['variable', 'concept_path', 'code', 'aligned']]
Out[29]:
variable concept_path code aligned
site
IU 0 0 0 0
KUMC 486 Pneumonia, organism unspecified
\i2b2\Diagnoses\ICD9\A18090800\A8359006\A83543... 486 True
MCRF 486 Pneumonia, organism unspecified
\i2b2\Diagnoses\A18090800\A8359006\A8354357\A1... 486 True
MCW 0 0 0 0
MU Pneumonia, organism unspecified (486) \i2b2\Diagnoses\Diseases of the
Respiratory Sy... 486 False
UIOWA Pneumonia, organism unspecified \i2b2\Diagnoses\460-519\480-488\486\
486 False
UNMC 486 Pneumonia, organism unspecified
\i2b2\Diagnoses\A18090800\A8359006\A8354357\A1... 486 True
UTHSCSA 486 Pneumonia, organism unspecified
\i2b2\Diagnoses\A18090800\A8359006\A8354357\A1... 486 True
UTSW (486)Pneumonia, organism unspecified
\i2b2\Diagnoses(ICD9CM)\DISEASES AND INJURIES\... 486 False
WISC 486: Pneumonia, organism unspecified
\diagnosis_information\icd\icd09\001-999.99\46... 486 False
...
--
Dan
_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev