Re: optional columns in i2b2 dimension tables RE: Minutes of GPV-DEV call 20140128

Dan Connolly Wed, 05 Feb 2014 08:28:02 -0800

You're not needlessly worried. This is indeed a challenge of making data useful 
in i2b2.


The lesson I continually re-learn is:
"Start by defining use cases, not ontologies" -- Building Ontologies Best 
practices, pitfalls and positives by Bodenreider 
2009<http://mor.nlm.nih.gov/pubs/pres/20090504-CBO.pdf>.
"By way of User Centred Design Knowledge Representation / ontologies was a 
solution, not a goal" -- Developing Biomedical Ontologies in OWL by Alan 
Rector<http://ontology.buffalo.edu/07/os3/Rector_OWL.pdf>
The more you can anticipate how people will want to use the data (preferably 
based on experience), the more usable you can make it.

It occurs to me that while flowsheet terminology is at the other end of the 
spectrum of standardization from SNOMED, the structure we found is also 
essentially a huge polyhierarchy: pulse (flow measure #8) shows up in many 
flowsheets.

We had little to go on as far as what people would want, so we sort of gave 
them everything. In our i2b2 representation, it shows up under each of the 
flowsheets in which it occurs, and each of those flowsheets shows up under each 
department where it's used, and so on.

The usability of the result is... well... you can imagine.

We did some research on automated clustering to remove redundancy.

  *    Expressing Observations from Electronic Medical Record Flowsheets in an 
i2b2-based Clinical Data Repository to Support Research and Quality 
Improvement<http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3243191/>
L. Waitman, J. Warren, E. Manos, D. Connolly

Abstract: While nursing documentation in electronic medical record (EMR) 
flowsheets may represent the largest investment of clinician time with 
information systems, organizations lack tools to visualize and repurpose this 
data for research and quality improvement. Incorporating flowsheet 
documentation into a clinical data repository and methods to reduce the 
flowsheet ontology's redundancy are described. 411 million flowsheet 
observations, derived from an EMR predominantly used in inpatient, outpatient 
oncology, and emergency room settings, were incorporated into a repository 
using the i2b2 framework. The local flowsheet ontology contained 720 
"templates" employing 5,379 groups (2,678 distinct), 37,836 measures (13,659 
distinct) containing 226,666 choices for a total size of 270,641. Aggressive 
pruning and clustering resulted in 150 templates, 743 groups (615 distinct), 
6,950 measures (4,066 distinct) with 22,497 choices, and size of 30,371. Making 
nursing data accessible within i2b2 provides a new perspective for contributing 
clinical organizations and heightens collaboration between the academic and 
clinical activities.
Ah... good... it looks like the complementary work, using an expert curation 
approach, was published too:

  *   Ambient Findability: Developing a Flowsheet Ontology for 
i2B2<http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3799091/> Judith J. Warren, 
PhD, RN, BC, FAAN, FACMI,1 E. LaVerne Manos, MS, RN, BC,1 Daniel W. Connolly,2 
and Lemuel R. Waitman, PhD2

Abstract  The process of moving from the locally defined flowsheet ontology 
containing redundancy and jargon to one understandable by researchers is 
described. Over 250 million nursing flowsheet observations were imported into a 
data repository that uses the i2b2 framework. Focus groups were used to derive 
a new ontology model--18 templates were identified. One hundred measures, 50% 
of all patient observations over 36 months, were encoded in SNOMED CT©. 78% of 
the concepts were mapped.

related blog item:

  *   AMIA 2011: Nursing Flowsheets data and the wild west of 
terminology<https://informatics.kumc.edu/work/blog/2011/10/slug>

We're also struggling through "which end is up? or how many ends are up?" with 
microbiology.

--
Dan


________________________________
From: Greater Plains Collaborative Software Development 
[[email protected]] on behalf of Campbell, James R [[email protected]]
Sent: Tuesday, February 04, 2014 6:04 PM
To: [email protected]
Subject: Re: optional columns in i2b2 dimension tables RE: Minutes of GPV-DEV 
call 20140128

Researching the concept of 233607000|Pneumococcal pneumonia(disorder)| I find 8 
distinct paths to the root due to the combinatorial possibilities of the 
polyhierarchy.  I suspect that not all the paths are useful for data browsing 
or aggregation and I suspect a more parsimonious set would improve usability, 
or am I needlessly worried?
Jim

From: Greater Plains Collaborative Software Development 
[mailto:[email protected]] On Behalf Of Wanta Keith M
Sent: Tuesday, February 04, 2014 1:42 PM
To: [email protected]
Subject: Re: optional columns in i2b2 dimension tables RE: Minutes of GPV-DEV 
call 20140128

Jim, that is correct about building multiple paths per concept, because of the 
multiple inheritance/multiple generalization.  In a sense, you end up with 
multiple concepts (based on the number of parents) in the CONCEPT_DIMENSION.

UW is also on 1.6.

-Keith

Re: optional columns in i2b2 dimension tables RE: Minutes of GPV-DEV call 20140128

Reply via email to