Re: Minutes of GPV-DEV call 20140213 - Keith's Action Item

Russ Waitman Thu, 13 Feb 2014 22:43:18 -0800

Hi All,

Nathan and I are sitting at Love Field on our way back from San Antonio and 
Dallas (flights delayed and don't leave here till after 10) and the darn Dallas 
wifi is only free for 30 minutes so I need to resend this when I get home.


We've a bunch of work that sits before us this Spring (and the fun stuff starts 
past the i2b2 work) and will be very pragmatically focused (most of the closure 
we will be thinking about will involve sutures).

I wouldn't worry too much about different i2b2 versions in the next couple 
months but focus on getting everyone up on i2b2 and looking at what we have.

Once we get our arms around mapping concepts to standards we will likely then 
evaluate modifiers for attributing the source of the data (like we do for our 
diagnosis tree) as I think that has relevance for this project.

Practically, everybody posting on babel is really going to help us engage our 
clinical researcher colleagues.  It's really amazing to see when I explain what 
we're doing to CIOs around at various places.  People are really impressed to 
see we are already doing this in February.

A simple example of what we might need is an obesity use case from Dan Hale 
(leading our obesity characterization):
Event: kid has lower limb fracture
-according to diagnosis or procedure code?  UW has both... could be interesting 
comparison of the value of the different signals of the fracture event;
-closed or open fracture?
Outcome: does the kid gain weight  (become obese)?
-  do you have their BMI from visit vitals and inpatient flowsheets 
before/during fracture care and then afterwards?  I see BMI in babel at UW!!!  
Yeah!

How does that vary by
- gender, UW has that
- race/ethnicity,  UW has that and they aren't lumped into one category.  Yeah!
- obscured GPC site,
- proxies for economic status
-- payor class for example (Medicare vs commercial payor) - don't see that at 
UW... but...
-- or, census tract/zip code abstracted from address on our identified data 
sources   - see UW has zip so I bet you have address. We could then probably 
work with people to implement a census tract mapping algorithm!

Could develop an abstract and show people the GPC can answer questions,

Russ
________________________________
From: Greater Plains Collaborative Software Development 
[[email protected]] on behalf of Dan Connolly [[email protected]]
Sent: Thursday, February 13, 2014 6:24 PM
To: [email protected]
Subject: Re: Minutes of GPV-DEV call 20140213 - Keith's Action Item

I very much like diagrams, but I don't have a lot of experience with UML and in 
I don't quite see how this UML diagram connects to our work... Could you do me 
a favor and replace generic labels such as "Class1" and "Class2" with something 
relevant to cancer/obesity/ALS?

And with respect to i2b2... I suppose these UML classes correspond to i2b2 
concepts*, but I'd appreciate confirmation.

I'm really struggling to read the soft-coded generalization UML model. Is it a 
junction table<http://en.wikipedia.org/wiki/Junction_table>? Or two of them?

And the arrows in the hard-coded model... can I read those as owl subclass 
relationships? i.e. subset?

About keeping in sync with upstream i2b2, my understanding is that the 
transitive closure table is used to build the normal concept paths that i2b2 
uses; so it's just like any of the other techniques that we'd have to use to 
put things into i2b2's star schema. But on the other hand, I suppose Henderson 
did speak of query-time performance impact, so maybe I'm off base. I am yet to 
study his code.

* I much prefer to call them 
terms<http://en.wikipedia.org/wiki/First-order_logic#Terms> and I wish i2b2 had 
as well. I'm convinced by Barry Smith's realist ontology 
writing<http://ontology.buffalo.edu/medo/reasoningBT.pdf> that "concepts" is 
muddy thinking, i.e. "International Standard Bad Philosophy."

________________________________
From: Greater Plains Collaborative Software Development 
[[email protected]] on behalf of Wanta Keith M [[email protected]]
Sent: Thursday, February 13, 2014 1:07 PM
To: [email protected]
Subject: Re: Minutes of GPV-DEV call 20140213 - Keith's Action Item

All,

Attached, you will find an image I just sketched threw together with some UML 
screen shots I took as examples.  It shows single and multiple inheritance 
(also referred to as Generalization) in its two most common UML design 
patterns.  I was the technical reviewer of a book published earlier this year 
that discusses these patterns.  Hard coded generalization and soft coded 
generalization (also referred to as a meta model) are two implementation 
strategies for generalization.  Most common operational systems implement the 
hard coded generalization because with big data, this pattern performs the best 
with more attributes per entity.  In an ontology, the best approach is to use 
soft coded generalization simply because it allows you to model anything and 
everything.

Others pointed out the term transitive closure table.  I don’t know the 
original reference of this term, but it’s identical to the soft coded 
generalization for multiple inheritance and is the way i2b2 should have been 
designed.  Also, if i2b2 moved to this design pattern, the LIKE operator 
wouldn’t be necessary anymore in i2b2.  If you do not know how to tweak 
performance, the LIKE operator perform better in a relational database, which 
is why they probably chose that pattern.

One caveat to our conversation earlier during the GPC DEV meeting.  UW-PCORI or 
WISC (UW Health / University of Wisconsin-Madison) has not used this approach 
for i2b2 because it deviates from standard i2b2 functionality.  Rather than 
changing standard i2b2 source code (which is one possibility), I would much 
rather propose a new design to Partners Healthcare rather than changing its 
current design, otherwise it creates upgrade nightmares for everyone.  The rule 
of thumb for software is that by introducing more frameworks (which i2b2 has 
many of), the upgrade defect risks increase exponentially.  I2b2 uses 9+ 
frameworks (depending on how you implement things), so if we have 11 PCORI 
schools standardizing code that haven’t chosen a standard i2b2 version, this 
greatly concerns me.  We don’t have multiple inheritance ontologies loaded in 
i2b2 because of the issues with synonyms and concept management.  We have not 
chosen to implement soft coded generalization (aka transitive closure table) in 
i2b2 because it is not standard.

If someone needs assistance with moving a file to this soft generalization 
design model (before they move the data into the i2b2 METADATA tables (or 
CONCEPT_DIMENSION), let me know.  By playing with the indexes, you can make it 
perform better.  The columns you absolutely need are parent and child, and in 
order to query the data, a recursive query is needed.  Depth can be calculated. 
 The discriminator gives the context of the generalization, and from my 
software experience, having the generalization as exhaustive is my preference.  
If you have others you need to add in the future, use a miscellaneous or other 
discriminator.

Best Regards,
Keith

Re: Minutes of GPV-DEV call 20140213 - Keith's Action Item

Reply via email to