Re: i2b2 discussion

Russ Waitman Thu, 04 Sep 2014 21:14:46 -0700

We’re already doing the first issue.

The question is when or do we tackle the second issue.


I guess what I’d do is wait and see based upon customer demand,
- how much stuff people wanted us to do for computable phenotype work that was 
solvable via the PCORI CDM for PPRNs and other investigators
- how much we’d benefit from executing that computable phenotype query at a 
non-i2b2 site like Vanderbilt that has the full CDM.

If the answers were
a) there were very few questions from investigators that could be answered with 
the CDM because it lacks meds and labs and registries
b) none of the non-i2b2 CDRNs were interested in supporting an inbound cohort 
request based on an i2b2 query,

then I’d wait on the second issue work till after the CDM evolves.

Likely need to wait a while anyway because we need to get our data sharing 
agreements solidified which is a higher focus for me.

If a) and b) were false, after we’ve got data sharing signed I’d probably see 
if Nathan (plus Phillip Reeder and others in the GPC who are working to create 
the bitbucket standardized ETL code from i2b2->CDM working that we can use 
repeatably across our network) could get a consult with Shawn’s team about 
their thoughts and then estimate effort.

If this is a tractable problem (1~2 months coding work estimate), I’d probably 
see if we could do it either out of current effort or spend down my commitment 
here.  If it was really critical, my guess is the constraint may be coding 
chunks time from someone who’s up to speed; not funding.  Would probably know 
after a week or so if it’s solvable.  There’s no GUI work… it’s all just 
“transformation” of SQL and mapping ;)

But, I don’t think we need to think too much about it over the next 3-6 months; 
need to focus on getting our standardized approach to filling the CDM tables 
deployed:
raw data -> local i2b2 terms -> GPC compliant i2b2 ontologies -> GPC/PCORI CDM 
i2b2 ontology -> run standardized extract code into -> CDM

Still have work to do to finish off that pipeline for all the data the CDM 
requires and the data we need in GPC compliant ontologies for things like 
cancer registries, meds, labs, and vitals/flowsheets/observables.

Russ

On Sep 4, 2014, at 4:34 PM, Aaron A Sorensen 
<[email protected]<mailto:[email protected]>> wrote:

Russ,
I think we are actually on the same page. Let's say you had the answers to all 
your questions below and had formulated a clear plan of attack as to how you 
would go about accomplishing the popmednet-i2b2 you had envisioned.

The next question would, "Who will do all this work?" Based on the conversation 
during the last call, Shawn and Jeff would say, "Not us, unless we get more 
funding."  At which point, Maryan would say, "I might be able to help you with 
that if you can give me a compelling abstract and concrete specific aims."

So I think, what you wrote below gives the group a pretty good idea of the 
"flavor" of your preferred specific aims.

Aaron

Aaron Sorensen
Director of Informatics
Temple University School of Medicine
Office: 215-707-8079
Mobile: 215-341-5033
________________________________
From: Russ Waitman<mailto:[email protected]>
Sent: ‎9/‎4/‎2014 17:02
To: 'Maryan Zirkle'<mailto:[email protected]>; Aaron A 
Sorensen<mailto:[email protected]>
Cc: [email protected]<mailto:[email protected]>; Morris, 
Michele<mailto:[email protected]>; Shirey, Bill<mailto:[email protected]>; Borromeo, 
Charles<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; 
Dan Connolly<mailto:[email protected]>; Nathan 
Graham<mailto:[email protected]>; Espino, Jeremy Umali<mailto:[email protected]>; 
Mignogna, Linda Kathryn<mailto:[email protected]>; Murphy, Shawn 
N.<mailto:[email protected]>; Jeff 
Brown<mailto:[email protected]>;Thompson, 
Helga<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; Wehbe, 
Firas<mailto:[email protected]>; Denny, Joshua Charles 
([email protected])<mailto:[email protected]>; Harris, Paul 
([email protected])<mailto:[email protected]>; Basford, 
Melissa ([email protected])<mailto:[email protected]>
Subject: RE: i2b2 discussion

Hi All,
I was on vacation last week but unless I am missing something there’s a 
difference between data representation and end user intuitive query interfaces. 
 Not sure the gain with the approach outlined by Aaron and my thoughts are 
below:
Translating between the first is a straightforward though an 
attention-to-detail set of work while the second can be unbounded in effort or 
requires constraints that artificially limit its value to users (eg. 3 
booleans).
I think the initial point of the call was to get people talking and get 
agreement on the question can you translate data from i2b2 to the CDM?

-          Joe/Maryan/Jeff/CC: Yes, that is allowable.

-          Nathan’s GPC code to do it is here:

o   https://bitbucket.org/njgraham/pcori-annotated-data-dictionary

o   We now need to standardize across our network the mapping file work 
(heron_to_pcori.csv); as Shawn says… the devil is in the mapping with i2b2.  We 
want to change this from a csv to something managed all within i2b2 concept 
paths from local site to GPC standard to CDM standard
But for the “query”, right now we’re basically giving Jeff data tables and he’s 
free to write arbitrarily complex queries in SQL and I assume SAS.  I think 
there might even be challenges with getting simple SQL statements running 
across different SQL database unless it’s always wrapped by SAS (syntax of 
Oracle versus Microsoft versus MySQL versus PostGres…).
If the query spec was bounded by ANSI SQL, someone might be able to code a 
“translator” that would convert code designed to work against the CDM to run 
directly against the same data in an i2b2 schema.  That might be interesting 
but not the same as a “query” that can be created by an end user via the i2b2 
interface.
So the first issue is would the functional PCORNet Coordinating Center “basic” 
queries be constrained by or a subset of i2b2’s user functionality?  Which 
version of i2b2?

-          If yes, then it might be worth thinking about.  If the first query 
from the coordinating center wants to apply a MAX(), no.

-          I think the answer is likely no.  Jeff and if I was a programmer 
would desire greater flexibility to write code that uses the full data 
representation of the CDM and not have artificial constraints imposed.

-          Now we may learn that the management overhead of creating the shadow 
CDM is really bad but we can cross that bridge later for now.  We know we have 
to write this i2b2->CDM translator anyway


The second issue would be is there a place for i2b2 as a non-programmer means 
to build a query against a PCORNet compliant ontology that could then be shared 
first amongst i2b2-using PCORI sites and then perhaps equally important, as a 
means for users to build PCORNet complaint queries?

-          I think the answer might be yes.  I think there’s also a PopMedNet 
GUI but my sense is more investigators are familiar with i2b2 and it has 
increasingly rich functionality that’s battle tested by many sites.

-          An i2b2 query is at its heart translated into a SQL statement that 
runs against tables.

-          If the i2b2 sites all have the same PCORNet CDM ontology 
implemented, it’s the foundation of cross site queries and we can copy our 
queries between one another and eventually use things like SHRINE

-          But, if someone could write code that translated the i2b2 query into 
something that hit the PCORNet CDM, you could have users build queries with 
i2b2 and then have that get executed against sites who hold data in the CDM but 
don’t have the i2b2 schema and application running.

o   Shawn and other i2b2 gurus would need to think if this is possible but my 
guess is it’s a tractable problem (meaning once written, it’s going to work 
across the CDM spec)

o   That would help people author things in i2b2 and share back with PEDSNet 
and Vandy (was talking today with Vanderbilt regarding computable phenotype).

o   That said, I don’t think this is top priority at this point since we’ve got 
to get the first leg nailed down


Russ

From: Maryan Zirkle [mailto:[email protected]]
Sent: Thursday, September 04, 2014 4:18 AM
To: Aaron A Sorensen
Cc: [email protected]<mailto:[email protected]>; Morris, Michele; Shirey, Bill; 
Borromeo, Charles; 
[email protected]<mailto:[email protected]>; 
Russ Waitman; Dan Connolly; Nathan Graham; Espino, Jeremy Umali; Mignogna, 
Linda Kathryn; Murphy, Shawn N.; Jeff Brown; Thompson, Helga; 
[email protected]<mailto:[email protected]>
Subject: Re: i2b2 discussion

Aaron,
This is perfect! Yes, specific aims such as these are exactly what is needed on 
my end; not to mention the direction they can help give the overall efforts of 
the group.

I agree with your interpretation of the consensus after our last discussion, 
but welcome the perspective of others along with their suggested aims as you 
mention.

Once we agree on the aims, I would like to determine the who, what, when of 
each so that I can create a timeline of our efforts and outputs for my 
leadership. This can also help to give shape to the manuscript you speak about 
in #5--as to who will own discussion of what.

Eventually, I will be finding a consistent time (weekly or bi-monthly) for us 
to meet, but the calendars aren't aligning well enough until November for that.

Best,
Maryan Zirkle MD, MS, MA
Program Officer, CER Methods and Infrastructure Program
Patient Centered Outcomes Research Institute (PCORI)

Sent from my mobile office

On Sep 4, 2014, at 4:52 AM, "Aaron A Sorensen" 
<[email protected]<mailto:[email protected]>> wrote:
Maryan,
Assuming that we are all in agreement that the end result of last call was 
general consensus that a basic level of i2b2{for PCORnetCDM}/PopMedNet{for 
PCORnetCDM} interoperability is desirable and completely feasible but just not 
funded, I would propose that on the next call, a faux grant abstract and 
specific aims would start to take form so that we could give you something 
concrete to take back to the PCORnet/CTSA leadership.

If people like this idea, maybe anyone who has an opinion could send in three 
to five sample specific aims e.g.,
1.       Boarder cases will be explored in order to come up with a crisp 
definition of “simple query” e.g., no more than three Booleans and no more than 
two stratifying dimensions
2.       Within the confines of the simple-query space, a proof-of-concept 
(PoC) will be executed first individually (and manually) in both systems, and 
subsequently using an automated algorithmic transform that takes 
PopMedNet-Query XML and translates it to the corresponding i2b2 query and then 
does the reverse translation with the query’s results set.
3.       Based on specific aims #1 and #2, identify boarder use cases to guide 
the efforts of making the PoC production ready
4.       Ten development sprints of two weeks each (with end-user testing after 
each sprint) to harden the PoC code using the use cases identified in #3.
5.      Drafting of a manuscript describing the effort to be submitted to JAMIA 
or other peer-reviewed journal.

Aaron

Aaron Sorensen
Director of Informatics
Temple University School of Medicine
 Office: +1.215.707.8079
Mobile: +1.215.341.5033


-----Original Appointment-----
From: Maryan Zirkle [mailto:[email protected]]
Sent: Wednesday, September 3, 2014 14:19
To: Maryan Zirkle; [email protected]<mailto:[email protected]>; Morris, Michele; 
Shirey, Bill; Aaron A Sorensen; Borromeo, Charles; 
[email protected]<mailto:[email protected]>; 
Russ Waitman; Dan Connolly; [email protected]<mailto:[email protected]>; Espino, 
Jeremy Umali; Mignogna, Linda Kathryn; Murphy, Shawn N.; Jeff Brown; Thompson, 
Helga;[email protected]<mailto:[email protected]>
Subject: i2b2 discussion
When: Monday, September 8, 2014 13:30-14:00 (UTC-05:00) Eastern Time (US & 
Canada).
Where: Toll-free dial-in number (U.S. and Canada): (866) 802-2104 Conference 
code: 6225772781


Hi all,
We will continue our discussion from last week. Please send me any specific 
topics for discussion via email and we will quickly go through them like we did 
on the last call. I know 30 minutes isn't ideal, but it is all that we have 
with the busy schedules!
Best,
MZ
Toll-free dial-in number (U.S. and Canada):
(866) 802-2104

Conference code:
6225772781

Russ Waitman, PhD
Director of Medical Informatics
Assistant Vice Chancellor for Enterprise Analytics
Associate Professor, Department of Internal Medicine
University of Kansas Medical Center, Kansas City, Kansas
913-945-7087 (office)
[email protected]<mailto:[email protected]>
http://www.kumc.edu/ea-mi/
http://informatics.kumc.edu<http://informatics.kumc.edu/>
http://informatics.gpcnetwork.org – a PCORNet collaborative

_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Re: i2b2 discussion

Reply via email to