Charles This comment about two datamarts per site is something brand new to me. There is certainly nothing explicit in CDM specs that imply such a requirement. From our i2b2 native reference datasets, we have SQL queries that prepare the data in CDM table format although the jump from CDM V2 to V3 is going to be troublesome, as I indicated to Russ Jim
James R. Campbell MD [email protected] Office 402-559-7505 Secretary 402-559-7299 Fax 402-559-8396 Pager 402-888-1230 -----Original Message----- From: Borromeo, Charles [mailto:[email protected]] Sent: Thursday, May 21, 2015 9:00 AM To: Russ Waitman; Campbell, James R; [email protected]; McClay, James C Cc: Mandl, Kenneth ([email protected]); Shawn N. Murphy ([email protected]); Rachel Hess ([email protected]) Subject: Re: PCORI CDM V3 vote Hi Russ, We met with DSSNI on Monday. The PaTH CDRN shares your concerns about the non-EAV structure of the data model. Dr. Chris Chute (recently joined JHU) also thinks the CDM is very brittle. However, PaTH never dedicated time to developing a viable alternative to the CDM. It seemed like too big of a change to discuss given the CDM 3.0 approval date of May 2015. I did discuss a short-term flaw with Jeff, Leslie, Laura, and Rich Platt. In PaTH, I am developing some Python scripts to convert our i2b2 data into the CDM. According to DSSNI, the CDRNs should deploy 2 DataMarts: one with EMR data and one with Claims data. Deploying two DataMarts allows the CDRNs to avoid the issue of combining Claims Encounters with EMR Encounters. Leslie said some CDRNs are required to keep the Claims data separate from the EMR data thus necessitating 2 DataMarts. During the development process I found a flaw with the 2 DataMart approach. Basically, the Claims only DataMart would be missing data in several tables (see attached image) including: VITAL, CONDITION, PRO_CM, LAB_RESULT_CM, and PRESCRIBING. The data for these tables comes from the EMR, not claims. Therefore, the Claims Only DataMart would only be able to answer a subset of research questions. During the discussion, it appeared that Jeff Brown did not have a technical solution allowing him to query across the 2 DataMarts in a single query. Therefore, storing the data in one DataMart would answer more research questions. I suggested that DSSNI add some columns to the tables allowing the ETL process to describe the data provenance. The columns would include information about the type of encounter (inpatient vs outpatient) and datasource (claims vs EMR). Some of the tables (like PROCEDURES) already have these columns (ENC_TYPE and PX_TYPE). DSSNI would need to check the other tables to ensure this information. This approach effectively demotes the importance of the encounter and eliminates the need to combine encounters. There may be some other alternatives. Jeff said he would give this some consideration so we will see what happens. Chuck Borromeo From: Russ Waitman <[email protected]> Date: Thursday, May 21, 2015 at 9:20 AM To: "'Campbell, James R'" <[email protected]>, "[email protected]" <[email protected]>, "'McClay, James C ([email protected])'" <[email protected]> Cc: Charles Borromeo <[email protected]>, "Mandl, Kenneth ([email protected])" <[email protected]>, "Shawn N. Murphy ([email protected])" <[email protected]>, Rachel Hess <[email protected]> Subject: RE: PCORI CDM V3 vote Dear Jim and GPC Dev, Thanks for the good discussion Tuesday regarding the CDM 3 vote: https://docs.google.com/document/d/1ih4XGJVrTjIH7xOHAnQOqfqKvLl9ZxovS5PXgFO T7qo/edit <https://docs.google.com/document/d/1ih4XGJVrTjIH7xOHAnQOqfqKvLl9ZxovS5PXgF OT7qo/edit> Did we as the GPC or any other CDRNs ever propose alternatives or improved modifications to the CDM draft? If not, was it because - No opportunity - there was a sense it was futile - No interest Do we have written recommendations to improve CDM3 or specifically identify the flaws or most difficult to maintain sections? At a high level I view adding prescribing/ordered medications as good I am still concerned this non-EAV model or each domain is very expensive to augment and maintain It would be preferable to share extensible enhancements with the group as an alternative, Russ From: Campbell, James R [mailto:[email protected]] Sent: Monday, May 18, 2015 7:10 AM To: Russ Waitman Subject: RE: PCORI CDM V3 vote On Saturday PCORI preemptively cancelled the DSSNI call scheduled for today. There has been no organized discussion of CDM for a month now. They have moved the decision making to the PIs, apparently to limit debate and need for their response. It seems they have been missing every deadline they set for themselves and I am not sure what to expect. Are you saying they have not been discussing this within the PI steering group either? Jim ________________________________________ From: Russ Waitman [[email protected]] Sent: Monday, May 18, 2015 6:25 AM To: Campbell, James R Cc: Dan Connolly; McClay, James C Subject: Re: PCORI CDM V3 vote That secure transmission is the fault of the KUMC email system. No idea why it did that. I think we are all somewhat non-enthusiastic of the direction of the CDM. Do you have suggestions that would improve the next iteration? Any chance to bring those forward to Disney? Russ On May 17, 2015, at 10:23 AM, Campbell, James R <[email protected]> wrote: Russ, Thanks for sharing the CDM V3 document with me. Why the secure transmission? I thought this was public knowledge? Have they been discussing these changes in the CDM in the PI forum? Looking through the copy that you sent me I count over 35 data attributes ADDED since our input was tendered on V3. Many of those additions do nothing to improve data quality (like all the temporary primary identifier fields we will have to generateŠ..we need to be sure they are serious that we do not have to maintain IDs across refresh cycles) and will be a lot of work for GPC data managers. I can understand hat perhaps they will be useful for the central data warehouse managers and presume that is where the requirements originated. I assume that many networks are refusing to release non-obfuscated dates without full IRB and so I appreciate the rationale for the proliferation of HARVEST.Attributes but that table will have to be regenerated for each trial report assuming that we will have a mixture of IRB approved and non-approved trials. They are giving lip service to compliance with meaningful use standardization but are adding duplicate data identification requirements (PRO_CM.PRO_ITEM; LAB_RESULT.LAB_NAME are examples) that create overhead for our data managers and require mapping tables in addition to what our sites are doing for ONC compliance. I was suprized by the appearance of the table 2.5 ³Implementation Expectations² table of page 6. Are a lot of CDRNs not able to produce LAB, CONDITION, DEATH and PRESCRIBING datasets? Will these be the factors that separate the men from the boys in trial participation? I don¹t see how they can do the ADAPTABLE trial from EHR data harvest without some of these data sets. In short, this V3 document creates a lot of new requirements for our data managers, many with apparently arbitrary specs. If we can take table 2.5 literally, GPC should be able to meet CDM compliance in the next few months but I ask if the OPTIONAL tables will not be the mark of the truly successful CDRN and therefore required for our long term viability. Please provide your prospectives on this. What is the discussion among PIs? Is the snowball already hallway down the hill? Jim NEW or REVISED ELEMENTS IN CDM V3 DIAGNOSIS.DIAGNOSISID (Unique over time for all queries to site?? They say no and so I ask WHY?) PROCEDURE.PROCEDURESID VITAL.SMOKING VITAL.TOBACCO (CHANGED FROM V2; IT APPEARS THAT THEY HAVE CREATED DUPLICATE ENTRIES FOR SMOKING BEHAVIOR AND HAVE CHANGED V2 DEFINITIONS ON TOBACCO TYPE. THIS FLIES IN THE FACE OF WHAT WE ARE BEING REQUIRED TO REPORT FOR MEANINGFUL USE ) VITAL.TOBACCO_TYPE DISPENSING.DISPENSINGID DISPENSING.PRESCRIBINGID (QUESTIONABLE ADDITION! THOSE OF OUR SITES THAT CAN REPORT THIS WILL BE ACCEPTING SURESCRIPTS DATA THAT THEY HAVE NOT ORIGINATED) DISPENSING.NDC (SHOULD SPECIFICALLY DRAW FROM NLM RXNAV PUBLICATION) [LAB_RESULT.LAB_NAME CREATES BURDEN FOR MAPPING ALL TEST NAMES IN ADDITION TO LOINC SOTH NAME WHICH SHOULD BE QUITE ADEQUATE FOR RESEARCH PURPOSES] LAB_RESULT.NORM_MODIFIER_LO LAB_RESULT.NORM_MODIFIER_HI CONDITION.CONDITIONID PRO_CM.PRO_CMID PRO_CM.PRO_ITEM (REDUNDANT WITH LOINC CODE ?WHY?) PRESCRIBING.ORDER_TIME PRESCRIBING _FREQUENCY PRESCRIBING.RX_BASIS (NEW; THIS IS INCONSISTENT WITH GUIDANCE ON NATURE OF THE DISPENSING RECORD AND MAKES NO SENSE!) PCORNET_TRIAL.PARTICIPANTID PCORNET_TRIAL.TRIALSITEID HARVEST.BIRTH_DATE_MGMT HARVEST.ENR_START_DATE_MGMT HARVEST.ENR_END_DATE_MGMT HARVEST.ADMIT_DATE_MGMT HARVEST.DISCHARGE_DATE_MGMT HARVEST.PX_DATE_MGMT HARVEST.RX_ORDER_DATE_MGMT HARVEST.RX_START_DATE_MGMT HARVEST.RX_END_DATE_MGMT HARVEST .RESULT_DATE HARVEST .MEASURE_DATE HARVEST.ONSET_DATE_MGMT HARVEST.REPORT_DATE_MGMT HARVEST.RESOLVE_DATE_MGMT HARVEST.PRO_DATE_MGMT HARVEST.REFRESH_DEMOGRAPHIC_DATE HARVEST.REFRESH_PRESCRIBING_DATE HARVEST.REFRESHPCORNET_TRIAL_DATE HARVEST.REFRESH_DEATH_DATE HARVEST.REFRESH_DEATH_CAUSE_DATE The information in this e-mail may be privileged and confidential, intended only for the use of the addressee(s) above. Any unauthorized use or disclosure of this information is prohibited. If you have received this e-mail by mistake, please delete it and immediately contact the sender. <v3_VOTE.docx> Russ Waitman, PhD Director of Medical Informatics Assistant Vice Chancellor for Enterprise Analytics Associate Professor, Department of Internal Medicine University of Kansas Medical Center, Kansas City, Kansas 913-945-7087 (office) [email protected] http://www.kumc.edu/ea-mi/ http://informatics.kumc.edu <http://informatics.kumc.edu/> http://informatics.gpcnetwork.org a PCORNet collaborative _______________________________________________ Gpc-dev mailing list [email protected] http://listserv.kumc.edu/mailman/listinfo/gpc-dev
