Alex, The standard i2b2 query generates something that includes (simplified
some):
select patient_num from observation_fact obs
where obs.concept_cd in
(select cd.concept_cd from concept_dimension cd where
cd.concept_path LIKE '\i2b2\Encounters\Notes\%’)
...
As a result, if you have dupes in the CONCEPT_DIMENSION table, it will merely
result in some dupes in the select within parenthesis. The outer query will
merely look for the matching concept_cd terms within that list. It should not
generate a join that has additional permutations.
- Glenn
Medical Informatics Senior Analyst
CTSI – Clinical & Translational Science Institute
[email protected]<mailto:[email protected]>
(414) 805-7239
From: Alex Bokov <[email protected]<mailto:[email protected]>>
Organization: Department of Epidemiology and Biostatistics, UTHSCSA
Date: Monday, September 22, 2014 at 8:38 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: I2B2's CONCEPT_DIMENSION and exploding joins
As I was trying to count and pivot data from I2B2 today, I noticed that
our CONCEPT_DIMENSION table (in the BLUEHERONDATA schema if you use the
HERON ETL code, in the I2B2DEMODATA schema by default presumably) has
duplicate CONCEPT_CD's.
Here is the resulting question to the I2B2 Install Help group:
https://groups.google.com/d/topic/i2b2-install-help/eTK7I1JJn3Y/discussion
Do any of you suffer from exploding joins when attempting to join
OBSERVATION_FACT on CONCEPT_DIMENSION using CONCEPT_CD? At the other
end, do any of you never even use CONCEPT_DIMENSION? I'm tempted to just
keep the first path and description from each distinct CONCEPT_CD in
concept dimension. Thoughts?
_______________________________________________
Gpc-dev mailing list
[email protected]<mailto:[email protected]>
http://listserv.kumc.edu/mailman/listinfo/gpc-dev
_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev