The bulk of the work is done in 
ssdmf_load.sql<https://github.com/kumc-bmi/heron/blob/master/heron_load/ssdmf_load.sql>.
 It matches on SSN and birth date and adds info to patient_mapping, 
encounter_mapping, patient_dimension, and observation_fact.


The table schema we use is in 
ssdmf_table.sql<https://github.com/kumc-bmi/heron/blob/master/heron_staging/ssdmf/ssdmf_table.sql>:

create table ssdmf_table ("ADD_CHANGE_DELETE" varchar2(1)
, "SOCIAL_SECURITY_NUMBER" varchar2(9)
, "LAST_NAME" varchar2(20)
, "NAME_SUFFIX" varchar2(4)
, "FIRST_NAME" varchar2(15)
, "MIDDLE_NAME" varchar2(15)
, "VERIFY_OR_PROOF_CODE" varchar2(1)
, "DATE_OF_DEATH" date
, "DATE_OF_BIRTH" date
, "STATE_COUNTRY_CODE_OF" varchar2(2)
, "ZIP_CODE_LAST_RESIDENCE" varchar2(5)
, "ZIP_CODE_LUMP_SUM_PAYMENT" varchar2(5))

NTIS publishes regular deltas rather than the whole database. We use 
get_updates.py<https://github.com/kumc-bmi/heron/blob/master/heron_staging/ssdmf/get_updates.py>
 to download the deltas and 
dmf_monthly_etl.sql<https://github.com/kumc-bmi/heron/blob/master/heron_staging/ssdmf/dmf_monthly_etl.sql>
 to apply them. (These are both wrapped in Jenkins jobs.)

--
Dan

_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Reply via email to