The bulk of the work is done in ssdmf_load.sql<https://github.com/kumc-bmi/heron/blob/master/heron_load/ssdmf_load.sql>. It matches on SSN and birth date and adds info to patient_mapping, encounter_mapping, patient_dimension, and observation_fact.
The table schema we use is in ssdmf_table.sql<https://github.com/kumc-bmi/heron/blob/master/heron_staging/ssdmf/ssdmf_table.sql>: create table ssdmf_table ("ADD_CHANGE_DELETE" varchar2(1) , "SOCIAL_SECURITY_NUMBER" varchar2(9) , "LAST_NAME" varchar2(20) , "NAME_SUFFIX" varchar2(4) , "FIRST_NAME" varchar2(15) , "MIDDLE_NAME" varchar2(15) , "VERIFY_OR_PROOF_CODE" varchar2(1) , "DATE_OF_DEATH" date , "DATE_OF_BIRTH" date , "STATE_COUNTRY_CODE_OF" varchar2(2) , "ZIP_CODE_LAST_RESIDENCE" varchar2(5) , "ZIP_CODE_LUMP_SUM_PAYMENT" varchar2(5)) NTIS publishes regular deltas rather than the whole database. We use get_updates.py<https://github.com/kumc-bmi/heron/blob/master/heron_staging/ssdmf/get_updates.py> to download the deltas and dmf_monthly_etl.sql<https://github.com/kumc-bmi/heron/blob/master/heron_staging/ssdmf/dmf_monthly_etl.sql> to apply them. (These are both wrapped in Jenkins jobs.) -- Dan
_______________________________________________ Gpc-dev mailing list [email protected] http://listserv.kumc.edu/mailman/listinfo/gpc-dev
