For UTSW, I'm not seeing several of the terminologies I sent inside of Babel. I'm guessing they just need to be added to the table_access. Specifically, the following
\Beacon\ - Our Epic Beacon Staging information \CancerGeneConnect\ - A genetic database \caTissue\ - caTissue Samples \MINIMALDATASET\ - A Breast Cancer Dataset \i2b2\naaccr\ - Our Tumor Registry \i2b2\Reports\ - Notes Terminology \PATH REPORT\ - Information from Pathology Reports Phillip From: Dan Connolly <[email protected]<mailto:[email protected]>> Date: Wednesday, January 29, 2014 1:18 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: loading I2B2 concepts: KUMC, MCW, UMN OK; struggles with MCRF, UTSW, WISC Progress on loading stuff into babel<http://babel.gpcnetwork.org/> continues... 3 down, but... Joe, I'm having trouble with wierd quotes in the c_basecode field the Marshfield CSV data: /usr/bin/psql --host localhost --dbname i2b2 --user i2b2metadata --no-password -c copy mcrf_terms(C_HLEVEL, C_FULLNAME, C_NAME, C_SYNONYM_CD, C_VISUALATTRIBUTES, C_TOTALNUM, C_BASECODE, C_METADATAXML, C_FACTTABLECOLUMN, C_TABLENAME, C_COLUMNNAME, C_COLUMNDATATYPE, C_OPERATOR, C_DIMCODE, C_COMMENT, C_TOOLTIP, UPDATE_DATE, DOWNLOAD_DATE, IMPORT_DATE, SOURCESYSTEM_CD, VALUETYPE_CD) from stdin with (format csv, header true, null 'NULL') ERROR: value too long for type character varying(50) CONTEXT: COPY mcrf_terms, line 42163, column c_basecode: ") Miscell. (Med.Supl.;Non-Drugs),,concept_cd,concept_dimension,concept_path,T,=,\i2b2\Medications\*N..." The offending line is: 5,\i2b2\Medications\*Not Available\DEVICES\Syringe with Needle (Disp)\Syringe with Needle (Disp) (Syringe 3cc/21Gx1-1/2\,Med:29443,N,LA ,0,) Miscell. (Med.Supl.;Non-Drugs)",,concept_cd,concept_dimension,concept_path,T,=,\i2b2\Medications\*Not Available\DEVICES\Syringe with Needle (Disp)\Syringe with Needle (Disp) (Syringe 3cc/21Gx1-1/2\,,Med:29443,2013-10-16 00:00:00.000,2013-10-16 00:00:00.000,2013-10-16 00:00:00.000,RDW, I'm working on encoding issues in the UTSW data. And it looks like the WISC data is tab-separated, not CSV... -- Dan ________________________________ UT Southwestern Medical Center The future of medicine, today.
