Johnsd11 commented on issue #55: URL: https://github.com/apache/ctakes/issues/55#issuecomment-2835940544
The first thing that I'll mention is that there are a lot of updates to the ctakes Dictionary builder in the unreleased version 5.0, so I am going to talk about its use. https://github.com/apache/ctakes > 1. Combining SNOMED and RxNorm in Dictionary Creation: > I extracted data from umls-2023AA-full and RxNorm_full. After utilizing NLM Metamorphosys to install UMLS, the conversion of SNOMED from umls-2023AA-full into RRF files was successfully accomplished. - For clarity, are you stating that you created RRF files for snomed from umls-2023AA_full and separate RRF files from RxNorm_full sources ? If so, are you sure that UMLS 2023AA_full doesn't contain all of the RxNorm information that you need? > I can only select one "UMLS Installation" source, limiting me to either SNOMED or RxNorm. - This is correct. Normally a dictionary is built from RRF files created using metamorphosys on a single source. - There are two possible clobberings to combine dictionaries from disparate sources: 1. Concatenate the source RRF files from both sources. You should only need to do this with the MRCONSO RRF files. Then select the directory containing the concatenated RRF (and other RRF files) as the umls source for the dictionary creator gui. 2. Build 2 ctakes dictionaries, one from each source. Then concatenate all "INSERT" lines into one dictionary file. - A cleaner method for your situation is to create one ctakes dictionary for snomed and a separate ctakes dictionary for rxnorm. Then create a dictionary descriptor file for multiple dictionaries. Tim Miller has a great example of one here: https://github.com/tmills/ctakes-docker/blob/master/ctakes-as-pipeline/MultipleDictionaryLookupSpecExample.xml - The multiple dictionary approach is more flexible, but try not to use multiple dictionaries with a lot of overlap. - I think that vocabularies containing a dash in the name such as "MED-RT" were problematic in older versions of the dictionary creator. It should be ok with v.5 - The problem stemmed from SQL not allowing dash characters in table names without special treatment. ctakes gets around it by converting the dash character to an underscore. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
