Johnsd11 commented on issue #55:
URL: https://github.com/apache/ctakes/issues/55#issuecomment-2835940544

   
   The first thing that I'll mention is that there are a lot of updates to the 
ctakes Dictionary builder in the unreleased version 5.0, so I am going to talk 
about its use.  https://github.com/apache/ctakes
   
   
   
       >  1. Combining SNOMED and RxNorm in Dictionary Creation:
       > I extracted data from umls-2023AA-full and RxNorm_full. After 
utilizing NLM Metamorphosys to install UMLS, the conversion of SNOMED from 
umls-2023AA-full into RRF files was successfully accomplished.
   
   - For clarity, are you stating that you created RRF files for snomed from 
umls-2023AA_full and separate RRF files from RxNorm_full sources ?  If so, are 
you sure that UMLS 2023AA_full doesn't contain all of the RxNorm information 
that you need?
   
   
   
       > I can only select one "UMLS Installation" source, limiting me to 
either SNOMED or RxNorm.
   
   - This is correct.  Normally a dictionary is built from RRF files created 
using metamorphosys on a single source.
   - There are two possible clobberings to combine dictionaries from disparate 
sources:
   
     1.  Concatenate the source RRF files from both sources.  You should only 
need to do this with the MRCONSO RRF files.  Then select the directory 
containing the concatenated RRF (and other RRF files) as the umls source for 
the dictionary creator gui.
     2.   Build 2 ctakes dictionaries, one from each source.  Then concatenate 
all "INSERT" lines into one dictionary file.
   
   - A cleaner method for your situation is to create one ctakes dictionary for 
snomed and a separate ctakes dictionary for rxnorm.  Then create a dictionary 
descriptor file for multiple dictionaries.  Tim Miller has a great example of 
one here:  
https://github.com/tmills/ctakes-docker/blob/master/ctakes-as-pipeline/MultipleDictionaryLookupSpecExample.xml
   - The multiple dictionary approach is more flexible, but try not to use 
multiple dictionaries with a lot of overlap.
   
   
   - I think that vocabularies containing a dash in the name such as "MED-RT" 
were problematic in older versions of the dictionary creator.  It should be ok 
with v.5
   - The problem stemmed from SQL not allowing dash characters in table names 
without special treatment.  ctakes gets around it by converting the dash 
character to an underscore.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to