Hi Melvin, To see how the sno_rx_16ab was created, please see: https://cwiki.apache.org/confluence/display/CTAKES/Dictionary+Creator+GUI
That page will also instruct you on making one that contains things like "teary eyes", which comes from the MDR vocabulary and not snomed or rxnorm. It is possible that I didn't include the MDR in my local umls rrf creation. The gui uses code that is in the ctakes-gui project. You can blacklist terms like "probably" and "yes" (which is "yes - presence finding" in the umls). See this recent thread: http://mail-archives.apache.org/mod_mbox/ctakes-dev/201710.mbox/%[email protected]%3E The comma-delimited semantic groups has been implemented. Sean -----Original Message----- From: Melvin Ma [mailto:[email protected]] Sent: Thursday, November 02, 2017 12:46 PM To: [email protected] Subject: fast dictionary - [EXTERNAL] I am recently studying the fast dictionary code and behavior. I wonder how was sno_rx_16ab.script originally constructed. I have not seen any code converting UMLs data into "sno_rx_16ab.script" file (obviously I could missed something). Is it simply a copy of UMLs tables? Specifically, I am puzzled by the following: 1> "Probably" was recognized as sympton using the default clinical pipeline. I could see the following line from sno_rx_16ab.script file: INSERT INTO CUI_TERMS VALUES(332148,0,1,'probably','probably') I am guessing somehow, we should eliminate it... (not sure about that). Similary to "Probably", "Yes" INSERT INTO CUI_TERMS VALUES(1298907,0,1,'yes','yes') 2> "teary eyes" is not captured. If I search in UMLs browser, I did get return - Teary eyes [A25737508/MDR/LLT/10043172] Not sure why it is not included in the fast dictionary. Thank you very much! Melvin
