Thanks, Guergana and Sean. We've been using the dictionary tool to build an updated UMLS dictionary, but have only been seeing SNOMED and RxNorm concepts in our output. After receiving your message, we reviewed the dictionary tool.
It seems to us that to add a dictionary other than the defaults (SNOMED, RxNorm, and ICD), we would need to make significant changes, including some hard coding in a Java class. Before we go that route, we thought that we'd ask for a sanity check. It appears that we would need to: - Include new vocabularies in the dictionarytool's ConversionSources.txt--making it look more like the "optional" version instead of the "default" one (i.e., https://svn.apache.org/repos/asf/ctakes/sandbox/dictionary-gui/data/default/ConversionSources.txt). Easy enough. - Add custom property keys for the desired dictionaries to the cTakesHsql.xml file. The default file currently has keys for SNOMED, RxNorm, ICD-9, and ICD-10. Also straightforward. - Update the code in the class org.apache.ctakes.dictionary.lookup2.concept.JdbcConceptFactory. This class seems to be hard-coded to look for the SNOMED, RxNorm, etc. tags in cTakesHsql.xml (e.g. <property key="snomedTable" value="snomedct"/>. Then recompile the class. This is something that we'd rather avoid, of course. Is that all that we would need to do? Is there a simpler way? Regards, Alan On Tue, Dec 27, 2016 at 7:31 PM, Savova, Guergana < [email protected]> wrote: > Hi Alan, > > There is a module for building a dictionary off any vocabulary. It was > Sean Finan who wrote the code. Sean is out until Jan 3, I am sure he will > get back to you when he comes back from the holidays. From what I remember, > the code is straightforward to use. > > Happy Holidays! > > --Guergana > > > > Guergana Savova, PhD, FACMI > > Associate Professor > > PI Natural Language Processing Lab > > Boston Children's Hospital and Harvard Medical School > > 300 Longwood Avenue > > Mailstop: BCH3092 > > Enders 144.1 > > Boston, MA 02115 > > Tel: (617) 919-2972 > > Fax: (617) 730-0817 > > [email protected] > > Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv > > ctakes.apache.org > > thyme.healthnlp.org > > cancer.healthnlp.org > > share.healthnlp.org > > > > > > *From:* Alan Simmons [mailto:[email protected]] > *Sent:* Tuesday, December 27, 2016 5:06 PM > *To:* [email protected] > *Subject:* expanding cTAKES to use concepts from vocabularies other than > SNOMED and RXNorm > > > > Hi. I've been working with cTAKES for a few weeks now. I'm running the > standard CPE from the command line and generating CAS files that include > SNOMED and RxNorm concepts. > > I'd like to expand my annotation to include concepts from vocabularies > other than SNOMED and RxNORM--specifically, terms from the NCI Thesaurus > for cancer-specific terms that are not in SNOMED--e.g., "Stage IB non-small > cell lung cancer" (UMLS CUI C1336139). What's the best way to accomplish > this? > > Regards, > > Alan Simmons > > -- > > J. Alan Simmons > > Solution Architect > > > (c) +1.773.220.5018 > > > This email and any attachments may contain privileged and confidential > information and/or protected health information (PHI) that is protected by > federal and state privacy laws. It is intended solely for the use of > Tempus Labs and the recipient(s) named above. Nothing contained in this > communication and any attachments thereto is intended to waive any > privileges or rights of confidentiality. If you are not the recipient, or > the employee or agent responsible for delivering this message to the > intended recipient, you are hereby notified that any review, dissemination, > distribution, printing or copying of this email message and/or any > attachments is strictly prohibited. * If you have received this > transmission in error, please notify us immediately at **(877)-654-5544 > <%28877%29%20654-5544>** and permanently delete this email and any > attachments*. > -- J. Alan Simmons Solution Architect (c) +1.773.220.5018 -- This email and any attachments may contain privileged and confidential information and/or protected health information (PHI) that is protected by federal and state privacy laws. It is intended solely for the use of Tempus Labs and the recipient(s) named above. Nothing contained in this communication and any attachments thereto is intended to waive any privileges or rights of confidentiality. If you are not the recipient, or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any review, dissemination, distribution, printing or copying of this email message and/or any attachments is strictly prohibited. * If you have received this transmission in error, please notify us immediately at **(877)-654-5544** and permanently delete this email and any attachments*.
