Hi Masoud, In this link - https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+-+Fast+Dictionary+Lookup, I could see an information stating " A paper on rare word indexing is currently in progress."
May be Sean or Tim will be able to provide info on this I feel. Regards, Gandhi -----Original Message----- From: Masoud Rouhizadeh [mailto:m...@jhu.edu] Sent: Thursday, February 22, 2018 9:57 PM To: dev@ctakes.apache.org Subject: Fast UMLS dictionary lookup description Hello, cTAKES developing team, We are using and comparing various NLP tools (including cTAKES) for processing over 5 million clinical notes within Johns Hopkins Medical Institutes. As a part of our comparisons, we are exploring the architecture of the NER and (UMLS) concept linking components of the tools. I was able to find the description on the cTAKES default/original dictionary look up in the Savova et. al. 2010 paper but I was not able to find a paper or tech report describing the fast UMLS dictionary lookup (Fast UMLS Processor) yet. Any description of the fast dictionary lookup algorithm is highly appreciated. Thank you, Masoud Rouhizadeh ---- Masoud Rouhizadeh, PhD NLP Specialist / Software Engineer Institute for Clinical and Translational Research Center for Clinical Data Analysis School of Medicine, Johns Hopkins University http://pages.jh.edu/~mrouhiz1 This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender or system manager by email immediately if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited and against the law.