Yes, you are correct. cTAKES does named entity recognition and normalization=mapping to an ontology (through the UMLS). The normalization part is what is different from what is usually done in the general domain (where mentions of several semantic types are discovered but not necessarily normalized to a concept within an ontology). In the general domain, there is a recent trend to normalize to Wikipedia (wikification).
In short, to do the NER in cTAKES you do need a license for the UMLS. BTW, that license is free for level 0 vocabularies. Hope this information helps. --Guergana -----Original Message----- From: Damir Olejar [mailto:[email protected]] Sent: Friday, May 22, 2015 7:51 AM To: [email protected] Subject: Re: Exploiting the power of cTakes, using OpenNLP only To answer my own question, it all comes down to UMLS licensing, and which files are being downloaded from the server. The files that are downloaded are compressed *.model files that can be integrated with cTakes. However, there is (or might be in the near future) a restriction to which user can download which files, and also, there might be a copyright issue if the UMLS procedure is not followed. So, yes, there is no need for UIMA, but then, for any serious work, the copyrights need to be respected. On Thu, May 21, 2015 at 12:10 PM, Damir Olejar <[email protected]> wrote: > To whom it may concern, > > First, I would like to apologize if my question is vague, since I am > new and unaccustomed to the cTakes diction. To keep my question simple > and up to a point, let us assume that I am working only with an Apache > OpenNLP. I do not have any UIMA-specific JAR files included, and let > us assume that I do not want to include any of them (or keep it to a > minimum), thus keeping the project confined to OpenNLP as much as possible. > > As far as I know, UIMA is just a framework that does not provide any > specific NLP tools (source: > https://urldefense.proofpoint.com/v2/url?u=http-3A__stackoverflow.com_questions_24186742_is-2Duima-2Dprovides-2Donly-2Da-2Dwrapper-2Dor-2Dis-2Dit-2Dlike-2Dstandfordcore-2Dnlp-2Dand-2Dgate&d=BQIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=umFvmAvfVN2FIHuugFp5H33UdNyy-mxG3U3yDPRMp9I&s=uM0wOUdg63NBJRXD3JRZeU0fx-jT8ide6bcZdx_-WY8&e= > ). > This means that there should be a way of integrating the cTakes > components with OpenNLP. > > What I would like to do is to simply have the Name Entity Recognition > (NER) applied to a text, so I know which word from an inputted > sentence is a medical term. The perfect option would be if I could > have a *.bin file such as "en-ner-person.bin”, but I think that cTakes > does not give us such an option, since there are no *.bin files. > > How would I accomplish such a task? Would there be any code, examples, > tutorials, documentations, pseudo-code, ideas ,… to take a look at? > > Thank you kindly for your time, understanding, and a patience. > > Damir >
