Last time I looked at that code, changing (adding a new property like "genesTuis") required coding changes. The expected property names are in UmlsToSnomedConsumerImpl.java and are used in if-else-if statements. It would be great to have that generalized someday, but for now, the options I know of, are to modify that class, or to re-purpose one of the defined groups.
- James From: Miller, Timothy [mailto:[email protected]] Sent: Friday, March 21, 2014 8:52 AM To: [email protected] Subject: RE: Hi.. cTAKES by default gets the semantic groups/types you mentioned -- diseases/disorders, signs/symptoms, and a few others that are clinically most relevant. Proteins/genes/etc. I think can be tagged by editing the dictionary descriptor file, where you can specify the UMLS Type IDs (TUIs) which you want included. If you look in: /ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml you will find lines like this: <property key="anatomicalSiteTuis" value="T021,T022,T023,T024,T025,T026,T029,T030"/> <property key="procedureTuis" value="T059,T060,T061"/> which specify the TUIs ctakes will find. If you add the tui T116 for "amino acid,peptide, or protein" I think you should get immunoglobulins. You can use the UTS to find the other TUIs you need. One issue remains, which hopefully someone else can address -- if her TUIs don't fit into one of the standard ctakes semantic types (anatomical sites/procedure/ss/dd/drug), how does she go about adding another and getting the type right? This is probably something we need to be aware of as genetic information becomes more prevalent in clinical notes. Tim ________________________________ From: Prasanna Bala [[email protected]] Sent: Friday, March 21, 2014 9:36 AM To: [email protected]<mailto:[email protected]> Subject: Hi.. Hi, I have some clarifications. I am able to run the tagger using UMLS for identifying disease, drugs, signs/symptoms. I am afraid that I am not able to tag lot of good information. For eg: Let me explain with an example. I tried documents that contains clinical test with lot of medical entities in it. But I am not able to identify the entities such as IMMUNOGLOBULIN, TRANSGLUTAMINASE which should be tagged as protein or atleast chemical class. I tried the same document for other biomedical text mining libraries such as ABNER. I am able to tag them as protein. Am I missing something here ? Can you please suggest some modules for finding genes, proteins, orgranism. What are the limitation of cTakes. Looking forward for someone to clarify this. Regards, Prasanna.
