cTAKES by default gets the semantic groups/types you mentioned --
diseases/disorders, signs/symptoms, and a few others that are clinically most
relevant. Proteins/genes/etc. I think can be tagged by editing the dictionary
descriptor file, where you can specify the UMLS Type IDs (TUIs) which you want
included.
If you look in:
/ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml
you will find lines like this:
<property key="anatomicalSiteTuis"
value="T021,T022,T023,T024,T025,T026,T029,T030"/>
<property key="procedureTuis" value="T059,T060,T061"/>
which specify the TUIs ctakes will find. If you add the tui T116 for "amino
acid,peptide, or protein" I think you should get immunoglobulins. You can use
the UTS to find the other TUIs you need.
One issue remains, which hopefully someone else can address -- if her TUIs
don't fit into one of the standard ctakes semantic types (anatomical
sites/procedure/ss/dd/drug), how does she go about adding another and getting
the type right? This is probably something we need to be aware of as genetic
information becomes more prevalent in clinical notes.
Tim
________________________________
From: Prasanna Bala [[email protected]]
Sent: Friday, March 21, 2014 9:36 AM
To: [email protected]
Subject: Hi..
Hi,
I have some clarifications. I am able to run the tagger using UMLS for
identifying disease, drugs, signs/symptoms. I am afraid that I am not able to
tag lot of good information. For eg: Let me explain with an example. I tried
documents that contains clinical test with lot of medical entities in it. But I
am not able to identify the entities such as IMMUNOGLOBULIN, TRANSGLUTAMINASE
which should be tagged as protein or atleast chemical class. I tried the same
document for other biomedical text mining libraries such as ABNER. I am able to
tag them as protein. Am I missing something here ? Can you please suggest some
modules for finding genes, proteins, orgranism. What are the limitation of
cTakes. Looking forward for someone to clarify this.
Regards,
Prasanna.