Richard, org.apache.ctakes.assertion.medfacts.types.Concept is an internal type used by the assertion module, could you see what is returned in: *org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation?*
On Fri, Apr 4, 2014 at 3:56 PM, Lee, Richard A. [USA] <[email protected]>wrote: > I ran several documents through cTAKES, using > AggregatePlaintextUMLSProcessor, and examined the list of > org.apache.ctakes.assertion.medfacts.types.Concept annotations produced for > each. From those results, I made up a list of phrases I had hoped cTAKES > would annotate but did not. I used MetaMap to look up each of those > phrases, and found that approximately 150 of them resulted in a full-phrase > match and a corresponding CUI. > > > > I used the MetamorphoSys scripts to load the UMLS RRF data set into a SQL > DB, and queried the DB to confirm that those ~150 phrases were indeed > present with the expected CUIs. So, the question becomes, why didn't cTAKES > annotate them? > > > > Looking at the cTAKES logs, it appears the OrangeBookFilter "Filtered out" > only 5 out of the 150. > > > > The other possible cause I could think of was the TUI filtering; there was > no evidence of it in the logs, but I don't know whether the results of > filtering in that step get logged by default or not. I looked up in the DB > the TUIs for each of the phrases, compared them to the lists of "allowed" > TUIs in LookupDesc_Db.xml, and concluded that the TUI filtering might > account for 44 of the phrases. So the rest remain a mystery. > > > > I modified the TUI lists in LookupDesc_Db.xml to add TUIs, in the hopes > that that would cause the corresponding phrases to be annotated. > Specifically, I added T058 to one list, and added a second list with a > handful of TUIs: > > > > <property key="procedureTuis" value="T058,T059,T060,T061"/> > > <property key="chemicalanddrugTuis" value="T109,T110,T116,T121,T123"/> > > > > T058 corresponded to 3 of the phrases on my list; T121 alone accounted for > 24 of them. But, upon restarting cTAKES with that modified file, and > running relevant documents, I found that the expected phrases were still > not annotated. I even tried making the same change in LookupDesc.xml just > in case, to no avail. > > > > So, the questions are: > > > > - Are there reasons beyond the OrangeBook and TUI filters why > CUI-associated phrases in UMLS would not get annotated? > > > > - Do TUI-filter results get logged by default, and if not, is there a way > (log4j settings?) to log them without making code changes? > > > > - Am I doing the TUI filter changes wrong? > > > > Thanks for any answers and advice. >
