Hi, I am using the cTakes 3.2.0 code base and I have been trying to figure out what would be the proper way to get cTakes to recognize and annotate mentions of medical devices.
I am using the AggregatePlaintextUMLSProcessor.xml because one of the main requirements for the annotation is that it needs to to include subject, and polarity, certainty, etc. First, I added TUI T074 to the procedureTuis in the LookupDesc_Db.xml. The output from this provides mostly what I want, except it lumps the devices as ProcedureMention where I would like them to be distinguished by their own annotation. An example annotation created this way was: <org.apache.ctakes.typesystem.type.textsem.ProcedureMention _indexed="1" _id="35574" _ref_sofa="3" begin="472" end="481" id="32" _ref_ontologyConceptArr="35569" typeID="8" segmentID="SIMPLE_SEGMENT" discoveryTechnique="1" confidence="1.0" polarity="1" uncertainty="0" conditional="false" generic="false" subject="patient" historyOf="0"/> I also tried adding code to classify devices as an EntityMention, and that seemed to work too: <org.apache.ctakes.typesystem.type.textsem.EntityMention _indexed="1" _id="35574" _ref_sofa="3" begin="472" end="481" id="32" _ref_ontologyConceptArr="35569" typeID="8" segmentID="SIMPLE_SEGMENT" discoveryTechnique="1" confidence="1.0" polarity="1" uncertainty="0" conditional="false" generic="false" subject="patient" historyOf="0"/> Again, that doesn't give devices their own unique annotation so I looked further. Exploring the typesystem, I noticed the following types in TypeSystem.xml: org.apache.ctakes.typesystem.type.refsem.ProcedureDevice org.apache.ctakes.typesystem.type.textsem.ProcedureDeviceModifier These seemed like the closest defined types to what I would expect so I thought I would see if using them would generate what I wanted. I modified the code to generate these annotations and the result was as follows: <org.apache.ctakes.typesystem.type.textsem.ProcedureDeviceModifier _indexed="1" _id="35587" _ref_sofa="3" begin="472" end="481" id="32" typeID="0" segmentID="SIMPLE_SEGMENT" discoveryTechnique="0" confidence="0.0" polarity="0" uncertainty="0" conditional="false" generic="false" subject="patient" historyOf="0" _ref_normalizedForm="35574"/> <org.apache.ctakes.typesystem.type.refsem.ProcedureDevice _id="35574" id="0" _ref_ontologyConcept="35542" discoveryTechnique="1" confidence="0.0" conditional="false" generic="false" polarity="0" uncertainty="0" historyOf="0"/> The problem with this approach was that confidence, polarity, and uncertainty did not get filled in. I tried adding these to the inputs and outputs of the AssertionMiniPipelineAnalysisEngine, but that didn't seem to have any effect. Perhaps I didn't do it right? or maybe it isn't even the right pipeline component to try to modify? Since ProcedureDevice and ProcedureDeviceModifer have different supertypes than ProcedureMention, I also tried creating a new type in TypeSystem.xml: <typeDescription> <name>org.apache.ctakes.typesystem.type.textsem.DeviceMention</name> <supertypeName>org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation</supertypeName> <features> <featureDescription> <name>entity</name> <description/> <rangeTypeName>org.apache.ctakes.typesystem.type.refsem.Entity</rangeTypeName> </featureDescription> </features> </typeDescription> Modifying the code to create this type, the result was: <org.apache.ctakes.typesystem.type.textsem.DeviceMention _indexed="1" _id="35574" _ref_sofa="3" begin="472" end="481" id="32" _ref_ontologyConceptArr="35569" typeID="8" segmentID="SIMPLE_SEGMENT" discoveryTechnique="1" confidence="0.0" polarity="0" uncertainty="0" conditional="false" generic="false" subject="patient" historyOf="0"/> Again the problem here is that confidence, polarity and uncertainty are not filled in. So, I am left wondering: 1) Which of the methods I tried would be the best "cTakes way"? 2) What do I need to modify to get confidence, polarity and uncertainty to be filled in? Thanks, Bruce