RE: Combining Drug and Concept annotations

Masanz, James J. Tue, 05 Feb 2013 18:15:07 -0800

I'm attaching something you can try. 

I'm assuming you are using cTAKES 2.5.
1) put the attached file in cTAKESdesc\cdpdesc\analysis_engine
2) update 
cTAKESdesc\lookupdesc\analysis_engine\DictionaryLookupAnnotatorUMLS.xml with 
your UMLS username and password
3) update the other copy of DictionaryLookupAnnotatorUMLS.xml in 
cTAKESdesc\drugnerdesc\analysis_engine also with your UMLS username and password


-- James Masanz
________________________________________
From: ctakes-dev-return-1136-Masanz.James=mayo....@incubator.apache.org 
[ctakes-dev-return-1136-Masanz.James=mayo....@incubator.apache.org] on behalf 
of shady hussein [[email protected]]
Sent: Monday, February 04, 2013 1:52 PM
To: [email protected]
Subject: Re: Combining Drug and Concept annotations

Hi Pei,
   Thanks for your reply, Yes i meant that DrugAggregatePlaintextUMLSProcessor 
return more concepts, or the opposite AggregatePlaintextUMLSProcessor returns 
the usual concepts + the MedicationEventMentions. I don't think it is hard to 
implement, as i think the dictionary lookup code won't change.

I tried to merge the drug lookup in the lookupDB and add the drug annotator in 
the normal pipe line, but of course things are not that simple :) I don't fully 
understand how the dictionary look up works, otherwise i could do it. Maybe if 
you have sometime, you can guide me a little and i can go from there.

Thanks,
Shady

On Feb 4, 2013, at 6:58 PM, "Chen, Pei" <[email protected]> wrote:

> Hi Shady,
> Just wanted to confirm:
> Did  you mean that the DrugAggregatePlaintextUMLSProcessor identifies the 
> same drugs, but just with more attributes (i.e. dosage, frequency, etc.)?
> Or did you mean that the DrugAggregatePlaintextUMLSProcessor actually 
> returned more UMLSConcepts (MedicationEventMentions) the regular 
> AggregatePlaintextUMLSProcessor?
>
> For the former, there is an outstanding Jira item to combine the 2 (reusing 
> the existing lookup entries- rather than a second lookup): 
> https://issues.apache.org/jira/browse/CTAKES-20
>
>> -----Original Message-----
>> From: Shady Hussein [mailto:[email protected]]
>> Sent: Monday, February 04, 2013 5:47 AM
>> To: [email protected]
>> Subject: Combining Drug and Concept annotations
>>
>> Dear All,
>>   I discovered that cTAKES doesn't recognize all the medical entities as
>> concepts. There is a difference between using the normal UMLS dictionary in
>> "/cdpdesc/analysis_engine/AggregatePlaintextUMLSProcessor.xml" and
>> "/drugnerdesc/analysis_engine/DrugAggregatePlaintextUMLSProcessor.xml
>> ". The later can detect all the drugs, while the first can't.
>>
>> My question now is how to combine between both of those dictionaries. So i
>> can detect all the drugs and concepts mentioned in the text. I would be
>> grateful if somebody can help me :)
>>
>> --
>> Thanks and best Regards,
>>
>> Shady AbdelAziz

<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier";>
  <frameworkImplementation>org.apache.uima.java</frameworkImplementation>
  <primitive>false</primitive>
  <delegateAnalysisEngineSpecifiers>
    <delegateAnalysisEngine key="DrugMentionAnnotator">
      <import location="../../drugnerdesc/analysis_engine/DrugMentionAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="Chunker">
      <import location="../../chunkerdesc/analysis_engine/Chunker.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="TokenizerAnnotator">
      <import location="../../coredesc/analysis_engine/TokenizerAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="ContextDependentTokenizerAnnotator">
      <import location="../../cdtdesc/analysis_engine/ContextDependentTokenizerAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="DictionaryLookupAnnotatorDB">
      <import location="../../lookupdesc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="StatusAnnotator">
      <import location="../../necontextdesc/analysis_engine/StatusAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="NegationAnnotator">
      <import location="../../necontextdesc/analysis_engine/NegationAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="ExtractionPrepAnnotator">
      <import location="ExtractionPrepAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="SentenceDetectorAnnotator">
      <import location="../../coredesc/analysis_engine/SentenceDetectorAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="LookupWindowAnnotator">
      <import location="LookupWindowAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="AdjustNounPhraseToIncludeFollowingNP">
      <import location="../../chunkerdesc/analysis_engine/AdjustNounPhraseToIncludeFollowingNP.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="DrugLookupWindowAnnotator">
      <import location="../../drugnerdesc/analysis_engine/DrugLookupWindowAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="AdjustNounPhraseToIncludeFollowingPPNP">
      <import location="../../chunkerdesc/analysis_engine/AdjustNounPhraseToIncludeFollowingPPNP.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="SimpleSegmentAnnotator">
      <import location="SimpleSegmentAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="POSTagger">
      <import location="../../posdesc/analysis_engine/POSTagger.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="LvgAnnotator">
      <import location="../../lvgdesc/analysis_engine/LvgAnnotator.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="AssertionAnnotator">
      <import location="../../assertiondesc/AssertionMiniPipelineAnalysisEngine.xml"/>
    </delegateAnalysisEngine>
    <delegateAnalysisEngine key="DependencyParser">
      <import location="../../dpdesc/analysis_engine/ClearParserDependencyParserAE.xml"/>
    </delegateAnalysisEngine>
  </delegateAnalysisEngineSpecifiers>
  <analysisEngineMetaData>
    <name>AggregatePlaintextUMLSProcessorPlusDrugNER</name>
    <description>Runs the complete pipeline for annotating clinical documents in plain text format 
using the built in UMLS (SNOMEDCT and RxNORM) dictionaries, including the drug NER components that add attributes such as dosage, frequency, etc. 
This uses the /DictionaryLookupAnnotatorUMLS.xml in both 
dictionary lookup/desc 
and in drugnerdesc\analysis_engine\ 
and requires a UMLS license. 
Please update *both* DictionaryLookupAnnotatorUMLS.xml files with your UMLS username and password.
</description>
    <version/>
    <vendor/>
    <configurationParameters searchStrategy="language_fallback">
      <configurationParameter>
        <name>SegmentID</name>
        <description/>
        <type>String</type>
        <multiValued>false</multiValued>
        <mandatory>false</mandatory>
        <overrides>
          <parameter>SimpleSegmentAnnotator/SegmentID</parameter>
        </overrides>
      </configurationParameter>
      <configurationParameter>
        <name>ChunkCreatorClass</name>
        <type>String</type>
        <multiValued>false</multiValued>
        <mandatory>true</mandatory>
        <overrides>
          <parameter>Chunker/ChunkCreatorClass</parameter>
        </overrides>
      </configurationParameter>
    </configurationParameters>
    <configurationParameterSettings>
      <nameValuePair>
        <name>ChunkCreatorClass</name>
        <value>
          <string>edu.mayo.bmi.uima.chunker.PhraseTypeChunkCreator</string>
        </value>
      </nameValuePair>
    </configurationParameterSettings>
    <flowConstraints>
      <fixedFlow>
        <node>SimpleSegmentAnnotator</node>
        <node>SentenceDetectorAnnotator</node>
        <node>TokenizerAnnotator</node>
        <node>LvgAnnotator</node>
        <node>ContextDependentTokenizerAnnotator</node>
        <node>POSTagger</node>
        <node>Chunker</node>
        <node>AdjustNounPhraseToIncludeFollowingNP</node>
        <node>AdjustNounPhraseToIncludeFollowingPPNP</node>
        <node>LookupWindowAnnotator</node>
        <node>DrugLookupWindowAnnotator</node>
        <node>DictionaryLookupAnnotatorDB</node>
        <node>DrugMentionAnnotator</node>
        <node>DependencyParser</node>
        <node>AssertionAnnotator</node>
        <!-- 
        	<node>StatusAnnotator</node>
        	<node>NegationAnnotator</node>
         -->
        <node>ExtractionPrepAnnotator</node>
      </fixedFlow>
    </flowConstraints>
    <typePriorities>
      <name>Ordering</name>
      <description>For subiterator</description>
      <version>1.0</version>
      <priorityList>
        <type>edu.mayo.bmi.uima.core.type.textspan.Segment</type>
        <type>edu.mayo.bmi.uima.core.type.textspan.Sentence</type>
        <type>edu.mayo.bmi.uima.core.type.syntax.BaseToken</type>
      </priorityList>
      <priorityList>
        <type>edu.mayo.bmi.uima.core.type.textspan.Sentence</type>
        <type>edu.mayo.bmi.uima.core.type.textsem.IdentifiedAnnotation</type>
      </priorityList>
    </typePriorities>
    <fsIndexCollection/>
    <capabilities>
      <capability>
        <inputs/>
        <outputs>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.NewlineToken</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.IdentifiedAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.WordToken</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.VP</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.refsem.UmlsConcept</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.UCP</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.TimeAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.SymbolToken</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textspan.Sentence</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textspanSegment</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.SBAR</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.RomanNumeralAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.RangeAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.PunctuationToken</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.Property</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.Properties</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.PersonTitleAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.PRT</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.PP</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.OntologyConcept</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.NumToken</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.MeasurementAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.lookup.type.LookupWindowAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.Lemma</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.LST</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.INTJ</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.FractionAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.structured.DocumentID</type>
          <type allAnnotatorFeatures="true">uima.tcas.DocumentAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.DateAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.CopySrcAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.CopyDestAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.ContractionToken</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.textsem.ContextAnnotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.Chunk</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.CONJP</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.BaseToken</type>
          <type allAnnotatorFeatures="true">uima.cas.AnnotationBase</type>
          <type allAnnotatorFeatures="true">uima.tcas.Annotation</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.ADVP</type>
          <type allAnnotatorFeatures="true">edu.mayo.bmi.uima.core.type.syntax.ADJP</type>        
        </outputs>
        <languagesSupported/>
      </capability>
    </capabilities>
    <operationalProperties>
      <modifiesCas>true</modifiesCas>
      <multipleDeploymentAllowed>true</multipleDeploymentAllowed>
      <outputsNewCASes>false</outputsNewCASes>
    </operationalProperties>
  </analysisEngineMetaData>
  <resourceManagerConfiguration/>
</analysisEngineDescription>

RE: Combining Drug and Concept annotations

Reply via email to