Hi Sean Thanks For responding. Yes, I am using xml descriptor in ctakes-dictionary-lookup-fast. I tried the fix you mentioned but it changed nothing. To answer your second question, I have build the pipeline using xml descriptors in java and am testing in java. Initially i was using DefaultJCasTermAnnotator with The resource (dictionary) configuration file resources/.../dictionary/lookup/fast/sno_rx_16ab.xml as following
AggregateBuilder builder = new AggregateBuilder(); > ... > //other pipeline components// > ... > AnalysisEngineDescription dictionarylookup_desc = > AnalysisEngineFactory.createEngineDescription( > DefaultJCasTermAnnotator.class, > AbstractJCasTermAnnotator.PARAM_WINDOW_ANNOT_KEY, > "org.apache.ctakes.typesystem.type.textspan.Sentence", > JCasTermAnnotator.DICTIONARY_DESCRIPTOR_KEY, > "org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml"); > builder.add(dictionarylookup_desc); > ... > //other pipeline components// > ... > builder.createAggregateDescription(); This was working fine for plaintext but cuased the mentioned problem when I processed the CDA document after adding CdaCasInitializer annotator to the pipeline. So I changed it to UmlsOverlapLookupAnnotator descriptor as builder.add( AnalysisEngineFactory.createEngineDescription( > "desc/ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator") > ); BTW, I experimented with the AggregateCdaUMLSProcessor.xml (ctakes-clinical-pipeline) and changed the DictionaryLookupAnnotatorUmls (original dictionary-lookup ) to UmlsLookupAnnotator . This also doesn't give the the mentioned annotations. So this gaveme the hint that problem is with UmlsLookupAnnotator maybe. What do you suggest? >The java implementation pointed to in that descriptor, DefaultJCasTermAnnotator does provide the various semantically-distinct annotation types that you mention. I am a little confuse there; Do i have to enable the annotations types in the DefaultJCasTermAnnotator class or something like that? becuase i didnt had to do anything like that for the plaintext. Can you please elaborate this point too, I am really naive to cTAKES yet so i might be not getting it right. PS I am not getting any error or warning related to this in the compiling logs. Warm Regards Sana Riaz On Thu, Jan 10, 2019 at 8:25 PM Finan, Sean < [email protected]> wrote: > Hi Sana, > > When you say: > >i want to use dictionary_lookup "UmlsLookupAnnotator" > are you talking about the xml descriptor in > ctakes-dictionary-lookup-fast? If so that is great. > > >The problem is that identifiedAnnotation given by UmlsLookupAnnotator does > not include Sign/Symptoms, Disease/Disorder or Procedure Mentions etc. > > How are you testing this? The java implementation pointed to in that > descriptor, DefaultJCasTermAnnotator does provide the various > semantically-distinct annotation > types that you mention. I use it every day without problem*. Are you > seeing any errors at the top of the log? > > I just looked at the descriptor UmlsLookupAnnotator.xml and it may have a > problem: > > <annotatorImplementationName>org.apache.ctakes.dictionary.lookup2.ae > .DefaultJCasTermAnnotator > </annotatorImplementationName> > > Notice that the end tag </annotatorImplementationName> is on a second line > in the file. I have seen this cause problems in uima/ctakes. I think that > the xml parser assumes that whitespace is part of the information - which > in this case is not true. > > Try putting the end tag on the same line and running again. > > * I never use xml descriptors anymore. I use piper files. So, even > though I use that implementation every day I do not load it in the same > manner. > https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files > > Please try the fix I mention and let me know what happens. > > Sean > > > > ________________________________________ > From: Sana Riaz <[email protected]> > Sent: Thursday, January 10, 2019 6:13 AM > To: [email protected] > Subject: UmlsLookupAnnotator.xml does not give sign/symptom, > disease/disorder in identifiedAnnotation for CDA documents [EXTERNAL] > > Hi, I am doing NLP research on some clinical documents using cTAKES and I > am a little stuck at a point. > I have created a pipeline in java similar to "AggregateCdaUMLSProcessor" in > which i want to use dictionary_lookup "UmlsLookupAnnotator" instead of > "DictionaryLookupAnnotatorUmls". > The problem is that identifiedAnnotation given by UmlsLookupAnnotator does > not include Sign/Symptoms, Disease/Disorder or Procedure Mentions etc. I > have done required sofa Mapping in java . The pipeline works fine for > plaintext document but doesn't give the above mentioned annotations for > CDA. I have tested the CDA document using AggregateCdaUMLSProcessor.xml > descriptor and it gives the above mentioned annotations (except > MedicationMention which I also need) . Can you give me any suggestion about > what can I try or what is wrong? > Looking forward to hearing from you. > > Warm Regards, > > Sana Riaz >
