Hi Sana, >This was working fine for plaintext but cuased the mentioned problem when I processed the CDA document after adding CdaCasInitializer annotator to the pipeline.
Ok, so there is a very good chance that problem is not the dictionary lookup module. >Do i have to enable the annotations types in the DefaultJCasTermAnnotator class or something like that? You don't need to enable type production. By default the -fast lookup will create the following annotations based upon the Semantic TUI of the discovered concepts: SignSymptomMention, ProcedureMention, DiseaseDisorderMention, MedicationMention, LabMention, AnatomicalSiteMention EntityMention (for unknown semantic type) See https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-fast/src/main/java/org/apache/ctakes/dictionary/lookup2/consumer/DefaultTermConsumer.java In summary, the various types will be produced for discovered concepts. If types are not produced, it means that concepts were not discovered. There are various reasons for a lack of discovery: empty dictionary, no matching synonyms in the dictionary, missing segment, sentence, parts of speech or basetokens ... ________________________________________ From: Sana Riaz <[email protected]> Sent: Thursday, January 10, 2019 3:29 PM To: [email protected] Subject: Re: UmlsLookupAnnotator.xml does not give sign/symptom, disease/disorder in identifiedAnnotation for CDA documents [EXTERNAL] Hi Sean Thanks For responding. Yes, I am using xml descriptor in ctakes-dictionary-lookup-fast. I tried the fix you mentioned but it changed nothing. To answer your second question, I have build the pipeline using xml descriptors in java and am testing in java. Initially i was using DefaultJCasTermAnnotator with The resource (dictionary) configuration file resources/.../dictionary/lookup/fast/sno_rx_16ab.xml as following AggregateBuilder builder = new AggregateBuilder(); > ... > //other pipeline components// > ... > AnalysisEngineDescription dictionarylookup_desc = > AnalysisEngineFactory.createEngineDescription( > DefaultJCasTermAnnotator.class, > AbstractJCasTermAnnotator.PARAM_WINDOW_ANNOT_KEY, > "org.apache.ctakes.typesystem.type.textspan.Sentence", > JCasTermAnnotator.DICTIONARY_DESCRIPTOR_KEY, > "org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml"); > builder.add(dictionarylookup_desc); > ... > //other pipeline components// > ... > builder.createAggregateDescription(); This was working fine for plaintext but cuased the mentioned problem when I processed the CDA document after adding CdaCasInitializer annotator to the pipeline. So I changed it to UmlsOverlapLookupAnnotator descriptor as builder.add( AnalysisEngineFactory.createEngineDescription( > "desc/ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator") > ); BTW, I experimented with the AggregateCdaUMLSProcessor.xml (ctakes-clinical-pipeline) and changed the DictionaryLookupAnnotatorUmls (original dictionary-lookup ) to UmlsLookupAnnotator . This also doesn't give the the mentioned annotations. So this gaveme the hint that problem is with UmlsLookupAnnotator maybe. What do you suggest? >The java implementation pointed to in that descriptor, DefaultJCasTermAnnotator does provide the various semantically-distinct annotation types that you mention. I am a little confuse there; Do i have to enable the annotations types in the DefaultJCasTermAnnotator class or something like that? becuase i didnt had to do anything like that for the plaintext. Can you please elaborate this point too, I am really naive to cTAKES yet so i might be not getting it right. PS I am not getting any error or warning related to this in the compiling logs. Warm Regards Sana Riaz On Thu, Jan 10, 2019 at 8:25 PM Finan, Sean < [email protected]> wrote: > Hi Sana, > > When you say: > >i want to use dictionary_lookup "UmlsLookupAnnotator" > are you talking about the xml descriptor in > ctakes-dictionary-lookup-fast? If so that is great. > > >The problem is that identifiedAnnotation given by UmlsLookupAnnotator does > not include Sign/Symptoms, Disease/Disorder or Procedure Mentions etc. > > How are you testing this? The java implementation pointed to in that > descriptor, DefaultJCasTermAnnotator does provide the various > semantically-distinct annotation > types that you mention. I use it every day without problem*. Are you > seeing any errors at the top of the log? > > I just looked at the descriptor UmlsLookupAnnotator.xml and it may have a > problem: > > <annotatorImplementationName>org.apache.ctakes.dictionary.lookup2.ae > .DefaultJCasTermAnnotator > </annotatorImplementationName> > > Notice that the end tag </annotatorImplementationName> is on a second line > in the file. I have seen this cause problems in uima/ctakes. I think that > the xml parser assumes that whitespace is part of the information - which > in this case is not true. > > Try putting the end tag on the same line and running again. > > * I never use xml descriptors anymore. I use piper files. So, even > though I use that implementation every day I do not load it in the same > manner. > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=Ih_UnUpfqsW7afLjD7OmNjFUXI3Wfivxr0jt_Td-SCU&s=dRTmusyMpYBKgtE-hrqhyPLKYUR6iKm5AChfXWTJQFU&e= > > Please try the fix I mention and let me know what happens. > > Sean > > > > ________________________________________ > From: Sana Riaz <[email protected]> > Sent: Thursday, January 10, 2019 6:13 AM > To: [email protected] > Subject: UmlsLookupAnnotator.xml does not give sign/symptom, > disease/disorder in identifiedAnnotation for CDA documents [EXTERNAL] > > Hi, I am doing NLP research on some clinical documents using cTAKES and I > am a little stuck at a point. > I have created a pipeline in java similar to "AggregateCdaUMLSProcessor" in > which i want to use dictionary_lookup "UmlsLookupAnnotator" instead of > "DictionaryLookupAnnotatorUmls". > The problem is that identifiedAnnotation given by UmlsLookupAnnotator does > not include Sign/Symptoms, Disease/Disorder or Procedure Mentions etc. I > have done required sofa Mapping in java . The pipeline works fine for > plaintext document but doesn't give the above mentioned annotations for > CDA. I have tested the CDA document using AggregateCdaUMLSProcessor.xml > descriptor and it gives the above mentioned annotations (except > MedicationMention which I also need) . Can you give me any suggestion about > what can I try or what is wrong? > Looking forward to hearing from you. > > Warm Regards, > > Sana Riaz >
