Hi Pei
1- abstract attached file is what I used as a sample
2- AggregatePlaintextUMLSProcessor attached file is the .xml configuration
(note: even if i removed the dependencyParser, SemanticRoleLabler,
AssertionAnnotator and ExtractionPrepAnnotator these does not change the
performance)
Thank you very much for your help
________________________________
From: "Chen, Pei" <[email protected]>
To: samir chabou <[email protected]>
Cc: "[email protected]" <[email protected]>
Sent: Thursday, August 15, 2013 10:23:50 PM
Subject: RE: umls lookup issue
Hi Samir,
Do you have a sample sentence that causes the 3hr run?
Also could you attach the AggregatePipeline.xml configuration used? In case,
someone else on the dev list may have encountered this in the past already.
I'll try and see if I can recreate it.
--Pei
________________________________
From: samir chabou [[email protected]]
Sent: Thursday, August 15, 2013 7:07 PM
To: Chen, Pei
Subject: Re: umls lookup issue
Hi Pei,
we did more debuging and it's the lookup call below (higlighted in yelleow)
that causes the delay.
performLookup is in DictionaryLookupAnnotator.java
private void performLookup(JCas jcas, LookupSpec ls, List lookupTokenList,
Map ctxMap) throws Exception
{
// sort the lookup tokens
Collections.sort(lookupTokenList,
LookupTokenComparator.getInstance() );
// perform lookup
Collection lookupHitCol = null;
LookupAlgorithm la = (LookupAlgorithm) ls.getLookupAlgorithm();
lookupHitCol = la.lookup(lookupTokenList, ctxMap);
Samir
________________________________
From: "Chen, Pei" <[email protected]>
To: "[email protected]" <[email protected]>
Cc: samir chabou <[email protected]>
Sent: Thursday, August 15, 2013 9:00:37 AM
Subject: RE: umls lookup issue
Hi Samir,
[including the public dev list]
Thanks for opening up a new thread on this issue.
Would you be able to help narrow down the sentence that you believe is causing
the NP2LookupWindow to take 3h to process? I can’t seem to reproduce it on my
end.
I vaguely remember someone running into something where it could go into a
loop, so hopefully maybe they can also chime in…
--Pei
From: samir chabou [mailto:[email protected]]
Sent: Wednesday, August 14, 2013 7:30 PM
To: Chen, Pei
Subject: Re: umls lookup issue
specifically the NP2LookupWindow that causes de delay
________________________________
From: samir chabou <[email protected]>
To: "Chen, Pei" <[email protected]>
Sent: Wednesday, August 14, 2013 7:21:18 PM
Subject: Re: umls lookup issue
Hi Pei
I removed the LookupWindowAnnotator went very fast less than 1 min but there
was no annotations for EntityMention and EventMention, it looks there is some
thinh wrong with the LookupWindowAnnotator
Samir
________________________________
From: samir chabou <[email protected]>
To: "Chen, Pei" <[email protected]>
Sent: Wednesday, August 14, 2013 7:11:57 PM
Subject: Re: umls lookup issue
Hi Pei
I removed the lookupwindowannotation went very fast less than 1 min but there
was no annotations for EntityMention and EventMention, it looks there is some
thinh wrong with the lookupwindowannotation
Samir
________________________________
From: "Chen, Pei" <[email protected]>
To: samir chabou <[email protected]>
Sent: Wednesday, August 14, 2013 3:40:46 PM
Subject: RE: umls lookup issue
That is strange- it shouldn’t take that long. I wonder if it’s going into an
infinite loop.
Have you tried debugging it? Perhaps removing some of the lines in the note or
removing the dictionary lookup component itself?
--Pei
From: samir chabou [mailto:[email protected]]
Sent: Wednesday, August 14, 2013 1:14 PM
To: Chen, Pei
Subject: Re: umls lookup issue
Hi Pei,
Unfortunately, the removal of the DependencyParsser and Assertion did not make
difference (it has been running now for 1h so i stopped). Pei I think the
bottle neck was the LookupWindowAnnotator, yesterday when it was running the
console showed the LookupWindowAnnotator annotations it took quit time to go
from one LookupWindow to an other, also these annotations of lookupwindows was
done twice.
Memory: Xms500M and Xmx1500
The jdk : JavaSE-1.6 (jre7)
below screen capture showing from where i got the memory and jdk info + the
structure of AggregatePlaintextUMLSProcessor.xml without the DependencyParsser
and Assertion
Thanks a lot
Samir
________________________________
From: "Chen, Pei" <[email protected]>
To: samir chabou <[email protected]>
Sent: Wednesday, August 14, 2013 10:08:00 AM
Subject: RE: umls lookup issue
Hi Samir,
It shouldn’t take 3h… it’s a bit strange. cTAKES is much more constrained to
memory rather than cpu. Do you know which JDK and what the java memory
settings were used?
Could you also try removing the new annotators that were added in 3.0?
DependencyParser, Assertion Module. See attached as an example.
--Pei
From: samir chabou [mailto:[email protected]]
Sent: Tuesday, August 13, 2013 10:48 PM
To: Chen, Pei
Subject: Re: umls lookup issue
Hi Pei
I tried the clinical pipeline as is with no modification except for umls
username and password, it took more than 5h on my laptop to process the text
sample that i send to you. Then I thought may be my laptop was not performing
enough so I tried it in on an other laptop i7, 16M, 2.4Mhz but again it took 3h
and plus. I was wondering if you run it within 5minutes what was the
environment.
Next step as you suggested I will try to create a local db on mysql for the db
umls2011ab and proceed the text. But again it strange that in version cTakes
2.5 this same test took less than one minute.
Thanks a lot for your cooperation your was appreciated<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
<primitive>false</primitive>
<delegateAnalysisEngineSpecifiers>
<delegateAnalysisEngine key="DependencyParser">
<import location="../../../ctakes-dependency-parser/desc/analysis_engine/ClearNLPDependencyParserAE.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="TokenizerAnnotator">
<import location="../../../ctakes-core/desc/analysis_engine/TokenizerAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="ContextDependentTokenizerAnnotator">
<import location="../../../ctakes-context-tokenizer/desc/analysis_engine/ContextDependentTokenizerAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="StatusAnnotator">
<import location="../../../ctakes-ne-contexts/desc/StatusAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="NegationAnnotator">
<import location="../../../ctakes-ne-contexts/desc/NegationAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="SentenceDetectorAnnotator">
<import location="../../../ctakes-core/desc/analysis_engine/SentenceDetectorAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="SimpleSegmentAnnotator">
<import location="SimpleSegmentAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="POSTagger">
<import location="../../../ctakes-pos-tagger/desc/POSTagger.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="AdjustNounPhraseToIncludeFollowingNP2">
<import location="../../../ctakes-chunker/desc/AdjustNounPhraseToIncludeFollowingNP.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="SemanticRoleLabeler">
<import location="../../../ctakes-dependency-parser/desc/analysis_engine/ClearNLPSemanticRoleLabelerAE.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="Chunker">
<import location="../../../ctakes-chunker/desc/Chunker.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="DictionaryLookupAnnotatorDB">
<import location="../../../ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="AdjustNounPhraseToIncludeFollowingPPNP2">
<import location="../../../ctakes-chunker/desc/AdjustNounPhraseToIncludeFollowingPPNP.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="ExtractionPrepAnnotator">
<import location="ExtractionPrepAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="AssertionAnnotator">
<import location="../../../ctakes-assertion/desc/AssertionMiniPipelineAnalysisEngine.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="LookupWindowAnnotator">
<import location="LookupWindowAnnotator.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="LvgAnnotator">
<import location="../../../ctakes-lvg/desc/analysis_engine/LvgAnnotator.xml"/>
</delegateAnalysisEngine>
</delegateAnalysisEngineSpecifiers>
<analysisEngineMetaData>
<name>AggregatePlaintextUMLSProcessor</name>
<description>Runs the complete pipeline for annotating clinical documents in plain text format using the built in UMLS (SNOMEDCT and RxNORM) dictionaries. This uses the dictionary lookup/desc/DictionaryLookupAnnotatorUMLS.xml
and requires an UMLS license. Please update DictionaryLookupAnnotatorUMLS.xml file with your UMLS username and password.</description>
<version/>
<vendor/>
<configurationParameters searchStrategy="language_fallback">
<configurationParameter>
<name>SegmentID</name>
<description/>
<type>String</type>
<multiValued>false</multiValued>
<mandatory>false</mandatory>
<overrides>
<parameter>SimpleSegmentAnnotator/SegmentID</parameter>
</overrides>
</configurationParameter>
<configurationParameter>
<name>ChunkCreatorClass</name>
<type>String</type>
<multiValued>false</multiValued>
<mandatory>true</mandatory>
<overrides>
<parameter>Chunker/ChunkCreatorClass</parameter>
</overrides>
</configurationParameter>
</configurationParameters>
<configurationParameterSettings>
<nameValuePair>
<name>ChunkCreatorClass</name>
<value>
<string>org.apache.ctakes.chunker.ae.PhraseTypeChunkCreator</string>
</value>
</nameValuePair>
</configurationParameterSettings>
<flowConstraints>
<fixedFlow>
<node>SimpleSegmentAnnotator</node>
<node>SentenceDetectorAnnotator</node>
<node>TokenizerAnnotator</node>
<node>LvgAnnotator</node>
<node>ContextDependentTokenizerAnnotator</node>
<node>POSTagger</node>
<node>Chunker</node>
<node>AdjustNounPhraseToIncludeFollowingNP2</node>
<node>AdjustNounPhraseToIncludeFollowingPPNP2</node>
<node>LookupWindowAnnotator</node>
<node>DictionaryLookupAnnotatorDB</node>
<node>DependencyParser</node>
<node>SemanticRoleLabeler</node>
<node>AssertionAnnotator</node>
<node>ExtractionPrepAnnotator</node>
</fixedFlow>
</flowConstraints>
<typePriorities>
<name>Ordering</name>
<description>For subiterator</description>
<version>1.0</version>
<priorityList>
<type>org.apache.ctakes.typesystem.type.textspan.Segment</type>
<type>org.apache.ctakes.typesystem.type.textspan.Sentence</type>
<type>org.apache.ctakes.typesystem.type.syntax.BaseToken</type>
</priorityList>
<priorityList>
<type>org.apache.ctakes.typesystem.type.textspan.Sentence</type>
<type>org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation</type>
</priorityList>
</typePriorities>
<fsIndexCollection/>
<capabilities>
<capability>
<inputs/>
<outputs>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.NewlineToken</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.WordToken</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.VP</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.refsem.UmlsConcept</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.UCP</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.TimeAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.SymbolToken</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textspan.Sentence</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.SBAR</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.RomanNumeralAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.RangeAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.PunctuationToken</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.PersonTitleAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.PRT</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.PP</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.NumToken</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.MeasurementAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.Lemma</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.LST</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.INTJ</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.FractionAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.structured.DocumentID</type>
<type allAnnotatorFeatures="true">uima.tcas.DocumentAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.DateAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.CopySrcAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.CopyDestAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.textsem.ContextAnnotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.Chunk</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.CONJP</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.BaseToken</type>
<type allAnnotatorFeatures="true">uima.cas.AnnotationBase</type>
<type allAnnotatorFeatures="true">uima.tcas.Annotation</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.ADVP</type>
<type allAnnotatorFeatures="true">org.apache.ctakes.typesystem.type.syntax.ADJP</type>
</outputs>
<languagesSupported/>
</capability>
</capabilities>
<operationalProperties>
<modifiesCas>true</modifiesCas>
<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
<outputsNewCASes>false</outputsNewCASes>
</operationalProperties>
</analysisEngineMetaData>
<resourceManagerConfiguration/>
</analysisEngineDescription>
To compare the incidences of symptom recurrence and permanent amenorrhea
following uterine artery embolization (UAE) for symptomatic fibroid tumors in
patients with type I and II utero-ovarian anastomoses (UOAs) with versus
without ovarian artery embolization (OAE).
MATERIALS AND METHODS:
A retrospective, institutional review board-approved study of 99 women who
underwent UAE for symptomatic fibroid tumors from April 2005 to October 2010
was conducted to identify patients who had type I or II UOAs at the time of
UAE. Based on the embolization technique, patients were categorized into
standard (ie, UAE only), combined (ie, UAE and OAE), and control (patients
without UOAs who underwent UAE) groups. Data collected included patient
characteristics, procedural technique and findings, symptom recurrence,
secondary interventions, and permanent amenorrhea. Statistical analysis was
performed with the Fisher exact test, with significance reached at P < .05.
RESULTS:
Twenty patients (20.2%; mean age, 46.9 y ± 6.3) had type I (n = 3) or II (n =
17) UOAs. Thirteen (65%) underwent UAE only (standard group) and seven (35%)
underwent UAE and OAE (combined group). There were no significant differences
between groups in demographics or in the incidence of permanent amenorrhea
after procedures (follow-up, 561 d ± 490). There was a significantly higher
incidence of symptom recurrence in the standard group compared with the control
group (P = .01), with no differences between combined and control groups (P =
1).
CONCLUSIONS:
There were no statistical differences in permanent amenorrhea rates in the
groups studied, with significantly higher symptom recurrence rates observed
when OAE was not performed in the setting of UOA.