Hi,

I am running multiple cTAKES pipelines on a single machine in parallel,
each in their own JVM. Looking across the logs of each JVM, it appears that
severe blocking is occurring after the annotations are generated for a
particular segment. In particular, it looks like only one JVM is processing
at a time, while the other form a queue. Nearly all of the processes seem
to be halting during the SimpleSegmentAnnotator, just prior to the
initialization of the Sentence Detector.

18/12/20 22:44:33 INFO AbstractJCasTermAnnotator: Finished processing
18/12/20 22:44:36 WARN DocumentIDAnnotationUtil: Unable to find
DocumentIDAnnotation
**** HERE
18/12/20 23:06:07 INFO ae.SentenceDetector: Starting processing.
18/12/20 23:06:07 INFO ae.TokenizerAnnotatorPTB: process(JCas) in
org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
18/12/20 23:06:07 INFO ae.LvgAnnotator: process(JCas)
18/12/20 23:06:11 INFO ae.ContextDependentTokenizerAnnotator: process(JCas)
18/12/20 23:06:11 INFO postagger.POSTagger: process(JCas)
18/12/20 23:06:12 INFO AbstractJCasTermAnnotator: Starting processing
18/12/20 23:06:12 INFO AbstractJCasTermAnnotator: Finished processing
18/12/20 23:06:15 WARN DocumentIDAnnotationUtil: Unable to find
DocumentIDAnnotation
**** HERE
18/12/20 23:27:22 INFO ae.SentenceDetector: Starting processing.
18/12/20 23:27:22 INFO ae.TokenizerAnnotatorPTB: process(JCas) in
org.apache.ctakes.core.ae.TokenizerAnnotatorPTB


I was wondering what could be causing this hold up? The JVMs share the
cTAKES resources and UMLS dictionaries - these were not duplicated for each
instance.

Thanks,

Mike

-- 
[image: MetiStream Logo - 500]
Mike Trepanier| Senior Big Data Engineer | MetiStream, Inc. |
m...@metistream.com | 845 - 270 - 3129 (m) | www.metistream.com

Reply via email to