Adam noted that the issue https://issues.apache.org/jira/browse/UIMA-2078
suggests there are other issues around Cas Multipliers in base UIMA, when using
shared UIMA Contexts.

This is because the getEmptyCas method in the (shared) UimaContext is checking
to see if the pool size is exceeded, and if the pool size is 1 but you have 5
pipelines sharing the UimaContext, this test would result in throwing an
exception on the 2nd one.

So, one approach to "fix" this problem would be to not share UimaContexts, in
this case.  The downside of this would be that the contexts were not shared
among the pipelines.  This could be a good or bad thing, depending on the use
case(s).

Good thing: The contexts are very large, but read-only.

Bad thing: The contexts are used by the pipeline for things like storing data
via the external resource manager, in structures like HashMaps, which are not
thread safe.  The user could have designed a pipe line where some upstream
annotators wrote some data into a map, and some downstream annotators later
accessed that data, presuming what it would see would be just what the upstream
annotator put there.  In the case of shared UIMA-Contexts, besides the issue of
thread safety for HashMaps, even if the user used a thread-safe version of this,
this presumption would not hold.

There is a built-in framework method (produceAnalysisEngine with 2 int
arguments) that instantiates multiple analysis engines
(MultiprocessingAnalysisEngine_impl), used by (for example) the SOAP service
adapter.  In light of this, it seems faulty in several ways.

1) MultiprocessingAnalysisEngine_impl shares the UimaContext among the pool of
resources, and has the above issues including the CasMultiplier / pool size 
issue.

2) When UIMA-AS was being debugged, one issue that came up was that some
annotators had been written with a presumption that the thread used to call the
initialize method needed to be the same thread used to call the process method
(these annotators made use of ThreadLocal variables, IIRC).
See https://issues.apache.org/jira/browse/UIMA-1223 .  UIMA-AS was updated to
insure in its multi-pipeline setup that this presumption was met.

Shouldn't this same presumption be met with the base UIMA implementation?

-Marshall





Reply via email to