Hello everyone,
I have been using uima as already for tagging text with a custom AAE,
though I did not scaled the AAE because I run in a few issues back then and
had no time to solve them.
Now I tried again to scale the AAE and failed again. The AAE gets a
document id
which is sent to it via uimaj-as-camel component. A cas multiplier then
fetches the
actual document out of a database and thats also the component which causes
trouble.
Because the AAE is not thread safe uima as must scale it through
creating multiple
instances of it.
After reading through the uima as documentation I came up with this
deployment descriptor:
...
<analysisEngine key="TextAnalysis" async="false">
<scaleout numberOfInstances="8" />
<delegates>
<analysisEngine key="HBaseCasMultiplier">
<casMultiplier poolSize="8"/>
</analysisEngine>
</delegates>
</analysisEngine>
...
I must admit the documentation confused me a bit about the meaning of
the async attribute.
Is it correct that async=false means that uima as creates multiple
instances which are each called
from one worker thread ? And async=true would then mean that one AE is
called by multiple threads.
If the numberOfInstacnes is larger then 1 I always get this exception:
Caused by: org.apache.uima.UIMARuntimeException: The method
CasManager.defineCasPool() was called twice by the same Analysis Engine
(/HBaseCasMultiplier/).
at
org.apache.uima.resource.impl.CasManager_impl.defineCasPool(CasManager_impl.java:181)
at
org.apache.uima.resource.impl.CasManager_impl.defineCasPool(CasManager_impl.java:161)
at
org.apache.uima.aae.EECasManager_impl.defineCasPool(EECasManager_impl.java:75)
at
org.apache.uima.impl.UimaContext_ImplBase.getEmptyCas(UimaContext_ImplBase.java:565)
at
org.apache.uima.analysis_component.CasMultiplier_ImplBase.getEmptyCAS(CasMultiplier_ImplBase.java:109)
at
dk.infopaq.nlp.repository.connector.HBaseReadCasMultiplier.hasNext(HBaseReadCasMultiplier.java:107)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl$AnalysisComponentCasIterator.hasNext(PrimitiveAnalysisEngine_impl.java:563)
at
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:566)
... 20 more
A while back I had a problem which resulted in the same exception message,
but I was solved by updating UIMA to the current 2.3.0-SNAPSHOT:
http://www.mail-archive.com/[email protected]/msg02054.html
The version I am using is 2.3.0-SNAPSHOT from mid of may.
Thanks,
Jörn