Hello everyone,

I have been using uima as already for tagging text with a custom AAE,
though I did not scaled the AAE because I run in a few issues back then and
had no time to solve them.

Now I tried again to scale the AAE and failed again. The AAE gets a document id which is sent to it via uimaj-as-camel component. A cas multiplier then fetches the
actual document out of a database and thats also the component which causes
trouble.

Because the AAE is not thread safe uima as must scale it through creating multiple
instances of it.
After reading through the uima as documentation I came up with this deployment descriptor:
           ...
           <analysisEngine key="TextAnalysis" async="false">
               <scaleout numberOfInstances="8" />

               <delegates>
                   <analysisEngine key="HBaseCasMultiplier">
                       <casMultiplier poolSize="8"/>
                   </analysisEngine>
               </delegates>
           </analysisEngine>
           ...

I must admit the documentation confused me a bit about the meaning of the async attribute. Is it correct that async=false means that uima as creates multiple instances which are each called from one worker thread ? And async=true would then mean that one AE is called by multiple threads.

If the numberOfInstacnes is larger then 1 I always get this exception:
Caused by: org.apache.uima.UIMARuntimeException: The method CasManager.defineCasPool() was called twice by the same Analysis Engine (/HBaseCasMultiplier/). at org.apache.uima.resource.impl.CasManager_impl.defineCasPool(CasManager_impl.java:181) at org.apache.uima.resource.impl.CasManager_impl.defineCasPool(CasManager_impl.java:161) at org.apache.uima.aae.EECasManager_impl.defineCasPool(EECasManager_impl.java:75) at org.apache.uima.impl.UimaContext_ImplBase.getEmptyCas(UimaContext_ImplBase.java:565) at org.apache.uima.analysis_component.CasMultiplier_ImplBase.getEmptyCAS(CasMultiplier_ImplBase.java:109) at dk.infopaq.nlp.repository.connector.HBaseReadCasMultiplier.hasNext(HBaseReadCasMultiplier.java:107) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl$AnalysisComponentCasIterator.hasNext(PrimitiveAnalysisEngine_impl.java:563) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:566)
   ... 20 more


A while back I had a problem which resulted in the same exception message,
but I was solved by updating UIMA to the current 2.3.0-SNAPSHOT:
http://www.mail-archive.com/[email protected]/msg02054.html

The version I am using is 2.3.0-SNAPSHOT from mid of may.

Thanks,
Jörn

Reply via email to