Our docs say that AE's are run in a single thread model (see http://uima.apache.org/d/uimaj-2.4.0/tutorials_and_users_guides.html#ugr.tug.aae.contract_for_annotator_methods). If multiple threads are wanted, the framework supports this by making multiple instances of the AE's implementation class. This limits "thread-safety" issues to only "static" or class-level fields.

The reason for this was an observation that the people writing annotators, although skilled in their particular discipline and able to write code that extracted information from Unstructured data, did not typically have the skills needed to write correct multi-threaded implementations in Java. So the framework "helped" here, by insuring that any parallelism the framework supported created multiple instances of the annotator class, for each thread.

I believe, however, that it is currently possible to use the framework in ways in which the application writer creates multiple threads and calls the same annotator instance on multiple threads at the same time. Perhaps a proper approach here would be to have the framework detect this, and signal some kind of error.

-Marshall

On 2/17/2012 4:09 AM, Tommaso Teofili (Commented) (JIRA) wrote:
     [ 
https://issues.apache.org/jira/browse/UIMA-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210151#comment-13210151
 ]

Tommaso Teofili commented on UIMA-2373:
---------------------------------------

bq. Possibly a concurrency issue?

Yes, I think so.
That came out when an AE is used from different clients which execute in 
parallel, so I wonder if is the usage which is wrong  or we should allow that 
and thus made a fix for it.

Possible bug in FixedFlowController
-----------------------------------

                 Key: UIMA-2373
                 URL: https://issues.apache.org/jira/browse/UIMA-2373
             Project: UIMA
          Issue Type: Bug
    Affects Versions: 2.4.0SDK
            Reporter: Tommaso Teofili

I am developing a series of Lucene tokenizers which can use UIMA for creating 
tokens via extracted annotations.
While doing a stress test with lots of different strings I experienced the 
following:
{noformat}
[junit] Testsuite: org.apache.lucene.analysis.uima.UIMATypeAwareAnalyzerTest
     [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 92,061 sec
     [junit]
     [junit] ------------- Standard Error -----------------
     [junit] The following exceptions were thrown by threads:
     [junit] *** Thread: Thread-9 ***
     [junit] java.lang.RuntimeException: java.io.IOException: 
org.apache.uima.analysis_engine.AnalysisEngineProcessException
     [junit]    at 
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:289)
     [junit] Caused by: java.io.IOException: 
org.apache.uima.analysis_engine.AnalysisEngineProcessException
     [junit]    at 
org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer.incrementToken(UIMATypeAwareAnnotationsTokenizer.java:87)
     [junit]    at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:121)
     [junit]    at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:371)
     [junit]    at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:295)
     [junit]    at 
org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:287)
     [junit] Caused by: 
org.apache.uima.analysis_engine.AnalysisEngineProcessException
     [junit]    at 
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:701)
     [junit]    at 
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:409)
     [junit]    at 
org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342)
     [junit]    at 
org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267)
     [junit]    at 
org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
     [junit]    at 
org.apache.lucene.analysis.uima.BaseUIMATokenizer.analyzeInput(BaseUIMATokenizer.java:57)
     [junit]    at 
org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer.analyzeText(UIMATypeAwareAnnotationsTokenizer.java:73)
     [junit]    at 
org.apache.lucene.analysis.uima.UIMATypeAwareAnnotationsTokenizer.incrementToken(UIMATypeAwareAnnotationsTokenizer.java:85)
     [junit]    ... 4 more
     [junit] Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 2
     [junit]    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
     [junit]    at java.util.ArrayList.get(ArrayList.java:322)
     [junit]    at 
org.apache.uima.flow.impl.FixedFlowController$FixedFlowObject.next(FixedFlowController.java:216)
     [junit]    at 
org.apache.uima.analysis_engine.asb.impl.FlowContainer.next(FlowContainer.java:98)
     [junit]    at 
org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:667)
     [junit]    ... 11 more
{noformat}
I'm debugging it and see if I can come up with the exact bug (and fix) :)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



Reply via email to