Hi Marshall, 2012/2/17 Marshall Schor <[email protected]>
> Our docs say that AE's are run in a single thread model (see > http://uima.apache.org/d/**uimaj-2.4.0/tutorials_and_** > users_guides.html#ugr.tug.aae.**contract_for_annotator_methods<http://uima.apache.org/d/uimaj-2.4.0/tutorials_and_users_guides.html#ugr.tug.aae.contract_for_annotator_methods> > **). If multiple threads are wanted, the framework supports this by > making multiple instances of the AE's implementation class. This limits > "thread-safety" issues to only "static" or class-level fields. > > The reason for this was an observation that the people writing annotators, > although skilled in their particular discipline and able to write code that > extracted information from Unstructured data, did not typically have the > skills needed to write correct multi-threaded implementations in Java. So > the framework "helped" here, by insuring that any parallelism the framework > supported created multiple instances of the annotator class, for each > thread. > > I believe, however, that it is currently possible to use the framework in > ways in which the application writer creates multiple threads and calls the > same annotator instance on multiple threads at the same time. Perhaps a > proper approach here would be to have the framework detect this, and signal > some kind of error. I agree with this latest sentence, detection of such situations would be very helpful in my opinion. Tommaso > > > -Marshall > > > On 2/17/2012 4:09 AM, Tommaso Teofili (Commented) (JIRA) wrote: > >> [ https://issues.apache.org/**jira/browse/UIMA-2373?page=** >> com.atlassian.jira.plugin.**system.issuetabpanels:comment-** >> tabpanel&focusedCommentId=**13210151#comment-13210151<https://issues.apache.org/jira/browse/UIMA-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210151#comment-13210151>] >> >> Tommaso Teofili commented on UIMA-2373: >> ------------------------------**--------- >> >> bq. Possibly a concurrency issue? >> >> Yes, I think so. >> That came out when an AE is used from different clients which execute in >> parallel, so I wonder if is the usage which is wrong or we should allow >> that and thus made a fix for it. >> >> Possible bug in FixedFlowController >>> ------------------------------**----- >>> >>> Key: UIMA-2373 >>> URL: >>> https://issues.apache.org/**jira/browse/UIMA-2373<https://issues.apache.org/jira/browse/UIMA-2373> >>> Project: UIMA >>> Issue Type: Bug >>> Affects Versions: 2.4.0SDK >>> Reporter: Tommaso Teofili >>> >>> I am developing a series of Lucene tokenizers which can use UIMA for >>> creating tokens via extracted annotations. >>> While doing a stress test with lots of different strings I experienced >>> the following: >>> {noformat} >>> [junit] Testsuite: org.apache.lucene.analysis.** >>> uima.UIMATypeAwareAnalyzerTest >>> [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 92,061 >>> sec >>> [junit] >>> [junit] ------------- Standard Error ----------------- >>> [junit] The following exceptions were thrown by threads: >>> [junit] *** Thread: Thread-9 *** >>> [junit] java.lang.RuntimeException: java.io.IOException: >>> org.apache.uima.analysis_**engine.**AnalysisEngineProcessException >>> [junit] at org.apache.lucene.analysis.**BaseTokenStreamTestCase$* >>> *AnalysisThread.run(**BaseTokenStreamTestCase.java:**289) >>> [junit] Caused by: java.io.IOException: org.apache.uima.analysis_** >>> engine.**AnalysisEngineProcessException >>> [junit] at org.apache.lucene.analysis.**uima.** >>> UIMATypeAwareAnnotationsTokeni**zer.incrementToken(** >>> UIMATypeAwareAnnotationsTokeni**zer.java:87) >>> [junit] at org.apache.lucene.analysis.**BaseTokenStreamTestCase.* >>> *assertTokenStreamContents(**BaseTokenStreamTestCase.java:**121) >>> [junit] at org.apache.lucene.analysis.**BaseTokenStreamTestCase.* >>> *checkRandomData(**BaseTokenStreamTestCase.java:**371) >>> [junit] at org.apache.lucene.analysis.**BaseTokenStreamTestCase.* >>> *checkRandomData(**BaseTokenStreamTestCase.java:**295) >>> [junit] at org.apache.lucene.analysis.**BaseTokenStreamTestCase$* >>> *AnalysisThread.run(**BaseTokenStreamTestCase.java:**287) >>> [junit] Caused by: org.apache.uima.analysis_**engine.** >>> AnalysisEngineProcessException >>> [junit] at org.apache.uima.analysis_**engine.asb.impl.ASB_impl$** >>> AggregateCasIterator.**processUntilNextOutputCas(ASB_**impl.java:701) >>> [junit] at org.apache.uima.analysis_**engine.asb.impl.ASB_impl$** >>> AggregateCasIterator.<init>(**ASB_impl.java:409) >>> [junit] at org.apache.uima.analysis_**engine.asb.impl.ASB_impl.** >>> process(ASB_impl.java:342) >>> [junit] at org.apache.uima.analysis_**engine.impl.** >>> AggregateAnalysisEngine_impl.**processAndOutputNewCASes(** >>> AggregateAnalysisEngine_impl.**java:267) >>> [junit] at org.apache.uima.analysis_**engine.impl.** >>> AnalysisEngineImplBase.**process(**AnalysisEngineImplBase.java:**267) >>> [junit] at org.apache.lucene.analysis.**uima.BaseUIMATokenizer.** >>> analyzeInput(**BaseUIMATokenizer.java:57) >>> [junit] at org.apache.lucene.analysis.**uima.** >>> UIMATypeAwareAnnotationsTokeni**zer.analyzeText(** >>> UIMATypeAwareAnnotationsTokeni**zer.java:73) >>> [junit] at org.apache.lucene.analysis.**uima.** >>> UIMATypeAwareAnnotationsTokeni**zer.incrementToken(** >>> UIMATypeAwareAnnotationsTokeni**zer.java:85) >>> [junit] ... 4 more >>> [junit] Caused by: java.lang.**IndexOutOfBoundsException: Index: 1, >>> Size: 2 >>> [junit] at java.util.ArrayList.**RangeCheck(ArrayList.java:547) >>> [junit] at java.util.ArrayList.get(**ArrayList.java:322) >>> [junit] at org.apache.uima.flow.impl.**FixedFlowController$** >>> FixedFlowObject.next(**FixedFlowController.java:216) >>> [junit] at org.apache.uima.analysis_** >>> engine.asb.impl.FlowContainer.**next(FlowContainer.java:98) >>> [junit] at org.apache.uima.analysis_**engine.asb.impl.ASB_impl$** >>> AggregateCasIterator.**processUntilNextOutputCas(ASB_**impl.java:667) >>> [junit] ... 11 more >>> {noformat} >>> I'm debugging it and see if I can come up with the exact bug (and fix) :) >>> >> -- >> This message is automatically generated by JIRA. >> If you think it was sent incorrectly, please contact your JIRA >> administrators: https://issues.apache.org/**jira/secure/** >> ContactAdministrators!default.**jspa<https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa> >> For more information on JIRA, see: http://www.atlassian.com/** >> software/jira <http://www.atlassian.com/software/jira> >> >> >> >>
