I tested the solution that proposes, add processParentLast="true" to the Cas Multiplier delegate's configuration. The behavior is the same. The other alternative that you say, is that i have a bug somewhere in my code which allows a CAS to be accessed in two separate threads, but i not have idea,because the cas generated in the second cas multipler is composed of all views and then this cas go to the end annotator that have only 1 instance, and here finish the flow.
2017-02-15 16:31 GMT-05:00, Jaroslaw Cwiklik <[email protected]>: > Nelson, change Cas Multiplier in your deployment descriptor as follows: > > <analysisEngine key="FileSystemMultiplerCas"> > <casMultiplier poolSize="10" > processParentLast="true"/> > </analysisEngine> > > Note: processParentLast="true". > > In UIMA-AS async aggregate its possible for a child CAS and its parent CAS > to flow through the pipeline at the same time and the parent CAS may reach > the end before its child(ren). The above setting will ensure the parent CAS > does not flow ahead of its children. From UIMA-AS documentation: > > "The processParentLast attribute on the <casMultiplier> element is > optional, and specifies processing order of an input CAS relative to its > children. If true, a flow of an input CAS will be suspended after it is > returned from a Cas Multiplier delegate until all its child CASes have > finished processing. If false, an input CAS can be processed in parallel > with its children." > > > If the above change does not fix the NPE, I suspect you may have a bug > somewhere in your code which allows a CAS to be accessed in two separate > threads. > > -jerry > > On Wed, Feb 15, 2017 at 12:43 PM, Jaroslaw Cwiklik <[email protected]> > wrote: > >> Nelson, I can try to setup a simple pipeline with one AE which will add >> 20 >> views and than test serialization. Not sure if I get to it today. If not >> this will have to wait till Monday next week. I've already mentioned this >> before, don't operate on a CAS once it leaves an AE. The contract is >> CAS-In >> CAS-out. A CAS instance can only be operated on by one AE at a time. >> >> -jerry >> >> On Wed, Feb 15, 2017 at 11:06 AM, Marshall Schor <[email protected]> wrote: >> >>> On 2/15/2017 9:51 AM, Jaroslaw Cwiklik wrote: >>> > Not exactly sure how to debug this. >>> >>> a small-ish test case we could run would enable debugging... >>> >>> > The UIMA-AS does not touch contents of >>> > a CAS directly. Are there any other errors in the log besides NPE? The >>> > UIMA-AS uses uima-sdk to serialize CASes. Since you are getting null >>> from >>> > getView(N), this view must have been deleted somehow. >>> > >>> > -jerry >>> > >>> > On Mon, Feb 13, 2017 at 11:43 AM, nelson rivera < >>> [email protected]> >>> > wrote: >>> > >>> >> I was able to check your email just today. The agregegate is async, >>> >> but only process one input CAS at the same time,default >>> >> numberOfCASes. >>> >> I read your possible explanation but i have no idea that another >>> >> thread can modificate the cas, because the last annotator's execution >>> >> is correct and only missing that the framework uima-as serializes the >>> >> cas. >>> >> >>> >> This is the configuration of deploy of the aggregate: >>> >> >>> >> <?xml version="1.0" encoding="UTF-8"?> >>> >> <analysisEngineDeploymentDescription >>> >> xmlns="http://uima.apache.org/resourceSpecifier"> >>> >> >>> >> <name>XClusterAnalyzerAE Deploy Descriptor</name> >>> >> <description>Deploys XClusterAnalyzerAE</description> >>> >> >>> >> <deployment protocol="jms" provider="activemq"> >>> >> >>> >> <service> >>> >> <inputQueue endpoint="XClusterAnalyzerAggregate" >>> >> brokerURL="${defaultBrokerURL}"/> >>> >> <topDescriptor> >>> >> <import location="./XClusterAnalyzerAggregate.xml"/> >>> >> </topDescriptor> >>> >> <!-- remoteReplyQueueScaleout for remote delegate--> >>> >> <analysisEngine inputQueueScaleout="2" >>> >> internalReplyQueueScaleout="3"> >>> >> <delegates> >>> >> <analysisEngine key="FileSystemMultiplerCas"> >>> >> <casMultiplier poolSize="10"/> >>> >> </analysisEngine> >>> >> <analysisEngine key="XFileFormatDetector"> >>> >> <scaleout numberOfInstances="2"/> >>> >> <asyncAggregateErrorConfiguration> >>> >> <processCasErrors maxRetries="0" >>> >> continueOnRetryFailure="true"/> >>> >> </asyncAggregateErrorConfiguration> >>> >> </analysisEngine> >>> >> <analysisEngine key="XDataFileExtractor"> >>> >> <scaleout numberOfInstances="2"/> >>> >> <asyncAggregateErrorConfiguration> >>> >> <processCasErrors maxRetries="0" >>> >> continueOnRetryFailure="true"/> >>> >> </asyncAggregateErrorConfiguration> >>> >> </analysisEngine> >>> >> <remoteAnalysisEngine key="XLanguageDetector"> >>> >> <inputQueue endpoint="XLanguageDetector" >>> >> brokerURL="${defaultBrokerURL}"/> >>> >> <serializer method="xmi"/> >>> >> <asyncAggregateErrorConfiguration> >>> >> <processCasErrors maxRetries="0" >>> >> continueOnRetryFailure="true"/> >>> >> </asyncAggregateErrorConfiguration> >>> >> </remoteAnalysisEngine> >>> >> <analysisEngine key="XTokenizer"> >>> >> <scaleout numberOfInstances="2"/> >>> >> <asyncAggregateErrorConfiguration> >>> >> <processCasErrors maxRetries="0" >>> >> continueOnRetryFailure="true"/> >>> >> </asyncAggregateErrorConfiguration> >>> >> </analysisEngine> >>> >> <analysisEngine key="XBoTModeler"> >>> >> <scaleout numberOfInstances="3"/> >>> >> <asyncAggregateErrorConfiguration> >>> >> <processCasErrors maxRetries="0" >>> >> continueOnRetryFailure="true"/> >>> >> </asyncAggregateErrorConfiguration> >>> >> </analysisEngine> >>> >> <analysisEngine key="MergerInViewCasMultipler"> >>> >> <casMultiplier poolSize="1"/> >>> >> </analysisEngine> >>> >> <analysisEngine key="XClusterAnalyzer"> >>> >> <scaleout numberOfInstances="1"/> >>> >> <asyncAggregateErrorConfiguration> >>> >> <processCasErrors maxRetries="0" >>> >> continueOnRetryFailure="true"/> >>> >> </asyncAggregateErrorConfiguration> >>> >> </analysisEngine> >>> >> </delegates> >>> >> </analysisEngine> >>> >> </service> >>> >> </deployment> >>> >> >>> >> </analysisEngineDeploymentDescription> >>> >> >>> >> 2017-02-10 16:43 GMT-05:00, Jaroslaw Cwiklik <[email protected]>: >>> >>> Just a bit more evidence. The caller of the gerSofaAddr() >>> >>> >>> >>> public void writeViewsCommons() throws Exception { >>> >>> // Get indexes for each SofaFS in the CAS >>> >>> int numViews = cas.getBaseSofaCount(); >>> >>> >>> >>> for (int sofaNum = 1; sofaNum <= numViews; sofaNum++) { >>> >>> FSIndexRepositoryImpl loopIR = (FSIndexRepositoryImpl) >>> >>> cas.getBaseCAS().getSofaIndexRepository(sofaNum); >>> >>> final int sofaAddr = getSofaAddr(sofaNum); >>> >>> >>> >>> Not an expert of this code, but it smells like another thread is >>> >> changing a >>> >>> CAS which is being serialized. >>> >>> >>> >>> -jerry >>> >>> >>> >>> On Fri, Feb 10, 2017 at 4:31 PM, Jaroslaw Cwiklik <[email protected]> >>> >> wrote: >>> >>>> Is this a primitive (single-threaded) aggregate or async >>> >>>> (multi-threaded)? >>> >>>> If async, try to simplify and run primitive aggregate with >>> scaleout=1. >>> >>>> >>> >>>> The CAS does not seem to be null in this case. The caller of the >>> >>>> getSerializedCas() >>> >>>> checks for null. >>> >>>> >>> >>>> The code dies here: >>> >>>> Caused by: java.lang.NullPointerException >>> >>>> at org.apache.uima.cas.impl.CasSe >>> rializerSupport$CasDocSerializ >>> >>>> er.getSofaAddr(CasSerializerSupport.java:454) >>> >>>> >>> >>>> public int getSofaAddr(int sofaNum) { >>> >>>> if (sofaNum != 1 || cas.isInitialSofaCreated()) { //skip if >>> >> initial >>> >>>> view && no Sofa yet >>> >>>> // all >>> >>>> non-initial-views must have a sofa >>> >>>> * return ((CASImpl)cas.getView(sofaNum)).getSofaRef();* >>> >>>> } >>> >>>> return 0; >>> >>>> } >>> >>>> >>> >>>> Looks to me that getView(sofaNum) is returning null. Is it possible >>> that >>> >>>> two threads are operating on the same CAS maybe? One removing a >>> >>>> view >>> >>>> while >>> >>>> another trying to serialize. Have no idea what else could it be. >>> >>>> >>> >>>> -jerry >>> >>>> >>> >>>> >>> >>>> >>> >>>> On Fri, Feb 10, 2017 at 8:45 AM, nelson rivera < >>> >> [email protected]> >>> >>>> wrote: >>> >>>> >>> >>>>> Hi, The first thing I did was these tests,i made a simple test >>> >>>>> case >>> >>>>> that create a Cas with 17 views and then serialize using >>> >>>>> XmiCasSerializer.serialize(newJCas.getCas(), fis) and serializes >>> >>>>> correctly. >>> >>>>> Also i made other test, initialize the same AE but of local way >>> >>>>> with >>> >>>>> UIMA API and process the same input documents and the processing >>> >>>>> is >>> >>>>> correct and then serialize the CAS, without problem. >>> >>>>> >>> >>>>> The error is with AE deployed in uima-as and consuming it. >>> >>>>> >>> >>>>> 2017-02-09 17:30 GMT-05:00, Marshall Schor <[email protected]>: >>> >>>>>> one thing that would help track this down is a small isolated >>> >>>>>> test >>> >>>>>> case. >>> >>>>>> >>> >>>>>> Do you think uima-as is needed? I'm wondering if a simple test >>> >>>>>> case >>> >>>>> which >>> >>>>>> generated 17 views and then tried to serialize would show the >>> >>>>>> failure... >>> >>>>>> >>> >>>>>> If you could supply a small test case that showed the failure so >>> >>>>>> we >>> >>>>> could >>> >>>>>> reproduce it, that would enable a rapid resolution. >>> >>>>>> >>> >>>>>> -Marshall >>> >>>>>> >>> >>>>>> >>> >>>>>> On 2/9/2017 3:58 PM, Marshall Schor wrote: >>> >>>>>>> The line throwing the null pointer exception is : >>> >>>>>>> >>> >>>>>>> cas.getView(sofaNum).getSofaRef() >>> >>>>>>> >>> >>>>>>> So the NPE is either the cas is null, or the getView(sofaNum) is >>> >>>>> returning >>> >>>>>>> null. >>> >>>>>>> >>> >>>>>>> I'm not sure what the best way is to debug this... >>> >>>>>>> >>> >>>>>>> -Marshall >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> On 2/9/2017 12:42 PM, nelson rivera wrote: >>> >>>>>>>> I have a aggregate service uima-as, at the end of aggregate the >>> cas >>> >>>>>>>> to >>> >>>>>>>> return is composed of as many views as the number of input >>> >>>>>>>> files, >>> >>>>>>>> each >>> >>>>>>>> view with annotations of processing. >>> >>>>>>>> With a number of input documents less than 15 the processing is >>> >>>>>>>> successful always, >>> >>>>>>>> but if the number of documents is greater than 15, i get a >>> >>>>>>>> NullPointerException at the aggregate service trying to >>> >>>>>>>> serialize >>> >>>>>>>> the >>> >>>>>>>> cas, not in the processing of AE aggregate. >>> >>>>>>>> the logs of aggregate service: >>> >>>>>>>> >>> >>>>>>>> 11:51:38.815 - 42: >>> >>>>>>>> cu.datys.xinetica.uima.core.MergerInViewCasMultipler.hasNext >>> (285): >>> >>>>>>>> INFO: HasNext false >>> >>>>>>>> 11:51:38.875 - 44: >>> >>>>>>>> org.apache.uima.uimacpp.UimacppAnalysisComponent.log(396): >>> INFO: : >>> >>>>>>>> XClusterAnalyzer::process --- OK >>> >>>>>>>> 11:51:39.145 - 45: >>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro >>> >>>>> ller_impl.replyToClient: >>> >>>>>>>> WARNING: Service: XClusterAnalyzerAggregate Runtime Exception >>> >>>>>>>> 11:51:39.145 - 45: >>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro >>> >>>>> ller_impl.replyToClient: >>> >>>>>>>> WARNING: >>> >>>>>>>> org.apache.uima.aae.error.AsynchAEException: >>> >>>>>>>> org.apache.uima.UIMARuntimeException >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSer >>> >>>>> ializedCas(JmsOutputChannel.java:1265) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.sendRe >>> >>>>> ply(JmsOutputChannel.java:800) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro >>> >>>>> ller_impl.sendReplyToRemoteClient(AggregateAnalysisEngineCon >>> >>>>> troller_impl.java:2173) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro >>> >>>>> ller_impl.replyToClient(AggregateAnalysisEngineControl >>> >> ler_impl.java:2342) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro >>> >>>>> ller_impl.finalStep(AggregateAnalysisEngineController_impl. >>> java:1862) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro >>> >>>>> ller_impl.executeFlowStep(AggregateAnalysisEngineController_ >>> >>>>> impl.java:2489) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro >>> >>>>> ller_impl.process(AggregateAnalysisEngineController_impl.java:1271) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.handler.HandlerBase.invokeProcess(Handle >>> >>>>> rBase.java:118) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.can >>> >>>>> celTimerAndProcess(ProcessResponseHandler.java:117) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.han >>> >>>>> dleProcessResponseWithCASReference(ProcessResponseHandler.java:485) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.han >>> >>>>> dle(ProcessResponseHandler.java:767) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.handler.HandlerBase.delegate(HandlerBase >>> >>>>> .java:149) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessRequestHandler_impl >>> >>>>> .handle(ProcessRequestHandler_impl.java:1113) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.spi.transport.vm.UimaVmMessageListener.o >>> >>>>> nMessage(UimaVmMessageListener.java:107) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.spi.transport.vm.UimaVmMessageDispatcher >>> >>>>> $1.run(UimaVmMessageDispatcher.java:70) >>> >>>>>>>> at >>> >>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool >>> >>>>> Executor.java:1145) >>> >>>>>>>> at >>> >>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo >>> >>>>> lExecutor.java:615) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.UimaAsThreadFactory$1.run(UimaAsThreadFa >>> >>>>> ctory.java:132) >>> >>>>>>>> at java.lang.Thread.run(Thread.java:745) >>> >>>>>>>> Caused by: org.apache.uima.UIMARuntimeException >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe >>> >>>>> rializer.java:420) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe >>> >>>>> rializer.java:385) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.aae.UimaSerializer.serializeCasToXmi(UimaSer >>> >>>>> ializer.java:145) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.serial >>> >>>>> izeCAS(JmsOutputChannel.java:251) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSer >>> >>>>> ializedCas(JmsOutputChannel.java:1250) >>> >>>>>>>> ... 18 more >>> >>>>>>>> Caused by: java.lang.NullPointerException >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ >>> >>>>> er.getSofaAddr(CasSerializerSupport.java:454) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ >>> >>>>> er.writeViewsCommons(CasSerializerSupport.java:465) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer. >>> >>>>> writeViews(XmiCasSerializer.java:572) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ >>> >>>>> er.serialize(CasSerializerSupport.java:441) >>> >>>>>>>> at >>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe >>> >>>>> rializer.java:415) >>> >>>>>>>> ... 22 more >>> >>>>>>>> >>> >>>>>> >>> >>>> >>> >>> >> >
