Miguel, A number of possibilities.
If the UIMA-AS process is scaling vertically across multiple threads, and the long document processing is not invalidating the computation internals, then periodically test document processing time in the application logic and stop processing the document by throwing an AnalysisEngineProcessException. If there are multiple UIMA-AS services, consider running batch jobs with DUCC. On a processing timeout DUCC will automatically kill and restart the affected JobProcess (which in general contains multiple pipeline instances, each running in a different thread), and resubmit other documents that were co-resident in the JP. If the processing timeout and/or OOM conditions are being caused by large document, use a CasMultiplier to split up large documents and run them in pieces and use another CasMultiplier to reassemble the pieces. Core UIMA has some sample code that does this. In general a JVM does not survive an OOM unharmed. Eddie On Sun, Mar 6, 2016 at 7:22 PM, Miguel Alvarez <[email protected]> wrote: > Hi Eddie, > > Thanks for the prompt reply, and the clarification. Now I know : ) > > At the moment I am just exploring options, and I am not sure the best > option for me is to terminate the UIMA-AS service. Sometimes the processing > of a document takes very long (due to the logic the engines have), and I > would like to skip the processing of those documents altogether but without > having to terminate the service (basically just cancel the processing of > that document but maintain the service running so it can process the next > one). I am not sure if UIMA-AS supports anything like this. > > And on a similar note, in very rare occasions our engines will run out of > memory. Is there a way for an UIMA-AS service to recover from an OOM error? > > Thanks again for your help. > > Sincerely, > Miguel > > -----Original Message----- > From: Eddie Epstein [mailto:[email protected]] > Sent: March 6, 2016 13:57 > To: [email protected] > Subject: Re: UIMA-AS: How to configure processing timeouts? > > Hi, > > I think UIMA-AS documentation is missing important information on the > processCas timeout, which is that this timeout only works for remote > delegates. Arbitrary in-process code cannot be interrupted without > destroying the JVM. Since the input CAS sent to remote delegates is saved > locally, the error handling allows that delegate to be skipped and > processing on the CAS to continue. > > Would it be useful for you in this situation, timeout for an in-process > delegate, to simply have the UIMA-AS service terminate? > > Eddie > > > On Sun, Mar 6, 2016 at 4:36 PM, Miguel Alvarez <[email protected]> > wrote: > > > Hi, > > > > > > > > When using UIMA-AS I would like to have a timeout that would stop the > > processing of a CAS document in case it is taking too long. When I > > read the documentation it seems like this should be possible, but for > > some reason it doesn't seem to be working for me, probably because I > > am configuring something wrong. I am using version 2.6.0 and Java 7 > > > > > > > > The UIMA-AS deployment descriptor contains a top level engine that is > > an aggregate engine descriptor which contains only one primitive AE > > using a "fixed flow". Find below the descriptor I am using (in this > > case the processing of a document should timeout after 1 second, > > right?), and I have tried multiple combinations of the settings, but > > for some reason I am not able to make the processing of a CAS timeout. > > > > > > > > I have also tried the timeout on the client side, and that one works > fine. > > > > > > > > What am I doing wrong? Does the delegate need to be a remote service > > for these timeouts to work? Does the delegate need to be pointing to > > another aggregate engine that wraps the primitive engine? > > > > > > > > Thanks, > > > > Miguel > > > > > > > > <?xml version="1.0" > > encoding="UTF-8"?><analysisEngineDeploymentDescription > > xmlns="http://uima.apache.org/resourceSpecifier"> > > > > <name>UIMA-AS MyEngine</name> > > > > <description>Processes the document.</description> > > > > <version/> > > > > <vendor/> > > > > <deployment protocol="jms" provider="activemq"> > > > > <casPool numberOfCASes="1" initialFsHeapSize="2000000"/> > > > > <service> > > > > <inputQueue endpoint="MyService" brokerURL="${defaultBrokerURL}" > > prefetch="1"/> > > > > <topDescriptor> > > > > <import location="MyAggregate.xml"/> > > > > </topDescriptor> > > > > <analysisEngine async="true"> > > > > <delegates> > > > > <analysisEngine key="MyPrimitiveEngine" async="false"> > > > > <scaleout numberOfInstances="2"/> > > > > <casMultiplier poolSize="1" initialFsHeapSize="2000000" > > processParentLast="false"/> > > > > <asyncAggregateErrorConfiguration> > > > > <getMetadataErrors maxRetries="0" timeout="0" > > errorAction="terminate"/> > > > > <processCasErrors maxRetries="5" timeout="1000" > > continueOnRetryFailure="false" thresholdCount="10" > thresholdWindow="10000" > > thresholdAction="terminate"/> > > > > <collectionProcessCompleteErrors timeout="0" > > additionalErrorAction="terminate"/> > > > > </asyncAggregateErrorConfiguration> > > > > </analysisEngine> > > > > </delegates> > > > > <asyncPrimitiveErrorConfiguration> > > > > <processCasErrors thresholdCount="0" thresholdWindow="0" > > thresholdAction="terminate"/> > > > > <collectionProcessCompleteErrors timeout="0" > > additionalErrorAction="terminate"/> > > > > </asyncPrimitiveErrorConfiguration> > > > > </analysisEngine> > > > > </service> > > > > </deployment> > > > > </analysisEngineDeploymentDescription> > > > > > > > > > >
