Here are some ideas for some additional methods that I think should be
added to the Flow Controler API.  Like the XMI stuff I recently did
this is laying the groundwork for supporting some more complex flow
and error handling options.

1)  ParallelStep - a new subtype of Step that the Flow Controller can
return.  The ParallelStep constructor would take a List of keys,
indicating that multiple AEs could logically be run in parallel.  Note
that the runtime may or may not actually execute them in parallel - in
a collocated deployment it certainly would not.  In a remote
deployment we can (eventually) support parallel execution by making
use of the XMI merging support I put in.

2) Dynamically adding/removing AEs from the aggregate.  This would be
a new set of FlowConroller APIs:
removeAnalysisEngines(List)
addAnalysisEngines(List)

each of which takes a List of keys.  The AnalysisEngine metadata map
available through the FlowControllerContext would also be updated.

This supports error handling such as the "disable" action in the
current CPM, allowing removing a misbehaving AE from the flow.
Someday it could also allow adding new AEs to an aggregate
dynamically.

The Flow Controller could throw an exception in response to
removeAnalysisEngines, indicating that the aggregate cannot continue
without the removed AEs (or that the flow controller simply can't
handle dynamic removal - maybe that should be the default in fact).

3) Notification of errors to allow continuing after a failure.  This
would support an action like the current CPM's "continue" action.
There would be a new API:
Flow.onFailure(String failedAnalysisEngineKey, Throwable failure)

If the runtime wanted to continue after a failure, it would call this
method on the Flow Controller, and then would go back to calling
hasNext/next.  Without this notification, a "continue" action wouldn't
make much sense, because a dynamic FlowControlle may make an
assumption that the last step it issued completed successfully.

4) A Flow.aborted() method to allow clean-up.  This is actually
already in JIRA UIMA-53.  When an unrecoverable failure occurs the
Flow Controller should be notified so it can release resources.

Note for #2 and #3 I'm not intending on having the existing framework
call these methods, yet.  These Flow Controller extensions are a
prerequisite for doing more advanced flow things like parallel flows
and error recovery.

Questions/comments?

-Adam

Reply via email to