Page Edited :
ODExSITE :
Activity Failure and Recovery
Activity Failure and Recovery has been edited by Alex Boisvert (May 09, 2007). Content:Activity Failure and RecoveryThere are several types of error conditions. In this document we introduce a class of error condition called failures, distinct from faults, and describe how failures are caught and handled by the process engine. A service returns a fault in response to a request it cannot process. A process may also raise a fault internally when it encounters a terminal error condition, e.g. a faulty _expression_ or false join condition. In addition, processes may raise faults in order to terminate normal processing. In contrast, failures are non-terminal error conditions that do not affect the normal flow of the process. We keep the process definition simple and straightforward by delegating failure handling to the process engine and administrator. For example, when the process is unable to perform DNS resolution to determine the service endpoint, it generates a failure. An administrator can fix the DNS server and tell the process engine to retry the activity. Had the DNS error been reported as a fault, the process would either terminate or require complex fault handling and recovery logic to proceed past this point of failure. In short, failures shields the process from common, non-terminal error conditions while retaining simple and straightforward process definitions that do not need to account for these error conditions. From Failure to RecoveryCurrently, the Invoke activity is the only activity that supports failure handling and recovery. The mechanism is identical for all other activities that may support failure handling and recovery in the future. In case of the Invoke activity, a failure condition is triggered by the integration layer, in lieu of a response or fault message. The Invoke activity consults its failure handling policy (specified here) and decides how to respond. Set faultOnFailure to yes, if you want the activity to throw a fault on failure. All other failure handling settings are ignored. The activity will throw the activityFailure fault. The activityFailure fault is a standard fault, so you can use the exitOnStandardFault attribute to control whether the process exits immediately, or throws a fault in the enclosing scope. Set retryFor to a positive integer if you want the activity to attempt self-recovery and retry up to that number of times. Set retryDelay to a reasonable time delay (specified in seconds) between retries. For example, if you set retryFor=2, retryDelay=30, the activity will retry after 30 and 60 seconds, for a total of three attempts, before entering activity recovery mode. If the activity retries and succeeds, it will complete successfully as if no failure occurred. Of course, the activity may retry and fault, e.g. if the invoked service returns a fault. If the activity has exhausted all retry attempts, it enters activity recovery mode. By default retryFor is zero, and the activity enters recovery mode after the first failure. When in recovery mode, you can recover the activity in one of three ways:
Specifying Failure BehaviorUse the failureHandling extensibility element defined in the namespace http://ode.apache.org/activityRecovery <ext:failureHandling xmlns:ext="http://ode.apache.org/activityRecovery"> <ext:faultOnFailure> _boolean_ </ext:faultOnFailure> <ext:retryFor> _integer_ </ext:retryFor> <ext:retryDelay> _integer_ </ext:retryDelay> </ext:failureHandling>
Use the recoverActivity operation to perform a recovery action on an activity in recovery mode. The operation requires the process instance ID, the activity instance ID and the recovery action to perform (one of retry, fault or cancel). You can also determine when failure or recovery occurred for a given activity instance from the execution log. |
Unsubscribe or edit your notifications preferences