I should also clarify that there is a distinction between (a) getting 
ER_NO_RESOURCES as the direct
result from an IMM API call  (in the above case a search or accessorGet used by 
SMF); and (b)  determining that a CCB was aborted due to resource error and not 
validation error (new API enhancement #1449). In both cases it means that the 
thing/request was rejected/aborted for resource
reasons. But the handling of retry is different. 

If the user (SMF) directly gets ERR_NO_RESOURCES returned on a call then that 
specifi call can be
retried. 

But if the user (SMF) determines that a CCB has been aborted 
(ERR_FAILED_OPERATION) due to
a resource failure (return value false on argument 'isValidationAbort' for the 
new API 
saImmOmCcbGetAbortReason, then a replay of the whole CCB can be atempted. But 
it makes no
sense here to retry the last ccb related downcall (ccbApply or ccbVAlidate or 
ccbObjectCreate..)
since the CCB has been aborted.

This distinction should be simple because in the resource aborted CCB case you  
dont
actually get SA_AIS_ERR_NO_RESOURCES as a return code.

SMF campaigns robustness can be improved on both aspects, when #1449 has been 
delivered.


---

** [tickets:#1448] smf: Make campaigns less fragile by retrying on 
ERR_NO_RESOURCES**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Aug 14, 2015 07:09 AM UTC by Anders Bjornerstedt
**Last Updated:** Tue Aug 25, 2015 10:56 AM UTC
**Owner:** nobody


The SMF service is a heavy user of the IMM service.
The IMM  has an established client pattern for ERR_TRY_AGAIN which allows an 
application realtime
control over how long it is prepared to wait for a transient inability of the 
IMM service to fullfill a request.
Each response of TRY_AGAIN should in itself be fast so the application needs a 
delay in its retry loop.

There is also the very similar error code ERR_NO_RESOURSES.  Logically that 
error code is identical
to TRY_AGAIN in that the request could not be accepted due to no fault of the 
client but due to some
more or less temporary problem in the IMM service. The difference is that 
NO_RESOURCES has no
realtime ambitions. Typically this error code is used by the imm when the imm 
can not fullfill a request
due to reasons that are outside of the imm service control. Also the time from 
request to a response
of ERR_NO_RESOUIRCES may be long. 

The SMF service in general has no realtime requirments. The main goal for the 
SMF service is to
successfully complete correctly formulated camopaings. This means that the SMF 
service should be
programmed to avoid unnecessary fragility related to temporary problems, even 
if the temporary problem
could linger for seconds or minutes. 

The alternative of aborting the campaign will itself discard potentially large 
execution times already
completed. It may sometimes even result in a system restore.

This means that SMF campaigns should have a "retry loop" that handles not just 
TRY_AGAIN,
but also ERR_NO_RESOURCES where this return code is relevant (can be returned 
according to
the API spec).. The error copde ERR_BUSY also exists and is for all practical 
purposes identical
to ERR_NO_RESOURCES in semantics, both logical and timing.


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to