The end user of "recent releases" i.e. previous releases has not seen this 
problem.
At least no report of the problem has been created until a few weeks ago.
It only occurs in overload situations and has only been seen in recent testing 
with an overloaded system.
New ways of testing are always good.
But testing overload on a system with no load regulation will always find the 
next bottleneck symptom.
WE can play that game indefinitely.
Adding defect upon defect.
Or we can provide some form of load-regulation mechanism for OpenSAF.

It is also ironic that we need to fix this particular overload issue on old 
releases at the same time as we are
Ripping up existing time release plans and suddenly declaring we are going to 
one-track development.,

Personally I am increasingly frustrated by the deterioration in following the 
rules of the ticket system.
Why not just drop the distinction between enhancement and defect ?
No one seems to care (or bother ) about this distinction any more.

The main reason for the distinction (I thought) was to provide an increased 
degree of stability on older
Branches.

New features always means new risk, at least in the short term i.e. first 
release occurrence of a new feature (enhancement).
But no one seems to care about that.

/AndersBj






From: Mathi Naickan [mailto:[email protected]]
Sent: den 25 augusti 2015 16:37
To: [email protected]
Subject: [tickets] [opensaf:tickets] #1448 smf: Make campaigns less fragile by 
retrying on ERR_NO_RESOURCES


I think it is more unfair to the end user of recent releases by not passing on 
the benefit by providing an optimization or fix for an issue just because it 
was uncovered/hit late! And especially when the fix does not create any harm 
and only helps in succeeding the campaign. May be in the case of this ticket, 
there is more to help the user and nothing to harm the code path! Also, the 
facts that this is not a newly introduced error code and that IMM API users 
have not met the expectation set upon by IMM, to handle this as TRY_AGAIN calls 
for this to be a defect.

________________________________

[tickets:#1448]<http://sourceforge.net/p/opensaf/tickets/1448/> smf: Make 
campaigns less fragile by retrying on ERR_NO_RESOURCES

Status: unassigned
Milestone: future
Created: Fri Aug 14, 2015 07:09 AM UTC by Anders Bjornerstedt
Last Updated: Tue Aug 25, 2015 11:14 AM UTC
Owner: nobody

The SMF service is a heavy user of the IMM service.
The IMM has an established client pattern for ERR_TRY_AGAIN which allows an 
application realtime
control over how long it is prepared to wait for a transient inability of the 
IMM service to fullfill a request.
Each response of TRY_AGAIN should in itself be fast so the application needs a 
delay in its retry loop.

There is also the very similar error code ERR_NO_RESOURSES. Logically that 
error code is identical
to TRY_AGAIN in that the request could not be accepted due to no fault of the 
client but due to some
more or less temporary problem in the IMM service. The difference is that 
NO_RESOURCES has no
realtime ambitions. Typically this error code is used by the imm when the imm 
can not fullfill a request
due to reasons that are outside of the imm service control. Also the time from 
request to a response
of ERR_NO_RESOUIRCES may be long.

The SMF service in general has no realtime requirments. The main goal for the 
SMF service is to
successfully complete correctly formulated camopaings. This means that the SMF 
service should be
programmed to avoid unnecessary fragility related to temporary problems, even 
if the temporary problem
could linger for seconds or minutes.

The alternative of aborting the campaign will itself discard potentially large 
execution times already
completed. It may sometimes even result in a system restore.

This means that SMF campaigns should have a "retry loop" that handles not just 
TRY_AGAIN,
but also ERR_NO_RESOURCES where this return code is relevant (can be returned 
according to
the API spec).. The error copde ERR_BUSY also exists and is for all practical 
purposes identical
to ERR_NO_RESOURCES in semantics, both logical and timing.

________________________________

Sent from sourceforge.net because 
[email protected]<mailto:[email protected]>
 is subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to