Since the use of PRTOs by SMF is *only* meaningful for campaigns that
involve a cluster reboot step, I think it should be possible for SMF
to change its PRTOs to RTOs and instead handle the persistence needs for
the rare cluster reboot campaigns separately. 

The naiive soluition would be to have two classes for each logical imm class
needed by campaigns. One class defines a PRTO and the other "sibling"
class defines an RTO. Same attributes in both. 
But ideally there should be a more elegant and less naiive/bulky solution.

The basis should be that SMF RTOs are non persistent RTOs and then
focus on how to persistify whatever needs to be persistified for
cluster reboot campaigns (which are the rare exception). 

If no simpler solution exists, then I would still like to see a solution
where SMF provides two versions of each of its campaign imm classes.

Note that for normal campaigns (not involving a cluster restart step) there
are still other reasons for disabling PBE in some phases. That would be
any phase where an UNPLANNED cluster reboot has to be escalated to a
restore. Having PBE enabled under such a phase is worse than pointless
since it slows things down, increases the risk of campaign failure and
is pointless since the incremental persistence is immediately discarded
in the one use case it is there for. 





---

** [tickets:#128] SMF: Remove SMFs extensive use of persistent runtime 
attributes.**

**Status:** unassigned
**Created:** Mon May 13, 2013 08:56 AM UTC by Ingvar Bergström
**Last Updated:** Mon May 13, 2013 08:56 AM UTC
**Owner:** nobody

http://devel.opensaf.org/ticket/2432

SMF makes extensive use of persistent runtime attributes in the imm.
The SMF standard apparently does not explicitly state that these
runtime attributes are persistent. which in my world means they
are not persistent according to the standard, but which is 
nevertheless interpreted by the OpenSAF SMF implementation as that
they should be persistent. 

As explained in ticket #2431, the way that SMF uses persistent 
runtime attributes actually causes more problems than it solves.
The only problem it "solves" is that for the rare case when a
campaign involves a planned cluster restart, then the SMF can
persistify is campaign state (using an explicit immdump since
PBE must be disabled), in such a way that efter the planned
cluster restart step, SMF can continue the campaign. 

There must be a better way for the SMF to checkpoint the necessary
campaign state. 

An SMF campaign can be seen as a huge transaction. You want the
entire campaign to succeed atomically and persistently or 
rollback atomically and persistently. 

The problem with using persistent runtime attributes is that they
can not be included in any transaction (ccb in the imm). 
Every single persistent runtime attribute update is a separate and 
independent action. 

The SMF runtime attributes should remain as non persistent runtime
attributes, reflecting the campaign state *towards* the operator,
but not be used by the SMF as a store and retrieval utility for
fetching its own campaign state after a cluster restart step
(a non runtime attribute use of runtime attributes).

A much better solution is for the SMF to create private config
data where it stores any campaign state that needs to survive
a cluster restart and does so using CCBs (atomic write 
transactions). 

For most campaigns, (that do not have a cluster restart step)
SMF would only store a marker saying it is not finished yet
with the campaign so any unplanned cluster restart shall escalate
to a restore. 

And for the rare and exceptional campaigns that do require a
cluster restart step, it would execute a ccb storing the needed
campaign state just before the cluster restart step, in the same 
place where it today invokes an immdump to checkpoint the imm
database.

With this enhancement in place, campaigns will be safer and the 
imm PBE need not be disabled during campaigns. 

Some campaigns that do encounter an unplanned cluster restart 
would then not have to be escalated to a *restore*. Instead SMF
could in some cases perform a rollback after the unplanned 
cluster restart. 



Changed 16 months ago by anders

    description modified (diff)

Changed 16 months ago by anders

The best way for SMF to completely avoid the need for persistent runtime 
attributes,
is to not allow any step that is a cluster reboot, except if it is the last 
step in
the campaign.

Any additional steps needed after the cluster reboot would have to be taken in
a new campaign.

The only problem with this is that it would require one manual step, namely for 
the
operator to invoke the second campaign. The fact that the oeprator has managed 
to invoke
the first campaign makes it likely but not certain that they could manage to 
invoke the
second campaign.

Seeing as some operators could forget to do this, an elaboration would be to 
have some
simple bridging mechanism. As it turns out, that bridging mechanism has to be 
there
already anyway. SMF has to have some way of finding out, after a cluster 
restart, if it
was a planned restart or not. If was not planned, it must invoke a restore.
if it was planned, then there coukld be also just enough information to point 
at a
continuation campaign.


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to