---
** [tickets:#686] SMF: Support fallback to pre-upgrade software checkpoint as
alternative to restore**
**Status:** unassigned
**Created:** Fri Dec 20, 2013 10:21 AM UTC by Anders Bjornerstedt
**Last Updated:** Fri Dec 20, 2013 10:21 AM UTC
**Owner:** nobody
This enhancement should possibly have 'osaf' as component.
But I place it on smf just to get some "driver" for this work and smf
certainly needs to be involved in it.
Today, if there is an unplanned cluster restart during an upgrade campaign,
the escalation is to immediately "request" a restore. The escalation is
immediately punted outside of OpenSAF.
The problem with this is that OpenSAF does not provide one self contained
solution framework for handling upgrade failures of OpenSAF or the software
that is installed on OpenSAF. Some software that runs on OpenSAF may be
extremely complex and require extra handling. But OpenSAF should at least
solve the problem for the "vanilla" application. The same way it takes care
of a cluster restart under normal circumstances, automatically and without
punting any critical part outside of OpenSAF.
This enhancement would add support for:
A) Creating/bundling the current "software version" that OpenSaf is
now executing. This means making a copy of:
1) All RPMs that can be changed in an SMF campaign.
2) All configuration files that can be changed in an SMF
campaign for configuration data not residing in IMM.
3) An immdump of the current IMM state.
B) This bundle is stored on the file-system available to OpenSAF to be ready
to be installed.
C) Support in OpenSAF (AMF ?) for catching the case of an unplanned
cluster restart that is triggered during an on-going SMF campaign.
D) Installing the previous software version before the cluster restart
is executed.
Note that the fallback is only to the previous version of code and
configuration data. The fallback does not and should not rewind user-data
stored in a user-data-repository. This is in fact one way of defining
configuration data, i.e. the only data (besides non persistent runtime data)
that should be stored in the IMM. Configuration data is data that must be
rewound to a previous state when the software version is rewound. Thus
config data always has a tight relation to the software.
User data on the other hands is payload. The reason that OpenSAF exists
is to minimize disturbances to user applications and user data.
A restore would typically revert also user data, which is a catastrophe
and should only need to be done in true catastrophes. Hopefully then
an externally caused catastrophe. A restore that is not automated will
result in a huge outage, hours days, and destroy all pretence of HA
for that systems remaining lifetime.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets