On 08/19/2013 04:33 PM, Suryanarayana Garlapati wrote: > In my perspective, this should be done at the Active Only and standby should > just drop it. Atleast we will have the > standby which gets promoted to active and continues to provide the service. > We should not be performing a cluster reset. > > Thoughts?.
The AMF code in question (event handling for MDS callback QUIESCED) can only be invoked in AMFD state QUIESCED AND SWITCH-OVER state. See http://devel.opensaf.org/~hafe/AMF/ControllerSwitchover.png It is not designed to be invoked in STANDBY state. Besides STANDBY->QUIESCED is not a valid transition. /Hans > > > On Friday 16 August 2013 07:03 PM, Hans Feldt wrote: >> osaf/services/saf/avsv/avd/avd_role.cc | 9 +++++++++ >> 1 files changed, 9 insertions(+), 0 deletions(-) >> >> >> MDS can force an active vdest into quiesced state (see docs). Reasons for >> this >> happening is unclear. The logic avd_mds_qsd_role_evh() can only handle this >> event in context of a controller switch-over. Otherwise it could e.g. hang in >> using IMM which eventually times out and calls abort() generating a core >> dump. >> >> Instead exit the amfd process when this event happens in non controller >> switch-over state. amfnd will failfast reboot the node when it detects this. >> >> diff --git a/osaf/services/saf/avsv/avd/avd_role.cc >> b/osaf/services/saf/avsv/avd/avd_role.cc >> --- a/osaf/services/saf/avsv/avd/avd_role.cc >> +++ b/osaf/services/saf/avsv/avd/avd_role.cc >> @@ -569,6 +569,15 @@ void avd_mds_qsd_role_evh(AVD_CL_CB *cb, >> TRACE_ENTER(); >> + /* Only accept this event in controller switch-over state, in other >> + * states it is invalid and indicates severe cluster problems. >> + */ >> + if (cb->swap_switch == SA_FALSE) { >> + LOG_NO("%s: MDS unexpectedly changed role to QUIESCED", >> __FUNCTION__); >> + LOG_CR("Controller split brain detected, exiting"); >> + _exit(EXIT_FAILURE); // should never get here... >> + } >> + >> /* Give up IMM OI implementer role */ >> if ((rc = immutil_saImmOiImplementerClear(cb->immOiHandle)) != >> SA_AIS_OK) { >> LOG_ER("FAILOVER Active --> Quiesced FAILED, ImplementerClear >> failed %u", rc); >> >> ------------------------------------------------------------------------------ >> Get 100% visibility into Java/.NET code with AppDynamics Lite! >> It's a free troubleshooting tool designed for production. >> Get down to code-level detail for bottlenecks, with <2% overhead. >> Download for free and get started troubleshooting in minutes. >> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk >> _______________________________________________ >> Opensaf-devel mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/opensaf-devel > > > ------------------------------------------------------------------------------ Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
