Can't amfd just exit when receiving mds down for amfnd? > -----Original Message----- > From: Mathivanan Naickan Palanivelu [mailto:[email protected]] > Sent: den 13 januari 2014 13:47 > To: Hans Feldt; Venkata Mahesh Alla; Anders Widell; Nagendra Kumar > Cc: [email protected] > Subject: RE: [devel] Possible Time delay between AMFD exit and DTM exit > during opensaf stop > > One another issue that can occur is that because of a slowed exit of AMFD on > the node going down, > i.e. During the 'opensafd stop' flow, I think the local AMFD should mark the > local node as "ABSENT" upon receiving down event of local > AMFND as below: > > diff --git a/osaf/services/saf/amf/amfd/ndfsm.cc > b/osaf/services/saf/amf/amfd/ndfsm.cc > --- a/osaf/services/saf/amf/amfd/ndfsm.cc > +++ b/osaf/services/saf/amf/amfd/ndfsm.cc > @@ -321,6 +321,7 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb > // Do nothing if the local node goes down. Most likely due to > system shutdown. > // If node director goes down due to a bug, the AMF watchdog > will restart the node. > if (node->node_info.nodeId == cb->node_id_avd) { > + node->node_state = AVD_AVND_STATE_ABSENT; > TRACE("Ignoring down event for local node director"); > goto done; > } > > This is because, if for some reason there is a small delay for the AMFD to > exit (through amfd's stop script) as described below, > Then during this duration(of delay) the other controller would have already > become ACTIVE and the local active AMFD would have > received > a CLM cluster track callback that is marking the local node(going down) as > exiting the cluster. > > Without the above protection(or similar), it can lead to other problems. > > Comments? > > Thanks, > Mathi. > > > -----Original Message----- > > From: Mathivanan Naickan Palanivelu > > Sent: Friday, January 10, 2014 10:45 PM > > To: Hans Feldtanders.widell; Venkata Mahesh Alla > > Cc: [email protected] > > Subject: [devel] Possible Time delay between AMFD exit and DTM exit > > during opensaf stop > > > > Hi, > > > > We might have discussed this before but, i think there is a small chance for > > the following to happen during the opensafd stop scenario. > > > > for cmd in `ls $pkgclcclidir/osaf-*`; do > > # skip dtm here to allow shutdown of other services (e.g. > > amfd) > > ===> if [ "$cmd" != "$pkgclcclidir/osaf-dtm" ] && [ "$cmd" != > > "$pkgclcclidir/osaf-transport-monitor" ]; then > > $cmd stop >/dev/null 2>&1 > > fi > > done > > [Mathi] > > AMFD clc-cli script would have got invoked because of the above lines. (This > > does not necessarily mean that the script has finished execution!) > > > > if [ "$MDS_TRANSPORT" = "TIPC" ]; then > > unload_tipc > > else > > # stop dtm, now all dependent services should be stopped > > ====> $pkgclcclidir/osaf-dtm stop >/dev/null 2>&1 > > [Mathi] > > By the time osaf-dtm is killed, there is a possibility that osaf-amfd has > > still not > > exited. > > Is it possible? If so, we might have to probably check and wait for the amfd > > pid to disappear before doing the kill here? > > > > rm -f $pkglocalstatedir/osaf_dtm_intra_server > > fi > > > > What do you say? > > > > Thanks, > > Mathi. > > > > ------------------------------------------------------------------------------ > > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > > Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical > > Workloads, Development Environments & Everything In Between. > > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.cl > > ktrk > > _______________________________________________ > > Opensaf-devel mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/opensaf-devel
------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
