Well, That should be the way to go then! AMFD cannot(should not!) do anything after the local AMFND down! - Mathi.
> -----Original Message----- > From: Hans Feldt [mailto:[email protected]] > Sent: Tuesday, January 14, 2014 2:04 PM > To: Mathivanan Naickan Palanivelu; Venkata Mahesh Alla; Anders Widell; > Nagendra Kumar > Cc: [email protected] > Subject: RE: [devel] Possible Time delay between AMFD exit and DTM exit > during opensaf stop > > Can't amfd just exit when receiving mds down for amfnd? > > > -----Original Message----- > > From: Mathivanan Naickan Palanivelu [mailto:[email protected]] > > Sent: den 13 januari 2014 13:47 > > To: Hans Feldt; Venkata Mahesh Alla; Anders Widell; Nagendra Kumar > > Cc: [email protected] > > Subject: RE: [devel] Possible Time delay between AMFD exit and DTM > > exit during opensaf stop > > > > One another issue that can occur is that because of a slowed exit of > > AMFD on the node going down, i.e. During the 'opensafd stop' flow, I > > think the local AMFD should mark the local node as "ABSENT" upon > receiving down event of local AMFND as below: > > > > diff --git a/osaf/services/saf/amf/amfd/ndfsm.cc > > b/osaf/services/saf/amf/amfd/ndfsm.cc > > --- a/osaf/services/saf/amf/amfd/ndfsm.cc > > +++ b/osaf/services/saf/amf/amfd/ndfsm.cc > > @@ -321,6 +321,7 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb > > // Do nothing if the local node goes down. Most likely due > > to > system shutdown. > > // If node director goes down due to a bug, the AMF > > watchdog will > restart the node. > > if (node->node_info.nodeId == cb->node_id_avd) { > > + node->node_state = AVD_AVND_STATE_ABSENT; > > TRACE("Ignoring down event for local node > > director"); > > goto done; > > } > > > > This is because, if for some reason there is a small delay for the > > AMFD to exit (through amfd's stop script) as described below, Then > > during this duration(of delay) the other controller would have already > > become ACTIVE and the local active AMFD would have received a CLM > cluster track callback that is marking the local node(going down) as exiting > the > cluster. > > > > Without the above protection(or similar), it can lead to other problems. > > > > Comments? > > > > Thanks, > > Mathi. > > > > > -----Original Message----- > > > From: Mathivanan Naickan Palanivelu > > > Sent: Friday, January 10, 2014 10:45 PM > > > To: Hans Feldtanders.widell; Venkata Mahesh Alla > > > Cc: [email protected] > > > Subject: [devel] Possible Time delay between AMFD exit and DTM exit > > > during opensaf stop > > > > > > Hi, > > > > > > We might have discussed this before but, i think there is a small > > > chance for the following to happen during the opensafd stop scenario. > > > > > > for cmd in `ls $pkgclcclidir/osaf-*`; do > > > # skip dtm here to allow shutdown of other services (e.g. > > > amfd) > > > ===> if [ "$cmd" != "$pkgclcclidir/osaf-dtm" ] && [ "$cmd" > != > > > "$pkgclcclidir/osaf-transport-monitor" ]; then > > > $cmd stop >/dev/null 2>&1 > > > fi > > > done > > > [Mathi] > > > AMFD clc-cli script would have got invoked because of the above > > > lines. (This does not necessarily mean that the script has finished > > > execution!) > > > > > > if [ "$MDS_TRANSPORT" = "TIPC" ]; then > > > unload_tipc > > > else > > > # stop dtm, now all dependent services should be stopped > > > ====> $pkgclcclidir/osaf-dtm stop >/dev/null 2>&1 > > > [Mathi] > > > By the time osaf-dtm is killed, there is a possibility that > > > osaf-amfd has still not exited. > > > Is it possible? If so, we might have to probably check and wait for > > > the amfd pid to disappear before doing the kill here? > > > > > > rm -f $pkglocalstatedir/osaf_dtm_intra_server > > > fi > > > > > > What do you say? > > > > > > Thanks, > > > Mathi. > > > > > > -------------------------------------------------------------------- > > > ---------- CenturyLink Cloud: The Leader in Enterprise Cloud > > > Services. > > > Learn Why More Businesses Are Choosing CenturyLink Cloud For > > > Critical Workloads, Development Environments & Everything In Between. > > > Get a Quote or Start a Free Trial Today. > > > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ost > > > g.cl > > > ktrk > > > _______________________________________________ > > > Opensaf-devel mailing list > > > [email protected] > > > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
