Can't amfd just exit when receiving mds down for amfnd?

> -----Original Message-----
> From: Mathivanan Naickan Palanivelu [mailto:[email protected]]
> Sent: den 13 januari 2014 13:47
> To: Hans Feldt; Venkata Mahesh Alla; Anders Widell; Nagendra Kumar
> Cc: [email protected]
> Subject: RE: [devel] Possible Time delay between AMFD exit and DTM exit 
> during opensaf stop
> 
> One another issue that can occur is that because of a slowed exit of AMFD on 
> the node going down,
> i.e. During the 'opensafd stop' flow, I think the local AMFD should mark the 
> local node as "ABSENT" upon receiving down event of local
> AMFND as below:
> 
> diff --git a/osaf/services/saf/amf/amfd/ndfsm.cc 
> b/osaf/services/saf/amf/amfd/ndfsm.cc
> --- a/osaf/services/saf/amf/amfd/ndfsm.cc
> +++ b/osaf/services/saf/amf/amfd/ndfsm.cc
> @@ -321,6 +321,7 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb
>                 // Do nothing if the local node goes down. Most likely due to 
> system shutdown.
>                 // If node director goes down due to a bug, the AMF watchdog 
> will restart the node.
>                 if (node->node_info.nodeId == cb->node_id_avd) {
> +                       node->node_state = AVD_AVND_STATE_ABSENT;
>                         TRACE("Ignoring down event for local node director");
>                         goto done;
>                 }
> 
> This is because, if for some reason there is a small delay for the AMFD to 
> exit (through amfd's stop script) as described below,
> Then during this duration(of delay) the other controller would have already 
> become ACTIVE and the local active AMFD would have
> received
> a CLM cluster track callback that is marking the local node(going down) as 
> exiting the cluster.
> 
> Without the above protection(or similar), it can lead to other problems.
> 
> Comments?
> 
> Thanks,
> Mathi.
> 
> > -----Original Message-----
> > From: Mathivanan Naickan Palanivelu
> > Sent: Friday, January 10, 2014 10:45 PM
> > To: Hans Feldtanders.widell; Venkata Mahesh Alla
> > Cc: [email protected]
> > Subject: [devel] Possible Time delay between AMFD exit and DTM exit
> > during opensaf stop
> >
> > Hi,
> >
> > We might have discussed this before but, i think there is a small chance for
> > the following to happen during the opensafd stop scenario.
> >
> >     for cmd in `ls $pkgclcclidir/osaf-*`; do
> >             # skip dtm here to allow shutdown of other services (e.g.
> > amfd)
> > ===>                if [ "$cmd" != "$pkgclcclidir/osaf-dtm" ] && [ "$cmd" !=
> > "$pkgclcclidir/osaf-transport-monitor" ]; then
> >                     $cmd stop >/dev/null 2>&1
> >             fi
> >     done
> > [Mathi]
> > AMFD clc-cli script would have got invoked because of the above lines. (This
> > does not necessarily mean that the script has finished execution!)
> >
> >     if [ "$MDS_TRANSPORT" = "TIPC" ]; then
> >             unload_tipc
> >     else
> >             # stop dtm, now all dependent services should be stopped
> > ====>               $pkgclcclidir/osaf-dtm stop >/dev/null 2>&1
> > [Mathi]
> > By the time osaf-dtm is killed, there is a possibility that osaf-amfd has 
> > still not
> > exited.
> > Is it possible? If so, we might have to probably check and wait for the amfd
> > pid to disappear before doing the kill here?
> >
> >             rm -f $pkglocalstatedir/osaf_dtm_intra_server
> >     fi
> >
> > What do you say?
> >
> > Thanks,
> > Mathi.
> >
> > ------------------------------------------------------------------------------
> > CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> > Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical
> > Workloads, Development Environments & Everything In Between.
> > Get a Quote or Start a Free Trial Today.
> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.cl
> > ktrk
> > _______________________________________________
> > Opensaf-devel mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to