Well, 
That should be the way to go then! AMFD cannot(should not!) do anything after 
the local AMFND down!
- Mathi.

> -----Original Message-----
> From: Hans Feldt [mailto:[email protected]]
> Sent: Tuesday, January 14, 2014 2:04 PM
> To: Mathivanan Naickan Palanivelu; Venkata Mahesh Alla; Anders Widell;
> Nagendra Kumar
> Cc: [email protected]
> Subject: RE: [devel] Possible Time delay between AMFD exit and DTM exit
> during opensaf stop
> 
> Can't amfd just exit when receiving mds down for amfnd?
> 
> > -----Original Message-----
> > From: Mathivanan Naickan Palanivelu [mailto:[email protected]]
> > Sent: den 13 januari 2014 13:47
> > To: Hans Feldt; Venkata Mahesh Alla; Anders Widell; Nagendra Kumar
> > Cc: [email protected]
> > Subject: RE: [devel] Possible Time delay between AMFD exit and DTM
> > exit during opensaf stop
> >
> > One another issue that can occur is that because of a slowed exit of
> > AMFD on the node going down, i.e. During the 'opensafd stop' flow, I
> > think the local AMFD should mark the local node as "ABSENT" upon
> receiving down event of local AMFND as below:
> >
> > diff --git a/osaf/services/saf/amf/amfd/ndfsm.cc
> > b/osaf/services/saf/amf/amfd/ndfsm.cc
> > --- a/osaf/services/saf/amf/amfd/ndfsm.cc
> > +++ b/osaf/services/saf/amf/amfd/ndfsm.cc
> > @@ -321,6 +321,7 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb
> >                 // Do nothing if the local node goes down. Most likely due 
> > to
> system shutdown.
> >                 // If node director goes down due to a bug, the AMF 
> > watchdog will
> restart the node.
> >                 if (node->node_info.nodeId == cb->node_id_avd) {
> > +                       node->node_state = AVD_AVND_STATE_ABSENT;
> >                         TRACE("Ignoring down event for local node 
> > director");
> >                         goto done;
> >                 }
> >
> > This is because, if for some reason there is a small delay for the
> > AMFD to exit (through amfd's stop script) as described below, Then
> > during this duration(of delay) the other controller would have already
> > become ACTIVE and the local active AMFD would have received a CLM
> cluster track callback that is marking the local node(going down) as exiting 
> the
> cluster.
> >
> > Without the above protection(or similar), it can lead to other problems.
> >
> > Comments?
> >
> > Thanks,
> > Mathi.
> >
> > > -----Original Message-----
> > > From: Mathivanan Naickan Palanivelu
> > > Sent: Friday, January 10, 2014 10:45 PM
> > > To: Hans Feldtanders.widell; Venkata Mahesh Alla
> > > Cc: [email protected]
> > > Subject: [devel] Possible Time delay between AMFD exit and DTM exit
> > > during opensaf stop
> > >
> > > Hi,
> > >
> > > We might have discussed this before but, i think there is a small
> > > chance for the following to happen during the opensafd stop scenario.
> > >
> > >   for cmd in `ls $pkgclcclidir/osaf-*`; do
> > >           # skip dtm here to allow shutdown of other services (e.g.
> > > amfd)
> > > ===>              if [ "$cmd" != "$pkgclcclidir/osaf-dtm" ] && [ "$cmd"
> !=
> > > "$pkgclcclidir/osaf-transport-monitor" ]; then
> > >                   $cmd stop >/dev/null 2>&1
> > >           fi
> > >   done
> > > [Mathi]
> > > AMFD clc-cli script would have got invoked because of the above
> > > lines. (This does not necessarily mean that the script has finished
> > > execution!)
> > >
> > >   if [ "$MDS_TRANSPORT" = "TIPC" ]; then
> > >           unload_tipc
> > >   else
> > >           # stop dtm, now all dependent services should be stopped
> > > ====>             $pkgclcclidir/osaf-dtm stop >/dev/null 2>&1
> > > [Mathi]
> > > By the time osaf-dtm is killed, there is a possibility that
> > > osaf-amfd has still not exited.
> > > Is it possible? If so, we might have to probably check and wait for
> > > the amfd pid to disappear before doing the kill here?
> > >
> > >           rm -f $pkglocalstatedir/osaf_dtm_intra_server
> > >   fi
> > >
> > > What do you say?
> > >
> > > Thanks,
> > > Mathi.
> > >
> > > --------------------------------------------------------------------
> > > ---------- CenturyLink Cloud: The Leader in Enterprise Cloud
> > > Services.
> > > Learn Why More Businesses Are Choosing CenturyLink Cloud For
> > > Critical Workloads, Development Environments & Everything In Between.
> > > Get a Quote or Start a Free Trial Today.
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ost
> > > g.cl
> > > ktrk
> > > _______________________________________________
> > > Opensaf-devel mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to