Looks ok to me. Thanks -Nagu
> -----Original Message----- > From: praveen malviya > Sent: 18 February 2014 11:28 > To: Hans Feldt > Cc: Nagendra Kumar; [email protected]; Hans Nordebäck > Subject: Re: [devel] [PATCH 1 of 1] amfnd: report failover after comp cleanup > [#598] > > Hi, > > Testing this patch for PI SUs. > The patch works fine for component failover for PI SUs. > But in case recovery policy is configured as nodeswitchover, the patch > causes reboot of node before the removal of assignments. Thus repair is > being done before the recovery. > This happens because in node switchover escalation cleanup script is > invoked. When cleanup up is done successfully, AMFND informs AMFD for > component failover assuming it to be component failover. This should not > be done in node switchover case. > Below is the modification in if condition needed to fix it. Here oper > state of avnd is also checked to differentiate between nodeswitchover > and component > failover: > > /* determine if this is a case of component failover */ > if (m_AVND_COMP_IS_FAILED(comp) && m_AVND_SU_IS_FAILED(su) && > m_AVND_SU_IS_PREINSTANTIABLE(su) && > (su->sufailover == false) && > (avnd_cb->oper_state != > SA_AMF_OPERATIONAL_DISABLED) ) { > /* yes, request director to orchestrate component > failover */ > rc = avnd_di_oper_send(cb, su, SA_AMF_COMPONENT_FAILOVER); > } > > Will test with NPI SUs. > Thanks, > Praveen > > > > On 14-Feb-14 6:43 PM, Hans Feldt wrote: > > osaf/services/saf/amf/amfnd/clc.cc | 9 +++++++++ > > osaf/services/saf/amf/amfnd/di.cc | 2 +- > > osaf/services/saf/amf/amfnd/err.cc | 9 +++++---- > > osaf/services/saf/amf/amfnd/include/avnd_di.h | 2 +- > > 4 files changed, 16 insertions(+), 6 deletions(-) > > > > > > If a component error is detected and the recovery action is > COMPONENT_FAILOVER, > > it is possible that a standby component gets the active assignment before > > the > > erroneous component has been terminated. This can cause a split brain on > > application level. > > > > The reason for this is that when the error is detected amfnd starts two > > parallel activities, component cleanup and inform director. When the > > director > > receives the information it starts the process of failing over the workload > > of the erroneous component. > > > > This patch informs the director after successful termination has been > performed. > > > > diff --git a/osaf/services/saf/amf/amfnd/clc.cc > b/osaf/services/saf/amf/amfnd/clc.cc > > --- a/osaf/services/saf/amf/amfnd/clc.cc > > +++ b/osaf/services/saf/amf/amfnd/clc.cc > > @@ -2024,6 +2024,7 @@ uint32_t avnd_comp_clc_terming_termfail_ > > > ************************************************************* > *****************/ > > uint32_t avnd_comp_clc_terming_cleansucc_hdler(AVND_CB *cb, > AVND_COMP *comp) > > { > > + const AVND_SU *su = comp->su; > > uint32_t rc = NCSCC_RC_SUCCESS; > > TRACE_ENTER2("'%s': Cleanup success event in the terminating state", > comp->name.value); > > > > @@ -2074,6 +2075,14 @@ uint32_t avnd_comp_clc_terming_cleansucc > > m_AVND_COMP_REG_PARAM_RESET(cb, comp); > > m_AVND_SEND_CKPT_UPDT_ASYNC_UPDT(cb, comp, > AVND_CKPT_COMP_CONFIG); > > } > > + > > + /* determine if this is a case of component failover */ > > + if (m_AVND_COMP_IS_FAILED(comp) && m_AVND_SU_IS_FAILED(su) > && > > + m_AVND_SU_IS_PREINSTANTIABLE(su) && (su- > >sufailover == false)) { > > + /* yes, request director to orchestrate component failover */ > > + rc = avnd_di_oper_send(cb, su, > SA_AMF_COMPONENT_FAILOVER); > > + } > > + > > TRACE_LEAVE(); > > return rc; > > } > > diff --git a/osaf/services/saf/amf/amfnd/di.cc > b/osaf/services/saf/amf/amfnd/di.cc > > --- a/osaf/services/saf/amf/amfnd/di.cc > > +++ b/osaf/services/saf/amf/amfnd/di.cc > > @@ -476,7 +476,7 @@ uint32_t avnd_evt_mds_avd_dn_evh(AVND_CB > > > > Notes : None. > > > ************************************************************* > *****************/ > > -uint32_t avnd_di_oper_send(AVND_CB *cb, AVND_SU *su, uint32_t rcvr) > > +uint32_t avnd_di_oper_send(AVND_CB *cb, const AVND_SU *su, uint32_t > rcvr) > > { > > AVND_MSG msg; > > uint32_t rc = NCSCC_RC_SUCCESS; > > diff --git a/osaf/services/saf/amf/amfnd/err.cc > b/osaf/services/saf/amf/amfnd/err.cc > > --- a/osaf/services/saf/amf/amfnd/err.cc > > +++ b/osaf/services/saf/amf/amfnd/err.cc > > @@ -702,9 +702,6 @@ uint32_t avnd_err_rcvr_comp_failover(AVN > > m_AVND_SU_OPER_STATE_SET(su, > SA_AMF_OPERATIONAL_DISABLED); > > m_AVND_SEND_CKPT_UPDT_ASYNC_UPDT(cb, su, > AVND_CKPT_SU_OPER_STATE); > > > > - /* inform AvD */ > > - rc = avnd_di_oper_send(cb, su, SA_AMF_COMPONENT_FAILOVER); > > - > > /* > > * su-sis may be in assigning/removing state. signal csi > > * assign/remove done so that su-si assignment/removal algo can > proceed. > > @@ -722,11 +719,15 @@ uint32_t avnd_err_rcvr_comp_failover(AVN > > if (NCSCC_RC_SUCCESS != rc) > > goto done; > > > > - /* clean the failed comp */ > > + // TODO: there should be no difference between PI/NPI comps > > if (m_AVND_SU_IS_PREINSTANTIABLE(su)) { > > + /* clean the failed comp */ > > rc = avnd_comp_clc_fsm_run(cb, failed_comp, > AVND_COMP_CLC_PRES_FSM_EV_CLEANUP); > > if (NCSCC_RC_SUCCESS != rc) > > goto done; > > + } else { > > + /* request director to orchestrate component failover */ > > + rc = avnd_di_oper_send(cb, failed_comp->su, > AVSV_ERR_RCVR_SU_FAILOVER); > > } > > > > done: > > diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h > b/osaf/services/saf/amf/amfnd/include/avnd_di.h > > --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h > > +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h > > @@ -68,7 +68,7 @@ > > > > struct avnd_cb_tag; > > > > -uint32_t avnd_di_oper_send(struct avnd_cb_tag *, AVND_SU *, uint32_t); > > +uint32_t avnd_di_oper_send(struct avnd_cb_tag *, const AVND_SU *, > uint32_t); > > uint32_t avnd_di_susi_resp_send(struct avnd_cb_tag *, AVND_SU *, > AVND_SU_SI_REC *); > > uint32_t avnd_di_object_upd_send(struct avnd_cb_tag *, > AVSV_PARAM_INFO *); > > uint32_t avnd_di_pg_act_send(struct avnd_cb_tag *, SaNameT *, > AVSV_PG_TRACK_ACT, bool); > > > > ------------------------------------------------------------------------------ > > Android apps run on BlackBerry 10 > > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > > Now with support for Jelly Bean, Bluetooth, Mapview and more. > > Get your Android app in front of a whole new audience. Start now. > > > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clk > trk > > _______________________________________________ > > Opensaf-devel mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/opensaf-devel > ------------------------------------------------------------------------------ Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
