Hi Minh, Please see responses with [Praveen].
Thanks, Praveen On 23-Aug-16 7:18 PM, minh chau wrote: > Hi Praveen, > > Please let me copy your questions and answer here in email, so it's > easier we can add comment in line, please see [Minh]. > > Thanks, > Minh > > ----------------------------- > > Hi Minh, > I am going through the patches 1725_phase1.tgz. Some initial comments: > 1) In patch 2 avnd_diq_rec_send_buffered_msg() checks presence of SUSI > then only it sends buffered message to AMFD. In case removal of > assignments completes during headless , AMFND deletes the SUSIs in > su_si_oper_done(). So AMFND will never send the assignment message and > admin operation will not continue. > > [Minh]: If this is the case AMFND deletes all SUSIs during headless, > then there will not be any assignment to be sent in state_info message > to AMFD after headless. However, in all admin operations of 2N I have > been testing, > the removal assignment sequence is the last step of admin LOCK/SHUTDOWN. > If AMFND deletes SUSI while headless, that also means the prior steps of > admin sequence had been done before headless. In this case, that is > equivalent to a completion of admin operation. > [Praveen]Yes, in this case it is not needed because by this time standby SU has become active. But in some cases AMFD performs failover/switchover based on removal of assignments status particularly when fault happens during admin op. As of now I do not know how to reproduce this scenario without faults but with faults it is possible. Since patch is not for admin op + faults ,so it can be left for the future. > 2) In patch1, I think after headless we will not get any invocation id > for the admin operation that > was going on before headless. Since AMF is continuing the admin > operation we should somehow > restrict other admin operation to start by setting some magic no for > invocationid or any other way. > > [Minh]: If AMF is continuing the admin operation after headless, the sg > fsm state should not be STABLE, I think (sg_fsm_state == > AVD_SG_FSM_STABLE) should be enough to reject new admin operation? > > > 3)If suswitch is in TOGGLED state then I think we should crosscheck that > there are atleast two SUs > having assignment. The reason is if this flag remains TOGGOLED and admin > op does not continue then there is very less probability that if will > get reset as it is used only in si-swap flow. > > [Minh]: Yes I don't particularly like this osafAmfSUSwitch to be written > to IMM. I had the only test case 144 failed (test list attached to ticket) > Test 144 is: Swap SI, delay csi STANDBY cbk in SU4, stop SCs, restart > SCs, reboot PL5. And I ran into the code line which requires suswitch > > void SG_2N::node_fail_su_oper(AVD_SU *su) { > > .... > /* the SU has standby SI assignments. if the other SUs > switch field > * is true, it is in service, having quiesced assigning state. > * Send D2N-INFO_SU_SI_ASSIGN modify active all to the other > SU. > * Change switch field to false. Change state to SG_realign. > * Free all the SI assignments to this SU. > */ > if ((su_oper_list_front()->su_switch == AVSV_SI_TOGGLE_SWITCH) > && (su_oper_list_front()->saAmfSuReadinessState == > SA_AMF_READINESS_IN_SERVICE)) { > > I think the *crosscheck* is actually a deduction of @su_switch from > whatever states that AMFD receives after headless. If *crosscheck* is > possible thing, then su_switch does not need to be checkpointed at > standby AMFD also. > In non-headless, we always need standby AMFD up-to-date all states by > checkpoint so that if active AMFD has gone, the standby AMFD can take > over by using these checkpointed states. > Now in headless, we also have to write these states somewhere (here is > IMM) so that the new active AMFD can use it. > It's the best that su_switch is revertible from a set of states, but > it's not easy to prove it's revertible from all scenarios of 2N si-swap. > If you think removing osafAmfSUSwitch is really needed, then this needs > to be looked more thoroughly later I think? > > 4)Since assignments are in progress. This could be because of admin > operation or > faults. AMFD should call one function here like log_admin_op(). This > function will search the entity > that is being under admin operation and log details like: > -After headless state admin op on '%s' is continuing in syslog. > -Also traces for susi states which are not assigned. > > [Minh]: Agree, some sort of logging like this is good idea, I think it's > best to introduce this logging in the patch : [PATCH 4 of 4] AMFD: > Validate headless cached RTA read from IMM [#1725] > And maybe I need more details of what you would like to log. [Praveen] I think to start with only name of entity and its admin state can be logged. > > Thanks, > Praveen > > --------------------- > > On 23/08/16 21:03, minh chau wrote: >> Hi Nagu, >> >> I see in the trace you provided, the SU2/SU3 become IN_SERVICE late. >> If there's a delay in PL4 joining cluster after headless in your test >> then you could also see it in the latest patches (longDN rebased version) >> I'm looking in to this issue. >> >> Thanks. >> Minh >> >> On 23/08/16 20:24, Nagendra Kumar wrote: >>> Please ignore TC #2, my mistake. >>> >>> Thanks >>> -Nagu >>> >>>> -----Original Message----- >>>> From: Nagendra Kumar >>>> Sent: 23 August 2016 15:49 >>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>> Cc: opensaf-devel@lists.sourceforge.net >>>> Subject: RE: [PATCH 2 of 2] AMFND: Admin operation continuation if csi >>>> callback completes during headless [#1725 part 1] V1 >>>> >>>> Please consider previous TC as TC #1 >>>> >>>> TC #2: Same configuration as TC #1. Logs attached in the ticket TC #2. >>>> >>>> Steps: >>>> 1. Same as step #1 of TC #1. >>>> 2. After locking SU1, keep delay in >>>> avnd_evt_avd_info_su_si_assign_evh and >>>> stop SC-1 and SC-2. >>>> 3. Start SC-1 and SC-2. SU1 is still in quisced state. Ideally, it >>>> should have no >>>> assignment and SU3 should have got assignment. >>>> >>>> safSISU=safSu=SU3\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>> mo1,safApp=AmfDemo1 >>>> saAmfSISUHAState=STANDBY(2) >>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>> mo1,safApp=AmfDemo1 >>>> saAmfSISUHAState=ACTIVE(1) >>>> safSISU=safSu=PL- >>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>> saAmfSISUHAState=ACTIVE(1) >>>> safSISU=safSu=SC- >>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>> saAmfSISUHAState=ACTIVE(1) >>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>> 2N,safApp=OpenSAF >>>> saAmfSISUHAState=ACTIVE(1) >>>> safSISU=safSu=SC- >>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>> saAmfSISUHAState=ACTIVE(1) >>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>> 2N,safApp=OpenSAF >>>> saAmfSISUHAState=STANDBY(2) >>>> safSISU=safSu=PL- >>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>> saAmfSISUHAState=ACTIVE(1) >>>> >>>> After that PL-3 rebooted by the following logs: >>>> Aug 23 15:31:52 PM_PL-3 osafamfwd[18056]: TIMEOUT receiving AMF >>>> health check request, generating core for amfnd Aug 23 15:31:52 PM_PL-3 >>>> osafamfwd[18056]: Last received healthcheck cnt=82 at Tue Aug 23 >>>> 15:30:52 >>>> 2016 Aug 23 15:31:52 PM_PL-3 osafamfwd[18056]: Rebooting OpenSAF >>>> NodeId = 0 EE Name = No EE Mapped, Reason: AMFND unresponsive, >>>> AMFWDOG initiated system reboot, OwnNodeId = 131855, SupervisionTime >>>> = 60 Aug 23 15:31:52 PM_PL-3 opensaf_reboot: Rebooting local node; >>>> timeout=60 >>>> >>>> Thanks >>>> -Nagu >>>> >>>>> -----Original Message----- >>>>> From: Nagendra Kumar >>>>> Sent: 23 August 2016 15:19 >>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>> Subject: RE: [PATCH 2 of 2] AMFND: Admin operation continuation if csi >>>>> callback completes during headless [#1725 part 1] V1 >>>>> >>>>> Please note that it is on change set 7846:31417997c82f and I have >>>>> applied patch of ticket #1894. >>>>> >>>>> Thanks >>>>> -Nagu >>>>>> -----Original Message----- >>>>>> From: Nagendra Kumar >>>>>> Sent: 23 August 2016 15:15 >>>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>> Subject: RE: [PATCH 2 of 2] AMFND: Admin operation continuation if >>>>>> csi callback completes during headless [#1725 part 1] V1 >>>>>> >>>>>> Hi Minh, >>>>>> The following SU lock case is not working. This issue will exist >>>>>> for all the flows, so please check. >>>>>> >>>>>> Configuration and traces attached in the ticket. >>>>>> >>>>>> Steps: >>>>>> 1. Start SC-1, SC-2, PL-3 and PL-4. Run the following command: >>>>>> immcfg -f /tmp/AppConfig-2N-1725.xml amf-adm unlock-in >>>>>> safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>> amf-adm unlock-in safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>> amf-adm unlock-in safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>> amf-adm unlock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>> amf-adm unlock safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>> amf-adm unlock safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>> >>>>>> Assignments are: >>>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>>> safSISU=safSu=SC- >>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>> 2N,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC- >>>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>> 2N,safApp=OpenSAF >>>>>> saAmfSISUHAState=STANDBY(2) >>>>>> safSISU=safSu=PL- >>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=PL- >>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> >>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>> mo1,safApp=AmfDemo1 >>>>>> saAmfSISUHAState=STANDBY(2) >>>>>> >>>> safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>> mo1,safApp=AmfDemo1 >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> >>>>>> 2. Issue lock on SU1. >>>>>> amf-adm lock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>> And keep gdb in csi_set callback. Stop SC-1 and SC-2. >>>>>> Send Ok from csi_set callback. >>>>>> >>>>>> 3. Start SC-1 and SC-2. >>>>>> >>>>>> 4. Assignment to components of SU2 is not given and assignments of >>>>>> SU2 still shows Standby. >>>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>>> >>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>> mo1,safApp=AmfDemo1 >>>>>> saAmfSISUHAState=STANDBY(2) >>>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>> 2N,safApp=OpenSAF >>>>>> saAmfSISUHAState=STANDBY(2) >>>>>> safSISU=safSu=SC- >>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=PL- >>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=PL- >>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC- >>>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>> 2N,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> >>>>>> >>>>>> Thanks >>>>>> -Nagu >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] >>>>>>> Sent: 05 August 2016 02:50 >>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen Malviya; >>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au; >>>>>>> minh.c...@dektech.com.au >>>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>>> Subject: [PATCH 2 of 2] AMFND: Admin operation continuation if csi >>>>>>> callback completes during headless [#1725 part 1] V1 >>>>>>> >>>>>>> osaf/services/saf/amf/amfnd/di.cc | 199 >>>>>>> +++++++++++++++++- >>>> --- >>>>> -- >>>>>> -- >>>>>>> osaf/services/saf/amf/amfnd/include/avnd_di.h | 1 + >>>>>>> 2 files changed, 134 insertions(+), 66 deletions(-) >>>>>>> >>>>>>> >>>>>>> The patch buffers susi_resp_msg during headless stage and resend >>>>>>> it to AMFD after headless. >>>>>>> >>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc >>>>>>> b/osaf/services/saf/amf/amfnd/di.cc >>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc >>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc >>>>>>> @@ -804,11 +804,6 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>>> if (cb->term_state == >>>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED) >>>>>>> return rc; >>>>>>> >>>>>>> - if (cb->is_avd_down == true) { >>>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>>> - return rc; >>>>>>> - } >>>>>>> - >>>>>>> // should be in assignment pending state to be here >>>>>>> osafassert(m_AVND_SU_IS_ASSIGN_PEND(su)); >>>>>>> >>>>>>> @@ -819,64 +814,76 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>>> TRACE_ENTER2("Sending Resp su=%s, si=%s, curr_state=%u, >>>>>>> prv_state=%u", su->name.value, curr_si->name.value,curr_si- >>>>>>>> curr_state,curr_si->prv_state); >>>>>>> /* populate the susi resp msg */ >>>>>>> msg.info.avd = new AVSV_DND_MSG(); >>>>>>> - msg.type = AVND_MSG_AVD; >>>>>>> - msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>>> snd_msg_id); >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>>> node_info.nodeId; >>>>>>> - if (si) { >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>>> - ((si->single_csi_add_rem_in_si == AVSV_SUSI_ACT_BASE) >>>> ? >>>>>>> false : true); >>>>>>> - } >>>>>>> - TRACE("curr_assign_state '%u'", >>>>>>> curr_si->curr_assign_state); >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>>> || >>>>>>> - >>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) >>>>> ? >>>>>>> - ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN : >>>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>>> - if (si) { >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- >>>>> name; >>>>>>> - if (AVSV_SUSI_ACT_ASGN == >>>>>>> si->single_csi_add_rem_in_si) { >>>>>>> - TRACE("si->curr_assign_state '%u'", >>>>>>> curr_si- >>>>>>>> curr_assign_state); >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>> - >>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>> - >>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>> - AVSV_SUSI_ACT_ASGN : >>>>>>> AVSV_SUSI_ACT_DEL; >>>>>>> - } >>>>>>> - } >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>>> - (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>>> || >>>>>>> - >>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) >>>>> ? >>>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>>> + msg.type = AVND_MSG_AVD; >>>>>>> + msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>>> node_info.nodeId; >>>>>>> + if (si) { >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>>> + ((si->single_csi_add_rem_in_si == >>>>>>> AVSV_SUSI_ACT_BASE) ? false : true); >>>>>>> + } >>>>>>> + TRACE("curr_assign_state '%u'", curr_si->curr_assign_state); >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>> + >>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>> + >>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>> + ((!curr_si->prv_state) ? >>>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>>> + if (si) { >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- >>>>>>>> name; >>>>>>> + if (AVSV_SUSI_ACT_ASGN == si->single_csi_add_rem_in_si) { >>>>>>> + TRACE("si->curr_assign_state '%u'", curr_si- >>>>>>>> curr_assign_state); >>>>>>> + msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.msg_act = >>>>>>> + >>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>> + >>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>> + AVSV_SUSI_ACT_ASGN : >>>>>>> AVSV_SUSI_ACT_DEL; >>>>>>> + } >>>>>>> + } >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>>> + (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>>> + >>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>> + >>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) ? >>>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>>> >>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>> AVSV_SUSI_ACT_ASGN) >>>>>>> - osafassert(si); >>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>> AVSV_SUSI_ACT_ASGN) >>>>>>> + osafassert(si); >>>>>>> >>>>>>> - /* send the msg to AvD */ >>>>>>> - TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', >>>>>>> su'%s', >>>>>> si'%s', >>>>>>> ha_state'%u', error'%u', single_csi'%u'", >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>>> msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.node_id, >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>> msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error, >>>> msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.single_csi); >>>>>>> + /* send the msg to AvD */ >>>>>>> + TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', su'%s', >>>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'", >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id, >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value, >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi); >>>>>>> >>>>>>> - if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>>> - if >>>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>> AVSV_SUSI_ACT_DEL) >>>>>>> - LOG_NO("Removed 'all SIs' from '%s'", >>>>>>> su->name.value); >>>>>>> + if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>> AVSV_SUSI_ACT_DEL) >>>>>>> + LOG_NO("Removed 'all SIs' from '%s'", su- >>>>>>>> name.value); >>>>>>> - if >>>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>> AVSV_SUSI_ACT_MOD) >>>>>>> - LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>>> - ha_state[msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>>> - su->name.value); >>>>>>> - } >>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>> AVSV_SUSI_ACT_MOD) >>>>>>> + LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>>> + ha_state[msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>>> + su->name.value); >>>>>>> + } >>>>>>> >>>>>>> - rc = avnd_di_msg_send(cb, &msg); >>>>>>> - if (NCSCC_RC_SUCCESS == rc) >>>>>>> - msg.info.avd = 0; >>>>>>> - >>>>>>> - /* we have completed the SU SI msg processing */ >>>>>>> - if (su_assign_state_is_stable(su)) >>>>>>> - m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>>> + if (cb->is_avd_down == true) { >>>>>>> + // We are in headless, buffer this msg >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; >>>>>>> + if (avnd_diq_rec_add(cb, &msg) == nullptr) { >>>>>>> + rc = NCSCC_RC_FAILURE; >>>>>>> + } >>>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF >>>>>>> director is offline"); >>>>>>> + } else { >>>>>>> + // We are in normal cluster, send msg to director >>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>>> snd_msg_id); >>>>>>> + /* send the msg to AvD */ >>>>>>> + rc = avnd_di_msg_send(cb, &msg); >>>>>>> + if (NCSCC_RC_SUCCESS == rc) >>>>>>> + msg.info.avd = 0; >>>>>>> + /* we have completed the SU SI msg processing */ >>>>>>> + if (su_assign_state_is_stable(su)) { >>>>>>> + m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>>> + } >>>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>>> + } >>>>>>> >>>>>>> /* free the contents of avnd message */ >>>>>>> avnd_msg_content_free(cb, &msg); @@ -1255,14 +1262,7 @@ void >>>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_ >>>>>>> /* stop the AvD msg response timer */ >>>>>>> if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) { >>>>>>> m_AVND_TMR_MSG_RESP_STOP(cb, *rec); >>>>>>> - // Resend msgs from queue because amfd dropped during >>>>>>> sync >>>>>>> - if ((cb->dnd_list.head != nullptr)) { >>>>>>> - TRACE("retransmit message to amfd"); >>>>>>> - AVND_DND_MSG_LIST *pending_rec = 0; >>>>>>> - for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>>> - avnd_diq_rec_send(cb, pending_rec); >>>>>>> - } >>>>>>> - } >>>>>>> + avnd_diq_rec_send_buffered_msg(cb); >>>>>>> /* resend pg start track */ >>>>>>> avnd_di_resend_pg_start_track(cb); >>>>>>> } >>>>>>> @@ -1275,6 +1275,73 @@ void avnd_diq_rec_del(AVND_CB *cb, >>>> AVND_ >>>>>>> TRACE_LEAVE(); >>>>>>> return; >>>>>>> } >>>>>>> >>>> +/************************************************************ >>>>>>> **************** >>>>>>> + Name : avnd_diq_rec_send_buffered_msg >>>>>>> + >>>>>>> + Description : Resend buffered msg >>>>>>> + >>>>>>> + Arguments : cb - ptr to the AvND control block >>>>>>> + >>>>>>> + Return Values : None. >>>>>>> + >>>>>>> + Notes : None. >>>>>>> >>>> +************************************************************* >>>>>>> ********** >>>>>>> +*******/ void avnd_diq_rec_send_buffered_msg(AVND_CB *cb) { >>>>>>> + TRACE_ENTER(); >>>>>>> + // Resend msgs from queue because amfnd dropped during headless >>>>>>> + // or headless-synchronization >>>>>>> + if ((cb->dnd_list.head != nullptr)) { >>>>>>> + AVND_DND_MSG_LIST *pending_rec = 0; >>>>>>> + TRACE("Attach msg_id of buffered msg"); >>>>>>> + bool found = true; >>>>>>> + while (found) { >>>>>>> + found = false; >>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>>> + if (pending_rec->msg.type == >>>>>>> AVND_MSG_AVD) { >>>>>>> + // At this moment, only oper_state >>>>>>> msg needs to report to director >>>>>>> + if (pending_rec->msg.info.avd- >>>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG && >>>>>>> + pending_rec->msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) { >>>>>>> + m_AVND_DIQ_REC_POP(cb, >>>>>>> pending_rec); #if 0 >>>>>>> + // only resend if this SUSI >>>>>>> does exist >>>>>>> + AVND_SU *su = >>>>>>> m_AVND_SUDB_REC_GET(cb->sudb, >>>>>>> + pending_rec- >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name); >>>>>>> + if (su != nullptr && su- >>>>>>>> si_list.n_nodes > 0) { #endif >>>>>>> + pending_rec- >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = >>>>>>>> ++(cb->snd_msg_id); >>>>>>> + >>>>>>> m_AVND_DIQ_REC_PUSH(cb, pending_rec); >>>>>>> + LOG_NO("Found and >>>>>>> resend buffered su_si_assign msg for SU:'%s', " >>>>>>> + >>>>>>> "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', " >>>>>>> + >>>>>>> "error:'%u', msg_id:'%u'", >>>>>>> + >>>>>>> pending_rec->msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>>> + >>>>>>> pending_rec->msg.info.avd- >>>>>>>> msg_info.n2d_su_si_assign.si_name.value, >>>>>>> + >>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>> + >>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>> + >>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.single_csi, >>>>>>> + >>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>>> + >>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id); >>>>>>> + >>>>>>> +#if 0 >>>>>>> + } else { >>>>>>> + >>>>>>> avnd_msg_content_free(cb, &pending_rec->msg); >>>>>>> + delete pending_rec; >>>>>>> + pending_rec = cb- >>>>>>>> dnd_list.head; >>>>>>> + } >>>>>>> +#endif >>>>>>> + found = true; >>>>>>> + } >>>>>>> + } >>>>>>> + } >>>>>>> + } >>>>>>> + TRACE("retransmit message to amfd"); >>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>> nullptr; >>>>>>> pending_rec = pending_rec->next) { >>>>>>> + avnd_diq_rec_send(cb, pending_rec); >>>>>>> + } >>>>>>> + } >>>>>>> + TRACE_LEAVE(); >>>>>>> + return; >>>>>>> +} >>>>>>> >>>>>>> >>>>>>> >>>> /************************************************************* >>>>>>> *************** >>>>>>> Name : avnd_diq_rec_send >>>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct avnd void >>>>>>> avnd_diq_del(struct avnd_cb_tag *); AVND_DND_MSG_LIST >>>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG *msg); void >>>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb, AVND_DND_MSG_LIST >>>> *rec); >>>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag *cb); >>>>>>> uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb, >>>>>> AVND_DND_MSG_LIST >>>>>>> *rec); uint32_t avnd_di_reg_su_rsp_snd(struct avnd_cb_tag *cb, >>>>>>> SaNameT *su_name, uint32_t ret_code); uint32_t >>>>>>> avnd_di_ack_nack_msg_send(struct avnd_cb_tag *cb, uint32_t rcv_id, >>>>>>> uint32_t view_num); >> > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel