Hi Praveen, Please let me copy your questions and answer here in email, so it's easier we can add comment in line, please see [Minh].
Thanks, Minh ----------------------------- Hi Minh, I am going through the patches 1725_phase1.tgz. Some initial comments: 1) In patch 2 avnd_diq_rec_send_buffered_msg() checks presence of SUSI then only it sends buffered message to AMFD. In case removal of assignments completes during headless , AMFND deletes the SUSIs in su_si_oper_done(). So AMFND will never send the assignment message and admin operation will not continue. [Minh]: If this is the case AMFND deletes all SUSIs during headless, then there will not be any assignment to be sent in state_info message to AMFD after headless. However, in all admin operations of 2N I have been testing, the removal assignment sequence is the last step of admin LOCK/SHUTDOWN. If AMFND deletes SUSI while headless, that also means the prior steps of admin sequence had been done before headless. In this case, that is equivalent to a completion of admin operation. 2) In patch1, I think after headless we will not get any invocation id for the admin operation that was going on before headless. Since AMF is continuing the admin operation we should somehow restrict other admin operation to start by setting some magic no for invocationid or any other way. [Minh]: If AMF is continuing the admin operation after headless, the sg fsm state should not be STABLE, I think (sg_fsm_state == AVD_SG_FSM_STABLE) should be enough to reject new admin operation? 3)If suswitch is in TOGGLED state then I think we should crosscheck that there are atleast two SUs having assignment. The reason is if this flag remains TOGGOLED and admin op does not continue then there is very less probability that if will get reset as it is used only in si-swap flow. [Minh]: Yes I don't particularly like this osafAmfSUSwitch to be written to IMM. I had the only test case 144 failed (test list attached to ticket) Test 144 is: Swap SI, delay csi STANDBY cbk in SU4, stop SCs, restart SCs, reboot PL5. And I ran into the code line which requires suswitch void SG_2N::node_fail_su_oper(AVD_SU *su) { .... /* the SU has standby SI assignments. if the other SUs switch field * is true, it is in service, having quiesced assigning state. * Send D2N-INFO_SU_SI_ASSIGN modify active all to the other SU. * Change switch field to false. Change state to SG_realign. * Free all the SI assignments to this SU. */ if ((su_oper_list_front()->su_switch == AVSV_SI_TOGGLE_SWITCH) && (su_oper_list_front()->saAmfSuReadinessState == SA_AMF_READINESS_IN_SERVICE)) { I think the *crosscheck* is actually a deduction of @su_switch from whatever states that AMFD receives after headless. If *crosscheck* is possible thing, then su_switch does not need to be checkpointed at standby AMFD also. In non-headless, we always need standby AMFD up-to-date all states by checkpoint so that if active AMFD has gone, the standby AMFD can take over by using these checkpointed states. Now in headless, we also have to write these states somewhere (here is IMM) so that the new active AMFD can use it. It's the best that su_switch is revertible from a set of states, but it's not easy to prove it's revertible from all scenarios of 2N si-swap. If you think removing osafAmfSUSwitch is really needed, then this needs to be looked more thoroughly later I think? 4)Since assignments are in progress. This could be because of admin operation or faults. AMFD should call one function here like log_admin_op(). This function will search the entity that is being under admin operation and log details like: -After headless state admin op on '%s' is continuing in syslog. -Also traces for susi states which are not assigned. [Minh]: Agree, some sort of logging like this is good idea, I think it's best to introduce this logging in the patch : [PATCH 4 of 4] AMFD: Validate headless cached RTA read from IMM [#1725] And maybe I need more details of what you would like to log. Thanks, Praveen --------------------- On 23/08/16 21:03, minh chau wrote: > Hi Nagu, > > I see in the trace you provided, the SU2/SU3 become IN_SERVICE late. > If there's a delay in PL4 joining cluster after headless in your test > then you could also see it in the latest patches (longDN rebased version) > I'm looking in to this issue. > > Thanks. > Minh > > On 23/08/16 20:24, Nagendra Kumar wrote: >> Please ignore TC #2, my mistake. >> >> Thanks >> -Nagu >> >>> -----Original Message----- >>> From: Nagendra Kumar >>> Sent: 23 August 2016 15:49 >>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>> Cc: opensaf-devel@lists.sourceforge.net >>> Subject: RE: [PATCH 2 of 2] AMFND: Admin operation continuation if csi >>> callback completes during headless [#1725 part 1] V1 >>> >>> Please consider previous TC as TC #1 >>> >>> TC #2: Same configuration as TC #1. Logs attached in the ticket TC #2. >>> >>> Steps: >>> 1. Same as step #1 of TC #1. >>> 2. After locking SU1, keep delay in >>> avnd_evt_avd_info_su_si_assign_evh and >>> stop SC-1 and SC-2. >>> 3. Start SC-1 and SC-2. SU1 is still in quisced state. Ideally, it >>> should have no >>> assignment and SU3 should have got assignment. >>> >>> safSISU=safSu=SU3\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>> mo1,safApp=AmfDemo1 >>> saAmfSISUHAState=STANDBY(2) >>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>> mo1,safApp=AmfDemo1 >>> saAmfSISUHAState=ACTIVE(1) >>> safSISU=safSu=PL- >>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>> saAmfSISUHAState=ACTIVE(1) >>> safSISU=safSu=SC- >>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>> saAmfSISUHAState=ACTIVE(1) >>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>> 2N,safApp=OpenSAF >>> saAmfSISUHAState=ACTIVE(1) >>> safSISU=safSu=SC- >>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>> saAmfSISUHAState=ACTIVE(1) >>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>> 2N,safApp=OpenSAF >>> saAmfSISUHAState=STANDBY(2) >>> safSISU=safSu=PL- >>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>> saAmfSISUHAState=ACTIVE(1) >>> >>> After that PL-3 rebooted by the following logs: >>> Aug 23 15:31:52 PM_PL-3 osafamfwd[18056]: TIMEOUT receiving AMF >>> health check request, generating core for amfnd Aug 23 15:31:52 PM_PL-3 >>> osafamfwd[18056]: Last received healthcheck cnt=82 at Tue Aug 23 >>> 15:30:52 >>> 2016 Aug 23 15:31:52 PM_PL-3 osafamfwd[18056]: Rebooting OpenSAF >>> NodeId = 0 EE Name = No EE Mapped, Reason: AMFND unresponsive, >>> AMFWDOG initiated system reboot, OwnNodeId = 131855, SupervisionTime >>> = 60 Aug 23 15:31:52 PM_PL-3 opensaf_reboot: Rebooting local node; >>> timeout=60 >>> >>> Thanks >>> -Nagu >>> >>>> -----Original Message----- >>>> From: Nagendra Kumar >>>> Sent: 23 August 2016 15:19 >>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>> Cc: opensaf-devel@lists.sourceforge.net >>>> Subject: RE: [PATCH 2 of 2] AMFND: Admin operation continuation if csi >>>> callback completes during headless [#1725 part 1] V1 >>>> >>>> Please note that it is on change set 7846:31417997c82f and I have >>>> applied patch of ticket #1894. >>>> >>>> Thanks >>>> -Nagu >>>>> -----Original Message----- >>>>> From: Nagendra Kumar >>>>> Sent: 23 August 2016 15:15 >>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>> Subject: RE: [PATCH 2 of 2] AMFND: Admin operation continuation if >>>>> csi callback completes during headless [#1725 part 1] V1 >>>>> >>>>> Hi Minh, >>>>> The following SU lock case is not working. This issue will exist >>>>> for all the flows, so please check. >>>>> >>>>> Configuration and traces attached in the ticket. >>>>> >>>>> Steps: >>>>> 1. Start SC-1, SC-2, PL-3 and PL-4. Run the following command: >>>>> immcfg -f /tmp/AppConfig-2N-1725.xml amf-adm unlock-in >>>>> safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>> amf-adm unlock-in safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>> amf-adm unlock-in safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>> amf-adm unlock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>> amf-adm unlock safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>> amf-adm unlock safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>> >>>>> Assignments are: >>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>> safSISU=safSu=SC- >>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>> 2N,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC- >>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>> 2N,safApp=OpenSAF >>>>> saAmfSISUHAState=STANDBY(2) >>>>> safSISU=safSu=PL- >>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=PL- >>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> >>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>> mo1,safApp=AmfDemo1 >>>>> saAmfSISUHAState=STANDBY(2) >>>>> >>> safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>> mo1,safApp=AmfDemo1 >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> >>>>> 2. Issue lock on SU1. >>>>> amf-adm lock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>> And keep gdb in csi_set callback. Stop SC-1 and SC-2. >>>>> Send Ok from csi_set callback. >>>>> >>>>> 3. Start SC-1 and SC-2. >>>>> >>>>> 4. Assignment to components of SU2 is not given and assignments of >>>>> SU2 still shows Standby. >>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>> >>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>> mo1,safApp=AmfDemo1 >>>>> saAmfSISUHAState=STANDBY(2) >>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>> 2N,safApp=OpenSAF >>>>> saAmfSISUHAState=STANDBY(2) >>>>> safSISU=safSu=SC- >>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=PL- >>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=PL- >>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC- >>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>> 2N,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> >>>>> >>>>> Thanks >>>>> -Nagu >>>>> >>>>>> -----Original Message----- >>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] >>>>>> Sent: 05 August 2016 02:50 >>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen Malviya; >>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au; >>>>>> minh.c...@dektech.com.au >>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>> Subject: [PATCH 2 of 2] AMFND: Admin operation continuation if csi >>>>>> callback completes during headless [#1725 part 1] V1 >>>>>> >>>>>> osaf/services/saf/amf/amfnd/di.cc | 199 >>>>>> +++++++++++++++++- >>> --- >>>> -- >>>>> -- >>>>>> osaf/services/saf/amf/amfnd/include/avnd_di.h | 1 + >>>>>> 2 files changed, 134 insertions(+), 66 deletions(-) >>>>>> >>>>>> >>>>>> The patch buffers susi_resp_msg during headless stage and resend >>>>>> it to AMFD after headless. >>>>>> >>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc >>>>>> b/osaf/services/saf/amf/amfnd/di.cc >>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc >>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc >>>>>> @@ -804,11 +804,6 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>> if (cb->term_state == >>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED) >>>>>> return rc; >>>>>> >>>>>> - if (cb->is_avd_down == true) { >>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>> - return rc; >>>>>> - } >>>>>> - >>>>>> // should be in assignment pending state to be here >>>>>> osafassert(m_AVND_SU_IS_ASSIGN_PEND(su)); >>>>>> >>>>>> @@ -819,64 +814,76 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>> TRACE_ENTER2("Sending Resp su=%s, si=%s, curr_state=%u, >>>>>> prv_state=%u", su->name.value, curr_si->name.value,curr_si- >>>>>>> curr_state,curr_si->prv_state); >>>>>> /* populate the susi resp msg */ >>>>>> msg.info.avd = new AVSV_DND_MSG(); >>>>>> - msg.type = AVND_MSG_AVD; >>>>>> - msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>> snd_msg_id); >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>> node_info.nodeId; >>>>>> - if (si) { >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>> - ((si->single_csi_add_rem_in_si == AVSV_SUSI_ACT_BASE) >>> ? >>>>>> false : true); >>>>>> - } >>>>>> - TRACE("curr_assign_state '%u'", >>>>>> curr_si->curr_assign_state); >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>> || >>>>>> - >>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) >>>> ? >>>>>> - ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN : >>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>> - if (si) { >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- >>>> name; >>>>>> - if (AVSV_SUSI_ACT_ASGN == >>>>>> si->single_csi_add_rem_in_si) { >>>>>> - TRACE("si->curr_assign_state '%u'", >>>>>> curr_si- >>>>>>> curr_assign_state); >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>> - >>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>> - >>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>> - AVSV_SUSI_ACT_ASGN : >>>>>> AVSV_SUSI_ACT_DEL; >>>>>> - } >>>>>> - } >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>> - (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>> || >>>>>> - >>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) >>>> ? >>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>> + msg.type = AVND_MSG_AVD; >>>>>> + msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>> node_info.nodeId; >>>>>> + if (si) { >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>> + ((si->single_csi_add_rem_in_si == >>>>>> AVSV_SUSI_ACT_BASE) ? false : true); >>>>>> + } >>>>>> + TRACE("curr_assign_state '%u'", curr_si->curr_assign_state); >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>> + >>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>> + >>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>> + ((!curr_si->prv_state) ? >>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>> + if (si) { >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- >>>>>>> name; >>>>>> + if (AVSV_SUSI_ACT_ASGN == si->single_csi_add_rem_in_si) { >>>>>> + TRACE("si->curr_assign_state '%u'", curr_si- >>>>>>> curr_assign_state); >>>>>> + msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.msg_act = >>>>>> + >>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>> + >>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>> + AVSV_SUSI_ACT_ASGN : >>>>>> AVSV_SUSI_ACT_DEL; >>>>>> + } >>>>>> + } >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>> + (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>> + >>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>> + >>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) ? >>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>> >>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>> AVSV_SUSI_ACT_ASGN) >>>>>> - osafassert(si); >>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>> AVSV_SUSI_ACT_ASGN) >>>>>> + osafassert(si); >>>>>> >>>>>> - /* send the msg to AvD */ >>>>>> - TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', >>>>>> su'%s', >>>>> si'%s', >>>>>> ha_state'%u', error'%u', single_csi'%u'", >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>> msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.node_id, >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>> msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error, >>> msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.single_csi); >>>>>> + /* send the msg to AvD */ >>>>>> + TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', su'%s', >>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'", >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id, >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value, >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi); >>>>>> >>>>>> - if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>> - if >>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>> AVSV_SUSI_ACT_DEL) >>>>>> - LOG_NO("Removed 'all SIs' from '%s'", >>>>>> su->name.value); >>>>>> + if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>> AVSV_SUSI_ACT_DEL) >>>>>> + LOG_NO("Removed 'all SIs' from '%s'", su- >>>>>>> name.value); >>>>>> - if >>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>> AVSV_SUSI_ACT_MOD) >>>>>> - LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>> - ha_state[msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>> - su->name.value); >>>>>> - } >>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>> AVSV_SUSI_ACT_MOD) >>>>>> + LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>> + ha_state[msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>> + su->name.value); >>>>>> + } >>>>>> >>>>>> - rc = avnd_di_msg_send(cb, &msg); >>>>>> - if (NCSCC_RC_SUCCESS == rc) >>>>>> - msg.info.avd = 0; >>>>>> - >>>>>> - /* we have completed the SU SI msg processing */ >>>>>> - if (su_assign_state_is_stable(su)) >>>>>> - m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>> + if (cb->is_avd_down == true) { >>>>>> + // We are in headless, buffer this msg >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; >>>>>> + if (avnd_diq_rec_add(cb, &msg) == nullptr) { >>>>>> + rc = NCSCC_RC_FAILURE; >>>>>> + } >>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF >>>>>> director is offline"); >>>>>> + } else { >>>>>> + // We are in normal cluster, send msg to director >>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>> snd_msg_id); >>>>>> + /* send the msg to AvD */ >>>>>> + rc = avnd_di_msg_send(cb, &msg); >>>>>> + if (NCSCC_RC_SUCCESS == rc) >>>>>> + msg.info.avd = 0; >>>>>> + /* we have completed the SU SI msg processing */ >>>>>> + if (su_assign_state_is_stable(su)) { >>>>>> + m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>> + } >>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>> + } >>>>>> >>>>>> /* free the contents of avnd message */ >>>>>> avnd_msg_content_free(cb, &msg); @@ -1255,14 +1262,7 @@ void >>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_ >>>>>> /* stop the AvD msg response timer */ >>>>>> if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) { >>>>>> m_AVND_TMR_MSG_RESP_STOP(cb, *rec); >>>>>> - // Resend msgs from queue because amfd dropped during >>>>>> sync >>>>>> - if ((cb->dnd_list.head != nullptr)) { >>>>>> - TRACE("retransmit message to amfd"); >>>>>> - AVND_DND_MSG_LIST *pending_rec = 0; >>>>>> - for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>> - avnd_diq_rec_send(cb, pending_rec); >>>>>> - } >>>>>> - } >>>>>> + avnd_diq_rec_send_buffered_msg(cb); >>>>>> /* resend pg start track */ >>>>>> avnd_di_resend_pg_start_track(cb); >>>>>> } >>>>>> @@ -1275,6 +1275,73 @@ void avnd_diq_rec_del(AVND_CB *cb, >>> AVND_ >>>>>> TRACE_LEAVE(); >>>>>> return; >>>>>> } >>>>>> >>> +/************************************************************ >>>>>> **************** >>>>>> + Name : avnd_diq_rec_send_buffered_msg >>>>>> + >>>>>> + Description : Resend buffered msg >>>>>> + >>>>>> + Arguments : cb - ptr to the AvND control block >>>>>> + >>>>>> + Return Values : None. >>>>>> + >>>>>> + Notes : None. >>>>>> >>> +************************************************************* >>>>>> ********** >>>>>> +*******/ void avnd_diq_rec_send_buffered_msg(AVND_CB *cb) { >>>>>> + TRACE_ENTER(); >>>>>> + // Resend msgs from queue because amfnd dropped during headless >>>>>> + // or headless-synchronization >>>>>> + if ((cb->dnd_list.head != nullptr)) { >>>>>> + AVND_DND_MSG_LIST *pending_rec = 0; >>>>>> + TRACE("Attach msg_id of buffered msg"); >>>>>> + bool found = true; >>>>>> + while (found) { >>>>>> + found = false; >>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>> + if (pending_rec->msg.type == >>>>>> AVND_MSG_AVD) { >>>>>> + // At this moment, only oper_state >>>>>> msg needs to report to director >>>>>> + if (pending_rec->msg.info.avd- >>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG && >>>>>> + pending_rec->msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) { >>>>>> + m_AVND_DIQ_REC_POP(cb, >>>>>> pending_rec); #if 0 >>>>>> + // only resend if this SUSI >>>>>> does exist >>>>>> + AVND_SU *su = >>>>>> m_AVND_SUDB_REC_GET(cb->sudb, >>>>>> + pending_rec- >>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name); >>>>>> + if (su != nullptr && su- >>>>>>> si_list.n_nodes > 0) { #endif >>>>>> + pending_rec- >>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = >>>>>>> ++(cb->snd_msg_id); >>>>>> + >>>>>> m_AVND_DIQ_REC_PUSH(cb, pending_rec); >>>>>> + LOG_NO("Found and >>>>>> resend buffered su_si_assign msg for SU:'%s', " >>>>>> + >>>>>> "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', " >>>>>> + >>>>>> "error:'%u', msg_id:'%u'", >>>>>> + >>>>>> pending_rec->msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>> + >>>>>> pending_rec->msg.info.avd- >>>>>>> msg_info.n2d_su_si_assign.si_name.value, >>>>>> + >>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>> + >>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>> + >>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.single_csi, >>>>>> + >>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>> + >>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id); >>>>>> + >>>>>> +#if 0 >>>>>> + } else { >>>>>> + >>>>>> avnd_msg_content_free(cb, &pending_rec->msg); >>>>>> + delete pending_rec; >>>>>> + pending_rec = cb- >>>>>>> dnd_list.head; >>>>>> + } >>>>>> +#endif >>>>>> + found = true; >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + } >>>>>> + TRACE("retransmit message to amfd"); >>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>> nullptr; >>>>>> pending_rec = pending_rec->next) { >>>>>> + avnd_diq_rec_send(cb, pending_rec); >>>>>> + } >>>>>> + } >>>>>> + TRACE_LEAVE(); >>>>>> + return; >>>>>> +} >>>>>> >>>>>> >>>>>> >>> /************************************************************* >>>>>> *************** >>>>>> Name : avnd_diq_rec_send >>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct avnd void >>>>>> avnd_diq_del(struct avnd_cb_tag *); AVND_DND_MSG_LIST >>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG *msg); void >>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb, AVND_DND_MSG_LIST >>> *rec); >>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag *cb); >>>>>> uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb, >>>>> AVND_DND_MSG_LIST >>>>>> *rec); uint32_t avnd_di_reg_su_rsp_snd(struct avnd_cb_tag *cb, >>>>>> SaNameT *su_name, uint32_t ret_code); uint32_t >>>>>> avnd_di_ack_nack_msg_send(struct avnd_cb_tag *cb, uint32_t rcv_id, >>>>>> uint32_t view_num); > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel