Hi Minh, Please see inline with [Praveen]
Thanks, Praveen On 29-Aug-16 5:57 AM, minh chau wrote: > Hi Praveen, > > Thanks for looking through the patch. > The potential problem of restoring nodegroup because nodegroup allows to > be created in LOCKED while the SUs are having assignment, this could > cause an ambiguity for AMFD after headless. For example: > Suppose having SU4 hosted on PL4, SU5 hosted on PL5, SU4 has active > assignment, SU5 has standby assignment. > case 1: Create nodegroup (PL4 + PL5) with LOCKED, lock PL5, lock PL4, > delay quiesced csi cbk, stop SC, restart SC. [Praveen] In this case, after headless state SG fsm will not be in SG_ADMIN state because payload are being locked one by one. So in this case it is distinguishable that it is a not a NG operation case as SG is not in SG_ADMIN state even though SG is fully assigned in NG. > case 2: Create nodegroup (PL4 + PL5) with LOCKED, lock nodegroup, delay > quiesced csi cbk, stop SC, restart SC. [Praveen] In this case we have following information after headless state: -SG is in SG_ADMIN state. -NG is in SHUTTING_DOWN or LOCKED state. -Nodes in SHUTTING_DOWN or LOCKED state. -SG FSM remains in SG_ADMIN state only in case of admin operation on SG. But after headless SG is not found in UNLOCKED state and one NG is found in LOCKED/SHUTTING down state and its nodes. I think with above information, AMFD can set @ng_using_saAmfSGAdminState and set SG admin state to SHUTTING_DOWN or LOCKED. Restoring admin_ng is not required as in su_si_assign(), there is an OR condition between @ng_using_saAmfSGAdminState and @admin_ng for calling process_su_si_response_for_ng(). Also checks on admin_ng is used only for updating counters related to completion of admin opeations which is not required after headless. > > if case 2 actually happened before headless, then @admin_ng and > @ng_using_saAmfSGAdminState needs to be restored, otherwise > process_su_si_response_for_ng() won't be called and saAmfSGAdminState > remains LOCKED and SG is still not STABLE state. > > But in both cases, after headless, AMFD sees all PLs are LOCKED, > nodegroup is LOCKED, SU4 has pending quiesced csi cbk, thus they are > running into the same code flow. In case 1, @admin_ng and > @ng_using_saAmfSGAdminState should not bet set since case1 was not > nodegroup operation before headless. > I have run a test of both cases, they are working with the patch > attached in ticket, but it still looks a potential problem since all > cases are not transparent to AMFD after headless, the @admin_ng and > @ng_using_asAmfSGAdminState maybe get hit in some points in case 1 [Praveen] Besides above cases, there remains only one case: when operation was initiated on NG and SG is partially mapped in NG. In this case, after headless state we can get only two states of SG either SU_OPER or SG_REALIGN. In both the cases I think we do not require to restore @ng_using_saAmfSGAdminState and @admin_ng because we do not require to enter in process_su_si_response_for_ng().In sg_2n_fsm, it marks Node from SHUTTING_DOWN to locked in susi_success_su_oper(). Have I missed any other case? > > If case 1 looks ok to you from nodegroup point of view, then I will > float the patch for review. > > Thanks, > Minh > > > On 26/08/16 16:08, praveen malviya wrote: >> Hi, >> >> I have gone through amfd traces. Also patch for NG seems to be ok but >> some minor can be done. >> >> As pointed by Minh, when whole SG is mapped in NG (say case a), AMFD >> uses SG_ADMIN flow and SG admin state without exposing it to the user >> through IMM for 2N model. In the other case when only one SU is >> assigned in NG (say case b) there should not be any problem because >> operation fully depends on NG admin state. Since other case b) does >> not use SG admin state and ng_using_saAMfSGAdminState, it should work >> fine. >> >> I think we can take the help of following facts and functions to >> improve the patch and with that restoring ng_using_saAmfSGAdminState >> from IMM may not be required: >> 1)In normal cluster, if controller switchover/failover happens when NG >> operation is going on then standby controller continues admin >> operation with information that it gets through CKPT updates in >> dec_sg_admin_state() and dec_ng_admin_state(). Active controller never >> checkpoints ng_using_saAmfSGAdminState and deduce it in these >> functions.The situation after headless is almost like that. >> I think, in case a when shutdown operation is going on, admin >> state of NG is still SHUTTING_DOWN and system becomes headless, >> requires more params and not the lock operation. In shutdown >> operation, AMFD has to ensure transition of NG and Nodes to LOCKED state. >> >> 2)Like controller fail-over/switch-over after headless also, we are >> not bound to reply to IMM for admin operation completion. So we need >> to analyse if we require to restore node->admin_ng. Half of the code >> in process_su_si_response_for_ng() is for tracking the state of admin >> operation so that AMFD replies to IMM for admin operation and this is >> not required after headless state. >> >> I think problem is not that much complex as it is valid for only 2N >> models and only in case a). >> >> Thanks, >> Praveen >> >> On 25-Aug-16 6:38 PM, minh chau wrote: >>> Hi, >>> >>> The test failed because two reasons: >>> 1. There are two places that nodegroup operation borrows 2N SG FSM, but >>> the AdminState of SG is not stored to IMM >>> saAmfSGAdminState = ng->saAmfNGAdminState; >>> ... >>> su->sg_of_su->saAmfSGAdminState = SA_AMF_ADMIN_UNLOCKED; >>> >>> This setting needs to be called by AVD_SG::set_admin_state() >>> >>> 2. After receives su_si assignment response after headless, @admin_ng, >>> @ng_using_saAmfSGAdminState have not been restored. >>> They need to be restored by somehow. Since nodegroup allows to be >>> created at any adminState. So there should be the case nodegroup's >>> AdminState is created with LOCKED but the belonging SUs are still having >>> assignment, so adminState of nodegroup can't be used. >>> The admin_ng, ng_using_saAmfSGAdminState seem need to be stored to IMM? >>> @Praveen: any suggestions? >>> >>> } else if ((su->sg_of_su->sg_ncs_spec == false) && >>> ((su->su_on_node->admin_ng != nullptr) || >>> (su->sg_of_su->ng_using_saAmfSGAdminState == true))) { >>> AVD_AMF_NG *ng = su->su_on_node->admin_ng; >>> //Got response from AMFND for assignments decrement >>> su_cnt_admin_oper. >>> if ((ng != nullptr) && >>> (((((ng->admin_ng_pend_cbk.admin_oper == >>> SA_AMF_ADMIN_SHUTDOWN) || >>> (ng->admin_ng_pend_cbk.admin_oper == >>> SA_AMF_ADMIN_LOCK)) && >>> (su->saAmfSUNumCurrActiveSIs == 0) && >>> (su->saAmfSUNumCurrStandbySIs == 0) && >>> (AVSV_SUSI_ACT_DEL == >>> n2d_msg->msg_info.n2d_su_si_assign.msg_act))) || >>> (ng->admin_ng_pend_cbk.admin_oper == >>> SA_AMF_ADMIN_UNLOCK))) { >>> su->su_on_node->su_cnt_admin_oper--; >>> TRACE("node:'%s', su_cnt_admin_oper:%u", >>> su->su_on_node->name.c_str(),su->su_on_node->su_cnt_admin_oper); >>> } >>> process_su_si_response_for_ng(su, SA_AIS_OK); >>> >>> On 25/08/16 21:36, Nagendra Kumar wrote: >>>> Further testing results: >>>> Node group lock has resulted in SG unstable. Logs and configuration >>>> file attached. >>>> >>>> Configuration : SC-1, PL-3 and PL-4. >>>> >>>> Steps: >>>> >>>> 1. Unlock SU1(on PL-3), SU2 and SU3 (Both on PL-4). >>>> 2. Create node group of PL-3 and PL-4: >>>> 3. Lock the node group. >>>> amf-adm lock safAmfNodeGroup=nagu,safAmfCluster=myAmfCluster >>>> 4. Keep gdb in csi set callback, stop SC-1 and start respond OK from >>>> csi set callback and start SC-1. >>>> >>>> SG becomes unstable if you try to unlock the Node group: >>>> Aug 25 16:57:06 PM_SC-1 osafamfd[2166]: NO >>>> 'safSg=AmfDemo_2N,safApp=AmfDemo1' is in unstable/transition state >>>> >>>> >>>> Thanks >>>> -Nagu >>>> >>>>> -----Original Message----- >>>>> From: Nagendra Kumar >>>>> Sent: 24 August 2016 16:58 >>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>> Subject: Re: [devel] [PATCH 2 of 2] AMFND: Admin operation >>>>> continuation if >>>>> csi callback completes during headless [#1725 part 1] V1 >>>>> >>>>> The below is the assignments after the test case (SU2 has standby >>>>> assignment): >>>>> >>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>> mo1,safApp=AmfDemo1 >>>>> saAmfSISUHAState=STANDBY(2) >>>>> safSISU=safSu=PL- >>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=PL- >>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC- >>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC- >>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>> 2N,safApp=OpenSAF >>>>> saAmfSISUHAState=STANDBY(2) >>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>> 2N,safApp=OpenSAF >>>>> saAmfSISUHAState=ACTIVE(1) >>>>> >>>>> Thanks >>>>> -Nagu >>>>> >>>>>> -----Original Message----- >>>>>> From: Nagendra Kumar >>>>>> Sent: 24 August 2016 16:55 >>>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>> Subject: Re: [devel] [PATCH 2 of 2] AMFND: Admin operation >>>>>> continuation if csi callback completes during headless [#1725 part 1] >>>>>> V1 >>>>>> >>>>>> Hi Minh, >>>>>> With 1725_phase_1_V2.tgz, the below email TC has failed. Please >>>>> find >>>>>> the traces attached along with the configuration in the ticket. >>>>>> >>>>>> Thanks >>>>>> -Nagu >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Nagendra Kumar >>>>>>> Sent: 23 August 2016 15:15 >>>>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>>> Subject: Re: [devel] [PATCH 2 of 2] AMFND: Admin operation >>>>>>> continuation if csi callback completes during headless [#1725 part >>>>>>> 1] >>>>>>> V1 >>>>>>> >>>>>>> Hi Minh, >>>>>>> The following SU lock case is not working. This issue will exist >>>>>>> for all the flows, so please check. >>>>>>> >>>>>>> Configuration and traces attached in the ticket. >>>>>>> >>>>>>> Steps: >>>>>>> 1. Start SC-1, SC-2, PL-3 and PL-4. Run the following command: >>>>>>> immcfg -f /tmp/AppConfig-2N-1725.xml amf-adm unlock-in >>>>>>> safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>> amf-adm unlock-in safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>> amf-adm unlock-in safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>> amf-adm unlock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>> amf-adm unlock safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>> amf-adm unlock safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>> >>>>>>> Assignments are: >>>>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>>>> safSISU=safSu=SC- >>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>> 2N,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=SC- >>>>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>> 2N,safApp=OpenSAF >>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>> safSISU=safSu=PL- >>>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=PL- >>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> >>>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>>> mo1,safApp=AmfDemo1 >>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>> >>>>> safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>>> mo1,safApp=AmfDemo1 >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> >>>>>>> 2. Issue lock on SU1. >>>>>>> amf-adm lock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>> And keep gdb in csi_set callback. Stop SC-1 and SC-2. >>>>>>> Send Ok from csi_set callback. >>>>>>> >>>>>>> 3. Start SC-1 and SC-2. >>>>>>> >>>>>>> 4. Assignment to components of SU2 is not given and assignments of >>>>>>> SU2 still shows Standby. >>>>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>>>> >>>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>>> mo1,safApp=AmfDemo1 >>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>> 2N,safApp=OpenSAF >>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>> safSISU=safSu=SC- >>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=PL- >>>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=PL- >>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=SC- >>>>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>> 2N,safApp=OpenSAF >>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> -Nagu >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] >>>>>>>> Sent: 05 August 2016 02:50 >>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen Malviya; >>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au; >>>>>>>> minh.c...@dektech.com.au >>>>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>>>> Subject: [PATCH 2 of 2] AMFND: Admin operation continuation if csi >>>>>>> callback >>>>>>>> completes during headless [#1725 part 1] V1 >>>>>>>> >>>>>>>> osaf/services/saf/amf/amfnd/di.cc | 199 >>>>>>>> +++++++++++++++++- >>>>> --- >>>>>> -- >>>>>>> -- >>>>>>>> osaf/services/saf/amf/amfnd/include/avnd_di.h | 1 + >>>>>>>> 2 files changed, 134 insertions(+), 66 deletions(-) >>>>>>>> >>>>>>>> >>>>>>>> The patch buffers susi_resp_msg during headless stage and resend >>>>>>>> it to AMFD after headless. >>>>>>>> >>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc >>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc >>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc >>>>>>>> @@ -804,11 +804,6 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>>>> if (cb->term_state == >>>>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED) >>>>>>>> return rc; >>>>>>>> >>>>>>>> - if (cb->is_avd_down == true) { >>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>>>> - return rc; >>>>>>>> - } >>>>>>>> - >>>>>>>> // should be in assignment pending state to be here >>>>>>>> osafassert(m_AVND_SU_IS_ASSIGN_PEND(su)); >>>>>>>> >>>>>>>> @@ -819,64 +814,76 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>>>> TRACE_ENTER2("Sending Resp su=%s, si=%s, curr_state=%u, >>>>>>>> prv_state=%u", su->name.value, curr_si->name.value,curr_si- >>>>>>>>> curr_state,curr_si->prv_state); >>>>>>>> /* populate the susi resp msg */ >>>>>>>> msg.info.avd = new AVSV_DND_MSG(); >>>>>>>> - msg.type = AVND_MSG_AVD; >>>>>>>> - msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>>>> snd_msg_id); >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>>>> node_info.nodeId; >>>>>>>> - if (si) { >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>>>> - ((si->single_csi_add_rem_in_si == >>>>>>>> AVSV_SUSI_ACT_BASE) >>>>> ? >>>>>>>> false : true); >>>>>>>> - } >>>>>>>> - TRACE("curr_assign_state '%u'", >>>>>>>> curr_si->curr_assign_state); >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>>>> || >>>>>>>> - >>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) >>>>>> ? >>>>>>>> - ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN : >>>>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>>>> - if (si) { >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = >>>>>>>> si- >>>>>> name; >>>>>>>> - if (AVSV_SUSI_ACT_ASGN == >>>>>>>> si->single_csi_add_rem_in_si) { >>>>>>>> - TRACE("si->curr_assign_state '%u'", curr_si- >>>>>>>>> curr_assign_state); >>>>>>>> - >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>>> - >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>> - >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>>> - AVSV_SUSI_ACT_ASGN : >>>>>>>> AVSV_SUSI_ACT_DEL; >>>>>>>> - } >>>>>>>> - } >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>>>> - (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>>>> || >>>>>>>> - >>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) >>>>>> ? >>>>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>>>> + msg.type = AVND_MSG_AVD; >>>>>>>> + msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>>>> node_info.nodeId; >>>>>>>> + if (si) { >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>>>> + ((si->single_csi_add_rem_in_si == >>>>>>>> AVSV_SUSI_ACT_BASE) ? false : true); >>>>>>>> + } >>>>>>>> + TRACE("curr_assign_state '%u'", curr_si->curr_assign_state); >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>>> + >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>> + >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>>> + ((!curr_si->prv_state) ? >>>>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>>>> + if (si) { >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- >>>>>>>>> name; >>>>>>>> + if (AVSV_SUSI_ACT_ASGN == si->single_csi_add_rem_in_si) { >>>>>>>> + TRACE("si->curr_assign_state '%u'", curr_si- >>>>>>>>> curr_assign_state); >>>>>>>> + msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.msg_act = >>>>>>>> + >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>> + >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>>> + AVSV_SUSI_ACT_ASGN : >>>>>>>> AVSV_SUSI_ACT_DEL; >>>>>>>> + } >>>>>>>> + } >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>>>> + (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>>>> + >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>> + >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) ? >>>>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>>>> >>>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>> AVSV_SUSI_ACT_ASGN) >>>>>>>> - osafassert(si); >>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>> AVSV_SUSI_ACT_ASGN) >>>>>>>> + osafassert(si); >>>>>>>> >>>>>>>> - /* send the msg to AvD */ >>>>>>>> - TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', >>>>>>>> su'%s', >>>>>>> si'%s', >>>>>>>> ha_state'%u', error'%u', single_csi'%u'", >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>>>> msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.node_id, >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>> msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>> msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.single_csi); >>>>>>>> + /* send the msg to AvD */ >>>>>>>> + TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', su'%s', >>>>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'", >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id, >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value, >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi); >>>>>>>> >>>>>>>> - if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>>>> - if >>>>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>> AVSV_SUSI_ACT_DEL) >>>>>>>> - LOG_NO("Removed 'all SIs' from '%s'", >>>>>>>> su->name.value); >>>>>>>> + if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>> AVSV_SUSI_ACT_DEL) >>>>>>>> + LOG_NO("Removed 'all SIs' from '%s'", su- >>>>>>>>> name.value); >>>>>>>> - if >>>>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>> AVSV_SUSI_ACT_MOD) >>>>>>>> - LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>>>> - ha_state[msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>>>> - su->name.value); >>>>>>>> - } >>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>> AVSV_SUSI_ACT_MOD) >>>>>>>> + LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>>>> + ha_state[msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>>>> + su->name.value); >>>>>>>> + } >>>>>>>> >>>>>>>> - rc = avnd_di_msg_send(cb, &msg); >>>>>>>> - if (NCSCC_RC_SUCCESS == rc) >>>>>>>> - msg.info.avd = 0; >>>>>>>> - >>>>>>>> - /* we have completed the SU SI msg processing */ >>>>>>>> - if (su_assign_state_is_stable(su)) >>>>>>>> - m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>>>> + if (cb->is_avd_down == true) { >>>>>>>> + // We are in headless, buffer this msg >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; >>>>>>>> + if (avnd_diq_rec_add(cb, &msg) == nullptr) { >>>>>>>> + rc = NCSCC_RC_FAILURE; >>>>>>>> + } >>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF >>>>>>>> director is offline"); >>>>>>>> + } else { >>>>>>>> + // We are in normal cluster, send msg to director >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>>>> snd_msg_id); >>>>>>>> + /* send the msg to AvD */ >>>>>>>> + rc = avnd_di_msg_send(cb, &msg); >>>>>>>> + if (NCSCC_RC_SUCCESS == rc) >>>>>>>> + msg.info.avd = 0; >>>>>>>> + /* we have completed the SU SI msg processing */ >>>>>>>> + if (su_assign_state_is_stable(su)) { >>>>>>>> + m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>>>> + } >>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>>>> + } >>>>>>>> >>>>>>>> /* free the contents of avnd message */ >>>>>>>> avnd_msg_content_free(cb, &msg); @@ -1255,14 +1262,7 @@ void >>>>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_ >>>>>>>> /* stop the AvD msg response timer */ >>>>>>>> if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) { >>>>>>>> m_AVND_TMR_MSG_RESP_STOP(cb, *rec); >>>>>>>> - // Resend msgs from queue because amfd dropped during >>>>>>>> sync >>>>>>>> - if ((cb->dnd_list.head != nullptr)) { >>>>>>>> - TRACE("retransmit message to amfd"); >>>>>>>> - AVND_DND_MSG_LIST *pending_rec = 0; >>>>>>>> - for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>>>> - avnd_diq_rec_send(cb, pending_rec); >>>>>>>> - } >>>>>>>> - } >>>>>>>> + avnd_diq_rec_send_buffered_msg(cb); >>>>>>>> /* resend pg start track */ >>>>>>>> avnd_di_resend_pg_start_track(cb); >>>>>>>> } >>>>>>>> @@ -1275,6 +1275,73 @@ void avnd_diq_rec_del(AVND_CB *cb, >>>>> AVND_ >>>>>>>> TRACE_LEAVE(); >>>>>>>> return; >>>>>>>> } >>>>>>>> >>>>> +/************************************************************ >>>>>>>> **************** >>>>>>>> + Name : avnd_diq_rec_send_buffered_msg >>>>>>>> + >>>>>>>> + Description : Resend buffered msg >>>>>>>> + >>>>>>>> + Arguments : cb - ptr to the AvND control block >>>>>>>> + >>>>>>>> + Return Values : None. >>>>>>>> + >>>>>>>> + Notes : None. >>>>>>>> >>>>> +************************************************************* >>>>>>>> ********** >>>>>>>> +*******/ void avnd_diq_rec_send_buffered_msg(AVND_CB *cb) { >>>>>>>> + TRACE_ENTER(); >>>>>>>> + // Resend msgs from queue because amfnd dropped during >>>>>>>> headless >>>>>>>> + // or headless-synchronization >>>>>>>> + if ((cb->dnd_list.head != nullptr)) { >>>>>>>> + AVND_DND_MSG_LIST *pending_rec = 0; >>>>>>>> + TRACE("Attach msg_id of buffered msg"); >>>>>>>> + bool found = true; >>>>>>>> + while (found) { >>>>>>>> + found = false; >>>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>>>> + if (pending_rec->msg.type == >>>>>>>> AVND_MSG_AVD) { >>>>>>>> + // At this moment, only oper_state >>>>>>>> msg needs to report to director >>>>>>>> + if (pending_rec->msg.info.avd- >>>>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG && >>>>>>>> + pending_rec->msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) { >>>>>>>> + m_AVND_DIQ_REC_POP(cb, >>>>>>>> pending_rec); #if 0 >>>>>>>> + // only resend if this SUSI >>>>>>>> does exist >>>>>>>> + AVND_SU *su = >>>>>>>> m_AVND_SUDB_REC_GET(cb->sudb, >>>>>>>> + pending_rec- >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name); >>>>>>>> + if (su != nullptr && su- >>>>>>>>> si_list.n_nodes > 0) { #endif >>>>>>>> + pending_rec- >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = >>>>>>>>> ++(cb->snd_msg_id); >>>>>>>> + >>>>>>>> m_AVND_DIQ_REC_PUSH(cb, pending_rec); >>>>>>>> + LOG_NO("Found and >>>>>>>> resend buffered su_si_assign msg for SU:'%s', " >>>>>>>> + >>>>>>>> "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', " >>>>>>>> + >>>>>>>> "error:'%u', msg_id:'%u'", >>>>>>>> + >>>>>>>> pending_rec->msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>>>> + >>>>>>>> pending_rec->msg.info.avd- >>>>>>>>> msg_info.n2d_su_si_assign.si_name.value, >>>>>>>> + >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>>> + >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>>> + >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.single_csi, >>>>>>>> + >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>>>> + >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id); >>>>>>>> + >>>>>>>> +#if 0 >>>>>>>> + } else { >>>>>>>> + >>>>>>>> avnd_msg_content_free(cb, &pending_rec->msg); >>>>>>>> + delete pending_rec; >>>>>>>> + pending_rec = cb- >>>>>>>>> dnd_list.head; >>>>>>>> + } >>>>>>>> +#endif >>>>>>>> + found = true; >>>>>>>> + } >>>>>>>> + } >>>>>>>> + } >>>>>>>> + } >>>>>>>> + TRACE("retransmit message to amfd"); >>>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>>> nullptr; >>>>>>>> pending_rec = pending_rec->next) { >>>>>>>> + avnd_diq_rec_send(cb, pending_rec); >>>>>>>> + } >>>>>>>> + } >>>>>>>> + TRACE_LEAVE(); >>>>>>>> + return; >>>>>>>> +} >>>>>>>> >>>>>>>> >>>>>>>> >>>>> /************************************************************* >>>>>>>> *************** >>>>>>>> Name : avnd_diq_rec_send >>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct avnd void >>>>>>>> avnd_diq_del(struct avnd_cb_tag *); AVND_DND_MSG_LIST >>>>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG *msg); void >>>>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb, AVND_DND_MSG_LIST >>>>> *rec); >>>>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag *cb); >>>>>>>> uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb, >>>>>>> AVND_DND_MSG_LIST >>>>>>>> *rec); uint32_t avnd_di_reg_su_rsp_snd(struct avnd_cb_tag *cb, >>>>>>>> SaNameT *su_name, uint32_t ret_code); uint32_t >>>>>>> avnd_di_ack_nack_msg_send(struct >>>>>>>> avnd_cb_tag *cb, uint32_t rcv_id, uint32_t view_num); >>>>>>> -------------------------------------------------------------------- >>>>>>> -- >>>>>>> -------- _______________________________________________ >>>>>>> Opensaf-devel mailing list >>>>>>> Opensaf-devel@lists.sourceforge.net >>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>>>> ---------------------------------------------------------------------- >>>>>> >>>>>> -------- _______________________________________________ >>>>>> Opensaf-devel mailing list >>>>>> Opensaf-devel@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> >>>>> _______________________________________________ >>>>> Opensaf-devel mailing list >>>>> Opensaf-devel@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>> >> > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel