Hi Praveen I think @admin_ng needs to be restore as well as @ng_using_saAmfSGAdminState, since just found that @admin_ng check is required in avd_node_down_appl_susi_failover(), which happens if node having pending csi callback reboot. I will use SG_FSM_ADMIN to differentiate cases 1 and 2.
Thanks, Minh On 29/08/16 16:25, praveen malviya wrote: > Hi Minh, > > Please see inline with [Praveen] > > Thanks, > Praveen > > On 29-Aug-16 5:57 AM, minh chau wrote: >> Hi Praveen, >> >> Thanks for looking through the patch. >> The potential problem of restoring nodegroup because nodegroup allows to >> be created in LOCKED while the SUs are having assignment, this could >> cause an ambiguity for AMFD after headless. For example: >> Suppose having SU4 hosted on PL4, SU5 hosted on PL5, SU4 has active >> assignment, SU5 has standby assignment. >> case 1: Create nodegroup (PL4 + PL5) with LOCKED, lock PL5, lock PL4, >> delay quiesced csi cbk, stop SC, restart SC. > [Praveen] In this case, after headless state SG fsm will not be in > SG_ADMIN state because payload are being locked one by one. So in this > case it is distinguishable that it is a not a NG operation case > as SG is not in SG_ADMIN state even though SG is fully assigned in NG. >> case 2: Create nodegroup (PL4 + PL5) with LOCKED, lock nodegroup, delay >> quiesced csi cbk, stop SC, restart SC. > [Praveen] In this case we have following information after headless > state: > -SG is in SG_ADMIN state. > -NG is in SHUTTING_DOWN or LOCKED state. > -Nodes in SHUTTING_DOWN or LOCKED state. > -SG FSM remains in SG_ADMIN state only in case of admin operation > on SG. But after headless SG is not found in UNLOCKED state and one NG > is found in LOCKED/SHUTTING down state and its nodes. > I think with above information, AMFD can set > @ng_using_saAmfSGAdminState and set SG admin state to SHUTTING_DOWN or > LOCKED. Restoring admin_ng is not required as in su_si_assign(), there > is an OR condition between @ng_using_saAmfSGAdminState and @admin_ng > for calling process_su_si_response_for_ng(). Also checks on admin_ng > is used only for updating counters related to completion of admin > opeations which is not required after headless. > >> >> if case 2 actually happened before headless, then @admin_ng and >> @ng_using_saAmfSGAdminState needs to be restored, otherwise >> process_su_si_response_for_ng() won't be called and saAmfSGAdminState >> remains LOCKED and SG is still not STABLE state. >> >> But in both cases, after headless, AMFD sees all PLs are LOCKED, >> nodegroup is LOCKED, SU4 has pending quiesced csi cbk, thus they are >> running into the same code flow. In case 1, @admin_ng and >> @ng_using_saAmfSGAdminState should not bet set since case1 was not >> nodegroup operation before headless. >> I have run a test of both cases, they are working with the patch >> attached in ticket, but it still looks a potential problem since all >> cases are not transparent to AMFD after headless, the @admin_ng and >> @ng_using_asAmfSGAdminState maybe get hit in some points in case 1 > [Praveen] Besides above cases, there remains only one case: when > operation was initiated on NG and SG is partially mapped in NG. In > this case, after headless state we can get only two states of SG > either SU_OPER or SG_REALIGN. In both the cases I think we do not > require to restore @ng_using_saAmfSGAdminState and @admin_ng because > we do not require to enter in process_su_si_response_for_ng().In > sg_2n_fsm, it marks Node from SHUTTING_DOWN to locked in > susi_success_su_oper(). > > Have I missed any other case? > >> >> If case 1 looks ok to you from nodegroup point of view, then I will >> float the patch for review. >> >> Thanks, >> Minh >> >> >> On 26/08/16 16:08, praveen malviya wrote: >>> Hi, >>> >>> I have gone through amfd traces. Also patch for NG seems to be ok but >>> some minor can be done. >>> >>> As pointed by Minh, when whole SG is mapped in NG (say case a), AMFD >>> uses SG_ADMIN flow and SG admin state without exposing it to the user >>> through IMM for 2N model. In the other case when only one SU is >>> assigned in NG (say case b) there should not be any problem because >>> operation fully depends on NG admin state. Since other case b) does >>> not use SG admin state and ng_using_saAMfSGAdminState, it should work >>> fine. >>> >>> I think we can take the help of following facts and functions to >>> improve the patch and with that restoring ng_using_saAmfSGAdminState >>> from IMM may not be required: >>> 1)In normal cluster, if controller switchover/failover happens when NG >>> operation is going on then standby controller continues admin >>> operation with information that it gets through CKPT updates in >>> dec_sg_admin_state() and dec_ng_admin_state(). Active controller never >>> checkpoints ng_using_saAmfSGAdminState and deduce it in these >>> functions.The situation after headless is almost like that. >>> I think, in case a when shutdown operation is going on, admin >>> state of NG is still SHUTTING_DOWN and system becomes headless, >>> requires more params and not the lock operation. In shutdown >>> operation, AMFD has to ensure transition of NG and Nodes to LOCKED >>> state. >>> >>> 2)Like controller fail-over/switch-over after headless also, we are >>> not bound to reply to IMM for admin operation completion. So we need >>> to analyse if we require to restore node->admin_ng. Half of the code >>> in process_su_si_response_for_ng() is for tracking the state of admin >>> operation so that AMFD replies to IMM for admin operation and this is >>> not required after headless state. >>> >>> I think problem is not that much complex as it is valid for only 2N >>> models and only in case a). >>> >>> Thanks, >>> Praveen >>> >>> On 25-Aug-16 6:38 PM, minh chau wrote: >>>> Hi, >>>> >>>> The test failed because two reasons: >>>> 1. There are two places that nodegroup operation borrows 2N SG FSM, >>>> but >>>> the AdminState of SG is not stored to IMM >>>> saAmfSGAdminState = ng->saAmfNGAdminState; >>>> ... >>>> su->sg_of_su->saAmfSGAdminState = SA_AMF_ADMIN_UNLOCKED; >>>> >>>> This setting needs to be called by AVD_SG::set_admin_state() >>>> >>>> 2. After receives su_si assignment response after headless, @admin_ng, >>>> @ng_using_saAmfSGAdminState have not been restored. >>>> They need to be restored by somehow. Since nodegroup allows to be >>>> created at any adminState. So there should be the case nodegroup's >>>> AdminState is created with LOCKED but the belonging SUs are still >>>> having >>>> assignment, so adminState of nodegroup can't be used. >>>> The admin_ng, ng_using_saAmfSGAdminState seem need to be stored to >>>> IMM? >>>> @Praveen: any suggestions? >>>> >>>> } else if ((su->sg_of_su->sg_ncs_spec == false) && >>>> ((su->su_on_node->admin_ng != nullptr) || >>>> (su->sg_of_su->ng_using_saAmfSGAdminState == true))) { >>>> AVD_AMF_NG *ng = su->su_on_node->admin_ng; >>>> //Got response from AMFND for assignments decrement >>>> su_cnt_admin_oper. >>>> if ((ng != nullptr) && >>>> (((((ng->admin_ng_pend_cbk.admin_oper == >>>> SA_AMF_ADMIN_SHUTDOWN) || >>>> (ng->admin_ng_pend_cbk.admin_oper == >>>> SA_AMF_ADMIN_LOCK)) && >>>> (su->saAmfSUNumCurrActiveSIs == 0) && >>>> (su->saAmfSUNumCurrStandbySIs == 0) && >>>> (AVSV_SUSI_ACT_DEL == >>>> n2d_msg->msg_info.n2d_su_si_assign.msg_act))) || >>>> (ng->admin_ng_pend_cbk.admin_oper == >>>> SA_AMF_ADMIN_UNLOCK))) { >>>> su->su_on_node->su_cnt_admin_oper--; >>>> TRACE("node:'%s', su_cnt_admin_oper:%u", >>>> su->su_on_node->name.c_str(),su->su_on_node->su_cnt_admin_oper); >>>> } >>>> process_su_si_response_for_ng(su, SA_AIS_OK); >>>> >>>> On 25/08/16 21:36, Nagendra Kumar wrote: >>>>> Further testing results: >>>>> Node group lock has resulted in SG unstable. Logs and configuration >>>>> file attached. >>>>> >>>>> Configuration : SC-1, PL-3 and PL-4. >>>>> >>>>> Steps: >>>>> >>>>> 1. Unlock SU1(on PL-3), SU2 and SU3 (Both on PL-4). >>>>> 2. Create node group of PL-3 and PL-4: >>>>> 3. Lock the node group. >>>>> amf-adm lock safAmfNodeGroup=nagu,safAmfCluster=myAmfCluster >>>>> 4. Keep gdb in csi set callback, stop SC-1 and start respond OK from >>>>> csi set callback and start SC-1. >>>>> >>>>> SG becomes unstable if you try to unlock the Node group: >>>>> Aug 25 16:57:06 PM_SC-1 osafamfd[2166]: NO >>>>> 'safSg=AmfDemo_2N,safApp=AmfDemo1' is in unstable/transition state >>>>> >>>>> >>>>> Thanks >>>>> -Nagu >>>>> >>>>>> -----Original Message----- >>>>>> From: Nagendra Kumar >>>>>> Sent: 24 August 2016 16:58 >>>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>> Subject: Re: [devel] [PATCH 2 of 2] AMFND: Admin operation >>>>>> continuation if >>>>>> csi callback completes during headless [#1725 part 1] V1 >>>>>> >>>>>> The below is the assignments after the test case (SU2 has standby >>>>>> assignment): >>>>>> >>>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd status >>>>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>> mo1,safApp=AmfDemo1 >>>>>> saAmfSISUHAState=STANDBY(2) >>>>>> safSISU=safSu=PL- >>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=PL- >>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC- >>>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC- >>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>> 2N,safApp=OpenSAF >>>>>> saAmfSISUHAState=STANDBY(2) >>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>> 2N,safApp=OpenSAF >>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>> >>>>>> Thanks >>>>>> -Nagu >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Nagendra Kumar >>>>>>> Sent: 24 August 2016 16:55 >>>>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>>> Subject: Re: [devel] [PATCH 2 of 2] AMFND: Admin operation >>>>>>> continuation if csi callback completes during headless [#1725 >>>>>>> part 1] >>>>>>> V1 >>>>>>> >>>>>>> Hi Minh, >>>>>>> With 1725_phase_1_V2.tgz, the below email TC has failed. Please >>>>>> find >>>>>>> the traces attached along with the configuration in the ticket. >>>>>>> >>>>>>> Thanks >>>>>>> -Nagu >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Nagendra Kumar >>>>>>>> Sent: 23 August 2016 15:15 >>>>>>>> To: Minh Hon Chau; hans.nordeb...@ericsson.com; Praveen Malviya; >>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au >>>>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>>>> Subject: Re: [devel] [PATCH 2 of 2] AMFND: Admin operation >>>>>>>> continuation if csi callback completes during headless [#1725 part >>>>>>>> 1] >>>>>>>> V1 >>>>>>>> >>>>>>>> Hi Minh, >>>>>>>> The following SU lock case is not working. This issue will >>>>>>>> exist >>>>>>>> for all the flows, so please check. >>>>>>>> >>>>>>>> Configuration and traces attached in the ticket. >>>>>>>> >>>>>>>> Steps: >>>>>>>> 1. Start SC-1, SC-2, PL-3 and PL-4. Run the following command: >>>>>>>> immcfg -f /tmp/AppConfig-2N-1725.xml amf-adm unlock-in >>>>>>>> safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>>> amf-adm unlock-in safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>>> amf-adm unlock-in safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>>> amf-adm unlock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>>> amf-adm unlock safSu=SU2,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>>> amf-adm unlock safSu=SU3,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>>> >>>>>>>> Assignments are: >>>>>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd >>>>>>>> status >>>>>>>> safSISU=safSu=SC- >>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>>> 2N,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=SC- >>>>>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>>> 2N,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>>> safSISU=safSu=PL- >>>>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=PL- >>>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> >>>>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>>>> mo1,safApp=AmfDemo1 >>>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>>> >>>>>> safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>>>> mo1,safApp=AmfDemo1 >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> >>>>>>>> 2. Issue lock on SU1. >>>>>>>> amf-adm lock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1 >>>>>>>> And keep gdb in csi_set callback. Stop SC-1 and SC-2. >>>>>>>> Send Ok from csi_set callback. >>>>>>>> >>>>>>>> 3. Start SC-1 and SC-2. >>>>>>>> >>>>>>>> 4. Assignment to components of SU2 is not given and assignments of >>>>>>>> SU2 still shows Standby. >>>>>>>> PM_SC-1:/home/nagu/views/staging-1725 # /etc/init.d/opensafd >>>>>>>> status >>>>>>>> >>>>>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe >>>>>>>> mo1,safApp=AmfDemo1 >>>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>>> safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>>> 2N,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=STANDBY(2) >>>>>>>> safSISU=safSu=SC- >>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=PL- >>>>>>>> 4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=PL- >>>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=SC- >>>>>>>> 2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- >>>>>>>> 2N,safApp=OpenSAF >>>>>>>> saAmfSISUHAState=ACTIVE(1) >>>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> -Nagu >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] >>>>>>>>> Sent: 05 August 2016 02:50 >>>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen Malviya; >>>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au; >>>>>>>>> minh.c...@dektech.com.au >>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net >>>>>>>>> Subject: [PATCH 2 of 2] AMFND: Admin operation continuation if >>>>>>>>> csi >>>>>>>> callback >>>>>>>>> completes during headless [#1725 part 1] V1 >>>>>>>>> >>>>>>>>> osaf/services/saf/amf/amfnd/di.cc | 199 >>>>>>>>> +++++++++++++++++- >>>>>> --- >>>>>>> -- >>>>>>>> -- >>>>>>>>> osaf/services/saf/amf/amfnd/include/avnd_di.h | 1 + >>>>>>>>> 2 files changed, 134 insertions(+), 66 deletions(-) >>>>>>>>> >>>>>>>>> >>>>>>>>> The patch buffers susi_resp_msg during headless stage and resend >>>>>>>>> it to AMFD after headless. >>>>>>>>> >>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc >>>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc >>>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc >>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc >>>>>>>>> @@ -804,11 +804,6 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>>>>> if (cb->term_state == >>>>>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED) >>>>>>>>> return rc; >>>>>>>>> >>>>>>>>> - if (cb->is_avd_down == true) { >>>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>>>>> - return rc; >>>>>>>>> - } >>>>>>>>> - >>>>>>>>> // should be in assignment pending state to be here >>>>>>>>> osafassert(m_AVND_SU_IS_ASSIGN_PEND(su)); >>>>>>>>> >>>>>>>>> @@ -819,64 +814,76 @@ uint32_t avnd_di_susi_resp_send(AVND_CB >>>>>>>>> TRACE_ENTER2("Sending Resp su=%s, si=%s, curr_state=%u, >>>>>>>>> prv_state=%u", su->name.value, curr_si->name.value,curr_si- >>>>>>>>>> curr_state,curr_si->prv_state); >>>>>>>>> /* populate the susi resp msg */ >>>>>>>>> msg.info.avd = new AVSV_DND_MSG(); >>>>>>>>> - msg.type = AVND_MSG_AVD; >>>>>>>>> - msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>>>>> snd_msg_id); >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>>>>> node_info.nodeId; >>>>>>>>> - if (si) { >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>>>>> - ((si->single_csi_add_rem_in_si == >>>>>>>>> AVSV_SUSI_ACT_BASE) >>>>>> ? >>>>>>>>> false : true); >>>>>>>>> - } >>>>>>>>> - TRACE("curr_assign_state '%u'", >>>>>>>>> curr_si->curr_assign_state); >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>>>>> || >>>>>>>>> - >>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) >>>>>>> ? >>>>>>>>> - ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN : >>>>>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>>>>> - if (si) { >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = >>>>>>>>> si- >>>>>>> name; >>>>>>>>> - if (AVSV_SUSI_ACT_ASGN == >>>>>>>>> si->single_csi_add_rem_in_si) { >>>>>>>>> - TRACE("si->curr_assign_state '%u'", curr_si- >>>>>>>>>> curr_assign_state); >>>>>>>>> - >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>>>> - >>>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>>> - >>>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>>>> - AVSV_SUSI_ACT_ASGN : >>>>>>>>> AVSV_SUSI_ACT_DEL; >>>>>>>>> - } >>>>>>>>> - } >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>>>>> - (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) >>>>>>> || >>>>>>>>> - >>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) >>>>>>> ? >>>>>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>>>>> + msg.type = AVND_MSG_AVD; >>>>>>>>> + msg.info.avd->msg_type = AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- >>>>>>>>>> node_info.nodeId; >>>>>>>>> + if (si) { >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.single_csi = >>>>>>>>> + ((si->single_csi_add_rem_in_si == >>>>>>>>> AVSV_SUSI_ACT_BASE) ? false : true); >>>>>>>>> + } >>>>>>>>> + TRACE("curr_assign_state '%u'", curr_si->curr_assign_state); >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act = >>>>>>>>> + >>>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>>> + >>>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>>>> + ((!curr_si->prv_state) ? >>>>>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.su_name = su->name; >>>>>>>>> + if (si) { >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- >>>>>>>>>> name; >>>>>>>>> + if (AVSV_SUSI_ACT_ASGN == >>>>>>>>> si->single_csi_add_rem_in_si) { >>>>>>>>> + TRACE("si->curr_assign_state '%u'", curr_si- >>>>>>>>>> curr_assign_state); >>>>>>>>> + msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.msg_act = >>>>>>>>> + >>>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>>> + >>>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? >>>>>>>>> + AVSV_SUSI_ACT_ASGN : >>>>>>>>> AVSV_SUSI_ACT_DEL; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.ha_state = >>>>>>>>> + (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? >>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error = >>>>>>>>> + >>>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || >>>>>>>>> + >>>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) ? >>>>>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; >>>>>>>>> >>>>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>>> AVSV_SUSI_ACT_ASGN) >>>>>>>>> - osafassert(si); >>>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>>> AVSV_SUSI_ACT_ASGN) >>>>>>>>> + osafassert(si); >>>>>>>>> >>>>>>>>> - /* send the msg to AvD */ >>>>>>>>> - TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', >>>>>>>>> su'%s', >>>>>>>> si'%s', >>>>>>>>> ha_state'%u', error'%u', single_csi'%u'", >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>>>>> msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.node_id, >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>>> msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>> msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.single_csi); >>>>>>>>> + /* send the msg to AvD */ >>>>>>>>> + TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', >>>>>>>>> su'%s', >>>>>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'", >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id, >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id, >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value, >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi); >>>>>>>>> >>>>>>>>> - if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>>>>> - if >>>>>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>>> AVSV_SUSI_ACT_DEL) >>>>>>>>> - LOG_NO("Removed 'all SIs' from '%s'", >>>>>>>>> su->name.value); >>>>>>>>> + if ((su->si_list.n_nodes > 1) && (si == nullptr)) { >>>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>>> AVSV_SUSI_ACT_DEL) >>>>>>>>> + LOG_NO("Removed 'all SIs' from '%s'", su- >>>>>>>>>> name.value); >>>>>>>>> - if >>>>>>>>> (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>>> AVSV_SUSI_ACT_MOD) >>>>>>>>> - LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>>>>> - ha_state[msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>>>>> - su->name.value); >>>>>>>>> - } >>>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == >>>>>>>>> AVSV_SUSI_ACT_MOD) >>>>>>>>> + LOG_NO("Assigned 'all SIs' %s of '%s'", >>>>>>>>> + ha_state[msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.ha_state], >>>>>>>>> + su->name.value); >>>>>>>>> + } >>>>>>>>> >>>>>>>>> - rc = avnd_di_msg_send(cb, &msg); >>>>>>>>> - if (NCSCC_RC_SUCCESS == rc) >>>>>>>>> - msg.info.avd = 0; >>>>>>>>> - >>>>>>>>> - /* we have completed the SU SI msg processing */ >>>>>>>>> - if (su_assign_state_is_stable(su)) >>>>>>>>> - m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); >>>>>>>>> + if (cb->is_avd_down == true) { >>>>>>>>> + // We are in headless, buffer this msg >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; >>>>>>>>> + if (avnd_diq_rec_add(cb, &msg) == nullptr) { >>>>>>>>> + rc = NCSCC_RC_FAILURE; >>>>>>>>> + } >>>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF >>>>>>>>> director is offline"); >>>>>>>>> + } else { >>>>>>>>> + // We are in normal cluster, send msg to director >>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- >>>>>>>>>> snd_msg_id); >>>>>>>>> + /* send the msg to AvD */ >>>>>>>>> + rc = avnd_di_msg_send(cb, &msg); >>>>>>>>> + if (NCSCC_RC_SUCCESS == rc) >>>>>>>>> + msg.info.avd = 0; >>>>>>>>> + /* we have completed the SU SI msg processing */ >>>>>>>>> + if (su_assign_state_is_stable(su)) { >>>>>>>>> + m_AVND_SU_ASSIGN_PEND_RESET(su); >>>>>>>>> + } >>>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); >>>>>>>>> + } >>>>>>>>> >>>>>>>>> /* free the contents of avnd message */ >>>>>>>>> avnd_msg_content_free(cb, &msg); @@ -1255,14 +1262,7 @@ >>>>>>>>> void >>>>>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_ >>>>>>>>> /* stop the AvD msg response timer */ >>>>>>>>> if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) { >>>>>>>>> m_AVND_TMR_MSG_RESP_STOP(cb, *rec); >>>>>>>>> - // Resend msgs from queue because amfd dropped during >>>>>>>>> sync >>>>>>>>> - if ((cb->dnd_list.head != nullptr)) { >>>>>>>>> - TRACE("retransmit message to amfd"); >>>>>>>>> - AVND_DND_MSG_LIST *pending_rec = 0; >>>>>>>>> - for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>>>>> - avnd_diq_rec_send(cb, pending_rec); >>>>>>>>> - } >>>>>>>>> - } >>>>>>>>> + avnd_diq_rec_send_buffered_msg(cb); >>>>>>>>> /* resend pg start track */ >>>>>>>>> avnd_di_resend_pg_start_track(cb); >>>>>>>>> } >>>>>>>>> @@ -1275,6 +1275,73 @@ void avnd_diq_rec_del(AVND_CB *cb, >>>>>> AVND_ >>>>>>>>> TRACE_LEAVE(); >>>>>>>>> return; >>>>>>>>> } >>>>>>>>> >>>>>> +/************************************************************ >>>>>>>>> **************** >>>>>>>>> + Name : avnd_diq_rec_send_buffered_msg >>>>>>>>> + >>>>>>>>> + Description : Resend buffered msg >>>>>>>>> + >>>>>>>>> + Arguments : cb - ptr to the AvND control block >>>>>>>>> + >>>>>>>>> + Return Values : None. >>>>>>>>> + >>>>>>>>> + Notes : None. >>>>>>>>> >>>>>> +************************************************************* >>>>>>>>> ********** >>>>>>>>> +*******/ void avnd_diq_rec_send_buffered_msg(AVND_CB *cb) { >>>>>>>>> + TRACE_ENTER(); >>>>>>>>> + // Resend msgs from queue because amfnd dropped during >>>>>>>>> headless >>>>>>>>> + // or headless-synchronization >>>>>>>>> + if ((cb->dnd_list.head != nullptr)) { >>>>>>>>> + AVND_DND_MSG_LIST *pending_rec = 0; >>>>>>>>> + TRACE("Attach msg_id of buffered msg"); >>>>>>>>> + bool found = true; >>>>>>>>> + while (found) { >>>>>>>>> + found = false; >>>>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>>>> nullptr; pending_rec = pending_rec->next) { >>>>>>>>> + if (pending_rec->msg.type == >>>>>>>>> AVND_MSG_AVD) { >>>>>>>>> + // At this moment, only oper_state >>>>>>>>> msg needs to report to director >>>>>>>>> + if (pending_rec->msg.info.avd- >>>>>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG && >>>>>>>>> + pending_rec->msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) { >>>>>>>>> + m_AVND_DIQ_REC_POP(cb, >>>>>>>>> pending_rec); #if 0 >>>>>>>>> + // only resend if this SUSI >>>>>>>>> does exist >>>>>>>>> + AVND_SU *su = >>>>>>>>> m_AVND_SUDB_REC_GET(cb->sudb, >>>>>>>>> + pending_rec- >>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name); >>>>>>>>> + if (su != nullptr && su- >>>>>>>>>> si_list.n_nodes > 0) { #endif >>>>>>>>> + pending_rec- >>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = >>>>>>>>>> ++(cb->snd_msg_id); >>>>>>>>> + >>>>>>>>> m_AVND_DIQ_REC_PUSH(cb, pending_rec); >>>>>>>>> + LOG_NO("Found and >>>>>>>>> resend buffered su_si_assign msg for SU:'%s', " >>>>>>>>> + >>>>>>>>> "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', " >>>>>>>>> + >>>>>>>>> "error:'%u', msg_id:'%u'", >>>>>>>>> + >>>>>>>>> pending_rec->msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, >>>>>>>>> + >>>>>>>>> pending_rec->msg.info.avd- >>>>>>>>>> msg_info.n2d_su_si_assign.si_name.value, >>>>>>>>> + >>>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.ha_state, >>>>>>>>> + >>>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act, >>>>>>>>> + >>>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.single_csi, >>>>>>>>> + >>>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.error, >>>>>>>>> + >>>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id); >>>>>>>>> + >>>>>>>>> +#if 0 >>>>>>>>> + } else { >>>>>>>>> + >>>>>>>>> avnd_msg_content_free(cb, &pending_rec->msg); >>>>>>>>> + delete pending_rec; >>>>>>>>> + pending_rec = cb- >>>>>>>>>> dnd_list.head; >>>>>>>>> + } >>>>>>>>> +#endif >>>>>>>>> + found = true; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + TRACE("retransmit message to amfd"); >>>>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != >>>>>>>>> nullptr; >>>>>>>>> pending_rec = pending_rec->next) { >>>>>>>>> + avnd_diq_rec_send(cb, pending_rec); >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + TRACE_LEAVE(); >>>>>>>>> + return; >>>>>>>>> +} >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> /************************************************************* >>>>>>>>> *************** >>>>>>>>> Name : avnd_diq_rec_send >>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h >>>>>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct avnd void >>>>>>>>> avnd_diq_del(struct avnd_cb_tag *); AVND_DND_MSG_LIST >>>>>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG *msg); void >>>>>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb, AVND_DND_MSG_LIST >>>>>> *rec); >>>>>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag *cb); >>>>>>>>> uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb, >>>>>>>> AVND_DND_MSG_LIST >>>>>>>>> *rec); uint32_t avnd_di_reg_su_rsp_snd(struct avnd_cb_tag *cb, >>>>>>>>> SaNameT *su_name, uint32_t ret_code); uint32_t >>>>>>>> avnd_di_ack_nack_msg_send(struct >>>>>>>>> avnd_cb_tag *cb, uint32_t rcv_id, uint32_t view_num); >>>>>>>> -------------------------------------------------------------------- >>>>>>>> >>>>>>>> -- >>>>>>>> -------- _______________________________________________ >>>>>>>> Opensaf-devel mailing list >>>>>>>> Opensaf-devel@lists.sourceforge.net >>>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>>>>> ---------------------------------------------------------------------- >>>>>>> >>>>>>> >>>>>>> -------- _______________________________________________ >>>>>>> Opensaf-devel mailing list >>>>>>> Opensaf-devel@lists.sourceforge.net >>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>>>> ------------------------------------------------------------------------------ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Opensaf-devel mailing list >>>>>> Opensaf-devel@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>> >>> >> > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel