Update: > Further testing will continue: > 1. Faults during assignments. This worked fine and no issues has been found.
> 2. Faults during admin operations. Testing is undergoing. Some of the initial test cases have passed. Thanks -Nagu > -----Original Message----- > From: Nagendra Kumar > Sent: 22 September 2016 16:52 > To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; > gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation continuation if > csi completes during headless [#1725 part 1] V1 > > Hi Minh, > > I have tested following scenarios till now and works well: > 1. Faults under standalone system. Act and Std SUs are there and cluster > went headless. Faults occurred in Act and Standby separately and together > with recovery as restart, su f/o, node f/o, node s/o. Faults are also mixed > with two SGs. > 2. During headless, the escalations from comp restart->su restart->su f/o -> > node f/o. Two SG was used for testings. When cluster recovers, the system > works well as expected. > > SG and node Auto repair was enabled and disabled in many test cases. > > Further testing will continue: > 1. Faults during assignments. > 2. Faults during admin operations. > > Thanks > -Nagu > > > -----Original Message----- > > From: minh chau [mailto:minh.c...@dektech.com.au] > > Sent: 15 September 2016 12:27 > > To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > > gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > > Cc: opensaf-devel@lists.sourceforge.net > > Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > > continuation if csi completes during headless [#1725 part 1] V1 > > > > Hi Nagu, > > > > Yes you are right. The "component failover" is unstable, I think I > > will post my analysis of component failover problem to #1902 after you > > have verified the other recoveries. > > > > Thanks, > > Minh > > > > On 15/09/16 16:51, Nagendra Kumar wrote: > > > Hi Minh, > > > @2.a.) and @2.b.) are working except "Component Failover" > > as recovery. Other recovery like SU Failover, etc are working fine > > with 1725_pending_review.tgz and > 07_no_recovery_if_no_pending_susi.diff. > > > > > > Please confirm. > > > > > > Thanks > > > -Nagu > > > > > >> -----Original Message----- > > >> From: Nagendra Kumar > > >> Sent: 15 September 2016 12:13 > > >> To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; > > >> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > > >> Cc: opensaf-devel@lists.sourceforge.net > > >> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > > >> continuation if csi completes during headless [#1725 part 1] V1 > > >> > > >> Hi Minh, > > >>>> If there's no any major problem, can we make SI Dep as last phase? > > >> Yes, absolutely. There is no problem. > > >>>> If I am right, I think you are testing @2.a) - and *fault* has > > >>>> just been as > > >> node reboot/powered-off by user during headless. > > >> Yes, you are right. > > >> > > >> Thanks > > >> -Nagu > > >> > > >>> -----Original Message----- > > >>> From: minh chau [mailto:minh.c...@dektech.com.au] > > >>> Sent: 14 September 2016 17:54 > > >>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > Malviya; > > >>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > > >>> Cc: opensaf-devel@lists.sourceforge.net > > >>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > > >>> continuation if csi completes during headless [#1725 part 1] V1 > > >>> > > >>> Hi Nagu, > > >>> > > >>> I have proposed to change the order on 28 Jul: > > >>> > > >>> ============== > > >>> > > >>> I would like to change the above orders of implementation: > > >>> @0. We are here now: No admin op continuation, no recovery on > > >>> faults during headless. > > >>> Since componentRestart/suRestart has no impact on recovery after > > >>> headless, faults during headless here mean: failover escalation, > > >>> node reboot/powered-off by user during headless. Faults are > > >>> different phenomenons but they all result in loss of SUSI. Having > > >>> #1902 will remove the major impact of a node reboot due to > > >>> immediate escalation and AMF also has to deal with the loss of > > >>> SUSI the same as without > > >>> #1902 plus failover escalation > > >>> > > >>> @1. Admin op continuation without required recovery on faults > > >>> during headless > > >>> @1.a) All CSI(s) callback completes during headless, but SUSI > > >>> states are still QUIESCED/QUIESCING > > >>> @1.b) One of CSI(s) callback is still ongoing after headless (AMFD > > >>> would have to wait for it?) > > >>> > > >>> @2. Recovery on faults. (Doing fault recovery needs to consider > > >>> admin op continuation which would have been implemented in step > > >>> @1) Need > > >>> #1902 > > >>> @2.a.) Faults in normal flow: No admin op continuation is required > > >>> after headless, but fault did happen during headless > > >>> @2.b.) Faults happen during admin operation while headless, after > > >>> headless AMFD needs to consider a recovery on fault together with > > >>> admin op continuation. > > >>> > > >>> @3. @1 + @2 + With SI Dep. > > >>> > > >>> =============== > > >>> I thought we have followed the above order so far? Because part 1 > > >>> was acked, which is "@1. Admin op continuation without required > > >>> recovery on faults during headless" > > >>> If there's no any major problem, can we make SI Dep as last phase? > > >>> If I am right, I think you are testing @2.a) - and *fault* has > > >>> just been as node reboot/powered-off by user during headless. > > >>> > > >>> Thanks, > > >>> Minh > > >>> > > >>> On 14/09/16 21:48, Nagendra Kumar wrote: > > >>>> Hi Minh, > > >>>> If it is not tested, then it is fine. But, we had added > > >>>> (#1) > > >>>> the > > >>> following in the ticket #1725 on 27 Jul : > > >>>> =========================================== > > >>>> Nagendra Kumar - 2016-07-27 > > >>>> > > >>>> For 2N red model, implementation can be done in the following > > >>>> phased > > >>> manner. > > >>>> It has advantages of being logically segregated and it continues > > >>>> from > > >>> where we left in 5.0. > > >>>> (Phases #1, #2 and #3 is more related to ticket #1725 and phases > > >>>> #4 and #5 are related to #1902) > > >>>> > > >>>> 1. Node restart escalation (with and without SI Dep). > > >>>> 2. Without Si Dep : Admin op (no faults/escalations). > > >>>> 3. Without Si Dep : Admin Op + node restart faults/escalations > during > > >>> headless. > > >>>> 4. Without Si Dep : > > >>>> a.) All faults in normal flows. > > >>>> b.) All faults during admin operation(minus node reboot > > >>>> during headless > > >>> as covered in #3). > > >>>> 5. With Si Dep : #2, #3 and #4. > > >>>> > > >>>> Since 5.0 already has immediate escalation model (component and > > >>>> node > > >>> restart/reboot), so #1, #2 and #3 completes left over portion of > > >>> headless contribution in 5.0 with that model. > > >>>> ====================================== > > >>>> > > >>>> Thanks > > >>>> -Nagu > > >>>> > > >>>>> -----Original Message----- > > >>>>> From: minh chau [mailto:minh.c...@dektech.com.au] > > >>>>> Sent: 14 September 2016 17:05 > > >>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > > Malviya; > > >>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > > >>>>> Cc: opensaf-devel@lists.sourceforge.net > > >>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > > >>>>> continuation if csi completes during headless [#1725 part 1] V1 > > >>>>> > > >>>>> Hi Nagu, > > >>>>> > > >>>>> SI Dep is the last phase of implementation of headless recovery, > > >>>>> its support is not included in all patches attached in ticket #1725. > > >>>>> > > >>>>> Thanks, > > >>>>> Minh > > >>>>> > > >>>>> On 14/09/16 21:21, Nagendra Kumar wrote: > > >>>>>> Hi Minh, > > >>>>>> Have you tested Si Dep (2N Red model) for "node > > restart test > > >>>>> cases" ? I can't see it in the test case doc. > > >>>>>> Thanks > > >>>>>> -Nagu > > >>>>>> > > >>>>>>> -----Original Message----- > > >>>>>>> From: Nagendra Kumar > > >>>>>>> Sent: 13 September 2016 11:20 > > >>>>>>> To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; > > >>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > > >>>>>>> Cc: opensaf-devel@lists.sourceforge.net > > >>>>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > > >>>>>>> continuation if csi completes during headless [#1725 part 1] > > >>>>>>> V1 > > >>>>>>> > > >>>>>>> Hi Minh, > > >>>>>>> I have tested these scenarios again and it works well. > > >>>>>>> > > >>>>>>> Thanks > > >>>>>>> -Nagu > > >>>>>>> > > >>>>>>>> -----Original Message----- > > >>>>>>>> From: minh chau [mailto:minh.c...@dektech.com.au] > > >>>>>>>> Sent: 12 September 2016 11:53 > > >>>>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > > >>> Malviya; > > >>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > > >>>>>>>> Cc: opensaf-devel@lists.sourceforge.net > > >>>>>>>> Subject: Re: [PATCH 2 of 4] AMFND: Admin operation > > >>>>>>>> continuation if csi completes during headless [#1725 part 1] > > >>>>>>>> V1 > > >>>>>>>> > > >>>>>>>> Hi Nagu, > > >>>>>>>> > > >>>>>>>> One bug get hit by your configuration, where the absent SUSIs > > >>>>>>>> are found after headless but no real SUSIs are available also. > > >>>>>>>> In this case I think that AMFD can do like a fresh assignment. > > >>>>>>>> I attach the patch to ticket #1725, please help to test again. > > >>>>>>>> > > >>>>>>>> Thanks, > > >>>>>>>> Minh > > >>>>>>>> > > >>>>>>>> On 12/09/16 11:09, minh chau wrote: > > >>>>>>>>> Hi Nagu, > > >>>>>>>>> > > >>>>>>>>> I'm running the tests with this configuration and will get > > >>>>>>>>> back to > > >> you. > > >>>>>>>>> Thanks, > > >>>>>>>>> Minh > > >>>>>>>>> > > >>>>>>>>> On 09/09/16 22:26, Nagendra Kumar wrote: > > >>>>>>>>>> Hi Minh, > > >>>>>>>>>> I am using 1725_pending_review.tgz > > >>>>>>>>>> (1725_02_V2_bugfix_01_resend_buffer_in_set_leds.diff, > > >>>>>>>>>> > 1725_02_V2_bugfix_02_honor_clusterinit_nodesync_timer.diff, > > >>>>>>>>>> 1725_02_V2_bugfix_03_restore_ng_admin.diff, > > >>>>>>>>>> 1725_03_V4_failover_absent_susi_longDn.diff, > > >>>>>>>>>> 1725_04_V2_headless_validation.diff, > > >>>>>>>>>> 1725_05_V2_resend_oper_state.diff, > > >>>>>>>>>> 1725_06a_fullscope_escalation_headless.diff). > > >>>>>>>>>> > > >>>>>>>>>> I am doing basic node reboot validation testing with no faults. > > >>>>>>>>>> > > >>>>>>>>>> Configuration: SU1(act) and SU2(stanby) both on PL-3. > > >>>>>>>>>> > > >>>>>>>>>> TC #1: Start SC-1, PL-3 and PL-5: Unlock SU1 and SU2. Stop > > >>>>>>>>>> SC-1 and stop PL-3, start PL-3 and start SC-1. > > >>>>>>>>>> After SC-1 and PL-3 comes back, ideally SU1 and SU2 should > > >>>>>>>>>> get assignments as Act and Std, but no assignment are being > > >>>>>>>>>> given to SUs on PL-3 and it shows following in status: > > >>>>>>>>>> > > >>>>>>>>>> Only Su2 has Std assignment. > > >>>>>>>>>> > > >>>>>>>>>> safSISU=safSu=SC- > > >>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=O > > >>>>>>>>>> penSAF > > >>>>>>>>>> > > >>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > > >>>>>>>>>> safSISU=safSu=PL- > > >>>>>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O > > >>>>>>>>>> penSAF > > >>>>>>>>>> > > >>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > > >>>>>>>>>> > > >> > > > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > > >>>>>>>> mo1,s > > >>>>>>>>>> afApp=AmfDemo1 > > >>>>>>>>>> > > >>>>>>>>>> saAmfSISUHAState=STANDBY(2) > > >>>>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- > > >>>>>>>> 2N,safApp=OpenSAF > > >>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > > >>>>>>>>>> safSISU=safSu=PL- > > >>>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O > > >>>>>>>>>> penSAF > > >>>>>>>>>> > > >>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > > >>>>>>>>>> > > >>>>>>>>>> TC #2: Configuration same as TC#1. Stop PL-3 and don't start. > > >>>>>>>>>> The same issue: > > >>>>>>>>>> safSISU=safSu=PL- > > >>>>>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O > > >>>>>>>>>> penSAF > > >>>>>>>>>> > > >>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > > >>>>>>>>>> > > >> > > > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > > >>>>>>>> mo1,s > > >>>>>>>>>> afApp=AmfDemo1 > > >>>>>>>>>> > > >>>>>>>>>> saAmfSISUHAState=STANDBY(2) > > >>>>>>>>>> safSISU=safSu=SC- > > >>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O > > >>>>>>>>>> penSAF > > >>>>>>>>>> > > >>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > > >>>>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- > > >>>>>>>> 2N,safApp=OpenSAF > > >>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > > >>>>>>>>>> > > >>>>>>>>>> TC #3: Configured SU1(Act) on PL-3 and SU2(Std) on PL-4. > > >>>>>>>>>> Stop SC-1, stop PL-3 and PL-4, but PL-5 is running. start > > >>>>>>>>>> SC-1, the same issue. > > >>>>>>>>>> > > >>>>>>>>>> TC #4: Same as TC #3, but SU3 configured on PL-5 as spare. > > >>>>>>>>>> SU3 doesn't get any assignment and Sg is unstable. > > >>>>>>>>>> > > >>>>>>>>>> Thanks > > >>>>>>>>>> -Nagu > > >>>>>>>>>> > > >>>>>>>>>>> -----Original Message----- > > >>>>>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] > > >>>>>>>>>>> Sent: 18 August 2016 05:46 > > >>>>>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; > Praveen > > >>>>>>> Malviya; > > >>>>>>>>>>> gary....@dektech.com.au; > long.hb.ngu...@dektech.com.au; > > >>>>>>>>>>> minh.c...@dektech.com.au > > >>>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net > > >>>>>>>>>>> Subject: [PATCH 2 of 4] AMFND: Admin operation > > >>>>>>>>>>> continuation if csi completes during headless [#1725 part > > >>>>>>>>>>> 1] V1 > > >>>>>>>>>>> > > >>>>>>>>>>> osaf/services/saf/amf/amfnd/di.cc | 199 > > >>>>>>>>>>> +++++++++++++++++-------- > > >>>>>>>>>>> osaf/services/saf/amf/amfnd/include/avnd_di.h | 1 + > > >>>>>>>>>>> 2 files changed, 134 insertions(+), 66 deletions(-) > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> There're two options basically that AMFD can continue > > >>>>>>>>>>> admin operation wih completed csi(s) > > >>>>>>>>>>> > > >>>>>>>>>>> First: AMFD can use the sync SUSI fsm state as latest, > > >>>>>>>>>>> AMFD then has to explore its SUSI assignments with > > >>>>>>>>>>> adminStates of relevant entities to determine which SU > > >>>>>>>>>>> should be on call of > > >>> susi_success(). > > >>>>>>>>>>> Deeper level of exploration for csi addition. It also > > >>>>>>>>>>> depends on SG Fsm state which is being used variously in > > >>>>>>>>>>> different SG > > >> types. > > >>>>>>>>>>> Second: AMFD uses the SUSI fsm state read from IMM as > > >>>>>>>>>>> latest, and AMFND needs to resend susi_resp messages > which > > >>>>>>>>>>> were deferred during headless so that AMFD can continue > > >>>>>>>>>>> the admin operation > > >>>>>>> sequence. > > >>>>>>>>>>> Both cases of csi completion [during or after] headless > > >>>>>>>>>>> can run in the same code flow. > > >>>>>>>>>>> > > >>>>>>>>>>> The patch buffers susi_resp_msg during headless stage and > > >>>>>>>>>>> resend it to AMFD after headless. There could be a chance > > >>>>>>>>>>> that AMFND sent out susi response message but AMFD could > > not > > >>> receive > > >>>>>>>>>>> or process it. This case could be seen as a defect, which > > >>>>>>>>>>> can be fixed by securing the result of sending susi_resp > > >>>>>>>>>>> message from AMFND toward > > >>>>>>> AMFD. > > >>>>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc > > >>>>>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc > > >>>>>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc > > >>>>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc > > >>>>>>>>>>> @@ -805,11 +805,6 @@ uint32_t > > >>> avnd_di_susi_resp_send(AVND_CB > > >>>>>>>>>>> if (cb->term_state == > > >>>>>>>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED) > > >>>>>>>>>>> return rc; > > >>>>>>>>>>> > > >>>>>>>>>>> - if (cb->is_avd_down == true) { > > >>>>>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); > > >>>>>>>>>>> - return rc; > > >>>>>>>>>>> - } > > >>>>>>>>>>> - > > >>>>>>>>>>> // should be in assignment pending state to be here > > >>>>>>>>>>> osafassert(m_AVND_SU_IS_ASSIGN_PEND(su)); > > >>>>>>>>>>> > > >>>>>>>>>>> @@ -820,64 +815,76 @@ uint32_t > > >>>>> avnd_di_susi_resp_send(AVND_CB > > >>>>>>>>>>> TRACE_ENTER2("Sending Resp su=%s, si=%s, > > >>>>>>>>>>> curr_state=%u, prv_state=%u", su->name.value, > > >>>>>>>>>>> curr_si->name.value,curr_si- > > >>>>>>>>>>>> curr_state,curr_si->prv_state); > > >>>>>>>>>>> /* populate the susi resp msg */ > > >>>>>>>>>>> msg.info.avd = new AVSV_DND_MSG(); > > >>>>>>>>>>> - msg.type = AVND_MSG_AVD; > > >>>>>>>>>>> - msg.info.avd->msg_type = > > >>>>>>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > > ++(cb- > > >>>>>>>>>>>> snd_msg_id); > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.node_id = > cb- > > >>>>>>>>>>>> node_info.nodeId; > > >>>>>>>>>>> - if (si) { > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi = > > >>>>>>>>>>> - ((si->single_csi_add_rem_in_si == > > >>>>>>>>>>> AVSV_SUSI_ACT_BASE) ? > > >>>>>>>>>>> false : true); > > >>>>>>>>>>> - } > > >>>>>>>>>>> - TRACE("curr_assign_state '%u'", curr_si- > > >>> curr_assign_state); > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > > >>>>>>>>>>> - > > (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > > >> || > > >>>>>>>>>>> - > > m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > > >> ? > > >>>>>>>>>>> - ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN : > > >>>>>>>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.su_name = > su- > > >>>>>> name; > > >>>>>>>>>>> - if (si) { > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- > > >name; > > >>>>>>>>>>> - if (AVSV_SUSI_ACT_ASGN == > > >>>>>>>>>>> si->single_csi_add_rem_in_si) { > > >>>>>>>>>>> - TRACE("si->curr_assign_state '%u'", > > >>>>>>>>>>> curr_si- > > >>>>>>>>>>>> curr_assign_state); > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > > >>>>>>>>>>> - > > >>>>>>>>>>> > (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > > || > > >>>>>>>>>>> - > > >>>>>>>>>>> > > m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? > > >>>>>>>>>>> - AVSV_SUSI_ACT_ASGN : > > >>>>>>>>>>> AVSV_SUSI_ACT_DEL; > > >>>>>>>>>>> - } > > >>>>>>>>>>> - } > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.ha_state = > > >>>>>>>>>>> - (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? > > >>>>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error = > > >>>>>>>>>>> - > > (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > > >> || > > >>>>>>>>>>> - > > m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) > > >> ? > > >>>>>>>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; > > >>>>>>>>>>> + msg.type = AVND_MSG_AVD; > > >>>>>>>>>>> + msg.info.avd->msg_type = > > >>>>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- > > >>>>>>>>>>>> node_info.nodeId; > > >>>>>>>>>>> + if (si) { > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.single_csi = > > >>>>>>>>>>> + ((si->single_csi_add_rem_in_si == > > >>>>>>>>>>> AVSV_SUSI_ACT_BASE) ? false : true); > > >>>>>>>>>>> + } > > >>>>>>>>>>> + TRACE("curr_assign_state '%u'", curr_si- > > >curr_assign_state); > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > > >>> || > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > > >>> ? > > >>>>>>>>>>> + ((!curr_si->prv_state) ? > > >>>>>>>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) : > > >>>>>>> AVSV_SUSI_ACT_DEL; > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.su_name = su- > > >>>> name; > > >>>>>>>>>>> + if (si) { > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name = > > >>>>>>>>>>> + si- > > >>>>>>>>>>>> name; > > >>>>>>>>>>> + if (AVSV_SUSI_ACT_ASGN == > > >>>>>>>>>>> + si->single_csi_add_rem_in_si) > > >> { > > >>>>>>>>>>> + TRACE("si->curr_assign_state '%u'", curr_si- > > >>>>>>>>>>>> curr_assign_state); > > >>>>>>>>>>> + msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.msg_act = > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > > >>> || > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > > >>> ? > > >>>>>>>>>>> + AVSV_SUSI_ACT_ASGN : > > >>>>>>>>>>> AVSV_SUSI_ACT_DEL; > > >>>>>>>>>>> + } > > >>>>>>>>>>> + } > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.ha_state = > > >>>>>>>>>>> + (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? > > >>>>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error = > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > > >>> || > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) > > >>> ? > > >>>>>>>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; > > >>>>>>>>>>> > > >>>>>>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > == > > >>>>>>>>>>> AVSV_SUSI_ACT_ASGN) > > >>>>>>>>>>> - osafassert(si); > > >>>>>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > > >>>>>>>>>>> + == > > >>>>>>>>>>> AVSV_SUSI_ACT_ASGN) > > >>>>>>>>>>> + osafassert(si); > > >>>>>>>>>>> > > >>>>>>>>>>> - /* send the msg to AvD */ > > >>>>>>>>>>> - TRACE("Sending. msg_id'%u', node_id'%u', > msg_act'%u', > > >>>>>>>>>>> su'%s', si'%s', > > >>>>>>>>>>> ha_state'%u', error'%u', single_csi'%u'", > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id, > > >>>>>>>>>>> msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.node_id, > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act, > > >>>>>>>>>>> msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, > > >>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, > > >>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error, > > >>>>>>>>>>> msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.single_csi); > > >>>>>>>>>>> + /* send the msg to AvD */ > > >>>>>>>>>>> + TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', > > >>>>>>>>>>> + su'%s', > > >>>>>>>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'", > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id, > > >>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id, > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act, > > >>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value, > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, > > >>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error, > > >>>>>>>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi); > > >>>>>>>>>>> > > >>>>>>>>>>> - if ((su->si_list.n_nodes > 1) && (si == nullptr)) { > > >>>>>>>>>>> - if (msg.info.avd- > > >msg_info.n2d_su_si_assign.msg_act > > >> == > > >>>>>>>>>>> AVSV_SUSI_ACT_DEL) > > >>>>>>>>>>> - LOG_NO("Removed 'all SIs' from '%s'", > > >>>>>>>>>>> su->name.value); > > >>>>>>>>>>> + if ((su->si_list.n_nodes > 1) && (si == nullptr)) { > > >>>>>>>>>>> + if > > >>>>>>>>>>> + (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > > >>>>>>>>>>> + == > > >>>>>>>>>>> AVSV_SUSI_ACT_DEL) > > >>>>>>>>>>> + LOG_NO("Removed 'all SIs' from '%s'", su- > > >>>>>>>>>>>> name.value); > > >>>>>>>>>>> - if (msg.info.avd- > > >msg_info.n2d_su_si_assign.msg_act > > >> == > > >>>>>>>>>>> AVSV_SUSI_ACT_MOD) > > >>>>>>>>>>> - LOG_NO("Assigned 'all SIs' %s of '%s'", > > >>>>>>>>>>> - ha_state[msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.ha_state], > > >>>>>>>>>>> - su->name.value); > > >>>>>>>>>>> - } > > >>>>>>>>>>> + if > > >>>>>>>>>>> + (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > > >>>>>>>>>>> + == > > >>>>>>>>>>> AVSV_SUSI_ACT_MOD) > > >>>>>>>>>>> + LOG_NO("Assigned 'all SIs' %s of '%s'", > > >>>>>>>>>>> + ha_state[msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.ha_state], > > >>>>>>>>>>> + su->name.value); > > >>>>>>>>>>> + } > > >>>>>>>>>>> > > >>>>>>>>>>> - rc = avnd_di_msg_send(cb, &msg); > > >>>>>>>>>>> - if (NCSCC_RC_SUCCESS == rc) > > >>>>>>>>>>> - msg.info.avd = 0; > > >>>>>>>>>>> - > > >>>>>>>>>>> - /* we have completed the SU SI msg processing */ > > >>>>>>>>>>> - if (su_assign_state_is_stable(su)) > > >>>>>>>>>>> - m_AVND_SU_ASSIGN_PEND_RESET(su); > > >>>>>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); > > >>>>>>>>>>> + if (cb->is_avd_down == true) { > > >>>>>>>>>>> + // We are in headless, buffer this msg > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; > > >>>>>>>>>>> + if (avnd_diq_rec_add(cb, &msg) == nullptr) { > > >>>>>>>>>>> + rc = NCSCC_RC_FAILURE; > > >>>>>>>>>>> + } > > >>>>>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); > > >>>>>>>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF > > >>>>>>>>>>> director is offline"); > > >>>>>>>>>>> + } else { > > >>>>>>>>>>> + // We are in normal cluster, send msg to director > > >>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > > >>>>>>>>>>> + ++(cb- > > >>>>>>>>>>>> snd_msg_id); > > >>>>>>>>>>> + /* send the msg to AvD */ > > >>>>>>>>>>> + rc = avnd_di_msg_send(cb, &msg); > > >>>>>>>>>>> + if (NCSCC_RC_SUCCESS == rc) > > >>>>>>>>>>> + msg.info.avd = 0; > > >>>>>>>>>>> + /* we have completed the SU SI msg processing */ > > >>>>>>>>>>> + if (su_assign_state_is_stable(su)) { > > >>>>>>>>>>> + m_AVND_SU_ASSIGN_PEND_RESET(su); > > >>>>>>>>>>> + } > > >>>>>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); > > >>>>>>>>>>> + } > > >>>>>>>>>>> > > >>>>>>>>>>> /* free the contents of avnd message */ > > >>>>>>>>>>> avnd_msg_content_free(cb, &msg); @@ -1256,14 > > >>>>>>>>>>> +1263,7 > > >>> @@ > > >>>>>>> void > > >>>>>>>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_ > > >>>>>>>>>>> /* stop the AvD msg response timer */ > > >>>>>>>>>>> if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) { > > >>>>>>>>>>> m_AVND_TMR_MSG_RESP_STOP(cb, *rec); > > >>>>>>>>>>> - // Resend msgs from queue because amfd dropped > during > > >>>>>>>>>>> sync > > >>>>>>>>>>> - if ((cb->dnd_list.head != nullptr)) { > > >>>>>>>>>>> - TRACE("retransmit message to amfd"); > > >>>>>>>>>>> - AVND_DND_MSG_LIST *pending_rec = 0; > > >>>>>>>>>>> - for (pending_rec = cb->dnd_list.head; pending_rec > > >>>>>>>>>>> != > > >>>>>>>>>>> nullptr; pending_rec = pending_rec->next) { > > >>>>>>>>>>> - avnd_diq_rec_send(cb, pending_rec); > > >>>>>>>>>>> - } > > >>>>>>>>>>> - } > > >>>>>>>>>>> + avnd_diq_rec_send_buffered_msg(cb); > > >>>>>>>>>>> /* resend pg start track */ > > >>>>>>>>>>> avnd_di_resend_pg_start_track(cb); > > >>>>>>>>>>> } > > >>>>>>>>>>> @@ -1276,6 +1276,73 @@ void avnd_diq_rec_del(AVND_CB > > >> *cb, > > >>>>>>>> AVND_ > > >>>>>>>>>>> TRACE_LEAVE(); > > >>>>>>>>>>> return; > > >>>>>>>>>>> } > > >>>>>>>>>>> > > >> > > > +/************************************************************ > > >>>>>>>>>>> **************** > > >>>>>>>>>>> + Name : avnd_diq_rec_send_buffered_msg > > >>>>>>>>>>> + > > >>>>>>>>>>> + Description : Resend buffered msg > > >>>>>>>>>>> + > > >>>>>>>>>>> + Arguments : cb - ptr to the AvND control block > > >>>>>>>>>>> + > > >>>>>>>>>>> + Return Values : None. > > >>>>>>>>>>> + > > >>>>>>>>>>> + Notes : None. > > >>>>>>>>>>> > > >> > > > +************************************************************* > > >>>>>>>>>>> ********** > > >>>>>>>>>>> +*******/ void avnd_diq_rec_send_buffered_msg(AVND_CB > > >> *cb) > > >>> { > > >>>>>>>>>>> + TRACE_ENTER(); > > >>>>>>>>>>> + // Resend msgs from queue because amfnd dropped > > >>>>>>>>>>> + during > > >>>>>>> headless > > >>>>>>>>>>> + // or headless-synchronization > > >>>>>>>>>>> + if ((cb->dnd_list.head != nullptr)) { > > >>>>>>>>>>> + AVND_DND_MSG_LIST *pending_rec = 0; > > >>>>>>>>>>> + TRACE("Attach msg_id of buffered msg"); > > >>>>>>>>>>> + bool found = true; > > >>>>>>>>>>> + while (found) { > > >>>>>>>>>>> + found = false; > > >>>>>>>>>>> + for (pending_rec = cb->dnd_list.head; > > >>>>>>>>>>> + pending_rec != > > >>>>>>>>>>> nullptr; pending_rec = pending_rec->next) { > > >>>>>>>>>>> + if (pending_rec->msg.type == > > >>>>>>>>>>> AVND_MSG_AVD) { > > >>>>>>>>>>> + // At this moment, only oper_state > > >>>>>>>>>>> msg needs to report to director > > >>>>>>>>>>> + if (pending_rec->msg.info.avd- > > >>>>>>>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG && > > >>>>>>>>>>> + pending_rec->msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) { > > >>>>>>>>>>> + m_AVND_DIQ_REC_POP(cb, > > >>>>>>>>>>> pending_rec); #if 0 > > >>>>>>>>>>> + // only resend if this SUSI > > >>>>>>>>>>> does exist > > >>>>>>>>>>> + AVND_SU *su = > > >>>>>>>>>>> m_AVND_SUDB_REC_GET(cb->sudb, > > >>>>>>>>>>> + pending_rec- > > >>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name); > > >>>>>>>>>>> + if (su != nullptr && su- > > >>>>>>>>>>>> si_list.n_nodes > 0) { #endif > > >>>>>>>>>>> + pending_rec- > > >>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > > >>>>>>>>>>>> ++(cb->snd_msg_id); > > >>>>>>>>>>> + > > >>>>>>>>>>> m_AVND_DIQ_REC_PUSH(cb, pending_rec); > > >>>>>>>>>>> + LOG_NO("Found and > > >>>>>>>>>>> resend buffered su_si_assign msg for SU:'%s', " > > >>>>>>>>>>> + > > >>>>>>>>>>> "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', " > > >>>>>>>>>>> + > > >>>>>>>>>>> "error:'%u', msg_id:'%u'", > > >>>>>>>>>>> + > > >>>>>>>>>>> pending_rec->msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, > > >>>>>>>>>>> + > > >>>>>>>>>>> pending_rec->msg.info.avd- > > >>>>>>>>>>>> msg_info.n2d_su_si_assign.si_name.value, > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >>>>>>>>>>> pending_rec->msg.info.avd- > > >>>> msg_info.n2d_su_si_assign.ha_state, > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >>>>>>>>>>> pending_rec->msg.info.avd- > > >>> msg_info.n2d_su_si_assign.msg_act, > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >>>>>>>>>>> pending_rec->msg.info.avd- > > >>>> msg_info.n2d_su_si_assign.single_csi > > >>>>>>>>>>> , > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >>>>>>>>>>> pending_rec->msg.info.avd- > > >msg_info.n2d_su_si_assign.error, > > >>>>>>>>>>> + > > >>>>>>>>>>> > > >>>>>>>>>>> pending_rec->msg.info.avd- > > >>> msg_info.n2d_su_si_assign.msg_id); > > >>>>>>>>>>> + > > >>>>>>>>>>> +#if 0 > > >>>>>>>>>>> + } else { > > >>>>>>>>>>> + > > >>>>>>>>>>> avnd_msg_content_free(cb, &pending_rec->msg); > > >>>>>>>>>>> + delete pending_rec; > > >>>>>>>>>>> + pending_rec = cb- > > >>>>>>>>>>>> dnd_list.head; > > >>>>>>>>>>> + } #endif > > >>>>>>>>>>> + found = true; > > >>>>>>>>>>> + } > > >>>>>>>>>>> + } > > >>>>>>>>>>> + } > > >>>>>>>>>>> + } > > >>>>>>>>>>> + TRACE("retransmit message to amfd"); > > >>>>>>>>>>> + for (pending_rec = cb->dnd_list.head; > > >>>>>>>>>>> +pending_rec != nullptr; > > >>>>>>>>>>> pending_rec = pending_rec->next) { > > >>>>>>>>>>> + avnd_diq_rec_send(cb, pending_rec); > > >>>>>>>>>>> + } > > >>>>>>>>>>> + } > > >>>>>>>>>>> + TRACE_LEAVE(); > > >>>>>>>>>>> + return; > > >>>>>>>>>>> +} > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >> > > > /************************************************************* > > >>>>>>>>>>> *************** > > >>>>>>>>>>> Name : avnd_diq_rec_send > > >>>>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h > > >>>>>>>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h > > >>>>>>>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h > > >>>>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h > > >>>>>>>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct > > >> avnd > > >>>>> void > > >>>>>>>>>>> avnd_diq_del(struct avnd_cb_tag *); AVND_DND_MSG_LIST > > >>>>>>>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG > > *msg); > > >>> void > > >>>>>>>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb, > > >> AVND_DND_MSG_LIST > > >>>>>>> *rec); > > >>>>>>>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag > > >> *cb); > > >>>>>>>>>>> uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb, > > >>>>>>>>>>> AVND_DND_MSG_LIST *rec); uint32_t > > >>>>> avnd_di_reg_su_rsp_snd(struct > > >>>>>>>>>>> avnd_cb_tag *cb, SaNameT *su_name, uint32_t ret_code); > > >>>>>>>>>>> uint32_t avnd_di_ack_nack_msg_send(struct avnd_cb_tag > > *cb, > > >>>>>>>>>>> uint32_t rcv_id, uint32_t view_num); > > >>>>>>> -------------------------------------------------------------- > > >>>>>>> -- > > >>>>>>> - > > >>>>>>> -- > > >>>>>>> -- > > >>>>>>> --------- _______________________________________________ > > >>>>>>> Opensaf-devel mailing list > > >>>>>>> Opensaf-devel@lists.sourceforge.net > > >>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel > > >> ------------------------------------------------------------------- > > >> -- > > >> --------- _______________________________________________ > > >> Opensaf-devel mailing list > > >> Opensaf-devel@lists.sourceforge.net > > >> https://lists.sourceforge.net/lists/listinfo/opensaf-devel > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel