Hi Nagu,

Thanks for your update, I hope I could float the patch so everyone can 
do code review, but let's wait for your testing first.

Thanks,
Minh



On 22/09/16 21:22, Nagendra Kumar wrote:
> Hi Minh,
>
> I have tested following scenarios till now and works well:
> 1. Faults under standalone system. Act and Std SUs are there and cluster went 
> headless. Faults occurred in Act and Standby separately and together with 
> recovery as restart, su f/o, node f/o, node s/o. Faults are also mixed with 
> two SGs.
> 2. During headless, the escalations from comp restart->su restart->su f/o -> 
> node f/o. Two SG was used for testings. When cluster recovers, the system 
> works well as expected.
>
> SG and node Auto repair was enabled and disabled in many test cases.
>
> Further testing will continue:
> 1. Faults during assignments.
> 2. Faults during admin operations.
>
> Thanks
> -Nagu
>
>> -----Original Message-----
>> From: minh chau [mailto:minh.c...@dektech.com.au]
>> Sent: 15 September 2016 12:27
>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya;
>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation continuation if
>> csi completes during headless [#1725 part 1] V1
>>
>> Hi Nagu,
>>
>> Yes you are right. The "component failover" is unstable, I think I will post 
>> my
>> analysis of component failover problem to #1902 after you have verified the
>> other recoveries.
>>
>> Thanks,
>> Minh
>>
>> On 15/09/16 16:51, Nagendra Kumar wrote:
>>> Hi Minh,
>>>             @2.a.) and @2.b.) are working except "Component Failover"
>> as recovery. Other recovery like SU Failover, etc are working fine with
>> 1725_pending_review.tgz and 07_no_recovery_if_no_pending_susi.diff.
>>> Please confirm.
>>>
>>> Thanks
>>> -Nagu
>>>
>>>> -----Original Message-----
>>>> From: Nagendra Kumar
>>>> Sent: 15 September 2016 12:13
>>>> To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya;
>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au
>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation
>>>> continuation if csi completes during headless [#1725 part 1] V1
>>>>
>>>> Hi Minh,
>>>>>> If there's no any major problem, can we make SI Dep as last phase?
>>>> Yes, absolutely. There is no problem.
>>>>>> If I am right, I think you are testing @2.a) - and *fault* has just
>>>>>> been as
>>>> node reboot/powered-off by user during headless.
>>>> Yes, you are right.
>>>>
>>>> Thanks
>>>> -Nagu
>>>>
>>>>> -----Original Message-----
>>>>> From: minh chau [mailto:minh.c...@dektech.com.au]
>>>>> Sent: 14 September 2016 17:54
>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya;
>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au
>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation
>>>>> continuation if csi completes during headless [#1725 part 1] V1
>>>>>
>>>>> Hi Nagu,
>>>>>
>>>>> I have proposed to change the order on 28 Jul:
>>>>>
>>>>> ==============
>>>>>
>>>>> I would like to change the above orders of implementation:
>>>>> @0. We are here now: No admin op continuation, no recovery on faults
>>>>> during headless.
>>>>> Since componentRestart/suRestart has no impact on recovery after
>>>>> headless, faults during headless here mean: failover escalation,
>>>>> node reboot/powered-off by user during headless. Faults are
>>>>> different phenomenons but they all result in loss of SUSI. Having
>>>>> #1902 will remove the major impact of a node reboot due to immediate
>>>>> escalation and AMF also has to deal with the loss of SUSI the same
>>>>> as without
>>>>> #1902 plus failover escalation
>>>>>
>>>>> @1. Admin op continuation without required recovery on faults during
>>>>> headless
>>>>> @1.a) All CSI(s) callback completes during headless, but SUSI states
>>>>> are still QUIESCED/QUIESCING
>>>>> @1.b) One of CSI(s) callback is still ongoing after headless (AMFD
>>>>> would have to wait for it?)
>>>>>
>>>>> @2. Recovery on faults. (Doing fault recovery needs to consider
>>>>> admin op continuation which would have been implemented in step @1)
>>>>> Need
>>>>> #1902
>>>>> @2.a.) Faults in normal flow: No admin op continuation is required
>>>>> after headless, but fault did happen during headless
>>>>> @2.b.) Faults happen during admin operation while headless, after
>>>>> headless AMFD needs to consider a recovery on fault together with
>>>>> admin op continuation.
>>>>>
>>>>> @3. @1 + @2 + With SI Dep.
>>>>>
>>>>> ===============
>>>>> I thought we have followed the above order so far? Because part 1
>>>>> was acked, which is "@1. Admin op continuation without required
>>>>> recovery on faults during headless"
>>>>> If there's no any major problem, can we make SI Dep as last phase?
>>>>> If I am right, I think you are testing @2.a) - and *fault* has just
>>>>> been as node reboot/powered-off by user during headless.
>>>>>
>>>>> Thanks,
>>>>> Minh
>>>>>
>>>>> On 14/09/16 21:48, Nagendra Kumar wrote:
>>>>>> Hi Minh,
>>>>>>          If it is not tested, then it is fine. But, we had added (#1) the
>>>>> following in the ticket #1725 on 27 Jul :
>>>>>> ===========================================
>>>>>> Nagendra Kumar - 2016-07-27
>>>>>>
>>>>>> For 2N red model, implementation can be done in the following
>>>>>> phased
>>>>> manner.
>>>>>> It has advantages of being logically segregated and it continues
>>>>>> from
>>>>> where we left in 5.0.
>>>>>> (Phases #1, #2 and #3 is more related to ticket #1725 and phases #4
>>>>>> and #5 are related to #1902)
>>>>>>
>>>>>> 1.    Node restart escalation (with and without SI Dep).
>>>>>> 2.    Without Si Dep : Admin op (no faults/escalations).
>>>>>> 3.    Without Si Dep : Admin Op + node restart faults/escalations during
>>>>> headless.
>>>>>> 4.    Without Si Dep :
>>>>>>        a.) All faults in normal flows.
>>>>>>        b.) All faults during admin operation(minus node reboot
>>>>>> during headless
>>>>> as covered in #3).
>>>>>>     5.   With Si Dep : #2, #3 and #4.
>>>>>>
>>>>>> Since 5.0 already has immediate escalation model (component and
>>>>>> node
>>>>> restart/reboot), so #1, #2 and #3 completes left over portion of
>>>>> headless contribution in 5.0 with that model.
>>>>>> ======================================
>>>>>>
>>>>>> Thanks
>>>>>> -Nagu
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: minh chau [mailto:minh.c...@dektech.com.au]
>>>>>>> Sent: 14 September 2016 17:05
>>>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen
>> Malviya;
>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au
>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation
>>>>>>> continuation if csi completes during headless [#1725 part 1] V1
>>>>>>>
>>>>>>> Hi Nagu,
>>>>>>>
>>>>>>> SI Dep is the last phase of implementation of headless recovery,
>>>>>>> its support is not included in all patches attached in ticket #1725.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Minh
>>>>>>>
>>>>>>> On 14/09/16 21:21, Nagendra Kumar wrote:
>>>>>>>> Hi Minh,
>>>>>>>>                Have you tested Si Dep (2N Red model) for "node
>> restart test
>>>>>>> cases" ? I can't see it in the test case doc.
>>>>>>>> Thanks
>>>>>>>> -Nagu
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Nagendra Kumar
>>>>>>>>> Sent: 13 September 2016 11:20
>>>>>>>>> To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya;
>>>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au
>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation
>>>>>>>>> continuation if csi completes during headless [#1725 part 1] V1
>>>>>>>>>
>>>>>>>>> Hi Minh,
>>>>>>>>>               I have tested these scenarios again and it works well.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> -Nagu
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: minh chau [mailto:minh.c...@dektech.com.au]
>>>>>>>>>> Sent: 12 September 2016 11:53
>>>>>>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen
>>>>> Malviya;
>>>>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au
>>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>>>>> Subject: Re: [PATCH 2 of 4] AMFND: Admin operation continuation
>>>>>>>>>> if csi completes during headless [#1725 part 1] V1
>>>>>>>>>>
>>>>>>>>>> Hi Nagu,
>>>>>>>>>>
>>>>>>>>>> One bug get hit by your configuration, where the absent SUSIs
>>>>>>>>>> are found after headless but no real SUSIs are available also.
>>>>>>>>>> In this case I think that AMFD can do like a fresh assignment.
>>>>>>>>>> I attach the patch to ticket #1725, please help to test again.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Minh
>>>>>>>>>>
>>>>>>>>>> On 12/09/16 11:09, minh chau wrote:
>>>>>>>>>>> Hi Nagu,
>>>>>>>>>>>
>>>>>>>>>>> I'm running the tests with this configuration and will get
>>>>>>>>>>> back to
>>>> you.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Minh
>>>>>>>>>>>
>>>>>>>>>>> On 09/09/16 22:26, Nagendra Kumar wrote:
>>>>>>>>>>>> Hi Minh,
>>>>>>>>>>>> I am using 1725_pending_review.tgz
>>>>>>>>>>>> (1725_02_V2_bugfix_01_resend_buffer_in_set_leds.diff,
>>>>>>>>>>>> 1725_02_V2_bugfix_02_honor_clusterinit_nodesync_timer.diff,
>>>>>>>>>>>> 1725_02_V2_bugfix_03_restore_ng_admin.diff,
>>>>>>>>>>>> 1725_03_V4_failover_absent_susi_longDn.diff,
>>>>>>>>>>>> 1725_04_V2_headless_validation.diff,
>>>>>>>>>>>> 1725_05_V2_resend_oper_state.diff,
>>>>>>>>>>>> 1725_06a_fullscope_escalation_headless.diff).
>>>>>>>>>>>>
>>>>>>>>>>>> I am doing basic node reboot validation testing with no faults.
>>>>>>>>>>>>
>>>>>>>>>>>> Configuration: SU1(act) and SU2(stanby) both on PL-3.
>>>>>>>>>>>>
>>>>>>>>>>>> TC #1: Start SC-1, PL-3 and PL-5: Unlock SU1 and SU2. Stop
>>>>>>>>>>>> SC-1 and stop PL-3, start PL-3 and start SC-1.
>>>>>>>>>>>> After SC-1 and PL-3 comes back, ideally SU1 and SU2 should
>>>>>>>>>>>> get assignments as Act and Std, but no assignment are being
>>>>>>>>>>>> given to SUs on PL-3 and it shows following in status:
>>>>>>>>>>>>
>>>>>>>>>>>> Only Su2 has Std assignment.
>>>>>>>>>>>>
>>>>>>>>>>>> safSISU=safSu=SC-
>>>>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=O
>>>>>>>>>>>> penSAF
>>>>>>>>>>>>
>>>>>>>>>>>>              saAmfSISUHAState=ACTIVE(1)
>>>>>>>>>>>> safSISU=safSu=PL-
>>>>>>>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O
>>>>>>>>>>>> penSAF
>>>>>>>>>>>>
>>>>>>>>>>>>              saAmfSISUHAState=ACTIVE(1)
>>>>>>>>>>>>
>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe
>>>>>>>>>> mo1,s
>>>>>>>>>>>> afApp=AmfDemo1
>>>>>>>>>>>>
>>>>>>>>>>>>              saAmfSISUHAState=STANDBY(2)
>>>>>>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-
>>>>>>>>>> 2N,safApp=OpenSAF
>>>>>>>>>>>>              saAmfSISUHAState=ACTIVE(1)
>>>>>>>>>>>> safSISU=safSu=PL-
>>>>>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O
>>>>>>>>>>>> penSAF
>>>>>>>>>>>>
>>>>>>>>>>>>              saAmfSISUHAState=ACTIVE(1)
>>>>>>>>>>>>
>>>>>>>>>>>> TC #2: Configuration same as TC#1. Stop PL-3 and don't start.
>>>>>>>>>>>> The same issue:
>>>>>>>>>>>> safSISU=safSu=PL-
>>>>>>>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O
>>>>>>>>>>>> penSAF
>>>>>>>>>>>>
>>>>>>>>>>>>              saAmfSISUHAState=ACTIVE(1)
>>>>>>>>>>>>
>> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe
>>>>>>>>>> mo1,s
>>>>>>>>>>>> afApp=AmfDemo1
>>>>>>>>>>>>
>>>>>>>>>>>>              saAmfSISUHAState=STANDBY(2)
>>>>>>>>>>>> safSISU=safSu=SC-
>>>>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O
>>>>>>>>>>>> penSAF
>>>>>>>>>>>>
>>>>>>>>>>>>              saAmfSISUHAState=ACTIVE(1)
>>>>>>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-
>>>>>>>>>> 2N,safApp=OpenSAF
>>>>>>>>>>>>              saAmfSISUHAState=ACTIVE(1)
>>>>>>>>>>>>
>>>>>>>>>>>> TC #3: Configured SU1(Act) on PL-3 and SU2(Std) on PL-4.
>>>>>>>>>>>> Stop SC-1, stop PL-3 and PL-4, but PL-5 is running. start
>>>>>>>>>>>> SC-1, the same issue.
>>>>>>>>>>>>
>>>>>>>>>>>> TC #4: Same as TC #3, but SU3 configured on PL-5 as spare.
>>>>>>>>>>>> SU3 doesn't get any assignment and Sg is unstable.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> -Nagu
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au]
>>>>>>>>>>>>> Sent: 18 August 2016 05:46
>>>>>>>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen
>>>>>>>>> Malviya;
>>>>>>>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au;
>>>>>>>>>>>>> minh.c...@dektech.com.au
>>>>>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>>>>>>>>>> Subject: [PATCH 2 of 4] AMFND: Admin operation continuation
>>>>>>>>>>>>> if csi completes during headless [#1725 part 1] V1
>>>>>>>>>>>>>
>>>>>>>>>>>>>       osaf/services/saf/amf/amfnd/di.cc             |  199
>>>>>>>>>>>>> +++++++++++++++++--------
>>>>>>>>>>>>>       osaf/services/saf/amf/amfnd/include/avnd_di.h |    1 +
>>>>>>>>>>>>>       2 files changed, 134 insertions(+), 66 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> There're two options basically that AMFD can continue admin
>>>>>>>>>>>>> operation wih completed csi(s)
>>>>>>>>>>>>>
>>>>>>>>>>>>> First: AMFD can use the sync SUSI fsm state as latest, AMFD
>>>>>>>>>>>>> then has to explore its SUSI assignments with adminStates of
>>>>>>>>>>>>> relevant entities to determine which SU should be on call of
>>>>> susi_success().
>>>>>>>>>>>>> Deeper level of exploration for csi addition. It also
>>>>>>>>>>>>> depends on SG Fsm state which is being used variously in
>>>>>>>>>>>>> different SG
>>>> types.
>>>>>>>>>>>>> Second: AMFD uses the SUSI fsm state read from IMM as
>>>>>>>>>>>>> latest, and AMFND needs to resend susi_resp messages which
>>>>>>>>>>>>> were deferred during headless so that AMFD can continue the
>>>>>>>>>>>>> admin operation
>>>>>>>>> sequence.
>>>>>>>>>>>>> Both cases of csi completion [during or after] headless can
>>>>>>>>>>>>> run in the same code flow.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The patch buffers susi_resp_msg during headless stage and
>>>>>>>>>>>>> resend it to AMFD after headless. There could be a chance
>>>>>>>>>>>>> that AMFND sent out susi response message but AMFD could
>> not
>>>>> receive
>>>>>>>>>>>>> or process it. This case could be seen as a defect, which
>>>>>>>>>>>>> can be fixed by securing the result of sending susi_resp
>>>>>>>>>>>>> message from AMFND toward
>>>>>>>>> AMFD.
>>>>>>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc
>>>>>>>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc
>>>>>>>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc
>>>>>>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc
>>>>>>>>>>>>> @@ -805,11 +805,6 @@ uint32_t
>>>>> avnd_di_susi_resp_send(AVND_CB
>>>>>>>>>>>>>           if (cb->term_state ==
>>>>>>>>>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED)
>>>>>>>>>>>>>               return rc;
>>>>>>>>>>>>>
>>>>>>>>>>>>> -    if (cb->is_avd_down == true) {
>>>>>>>>>>>>> -        m_AVND_SU_ALL_SI_RESET(su);
>>>>>>>>>>>>> -        return rc;
>>>>>>>>>>>>> -    }
>>>>>>>>>>>>> -
>>>>>>>>>>>>>           // should be in assignment pending state to be here
>>>>>>>>>>>>>           osafassert(m_AVND_SU_IS_ASSIGN_PEND(su));
>>>>>>>>>>>>>
>>>>>>>>>>>>> @@ -820,64 +815,76 @@ uint32_t
>>>>>>> avnd_di_susi_resp_send(AVND_CB
>>>>>>>>>>>>>           TRACE_ENTER2("Sending Resp su=%s, si=%s,
>>>>>>>>>>>>> curr_state=%u, prv_state=%u", su->name.value,
>>>>>>>>>>>>> curr_si->name.value,curr_si-
>>>>>>>>>>>>>> curr_state,curr_si->prv_state);
>>>>>>>>>>>>>           /* populate the susi resp msg */
>>>>>>>>>>>>>           msg.info.avd = new AVSV_DND_MSG();
>>>>>>>>>>>>> -        msg.type = AVND_MSG_AVD;
>>>>>>>>>>>>> -        msg.info.avd->msg_type =
>>>>>>>>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG;
>>>>>>>>>>>>> -        msg.info.avd->msg_info.n2d_su_si_assign.msg_id =
>> ++(cb-
>>>>>>>>>>>>>> snd_msg_id);
>>>>>>>>>>>>> -        msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb-
>>>>>>>>>>>>>> node_info.nodeId;
>>>>>>>>>>>>> -        if (si) {
>>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi =
>>>>>>>>>>>>> -                        ((si->single_csi_add_rem_in_si ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_BASE) ?
>>>>>>>>>>>>> false : true);
>>>>>>>>>>>>> -        }
>>>>>>>>>>>>> -        TRACE("curr_assign_state '%u'", curr_si-
>>>>> curr_assign_state);
>>>>>>>>>>>>> -        msg.info.avd->msg_info.n2d_su_si_assign.msg_act =
>>>>>>>>>>>>> -
>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si)
>>>> ||
>>>>>>>>>>>>> -
>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si))
>>>> ?
>>>>>>>>>>>>> -                ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN :
>>>>>>>>>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL;
>>>>>>>>>>>>> -        msg.info.avd->msg_info.n2d_su_si_assign.su_name = su-
>>>>>>>> name;
>>>>>>>>>>>>> -        if (si) {
>>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = si-
>>> name;
>>>>>>>>>>>>> -                if (AVSV_SUSI_ACT_ASGN ==
>>>>>>>>>>>>> si->single_csi_add_rem_in_si) {
>>>>>>>>>>>>> -                        TRACE("si->curr_assign_state '%u'", 
>>>>>>>>>>>>> curr_si-
>>>>>>>>>>>>>> curr_assign_state);
>>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act =
>>>>>>>>>>>>> -
>>>>>>>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si)
>> ||
>>>>>>>>>>>>> -
>>>>>>>>>>>>>
>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ?
>>>>>>>>>>>>> -                                AVSV_SUSI_ACT_ASGN :
>>>>>>>>>>>>> AVSV_SUSI_ACT_DEL;
>>>>>>>>>>>>> -                }
>>>>>>>>>>>>> -        }
>>>>>>>>>>>>> -        msg.info.avd->msg_info.n2d_su_si_assign.ha_state =
>>>>>>>>>>>>> -                (SA_AMF_HA_QUIESCING == curr_si->curr_state) ?
>>>>>>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state;
>>>>>>>>>>>>> -        msg.info.avd->msg_info.n2d_su_si_assign.error =
>>>>>>>>>>>>> -
>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si)
>>>> ||
>>>>>>>>>>>>> -
>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si))
>>>> ?
>>>>>>>>>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE;
>>>>>>>>>>>>> +    msg.type = AVND_MSG_AVD;
>>>>>>>>>>>>> +    msg.info.avd->msg_type =
>>>>>>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG;
>>>>>>>>>>>>> +    msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb-
>>>>>>>>>>>>>> node_info.nodeId;
>>>>>>>>>>>>> +    if (si) {
>>>>>>>>>>>>> +        msg.info.avd->msg_info.n2d_su_si_assign.single_csi =
>>>>>>>>>>>>> +                ((si->single_csi_add_rem_in_si ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_BASE) ? false : true);
>>>>>>>>>>>>> +    }
>>>>>>>>>>>>> +    TRACE("curr_assign_state '%u'", curr_si-
>>> curr_assign_state);
>>>>>>>>>>>>> +    msg.info.avd->msg_info.n2d_su_si_assign.msg_act =
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si)
>>>>> ||
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si))
>>>>> ?
>>>>>>>>>>>>> +                ((!curr_si->prv_state) ?
>>>>>>>>>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) :
>>>>>>>>> AVSV_SUSI_ACT_DEL;
>>>>>>>>>>>>> +    msg.info.avd->msg_info.n2d_su_si_assign.su_name = su-
>>>>>> name;
>>>>>>>>>>>>> +    if (si) {
>>>>>>>>>>>>> +        msg.info.avd->msg_info.n2d_su_si_assign.si_name =
>>>>>>>>>>>>> + si-
>>>>>>>>>>>>>> name;
>>>>>>>>>>>>> +        if (AVSV_SUSI_ACT_ASGN ==
>>>>>>>>>>>>> + si->single_csi_add_rem_in_si)
>>>> {
>>>>>>>>>>>>> +            TRACE("si->curr_assign_state '%u'", curr_si-
>>>>>>>>>>>>>> curr_assign_state);
>>>>>>>>>>>>> +                msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.msg_act =
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si)
>>>>> ||
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si))
>>>>> ?
>>>>>>>>>>>>> +                    AVSV_SUSI_ACT_ASGN :
>>>>>>>>>>>>> AVSV_SUSI_ACT_DEL;
>>>>>>>>>>>>> +        }
>>>>>>>>>>>>> +    }
>>>>>>>>>>>>> +    msg.info.avd->msg_info.n2d_su_si_assign.ha_state =
>>>>>>>>>>>>> +            (SA_AMF_HA_QUIESCING == curr_si->curr_state) ?
>>>>>>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state;
>>>>>>>>>>>>> +    msg.info.avd->msg_info.n2d_su_si_assign.error =
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si)
>>>>> ||
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si))
>>>>> ?
>>>>>>>>>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE;
>>>>>>>>>>>>>
>>>>>>>>>>>>> -        if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_ASGN)
>>>>>>>>>>>>> -                osafassert(si);
>>>>>>>>>>>>> +    if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_ASGN)
>>>>>>>>>>>>> +        osafassert(si);
>>>>>>>>>>>>>
>>>>>>>>>>>>> -        /* send the msg to AvD */
>>>>>>>>>>>>> -        TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u',
>>>>>>>>>>>>> su'%s', si'%s',
>>>>>>>>>>>>> ha_state'%u', error'%u', single_csi'%u'",
>>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id,
>>>>>>>>>>>>> msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.node_id,
>>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act,
>>>>>>>>>>>>> msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value,
>>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value,
>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state,
>>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error,
>>>>>>>>>>>>> msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.single_csi);
>>>>>>>>>>>>> +    /* send the msg to AvD */
>>>>>>>>>>>>> +    TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u',
>>>>>>>>>>>>> + su'%s',
>>>>>>>>>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'",
>>>>>>>>>>>>> +        msg.info.avd->msg_info.n2d_su_si_assign.msg_id,
>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id,
>>>>>>>>>>>>> +        msg.info.avd->msg_info.n2d_su_si_assign.msg_act,
>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value,
>>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value,
>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state,
>>>>>>>>>>>>> +        msg.info.avd->msg_info.n2d_su_si_assign.error,
>>>>>>>>>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi);
>>>>>>>>>>>>>
>>>>>>>>>>>>> -        if ((su->si_list.n_nodes > 1) && (si == nullptr)) {
>>>>>>>>>>>>> -                if (msg.info.avd-
>>> msg_info.n2d_su_si_assign.msg_act
>>>> ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_DEL)
>>>>>>>>>>>>> -                        LOG_NO("Removed 'all SIs' from '%s'",
>>>>>>>>>>>>> su->name.value);
>>>>>>>>>>>>> +    if ((su->si_list.n_nodes > 1) && (si == nullptr)) {
>>>>>>>>>>>>> +        if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act
>>>>>>>>>>>>> + ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_DEL)
>>>>>>>>>>>>> +            LOG_NO("Removed 'all SIs' from '%s'", su-
>>>>>>>>>>>>>> name.value);
>>>>>>>>>>>>> -                if (msg.info.avd-
>>> msg_info.n2d_su_si_assign.msg_act
>>>> ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_MOD)
>>>>>>>>>>>>> -                        LOG_NO("Assigned 'all SIs' %s of '%s'",
>>>>>>>>>>>>> -                               ha_state[msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.ha_state],
>>>>>>>>>>>>> -                               su->name.value);
>>>>>>>>>>>>> -        }
>>>>>>>>>>>>> +        if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act
>>>>>>>>>>>>> + ==
>>>>>>>>>>>>> AVSV_SUSI_ACT_MOD)
>>>>>>>>>>>>> +            LOG_NO("Assigned 'all SIs' %s of '%s'",
>>>>>>>>>>>>> +                    ha_state[msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.ha_state],
>>>>>>>>>>>>> +                    su->name.value);
>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>
>>>>>>>>>>>>> -        rc = avnd_di_msg_send(cb, &msg);
>>>>>>>>>>>>> -        if (NCSCC_RC_SUCCESS == rc)
>>>>>>>>>>>>> -                msg.info.avd = 0;
>>>>>>>>>>>>> -
>>>>>>>>>>>>> -        /* we have completed the SU SI msg processing */
>>>>>>>>>>>>> -        if (su_assign_state_is_stable(su))
>>>>>>>>>>>>> -                m_AVND_SU_ASSIGN_PEND_RESET(su);
>>>>>>>>>>>>> -        m_AVND_SU_ALL_SI_RESET(su);
>>>>>>>>>>>>> +    if (cb->is_avd_down == true) {
>>>>>>>>>>>>> +        // We are in headless, buffer this msg
>>>>>>>>>>>>> +        msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0;
>>>>>>>>>>>>> +        if (avnd_diq_rec_add(cb, &msg) == nullptr) {
>>>>>>>>>>>>> +            rc = NCSCC_RC_FAILURE;
>>>>>>>>>>>>> +        }
>>>>>>>>>>>>> +        m_AVND_SU_ALL_SI_RESET(su);
>>>>>>>>>>>>> +        LOG_NO("avnd_di_susi_resp_send() deferred as AMF
>>>>>>>>>>>>> director is offline");
>>>>>>>>>>>>> +    } else {
>>>>>>>>>>>>> +        // We are in normal cluster, send msg to director
>>>>>>>>>>>>> +        msg.info.avd->msg_info.n2d_su_si_assign.msg_id =
>>>>>>>>>>>>> + ++(cb-
>>>>>>>>>>>>>> snd_msg_id);
>>>>>>>>>>>>> +        /* send the msg to AvD */
>>>>>>>>>>>>> +        rc = avnd_di_msg_send(cb, &msg);
>>>>>>>>>>>>> +        if (NCSCC_RC_SUCCESS == rc)
>>>>>>>>>>>>> +            msg.info.avd = 0;
>>>>>>>>>>>>> +        /* we have completed the SU SI msg processing */
>>>>>>>>>>>>> +        if (su_assign_state_is_stable(su)) {
>>>>>>>>>>>>> +            m_AVND_SU_ASSIGN_PEND_RESET(su);
>>>>>>>>>>>>> +        }
>>>>>>>>>>>>> +        m_AVND_SU_ALL_SI_RESET(su);
>>>>>>>>>>>>> +    }
>>>>>>>>>>>>>
>>>>>>>>>>>>>           /* free the contents of avnd message */
>>>>>>>>>>>>>           avnd_msg_content_free(cb, &msg); @@ -1256,14
>>>>>>>>>>>>> +1263,7
>>>>> @@
>>>>>>>>> void
>>>>>>>>>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_
>>>>>>>>>>>>>           /* stop the AvD msg response timer */
>>>>>>>>>>>>>           if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) {
>>>>>>>>>>>>>               m_AVND_TMR_MSG_RESP_STOP(cb, *rec);
>>>>>>>>>>>>> -        // Resend msgs from queue because amfd dropped during
>>>>>>>>>>>>> sync
>>>>>>>>>>>>> -        if ((cb->dnd_list.head != nullptr)) {
>>>>>>>>>>>>> -            TRACE("retransmit message to amfd");
>>>>>>>>>>>>> -            AVND_DND_MSG_LIST *pending_rec = 0;
>>>>>>>>>>>>> -            for (pending_rec = cb->dnd_list.head; pending_rec !=
>>>>>>>>>>>>> nullptr; pending_rec = pending_rec->next) {
>>>>>>>>>>>>> -                avnd_diq_rec_send(cb, pending_rec);
>>>>>>>>>>>>> -            }
>>>>>>>>>>>>> -        }
>>>>>>>>>>>>> +        avnd_diq_rec_send_buffered_msg(cb);
>>>>>>>>>>>>>               /* resend pg start track */
>>>>>>>>>>>>>               avnd_di_resend_pg_start_track(cb);
>>>>>>>>>>>>>           }
>>>>>>>>>>>>> @@ -1276,6 +1276,73 @@ void avnd_diq_rec_del(AVND_CB
>>>> *cb,
>>>>>>>>>> AVND_
>>>>>>>>>>>>>           TRACE_LEAVE();
>>>>>>>>>>>>>           return;
>>>>>>>>>>>>>       }
>>>>>>>>>>>>>
>> +/************************************************************
>>>>>>>>>>>>> ****************
>>>>>>>>>>>>> +  Name          : avnd_diq_rec_send_buffered_msg
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +  Description   : Resend buffered msg
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +  Arguments     : cb  - ptr to the AvND control block
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +  Return Values : None.
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +  Notes         : None.
>>>>>>>>>>>>>
>> +*************************************************************
>>>>>>>>>>>>> **********
>>>>>>>>>>>>> +*******/ void avnd_diq_rec_send_buffered_msg(AVND_CB
>>>> *cb)
>>>>> {
>>>>>>>>>>>>> +    TRACE_ENTER();
>>>>>>>>>>>>> +    // Resend msgs from queue because amfnd dropped during
>>>>>>>>> headless
>>>>>>>>>>>>> +    // or headless-synchronization
>>>>>>>>>>>>> +    if ((cb->dnd_list.head != nullptr)) {
>>>>>>>>>>>>> +        AVND_DND_MSG_LIST *pending_rec = 0;
>>>>>>>>>>>>> +        TRACE("Attach msg_id of buffered msg");
>>>>>>>>>>>>> +        bool found = true;
>>>>>>>>>>>>> +        while (found) {
>>>>>>>>>>>>> +            found = false;
>>>>>>>>>>>>> +            for (pending_rec = cb->dnd_list.head;
>>>>>>>>>>>>> + pending_rec !=
>>>>>>>>>>>>> nullptr; pending_rec = pending_rec->next) {
>>>>>>>>>>>>> +                if (pending_rec->msg.type ==
>>>>>>>>>>>>> AVND_MSG_AVD) {
>>>>>>>>>>>>> +                    // At this moment, only oper_state
>>>>>>>>>>>>> msg needs to report to director
>>>>>>>>>>>>> +                    if (pending_rec->msg.info.avd-
>>>>>>>>>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG &&
>>>>>>>>>>>>> +                        pending_rec->msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) {
>>>>>>>>>>>>> +                        m_AVND_DIQ_REC_POP(cb,
>>>>>>>>>>>>> pending_rec); #if 0
>>>>>>>>>>>>> +                        // only resend if this SUSI
>>>>>>>>>>>>> does exist
>>>>>>>>>>>>> +                        AVND_SU *su =
>>>>>>>>>>>>> m_AVND_SUDB_REC_GET(cb->sudb,
>>>>>>>>>>>>> +                                pending_rec-
>>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name);
>>>>>>>>>>>>> +                        if (su != nullptr && su-
>>>>>>>>>>>>>> si_list.n_nodes > 0) { #endif
>>>>>>>>>>>>> +                            pending_rec-
>>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id =
>>>>>>>>>>>>>> ++(cb->snd_msg_id);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>         m_AVND_DIQ_REC_PUSH(cb, pending_rec);
>>>>>>>>>>>>> +                            LOG_NO("Found and
>>>>>>>>>>>>> resend buffered su_si_assign msg for SU:'%s', "
>>>>>>>>>>>>> +
>>>>>>>>>>>>>         "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', "
>>>>>>>>>>>>> +
>>>>>>>>>>>>>         "error:'%u', msg_id:'%u'",
>>>>>>>>>>>>> +
>>>>>>>>>>>>>         pending_rec->msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value,
>>>>>>>>>>>>> +
>>>>>>>>>>>>>         pending_rec->msg.info.avd-
>>>>>>>>>>>>>> msg_info.n2d_su_si_assign.si_name.value,
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> pending_rec->msg.info.avd-
>>>>>> msg_info.n2d_su_si_assign.ha_state,
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> pending_rec->msg.info.avd-
>>>>> msg_info.n2d_su_si_assign.msg_act,
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> pending_rec->msg.info.avd-
>>>>>> msg_info.n2d_su_si_assign.single_csi
>>>>>>>>>>>>> ,
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> pending_rec->msg.info.avd-
>>> msg_info.n2d_su_si_assign.error,
>>>>>>>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> pending_rec->msg.info.avd-
>>>>> msg_info.n2d_su_si_assign.msg_id);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +#if 0
>>>>>>>>>>>>> +                        } else {
>>>>>>>>>>>>> +
>>>>>>>>>>>>>         avnd_msg_content_free(cb, &pending_rec->msg);
>>>>>>>>>>>>> +                            delete pending_rec;
>>>>>>>>>>>>> +                            pending_rec = cb-
>>>>>>>>>>>>>> dnd_list.head;
>>>>>>>>>>>>> +                        }
>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>> +                        found = true;
>>>>>>>>>>>>> +                    }
>>>>>>>>>>>>> +                }
>>>>>>>>>>>>> +            }
>>>>>>>>>>>>> +        }
>>>>>>>>>>>>> +        TRACE("retransmit message to amfd");
>>>>>>>>>>>>> +         for (pending_rec = cb->dnd_list.head; pending_rec
>>>>>>>>>>>>> +!= nullptr;
>>>>>>>>>>>>> pending_rec = pending_rec->next) {
>>>>>>>>>>>>> +             avnd_diq_rec_send(cb, pending_rec);
>>>>>>>>>>>>> +         }
>>>>>>>>>>>>> +    }
>>>>>>>>>>>>> +    TRACE_LEAVE();
>>>>>>>>>>>>> +    return;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>> /*************************************************************
>>>>>>>>>>>>> ***************
>>>>>>>>>>>>>         Name          : avnd_diq_rec_send
>>>>>>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h
>>>>>>>>>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h
>>>>>>>>>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h
>>>>>>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h
>>>>>>>>>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct
>>>> avnd
>>>>>>> void
>>>>>>>>>>>>> avnd_diq_del(struct avnd_cb_tag *);  AVND_DND_MSG_LIST
>>>>>>>>>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG
>> *msg);
>>>>> void
>>>>>>>>>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb,
>>>> AVND_DND_MSG_LIST
>>>>>>>>> *rec);
>>>>>>>>>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag
>>>> *cb);
>>>>>>>>>>>>>       uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb,
>>>>>>>>>>>>> AVND_DND_MSG_LIST *rec);  uint32_t
>>>>>>> avnd_di_reg_su_rsp_snd(struct
>>>>>>>>>>>>> avnd_cb_tag *cb, SaNameT *su_name, uint32_t ret_code);
>>>>>>>>>>>>> uint32_t avnd_di_ack_nack_msg_send(struct avnd_cb_tag
>> *cb,
>>>>>>>>>>>>> uint32_t rcv_id, uint32_t view_num);
>>>>>>>>> ----------------------------------------------------------------
>>>>>>>>> -
>>>>>>>>> --
>>>>>>>>> --
>>>>>>>>> --------- _______________________________________________
>>>>>>>>> Opensaf-devel mailing list
>>>>>>>>> Opensaf-devel@lists.sourceforge.net
>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>>> ---------------------------------------------------------------------
>>>> --------- _______________________________________________
>>>> Opensaf-devel mailing list
>>>> Opensaf-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to