Hi Minh,
        Please check my response inlined with [Nagu].

Thanks
-Nagu
> -----Original Message-----
> From: minh chau [mailto:minh.c...@dektech.com.au]
> Sent: 13 January 2017 03:53
> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya;
> gary....@dektech.com.au
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless sync
> before standby AMFD comes up [#2162]
> 
> Hi Nagu,
> 
> Thanks for reviewing, please see comments inline.
> 
> Thanks,
> Minh
> 
> On 12/01/17 21:48, Nagendra Kumar wrote:
> > Hi Minh,
> >      Though I am not able to simulate the problem, I tested as below:
> > 1. Start SC1, SC2, PL-3 and PL-4. Configure SU1 on PL-3 as Act and SU2 on
> PL-4 as Standby.
> > 2. Stop SC1 and SC2 and then stop PL-3.
> > 3. Start SC-1 and SC-2. When SC-2 prints Cold sync complete, stop SC1. SC2
> becomes Act.
> [M]: As SU1 is on PL3, SU2 is on PL4, and If PL-3 is stopped, then only
> SU2 has active assignment
[Nagu]: PL-3 is stopped in step #2.
> >
> > In this case, SC-2 contains both SU1(Act) and SU2(Standby) assignments.
> > Ideally, SU2 assignments should have been Act and there shouldn't be SU1
> assignment.
> [M]: This seems to be another test where SU1 and SU2 are hosted on SC2,
> then both SU1 and SU2 should get assignment
[Nagu]: I mean to say command 'amf-state siass' run on SC-1 displays both SU1 
and SU2 assignments.
                SU1 and SU2 are hosted on PL-3 and PL-4 respectively.
This is similar test case, which is mentioned in the ticket?
> >
> >
> safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe
> mo,safApp=AmfDemo1
> >          saAmfSISUHAState=ACTIVE(1)
> >          saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >
> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe
> mo,safApp=AmfDemo1
> >          saAmfSISUHAState=STANDBY(2)
> >          saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >
> > Please check.
> >
> > Thanks
> > -Nagu
> >
> >> -----Original Message-----
> >> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au]
> >> Sent: 08 November 2016 08:53
> >> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen Malviya;
> >> gary....@dektech.com.au; minh.c...@dektech.com.au
> >> Cc: opensaf-devel@lists.sourceforge.net
> >> Subject: [PATCH 2 of 2] AMFND: Fix SC failover during headless sync
> >> before standby AMFD comes up [#2162]
> >>
> >>   osaf/services/saf/amf/amfnd/di.cc   |  7 +++++--
> >>   osaf/services/saf/amf/amfnd/susm.cc |  6 ++++++
> >>   2 files changed, 11 insertions(+), 2 deletions(-)
> >>
> >>
> >> This case of SC failover causes new active AMFD getting stuck in
> >> node_up messages
> >>
> >> Say first active controller is SC1, which goes down during headless sync.
> >> Therefore, the amfnd on SC2 receives mds_down of AVD, then both
> >> is_avd_down and amfd_sync_required are set to true. When SC2 takes
> >> over active role, amfnd on SC2 receives mds_up, but only is_avd_down
> >> is set to false and the variable amfd_sync_required remains true.
> >> When amfnd-SC2 finishes initiating middleware SU, it needs to send
> >> su_oper message to AMFD, but it is failed to send out due to
> amfd_sync_required.
> >>
> >> In this scenario of SC failover, amfd_sync_required needs to set to
> >> false when amfnd on SC2 receives su_pres message on middleware SUs.
> >> That means amfnd on active controller does not need to wait for
> >> set_leds message, to be informed that cluster initiation is done, so
> >> that amfnd can sen su_oper messages to AMFD. This logic also aligns
> >> with normal headless scenario, where amfnd on active controller has
> >> amfd_sync_required initially marked as false because no middleware
> >> SUs are initiated. When amfd_sync_required is true that means amfnd
> >> all middleware SUs are initiated and assigned before headless, thus
> >> amfnd needs to wait for cluster initiation after headless.
> >>
> >> diff --git a/osaf/services/saf/amf/amfnd/di.cc
> >> b/osaf/services/saf/amf/amfnd/di.cc
> >> --- a/osaf/services/saf/amf/amfnd/di.cc
> >> +++ b/osaf/services/saf/amf/amfnd/di.cc
> >> @@ -748,7 +748,8 @@ uint32_t avnd_di_oper_send(AVND_CB *cb,
> >>            if (avnd_diq_rec_add(cb, &msg) == nullptr) {
> >>                    rc = NCSCC_RC_FAILURE;
> >>            }
> >> -          LOG_NO("avnd_di_oper_send() deferred as AMF director is
> >> offline");
> >> +          LOG_NO("avnd_di_oper_send() deferred as AMF director is
> >> offline(%d),"
> >> +                  " or sync is required(%d)", cb->is_avd_down,
> >> +cb->amfd_sync_required);
> >>    } else {
> >>            // We are in normal cluster, send msg to director
> >>            msg.info.avd->msg_info.n2d_opr_state.msg_id = ++(cb-
> >>> snd_msg_id); @@ -881,7 +882,9 @@ uint32_t
> >> avnd_di_susi_resp_send(AVND_CB
> >>                    rc = NCSCC_RC_FAILURE;
> >>            }
> >>            m_AVND_SU_ALL_SI_RESET(su);
> >> -          LOG_NO("avnd_di_susi_resp_send() deferred as AMF
> director is
> >> offline");
> >> +                LOG_NO("avnd_di_susi_resp_send() deferred as AMF
> >> + director is
> >> offline(%d),"
> >> +                        " or sync is required(%d)", cb->is_avd_down,
> >> + cb->amfd_sync_required);
> >> +
> >>           } else {
> >>            // We are in normal cluster, send msg to director
> >>            msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb-
> >>> snd_msg_id); diff --git a/osaf/services/saf/amf/amfnd/susm.cc
> >> b/osaf/services/saf/amf/amfnd/susm.cc
> >> --- a/osaf/services/saf/amf/amfnd/susm.cc
> >> +++ b/osaf/services/saf/amf/amfnd/susm.cc
> >> @@ -1345,6 +1345,12 @@ uint32_t avnd_evt_avd_su_pres_evh(AVND_C
> >>                            goto done;
> >>            }
> >>    } else { /* => instantiate the su */
> >> +          // Do not need to wait for headless sync if there is no
> >> application SUs
> >> +          // initiated. This is known because here we are receiving
> >> su_pres message
> >> +          // for NCS SUs
> >> +          if (su->is_ncs == true)
> >> +                  cb->amfd_sync_required = false;
> >> +
> >>            AVND_EVT *evt_ir = 0;
> >>            TRACE("Sending to Imm thread.");
> >>            evt_ir = avnd_evt_create(cb, AVND_EVT_IR, 0, nullptr, &info-
> >>> su_name, 0, 0);
> 

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to