Yes, ack for both the patches. I assume you would have tested upgrade scenarios.


Thanks
-Nagu

> -----Original Message-----
> From: minh chau [mailto:minh.c...@dektech.com.au]
> Sent: 15 February 2017 08:52
> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya;
> gary....@dektech.com.au
> Cc: opensaf-devel@lists.sourceforge.net
> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless sync
> before standby AMFD comes up [#2162]
> 
> Hi Nagu,
> 
> The #2162 has two patches. I think your ack is for [PATCH 2 of 2] AMFND:
> Fix SC failover during headless sync before standby AMFD comes up [#2162].
> Does the other one ([PATCH 1 of 2] AMFD: Fix SC failover during headless
> sync at INIT_DONE state [#2162]) look ok?
> 
> Thanks,
> Minh
> On 14/02/17 20:40, Nagendra Kumar wrote:
> > Ack.
> > Tested the scenarios.
> >
> > Thanks
> > -Nagu
> >
> >> -----Original Message-----
> >> From: minh chau [mailto:minh.c...@dektech.com.au]
> >> Sent: 23 January 2017 16:24
> >> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya;
> >> gary....@dektech.com.au
> >> Cc: opensaf-devel@lists.sourceforge.net
> >> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless
> >> sync before standby AMFD comes up [#2162]
> >>
> >> Hi Nagu,
> >>
> >> I am checking the logs now.
> >>
> >> Thanks, Minh
> >>
> >> On 23/01/17 17:47, Nagendra Kumar wrote:
> >>> The logs (Logs-tc.rar) attached in the ticket.
> >>>
> >>> Thanks
> >>> -Nagu
> >>>
> >>>> -----Original Message-----
> >>>> From: minh chau [mailto:minh.c...@dektech.com.au]
> >>>> Sent: 16 January 2017 05:47
> >>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya;
> >>>> gary....@dektech.com.au
> >>>> Cc: opensaf-devel@lists.sourceforge.net
> >>>> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless
> >>>> sync before standby AMFD comes up [#2162]
> >>>>
> >>>> Hi Nagu,
> >>>>
> >>>> I misunderstood your point, and now I get it.
> >>>> In my test I see it works as expected - SU2 becomes Act and no
> >>>> assignment for SU1 I guess in your test some how the cluster
> >>>> initiation timer has not been started on SC2 (new active), there
> >>>> could be a
> >> missing case in the patch.
> >>>> Could you please share me the trace?
> >>>>
> >>>> Thanks,
> >>>> Minh
> >>>>
> >>>> On 13/01/17 21:48, Nagendra Kumar wrote:
> >>>>> Hi Minh,
> >>>>>         Please check my response inlined with [Nagu].
> >>>>>
> >>>>> Thanks
> >>>>> -Nagu
> >>>>>> -----Original Message-----
> >>>>>> From: minh chau [mailto:minh.c...@dektech.com.au]
> >>>>>> Sent: 13 January 2017 03:53
> >>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen
> Malviya;
> >>>>>> gary....@dektech.com.au
> >>>>>> Cc: opensaf-devel@lists.sourceforge.net
> >>>>>> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during
> >>>>>> headless sync before standby AMFD comes up [#2162]
> >>>>>>
> >>>>>> Hi Nagu,
> >>>>>>
> >>>>>> Thanks for reviewing, please see comments inline.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Minh
> >>>>>>
> >>>>>> On 12/01/17 21:48, Nagendra Kumar wrote:
> >>>>>>> Hi Minh,
> >>>>>>>        Though I am not able to simulate the problem, I tested as
> below:
> >>>>>>> 1. Start SC1, SC2, PL-3 and PL-4. Configure SU1 on PL-3 as Act
> >>>>>>> and
> >>>>>>> SU2 on
> >>>>>> PL-4 as Standby.
> >>>>>>> 2. Stop SC1 and SC2 and then stop PL-3.
> >>>>>>> 3. Start SC-1 and SC-2. When SC-2 prints Cold sync complete,
> >>>>>>> stop SC1. SC2
> >>>>>> becomes Act.
> >>>>>> [M]: As SU1 is on PL3, SU2 is on PL4, and If PL-3 is stopped,
> >>>>>> then only
> >>>>>> SU2 has active assignment
> >>>>> [Nagu]: PL-3 is stopped in step #2.
> >>>>>>> In this case, SC-2 contains both SU1(Act) and SU2(Standby)
> >> assignments.
> >>>>>>> Ideally, SU2 assignments should have been Act and there
> >>>>>>> shouldn't be
> >>>>>>> SU1
> >>>>>> assignment.
> >>>>>> [M]: This seems to be another test where SU1 and SU2 are hosted
> >>>>>> on SC2, then both SU1 and SU2 should get assignment
> >>>>> [Nagu]: I mean to say command 'amf-state siass' run on SC-1
> >>>>> displays both
> >>>> SU1 and SU2 assignments.
> >>>>>                    SU1 and SU2 are hosted on PL-3 and PL-4 respectively.
> >>>>> This is similar test case, which is mentioned in the ticket?
> >>
> safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe
> >>>>>> mo,safApp=AmfDemo1
> >>>>>>>             saAmfSISUHAState=ACTIVE(1)
> >>>>>>>             saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>>>>
> >>
> safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe
> >>>>>> mo,safApp=AmfDemo1
> >>>>>>>             saAmfSISUHAState=STANDBY(2)
> >>>>>>>             saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
> >>>>>>>
> >>>>>>> Please check.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> -Nagu
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au]
> >>>>>>>> Sent: 08 November 2016 08:53
> >>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen
> >> Malviya;
> >>>>>>>> gary....@dektech.com.au; minh.c...@dektech.com.au
> >>>>>>>> Cc: opensaf-devel@lists.sourceforge.net
> >>>>>>>> Subject: [PATCH 2 of 2] AMFND: Fix SC failover during headless
> >>>>>>>> sync before standby AMFD comes up [#2162]
> >>>>>>>>
> >>>>>>>>      osaf/services/saf/amf/amfnd/di.cc   |  7 +++++--
> >>>>>>>>      osaf/services/saf/amf/amfnd/susm.cc |  6 ++++++
> >>>>>>>>      2 files changed, 11 insertions(+), 2 deletions(-)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> This case of SC failover causes new active AMFD getting stuck
> >>>>>>>> in node_up messages
> >>>>>>>>
> >>>>>>>> Say first active controller is SC1, which goes down during
> >>>>>>>> headless
> >> sync.
> >>>>>>>> Therefore, the amfnd on SC2 receives mds_down of AVD, then
> both
> >>>>>>>> is_avd_down and amfd_sync_required are set to true. When SC2
> >>>>>>>> takes over active role, amfnd on SC2 receives mds_up, but only
> >>>>>>>> is_avd_down is set to false and the variable amfd_sync_required
> >>>> remains true.
> >>>>>>>> When amfnd-SC2 finishes initiating middleware SU, it needs to
> >>>>>>>> send su_oper message to AMFD, but it is failed to send out due
> >>>>>>>> to
> >>>>>> amfd_sync_required.
> >>>>>>>> In this scenario of SC failover, amfd_sync_required needs to
> >>>>>>>> set to false when amfnd on SC2 receives su_pres message on
> >> middleware
> >>>> SUs.
> >>>>>>>> That means amfnd on active controller does not need to wait for
> >>>>>>>> set_leds message, to be informed that cluster initiation is
> >>>>>>>> done, so that amfnd can sen su_oper messages to AMFD. This
> >>>>>>>> logic also aligns with normal headless scenario, where amfnd on
> >>>>>>>> active controller has amfd_sync_required initially marked as
> >>>>>>>> false because no middleware SUs are initiated. When
> >>>>>>>> amfd_sync_required is true that means amfnd all middleware SUs
> >>>>>>>> are initiated and assigned before headless, thus amfnd needs to
> >>>>>>>> wait for cluster initiation after
> >>>> headless.
> >>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc
> >>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc
> >>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc
> >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc
> >>>>>>>> @@ -748,7 +748,8 @@ uint32_t avnd_di_oper_send(AVND_CB
> *cb,
> >>>>>>>>                      if (avnd_diq_rec_add(cb, &msg) == nullptr) {
> >>>>>>>>                              rc = NCSCC_RC_FAILURE;
> >>>>>>>>                      }
> >>>>>>>> -            LOG_NO("avnd_di_oper_send() deferred as AMF
> >> director is
> >>>>>>>> offline");
> >>>>>>>> +            LOG_NO("avnd_di_oper_send() deferred as AMF
> >> director is
> >>>>>>>> offline(%d),"
> >>>>>>>> +                    " or sync is required(%d)", cb->is_avd_down,
> >>>>>>>> +cb->amfd_sync_required);
> >>>>>>>>              } else {
> >>>>>>>>                      // We are in normal cluster, send msg to 
> >>>>>>>> director
> >>>>>>>>                      msg.info.avd->msg_info.n2d_opr_state.msg_id =
> >> ++(cb-
> >>>>>>>>> snd_msg_id); @@ -881,7 +882,9 @@ uint32_t
> >>>>>>>> avnd_di_susi_resp_send(AVND_CB
> >>>>>>>>                              rc = NCSCC_RC_FAILURE;
> >>>>>>>>                      }
> >>>>>>>>                      m_AVND_SU_ALL_SI_RESET(su);
> >>>>>>>> -            LOG_NO("avnd_di_susi_resp_send() deferred as AMF
> >>>>>> director is
> >>>>>>>> offline");
> >>>>>>>> +                LOG_NO("avnd_di_susi_resp_send() deferred as
> >>>>>>>> + AMF director is
> >>>>>>>> offline(%d),"
> >>>>>>>> +                        " or sync is required(%d)",
> >>>>>>>> + cb->is_avd_down,
> >>>>>>>> + cb->amfd_sync_required);
> >>>>>>>> +
> >>>>>>>>              } else {
> >>>>>>>>                      // We are in normal cluster, send msg to 
> >>>>>>>> director
> >>>>>>>>                      msg.info.avd->msg_info.n2d_su_si_assign.msg_id =
> >>>>>>>> ++(cb-
> >>>>>>>>> snd_msg_id); diff --git a/osaf/services/saf/amf/amfnd/susm.cc
> >>>>>>>> b/osaf/services/saf/amf/amfnd/susm.cc
> >>>>>>>> --- a/osaf/services/saf/amf/amfnd/susm.cc
> >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/susm.cc
> >>>>>>>> @@ -1345,6 +1345,12 @@ uint32_t
> >>>> avnd_evt_avd_su_pres_evh(AVND_C
> >>>>>>>>                                      goto done;
> >>>>>>>>                      }
> >>>>>>>>              } else { /* => instantiate the su */
> >>>>>>>> +            // Do not need to wait for headless sync if there is no
> >>>>>>>> application SUs
> >>>>>>>> +            // initiated. This is known because here we are
> >> receiving
> >>>>>>>> su_pres message
> >>>>>>>> +            // for NCS SUs
> >>>>>>>> +            if (su->is_ncs == true)
> >>>>>>>> +                    cb->amfd_sync_required = false;
> >>>>>>>> +
> >>>>>>>>                      AVND_EVT *evt_ir = 0;
> >>>>>>>>                      TRACE("Sending to Imm thread.");
> >>>>>>>>                      evt_ir = avnd_evt_create(cb, AVND_EVT_IR, 0,
> >> nullptr,
> >>>>>>>> &info-
> >>>>>>>>> su_name, 0, 0);
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to