Yes, ack for both the patches. I assume you would have tested upgrade scenarios.
Thanks -Nagu > -----Original Message----- > From: minh chau [mailto:minh.c...@dektech.com.au] > Sent: 15 February 2017 08:52 > To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > gary....@dektech.com.au > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless sync > before standby AMFD comes up [#2162] > > Hi Nagu, > > The #2162 has two patches. I think your ack is for [PATCH 2 of 2] AMFND: > Fix SC failover during headless sync before standby AMFD comes up [#2162]. > Does the other one ([PATCH 1 of 2] AMFD: Fix SC failover during headless > sync at INIT_DONE state [#2162]) look ok? > > Thanks, > Minh > On 14/02/17 20:40, Nagendra Kumar wrote: > > Ack. > > Tested the scenarios. > > > > Thanks > > -Nagu > > > >> -----Original Message----- > >> From: minh chau [mailto:minh.c...@dektech.com.au] > >> Sent: 23 January 2017 16:24 > >> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > >> gary....@dektech.com.au > >> Cc: opensaf-devel@lists.sourceforge.net > >> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless > >> sync before standby AMFD comes up [#2162] > >> > >> Hi Nagu, > >> > >> I am checking the logs now. > >> > >> Thanks, Minh > >> > >> On 23/01/17 17:47, Nagendra Kumar wrote: > >>> The logs (Logs-tc.rar) attached in the ticket. > >>> > >>> Thanks > >>> -Nagu > >>> > >>>> -----Original Message----- > >>>> From: minh chau [mailto:minh.c...@dektech.com.au] > >>>> Sent: 16 January 2017 05:47 > >>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > >>>> gary....@dektech.com.au > >>>> Cc: opensaf-devel@lists.sourceforge.net > >>>> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless > >>>> sync before standby AMFD comes up [#2162] > >>>> > >>>> Hi Nagu, > >>>> > >>>> I misunderstood your point, and now I get it. > >>>> In my test I see it works as expected - SU2 becomes Act and no > >>>> assignment for SU1 I guess in your test some how the cluster > >>>> initiation timer has not been started on SC2 (new active), there > >>>> could be a > >> missing case in the patch. > >>>> Could you please share me the trace? > >>>> > >>>> Thanks, > >>>> Minh > >>>> > >>>> On 13/01/17 21:48, Nagendra Kumar wrote: > >>>>> Hi Minh, > >>>>> Please check my response inlined with [Nagu]. > >>>>> > >>>>> Thanks > >>>>> -Nagu > >>>>>> -----Original Message----- > >>>>>> From: minh chau [mailto:minh.c...@dektech.com.au] > >>>>>> Sent: 13 January 2017 03:53 > >>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > Malviya; > >>>>>> gary....@dektech.com.au > >>>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>>> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during > >>>>>> headless sync before standby AMFD comes up [#2162] > >>>>>> > >>>>>> Hi Nagu, > >>>>>> > >>>>>> Thanks for reviewing, please see comments inline. > >>>>>> > >>>>>> Thanks, > >>>>>> Minh > >>>>>> > >>>>>> On 12/01/17 21:48, Nagendra Kumar wrote: > >>>>>>> Hi Minh, > >>>>>>> Though I am not able to simulate the problem, I tested as > below: > >>>>>>> 1. Start SC1, SC2, PL-3 and PL-4. Configure SU1 on PL-3 as Act > >>>>>>> and > >>>>>>> SU2 on > >>>>>> PL-4 as Standby. > >>>>>>> 2. Stop SC1 and SC2 and then stop PL-3. > >>>>>>> 3. Start SC-1 and SC-2. When SC-2 prints Cold sync complete, > >>>>>>> stop SC1. SC2 > >>>>>> becomes Act. > >>>>>> [M]: As SU1 is on PL3, SU2 is on PL4, and If PL-3 is stopped, > >>>>>> then only > >>>>>> SU2 has active assignment > >>>>> [Nagu]: PL-3 is stopped in step #2. > >>>>>>> In this case, SC-2 contains both SU1(Act) and SU2(Standby) > >> assignments. > >>>>>>> Ideally, SU2 assignments should have been Act and there > >>>>>>> shouldn't be > >>>>>>> SU1 > >>>>>> assignment. > >>>>>> [M]: This seems to be another test where SU1 and SU2 are hosted > >>>>>> on SC2, then both SU1 and SU2 should get assignment > >>>>> [Nagu]: I mean to say command 'amf-state siass' run on SC-1 > >>>>> displays both > >>>> SU1 and SU2 assignments. > >>>>> SU1 and SU2 are hosted on PL-3 and PL-4 respectively. > >>>>> This is similar test case, which is mentioned in the ticket? > >> > safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > >>>>>> mo,safApp=AmfDemo1 > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > >>>>>>> > >> > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > >>>>>> mo,safApp=AmfDemo1 > >>>>>>> saAmfSISUHAState=STANDBY(2) > >>>>>>> saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > >>>>>>> > >>>>>>> Please check. > >>>>>>> > >>>>>>> Thanks > >>>>>>> -Nagu > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] > >>>>>>>> Sent: 08 November 2016 08:53 > >>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen > >> Malviya; > >>>>>>>> gary....@dektech.com.au; minh.c...@dektech.com.au > >>>>>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>>>>> Subject: [PATCH 2 of 2] AMFND: Fix SC failover during headless > >>>>>>>> sync before standby AMFD comes up [#2162] > >>>>>>>> > >>>>>>>> osaf/services/saf/amf/amfnd/di.cc | 7 +++++-- > >>>>>>>> osaf/services/saf/amf/amfnd/susm.cc | 6 ++++++ > >>>>>>>> 2 files changed, 11 insertions(+), 2 deletions(-) > >>>>>>>> > >>>>>>>> > >>>>>>>> This case of SC failover causes new active AMFD getting stuck > >>>>>>>> in node_up messages > >>>>>>>> > >>>>>>>> Say first active controller is SC1, which goes down during > >>>>>>>> headless > >> sync. > >>>>>>>> Therefore, the amfnd on SC2 receives mds_down of AVD, then > both > >>>>>>>> is_avd_down and amfd_sync_required are set to true. When SC2 > >>>>>>>> takes over active role, amfnd on SC2 receives mds_up, but only > >>>>>>>> is_avd_down is set to false and the variable amfd_sync_required > >>>> remains true. > >>>>>>>> When amfnd-SC2 finishes initiating middleware SU, it needs to > >>>>>>>> send su_oper message to AMFD, but it is failed to send out due > >>>>>>>> to > >>>>>> amfd_sync_required. > >>>>>>>> In this scenario of SC failover, amfd_sync_required needs to > >>>>>>>> set to false when amfnd on SC2 receives su_pres message on > >> middleware > >>>> SUs. > >>>>>>>> That means amfnd on active controller does not need to wait for > >>>>>>>> set_leds message, to be informed that cluster initiation is > >>>>>>>> done, so that amfnd can sen su_oper messages to AMFD. This > >>>>>>>> logic also aligns with normal headless scenario, where amfnd on > >>>>>>>> active controller has amfd_sync_required initially marked as > >>>>>>>> false because no middleware SUs are initiated. When > >>>>>>>> amfd_sync_required is true that means amfnd all middleware SUs > >>>>>>>> are initiated and assigned before headless, thus amfnd needs to > >>>>>>>> wait for cluster initiation after > >>>> headless. > >>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> @@ -748,7 +748,8 @@ uint32_t avnd_di_oper_send(AVND_CB > *cb, > >>>>>>>> if (avnd_diq_rec_add(cb, &msg) == nullptr) { > >>>>>>>> rc = NCSCC_RC_FAILURE; > >>>>>>>> } > >>>>>>>> - LOG_NO("avnd_di_oper_send() deferred as AMF > >> director is > >>>>>>>> offline"); > >>>>>>>> + LOG_NO("avnd_di_oper_send() deferred as AMF > >> director is > >>>>>>>> offline(%d)," > >>>>>>>> + " or sync is required(%d)", cb->is_avd_down, > >>>>>>>> +cb->amfd_sync_required); > >>>>>>>> } else { > >>>>>>>> // We are in normal cluster, send msg to > >>>>>>>> director > >>>>>>>> msg.info.avd->msg_info.n2d_opr_state.msg_id = > >> ++(cb- > >>>>>>>>> snd_msg_id); @@ -881,7 +882,9 @@ uint32_t > >>>>>>>> avnd_di_susi_resp_send(AVND_CB > >>>>>>>> rc = NCSCC_RC_FAILURE; > >>>>>>>> } > >>>>>>>> m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>> - LOG_NO("avnd_di_susi_resp_send() deferred as AMF > >>>>>> director is > >>>>>>>> offline"); > >>>>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as > >>>>>>>> + AMF director is > >>>>>>>> offline(%d)," > >>>>>>>> + " or sync is required(%d)", > >>>>>>>> + cb->is_avd_down, > >>>>>>>> + cb->amfd_sync_required); > >>>>>>>> + > >>>>>>>> } else { > >>>>>>>> // We are in normal cluster, send msg to > >>>>>>>> director > >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > >>>>>>>> ++(cb- > >>>>>>>>> snd_msg_id); diff --git a/osaf/services/saf/amf/amfnd/susm.cc > >>>>>>>> b/osaf/services/saf/amf/amfnd/susm.cc > >>>>>>>> --- a/osaf/services/saf/amf/amfnd/susm.cc > >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/susm.cc > >>>>>>>> @@ -1345,6 +1345,12 @@ uint32_t > >>>> avnd_evt_avd_su_pres_evh(AVND_C > >>>>>>>> goto done; > >>>>>>>> } > >>>>>>>> } else { /* => instantiate the su */ > >>>>>>>> + // Do not need to wait for headless sync if there is no > >>>>>>>> application SUs > >>>>>>>> + // initiated. This is known because here we are > >> receiving > >>>>>>>> su_pres message > >>>>>>>> + // for NCS SUs > >>>>>>>> + if (su->is_ncs == true) > >>>>>>>> + cb->amfd_sync_required = false; > >>>>>>>> + > >>>>>>>> AVND_EVT *evt_ir = 0; > >>>>>>>> TRACE("Sending to Imm thread."); > >>>>>>>> evt_ir = avnd_evt_create(cb, AVND_EVT_IR, 0, > >> nullptr, > >>>>>>>> &info- > >>>>>>>>> su_name, 0, 0); > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel