Any update ?? Thanks -Nagu
> -----Original Message----- > From: Nagendra Kumar > Sent: 23 January 2017 12:18 > To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; > gary....@dektech.com.au > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 2 of 2] AMFND: Fix SC failover during headless > sync before standby AMFD comes up [#2162] > > The logs (Logs-tc.rar) attached in the ticket. > > Thanks > -Nagu > > > -----Original Message----- > > From: minh chau [mailto:minh.c...@dektech.com.au] > > Sent: 16 January 2017 05:47 > > To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > > gary....@dektech.com.au > > Cc: opensaf-devel@lists.sourceforge.net > > Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless > > sync before standby AMFD comes up [#2162] > > > > Hi Nagu, > > > > I misunderstood your point, and now I get it. > > In my test I see it works as expected - SU2 becomes Act and no > > assignment for SU1 I guess in your test some how the cluster > > initiation timer has not been started on SC2 (new active), there could be a > missing case in the patch. > > Could you please share me the trace? > > > > Thanks, > > Minh > > > > On 13/01/17 21:48, Nagendra Kumar wrote: > > > Hi Minh, > > > Please check my response inlined with [Nagu]. > > > > > > Thanks > > > -Nagu > > >> -----Original Message----- > > >> From: minh chau [mailto:minh.c...@dektech.com.au] > > >> Sent: 13 January 2017 03:53 > > >> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > > >> gary....@dektech.com.au > > >> Cc: opensaf-devel@lists.sourceforge.net > > >> Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless > > >> sync before standby AMFD comes up [#2162] > > >> > > >> Hi Nagu, > > >> > > >> Thanks for reviewing, please see comments inline. > > >> > > >> Thanks, > > >> Minh > > >> > > >> On 12/01/17 21:48, Nagendra Kumar wrote: > > >>> Hi Minh, > > >>> Though I am not able to simulate the problem, I tested as > > >>> below: > > >>> 1. Start SC1, SC2, PL-3 and PL-4. Configure SU1 on PL-3 as Act and > > >>> SU2 on > > >> PL-4 as Standby. > > >>> 2. Stop SC1 and SC2 and then stop PL-3. > > >>> 3. Start SC-1 and SC-2. When SC-2 prints Cold sync complete, stop > > >>> SC1. SC2 > > >> becomes Act. > > >> [M]: As SU1 is on PL3, SU2 is on PL4, and If PL-3 is stopped, then > > >> only > > >> SU2 has active assignment > > > [Nagu]: PL-3 is stopped in step #2. > > >>> In this case, SC-2 contains both SU1(Act) and SU2(Standby) > assignments. > > >>> Ideally, SU2 assignments should have been Act and there shouldn't > > >>> be > > >>> SU1 > > >> assignment. > > >> [M]: This seems to be another test where SU1 and SU2 are hosted on > > >> SC2, then both SU1 and SU2 should get assignment > > > [Nagu]: I mean to say command 'amf-state siass' run on SC-1 displays > > > both > > SU1 and SU2 assignments. > > > SU1 and SU2 are hosted on PL-3 and PL-4 respectively. > > > This is similar test case, which is mentioned in the ticket? > > >>> > > >> > > > safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > > >> mo,safApp=AmfDemo1 > > >>> saAmfSISUHAState=ACTIVE(1) > > >>> saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > > >>> > > >> > > > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > > >> mo,safApp=AmfDemo1 > > >>> saAmfSISUHAState=STANDBY(2) > > >>> saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) > > >>> > > >>> Please check. > > >>> > > >>> Thanks > > >>> -Nagu > > >>> > > >>>> -----Original Message----- > > >>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] > > >>>> Sent: 08 November 2016 08:53 > > >>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen > Malviya; > > >>>> gary....@dektech.com.au; minh.c...@dektech.com.au > > >>>> Cc: opensaf-devel@lists.sourceforge.net > > >>>> Subject: [PATCH 2 of 2] AMFND: Fix SC failover during headless > > >>>> sync before standby AMFD comes up [#2162] > > >>>> > > >>>> osaf/services/saf/amf/amfnd/di.cc | 7 +++++-- > > >>>> osaf/services/saf/amf/amfnd/susm.cc | 6 ++++++ > > >>>> 2 files changed, 11 insertions(+), 2 deletions(-) > > >>>> > > >>>> > > >>>> This case of SC failover causes new active AMFD getting stuck in > > >>>> node_up messages > > >>>> > > >>>> Say first active controller is SC1, which goes down during headless > sync. > > >>>> Therefore, the amfnd on SC2 receives mds_down of AVD, then both > > >>>> is_avd_down and amfd_sync_required are set to true. When SC2 > > >>>> takes over active role, amfnd on SC2 receives mds_up, but only > > >>>> is_avd_down is set to false and the variable amfd_sync_required > > remains true. > > >>>> When amfnd-SC2 finishes initiating middleware SU, it needs to > > >>>> send su_oper message to AMFD, but it is failed to send out due to > > >> amfd_sync_required. > > >>>> In this scenario of SC failover, amfd_sync_required needs to set > > >>>> to false when amfnd on SC2 receives su_pres message on middleware > > SUs. > > >>>> That means amfnd on active controller does not need to wait for > > >>>> set_leds message, to be informed that cluster initiation is done, > > >>>> so that amfnd can sen su_oper messages to AMFD. This logic also > > >>>> aligns with normal headless scenario, where amfnd on active > > >>>> controller has amfd_sync_required initially marked as false > > >>>> because no middleware SUs are initiated. When amfd_sync_required > > >>>> is true that means amfnd all middleware SUs are initiated and > > >>>> assigned before headless, thus amfnd needs to wait for cluster > > >>>> initiation after > > headless. > > >>>> > > >>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc > > >>>> b/osaf/services/saf/amf/amfnd/di.cc > > >>>> --- a/osaf/services/saf/amf/amfnd/di.cc > > >>>> +++ b/osaf/services/saf/amf/amfnd/di.cc > > >>>> @@ -748,7 +748,8 @@ uint32_t avnd_di_oper_send(AVND_CB *cb, > > >>>> if (avnd_diq_rec_add(cb, &msg) == nullptr) { > > >>>> rc = NCSCC_RC_FAILURE; > > >>>> } > > >>>> - LOG_NO("avnd_di_oper_send() deferred as AMF > director is > > >>>> offline"); > > >>>> + LOG_NO("avnd_di_oper_send() deferred as AMF > director is > > >>>> offline(%d)," > > >>>> + " or sync is required(%d)", cb->is_avd_down, > > >>>> +cb->amfd_sync_required); > > >>>> } else { > > >>>> // We are in normal cluster, send msg to director > > >>>> msg.info.avd->msg_info.n2d_opr_state.msg_id = > ++(cb- > > >>>>> snd_msg_id); @@ -881,7 +882,9 @@ uint32_t > > >>>> avnd_di_susi_resp_send(AVND_CB > > >>>> rc = NCSCC_RC_FAILURE; > > >>>> } > > >>>> m_AVND_SU_ALL_SI_RESET(su); > > >>>> - LOG_NO("avnd_di_susi_resp_send() deferred as AMF > > >> director is > > >>>> offline"); > > >>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF > > >>>> + director is > > >>>> offline(%d)," > > >>>> + " or sync is required(%d)", > > >>>> + cb->is_avd_down, > > >>>> + cb->amfd_sync_required); > > >>>> + > > >>>> } else { > > >>>> // We are in normal cluster, send msg to director > > >>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > > >>>> ++(cb- > > >>>>> snd_msg_id); diff --git a/osaf/services/saf/amf/amfnd/susm.cc > > >>>> b/osaf/services/saf/amf/amfnd/susm.cc > > >>>> --- a/osaf/services/saf/amf/amfnd/susm.cc > > >>>> +++ b/osaf/services/saf/amf/amfnd/susm.cc > > >>>> @@ -1345,6 +1345,12 @@ uint32_t > > avnd_evt_avd_su_pres_evh(AVND_C > > >>>> goto done; > > >>>> } > > >>>> } else { /* => instantiate the su */ > > >>>> + // Do not need to wait for headless sync if there is no > > >>>> application SUs > > >>>> + // initiated. This is known because here we are > receiving > > >>>> su_pres message > > >>>> + // for NCS SUs > > >>>> + if (su->is_ncs == true) > > >>>> + cb->amfd_sync_required = false; > > >>>> + > > >>>> AVND_EVT *evt_ir = 0; > > >>>> TRACE("Sending to Imm thread."); > > >>>> evt_ir = avnd_evt_create(cb, AVND_EVT_IR, 0, > nullptr, &info- > > >>>>> su_name, 0, 0); > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most engaging > tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel