Hi Minh,

                                Good catch !! Yes, please push, but as such we 
have documented in Compliance Table that "Before the timer expiry, failover and 
switchover are not supported."

 

Thanks

-Nagu

 

From: minh chau [mailto:minh.c...@dektech.com.au] 
Sent: 16 February 2017 07:36
To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; 
gary....@dektech.com.au
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless sync before 
standby AMFD comes up [#2162]

 

Hi Nagu,

Thanks for reminding, there's one change in the patch that could affect on 
upgrade too, it is:

+                // The cb->init_state must be AVD_INIT_DONE or AVD_APP_STATE
+                // If AVD_INIT_DONE, there was a SC failover during cluster
+                // instantiation phase in cluster (after all NCS SU is 
assigned)
+                // If AVD_APP_STATE, this should be come from 2N-MW SI swap
+                if (cb->init_state >= AVD_INIT_DONE) {
+                    if (cluster_su_instantiation_done(cb, nullptr) == true) {
+                        cluster_startup_expiry_event_generate(cb);
+                    } else {
+                        m_AVD_CLINIT_TMR_START(cb);
+                    }
+                }

So, I would like to make it for AVD_INIT_DONE only, it looks like

+                // The cb->init_state must be AVD_INIT_DONE or AVD_APP_STATE
+                // If AVD_INIT_DONE, there was a SC failover during cluster
+                // instantiation phase in cluster (after all NCS SU is 
assigned)
+                if (cb->init_state == AVD_INIT_DONE) {
+                    if (cluster_su_instantiation_done(cb, nullptr) == true) {
+                        cluster_startup_expiry_event_generate(cb);
+                    } else {
+                        m_AVD_CLINIT_TMR_START(cb);
+                    }
+                }

If you agree, I can push the patches with new change.

Thanks,
Minh

On 15/02/17 15:13, Nagendra Kumar wrote:

Yes, ack for both the patches. I assume you would have tested upgrade scenarios.
 
 
Thanks
-Nagu
 

-----Original Message-----
From: minh chau [mailto:minh.c...@dektech.com.au]
Sent: 15 February 2017 08:52
To: Nagendra Kumar; HYPERLINK 
"mailto:hans.nordeb...@ericsson.com"hans.nordeb...@ericsson.com; Praveen 
Malviya;
HYPERLINK "mailto:gary....@dektech.com.au"gary....@dektech.com.au
Cc: HYPERLINK 
"mailto:opensaf-devel@lists.sourceforge.net"opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless sync
before standby AMFD comes up [#2162]
 
Hi Nagu,
 
The #2162 has two patches. I think your ack is for [PATCH 2 of 2] AMFND:
Fix SC failover during headless sync before standby AMFD comes up [#2162].
Does the other one ([PATCH 1 of 2] AMFD: Fix SC failover during headless
sync at INIT_DONE state [#2162]) look ok?
 
Thanks,
Minh
On 14/02/17 20:40, Nagendra Kumar wrote:

Ack.
Tested the scenarios.
 
Thanks
-Nagu
 

-----Original Message-----
From: minh chau [mailto:minh.c...@dektech.com.au]
Sent: 23 January 2017 16:24
To: Nagendra Kumar; HYPERLINK 
"mailto:hans.nordeb...@ericsson.com"hans.nordeb...@ericsson.com; Praveen 
Malviya;
HYPERLINK "mailto:gary....@dektech.com.au"gary....@dektech.com.au
Cc: HYPERLINK 
"mailto:opensaf-devel@lists.sourceforge.net"opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless
sync before standby AMFD comes up [#2162]
 
Hi Nagu,
 
I am checking the logs now.
 
Thanks, Minh
 
On 23/01/17 17:47, Nagendra Kumar wrote:

The logs (Logs-tc.rar) attached in the ticket.
 
Thanks
-Nagu
 

-----Original Message-----
From: minh chau [mailto:minh.c...@dektech.com.au]
Sent: 16 January 2017 05:47
To: Nagendra Kumar; HYPERLINK 
"mailto:hans.nordeb...@ericsson.com"hans.nordeb...@ericsson.com; Praveen 
Malviya;
HYPERLINK "mailto:gary....@dektech.com.au"gary....@dektech.com.au
Cc: HYPERLINK 
"mailto:opensaf-devel@lists.sourceforge.net"opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during headless
sync before standby AMFD comes up [#2162]
 
Hi Nagu,
 
I misunderstood your point, and now I get it.
In my test I see it works as expected - SU2 becomes Act and no
assignment for SU1 I guess in your test some how the cluster
initiation timer has not been started on SC2 (new active), there
could be a

missing case in the patch.

Could you please share me the trace?
 
Thanks,
Minh
 
On 13/01/17 21:48, Nagendra Kumar wrote:

Hi Minh,
   Please check my response inlined with [Nagu].
 
Thanks
-Nagu

-----Original Message-----
From: minh chau [mailto:minh.c...@dektech.com.au]
Sent: 13 January 2017 03:53
To: Nagendra Kumar; HYPERLINK 
"mailto:hans.nordeb...@ericsson.com"hans.nordeb...@ericsson.com; Praveen

Malviya;

HYPERLINK "mailto:gary....@dektech.com.au"gary....@dektech.com.au
Cc: HYPERLINK 
"mailto:opensaf-devel@lists.sourceforge.net"opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 2 of 2] AMFND: Fix SC failover during
headless sync before standby AMFD comes up [#2162]
 
Hi Nagu,
 
Thanks for reviewing, please see comments inline.
 
Thanks,
Minh
 
On 12/01/17 21:48, Nagendra Kumar wrote:

Hi Minh,
      Though I am not able to simulate the problem, I tested as

below:

1. Start SC1, SC2, PL-3 and PL-4. Configure SU1 on PL-3 as Act
and
SU2 on

PL-4 as Standby.

2. Stop SC1 and SC2 and then stop PL-3.
3. Start SC-1 and SC-2. When SC-2 prints Cold sync complete,
stop SC1. SC2

becomes Act.
[M]: As SU1 is on PL3, SU2 is on PL4, and If PL-3 is stopped,
then only
SU2 has active assignment

[Nagu]: PL-3 is stopped in step #2.

In this case, SC-2 contains both SU1(Act) and SU2(Standby)

assignments.

Ideally, SU2 assignments should have been Act and there
shouldn't be
SU1

assignment.
[M]: This seems to be another test where SU1 and SU2 are hosted
on SC2, then both SU1 and SU2 should get assignment

[Nagu]: I mean to say command 'amf-state siass' run on SC-1
displays both

SU1 and SU2 assignments.

                   SU1 and SU2 are hosted on PL-3 and PL-4 respectively.
This is similar test case, which is mentioned in the ticket?

 

safSISU=safSu=SU1\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe

mo,safApp=AmfDemo1

            saAmfSISUHAState=ACTIVE(1)
            saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
 

 

safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe

mo,safApp=AmfDemo1

            saAmfSISUHAState=STANDBY(2)
            saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1)
 
Please check.
 
Thanks
-Nagu
 

-----Original Message-----
From: Minh Hon Chau [mailto:minh.c...@dektech.com.au]
Sent: 08 November 2016 08:53
To: HYPERLINK "mailto:hans.nordeb...@ericsson.com"hans.nordeb...@ericsson.com; 
Nagendra Kumar; Praveen

Malviya;

HYPERLINK "mailto:gary....@dektech.com.au"gary....@dektech.com.au; HYPERLINK 
"mailto:minh.c...@dektech.com.au"minh.c...@dektech.com.au
Cc: HYPERLINK 
"mailto:opensaf-devel@lists.sourceforge.net"opensaf-devel@lists.sourceforge.net
Subject: [PATCH 2 of 2] AMFND: Fix SC failover during headless
sync before standby AMFD comes up [#2162]
 
     osaf/services/saf/amf/amfnd/di.cc   |  7 +++++--
     osaf/services/saf/amf/amfnd/susm.cc |  6 ++++++
     2 files changed, 11 insertions(+), 2 deletions(-)
 
 
This case of SC failover causes new active AMFD getting stuck
in node_up messages
 
Say first active controller is SC1, which goes down during
headless

sync.

Therefore, the amfnd on SC2 receives mds_down of AVD, then

both

is_avd_down and amfd_sync_required are set to true. When SC2
takes over active role, amfnd on SC2 receives mds_up, but only
is_avd_down is set to false and the variable amfd_sync_required

remains true.

When amfnd-SC2 finishes initiating middleware SU, it needs to
send su_oper message to AMFD, but it is failed to send out due
to

amfd_sync_required.

In this scenario of SC failover, amfd_sync_required needs to
set to false when amfnd on SC2 receives su_pres message on

middleware

SUs.

That means amfnd on active controller does not need to wait for
set_leds message, to be informed that cluster initiation is
done, so that amfnd can sen su_oper messages to AMFD. This
logic also aligns with normal headless scenario, where amfnd on
active controller has amfd_sync_required initially marked as
false because no middleware SUs are initiated. When
amfd_sync_required is true that means amfnd all middleware SUs
are initiated and assigned before headless, thus amfnd needs to
wait for cluster initiation after

headless.

diff --git a/osaf/services/saf/amf/amfnd/di.cc
b/osaf/services/saf/amf/amfnd/di.cc
--- a/osaf/services/saf/amf/amfnd/di.cc
+++ b/osaf/services/saf/amf/amfnd/di.cc
@@ -748,7 +748,8 @@ uint32_t avnd_di_oper_send(AVND_CB

*cb,

               if (avnd_diq_rec_add(cb, &msg) == nullptr) {
                    rc = NCSCC_RC_FAILURE;
               }
-          LOG_NO("avnd_di_oper_send() deferred as AMF

director is

offline");
+          LOG_NO("avnd_di_oper_send() deferred as AMF

director is

offline(%d),"
+               " or sync is required(%d)", cb->is_avd_down,
+cb->amfd_sync_required);
          } else {
               // We are in normal cluster, send msg to director
               msg.info.avd->msg_info.n2d_opr_state.msg_id =

++(cb-

snd_msg_id); @@ -881,7 +882,9 @@ uint32_t

avnd_di_susi_resp_send(AVND_CB
                       rc = NCSCC_RC_FAILURE;
                  }
                  m_AVND_SU_ALL_SI_RESET(su);
-             LOG_NO("avnd_di_susi_resp_send() deferred as AMF

director is

offline");
+                LOG_NO("avnd_di_susi_resp_send() deferred as
+ AMF director is
offline(%d),"
+                        " or sync is required(%d)",
+ cb->is_avd_down,
+ cb->amfd_sync_required);
+
             } else {
                  // We are in normal cluster, send msg to director
                  msg.info.avd->msg_info.n2d_su_si_assign.msg_id =
++(cb-

snd_msg_id); diff --git a/osaf/services/saf/amf/amfnd/susm.cc

b/osaf/services/saf/amf/amfnd/susm.cc
--- a/osaf/services/saf/amf/amfnd/susm.cc
+++ b/osaf/services/saf/amf/amfnd/susm.cc
@@ -1345,6 +1345,12 @@ uint32_t

avnd_evt_avd_su_pres_evh(AVND_C

                         goto done;
               }
          } else { /* => instantiate the su */
+          // Do not need to wait for headless sync if there is no
application SUs
+          // initiated. This is known because here we are

receiving

su_pres message
+          // for NCS SUs
+          if (su->is_ncs == true)
+               cb->amfd_sync_required = false;
+
               AVND_EVT *evt_ir = 0;
               TRACE("Sending to Imm thread.");
               evt_ir = avnd_evt_create(cb, AVND_EVT_IR, 0,

nullptr,

&info-

su_name, 0, 0);

 

 

 
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to