Hi Hans

I'm trying to preserve the existing behaviour when split-brain prevention is not enabled. If it is, then we can skip the reboot.

I'm not sure about removing the if statement entirely, as it seems to useful. ie. it picks up that we have an election tie.

I agree that we should look at removing the etcd lock at a graceful shutdown. Currently, the lock is removed by the new active or during an SI swap only. Can I create a new ticket for that?

Thanks

Gary


On 13/03/18 20:46, Hans Nordebäck wrote:
Hi Gary,

a question, this patch builds on logic introduced in ticket #2151 to handle the case when a controller

has not completely been stopped and the other controller is started. The if stmt

if ((fm_cb->role == PCS_RDA_ACTIVE) && (fm_cb->csi_assigned == false))

perhaps can be removed (and more code related to #2151?) and instead release the etcd lock as last operation in opensafd stop?

/HansN

On 03/09/2018 06:57 AM, Gary Lee wrote:
If we have a 'tied election' and split-brain prevention is enabled,
then the 'old active' is fenced, or the 'old active' will self-reboot
when it is notified a new node is active.

We need to disable this redundant check in fmd. Otherwise, the 'new active'
will also reboot, along with the 'old active'.
---
  src/fm/fmd/fm_main.cc | 19 ++++++++++++-------
  1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/fm/fmd/fm_main.cc b/src/fm/fmd/fm_main.cc
index 1244c2347..73c9b9ccd 100644
--- a/src/fm/fmd/fm_main.cc
+++ b/src/fm/fmd/fm_main.cc
@@ -600,13 +600,18 @@ static void fm_mbx_msg_handler(FM_CB *fm_cb, FM_EVT *fm_mbx_evt) {
         * progress of shutdown (i.e., amfd/immd is still alive).
         */
        if ((fm_cb->role == PCS_RDA_ACTIVE) && (fm_cb->csi_assigned == false)) {
-        LOG_WA(
-            "Two active controllers observed in a cluster, newActive: %x and "
-            "old-Active: %x",
-            unsigned(fm_cb->node_id), unsigned(fm_cb->peer_node_id));
-        opensaf_reboot(0, NULL,
-                       "Received svc up from peer node (old-active is not "
-                       "fully DOWN), hence rebooting the new Active");
+        Consensus consensus_service;
+        if (consensus_service.IsEnabled() == false) {
+          // If split-brain prevention is enabled, then the 'old active' has
+          // already initiated a self-reboot, or it is fenced.
+          LOG_WA(
+              "Two active controllers observed in a cluster, newActive: %x and "
+              "old-Active: %x",
+              unsigned(fm_cb->node_id), unsigned(fm_cb->peer_node_id));
+          opensaf_reboot(0, NULL,
+                         "Received svc up from peer node (old-active is not " +                         "fully DOWN), hence rebooting the new Active");
+        }
        }
          /* Peer fm came up so sending ee_id of this node */



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to