AMFD reboots the standby node when a failover STANDBY->ACTIVE is unsucessful, 
but for some reason the "out of sync" case is handled differently and in this 
case AMFD does not order a reboot.


~~~~
        uint32_t status = NCSCC_RC_FAILURE;
~~~~

[...]

~~~~
        if (AVD_STBY_OUT_OF_SYNC == cb->stby_sync_state) {
                LOG_ER("FAILOVER StandBy --> Active FAILED, Standby OUT OF 
SYNC");
                return NCSCC_RC_FAILURE;
        }

        if (nullptr == (my_node = avd_node_find_nodeid(cb->node_id_avd))) {
                LOG_ER("FAILOVER StandBy --> Active FAILED, node %x not found", 
cb->node_id_avd);
                goto done;
        }

        if (nullptr == (failed_node = 
avd_node_find_nodeid(cb->node_id_avd_other))) {
                LOG_ER("FAILOVER StandBy --> Active FAILED, node %x not found", 
cb->node_id_avd_other);
                goto done;
        }

        /* check the node state */
        if (my_node->node_state != AVD_AVND_STATE_PRESENT) {
                LOG_ER("FAILOVER StandBy --> Active FAILED, stdby not in good 
state");
                goto done;
        }
~~~~

[...]

~~~~
done:
        if (NCSCC_RC_SUCCESS != status)
                opensaf_reboot(my_node != nullptr ? my_node->node_info.nodeId : 
0,
                                my_node != nullptr ? (char 
*)my_node->node_info.executionEnvironment.value : nullptr,
                                "FAILOVER failed");
~~~~


---

** [tickets:#1732] OUT_OF_SYNC (failed over) new active controller should go 
for immediate reboot**

**Status:** unassigned
**Milestone:** 5.0.RC1
**Created:** Wed Apr 06, 2016 11:02 AM UTC by Srikanth R
**Last Updated:** Wed Apr 06, 2016 11:02 AM UTC
**Owner:** nobody


Changeset : 7436 
Version : 5.0 FC
Setup : Two controllers


Issue :
  Out of sync (failed over) new active controller should go for immediate 
reboot,
  
  During failover, if the standby controller is OUT OF SYNC and could not get 
promoted to active, the node should be rebooted immediately.
 
Apr  6 16:03:53 CONTROLLER-2 osafamfd[431]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
Apr  6 16:03:53 CONTROLLER-2 osafamfd[431]: ER avd_role_change role change 
failure
Apr  6 16:03:53 CONTROLLER-2 osafimmd[380]: WA IMMND DOWN on active controller 
1 detected at standby immd!! 2. Possible failover
..
Apr  6 16:06:53 CONTROLLER-2 osafamfnd[441]: WA AMF director unexpectedly 
crashed
Apr  6 16:06:53 CONTROLLER-2 osafamfnd[441]: Rebooting OpenSAF NodeId = 131599 
EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, 
OwnNodeId = 131599, SupervisionTime = 60
Apr  6 16:06:53 CONTROLLER-2 opensaf_reboot: Rebooting local node; timeout=60

This issue is fixed as part of  #1334, but might be observed because of #79


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to