AMFD reboots the standby node when a failover STANDBY->ACTIVE is unsucessful,
but for some reason the "out of sync" case is handled differently and in this
case AMFD does not order a reboot.
~~~~
uint32_t status = NCSCC_RC_FAILURE;
~~~~
[...]
~~~~
if (AVD_STBY_OUT_OF_SYNC == cb->stby_sync_state) {
LOG_ER("FAILOVER StandBy --> Active FAILED, Standby OUT OF
SYNC");
return NCSCC_RC_FAILURE;
}
if (nullptr == (my_node = avd_node_find_nodeid(cb->node_id_avd))) {
LOG_ER("FAILOVER StandBy --> Active FAILED, node %x not found",
cb->node_id_avd);
goto done;
}
if (nullptr == (failed_node =
avd_node_find_nodeid(cb->node_id_avd_other))) {
LOG_ER("FAILOVER StandBy --> Active FAILED, node %x not found",
cb->node_id_avd_other);
goto done;
}
/* check the node state */
if (my_node->node_state != AVD_AVND_STATE_PRESENT) {
LOG_ER("FAILOVER StandBy --> Active FAILED, stdby not in good
state");
goto done;
}
~~~~
[...]
~~~~
done:
if (NCSCC_RC_SUCCESS != status)
opensaf_reboot(my_node != nullptr ? my_node->node_info.nodeId :
0,
my_node != nullptr ? (char
*)my_node->node_info.executionEnvironment.value : nullptr,
"FAILOVER failed");
~~~~
---
** [tickets:#1732] OUT_OF_SYNC (failed over) new active controller should go
for immediate reboot**
**Status:** unassigned
**Milestone:** 5.0.RC1
**Created:** Wed Apr 06, 2016 11:02 AM UTC by Srikanth R
**Last Updated:** Wed Apr 06, 2016 11:02 AM UTC
**Owner:** nobody
Changeset : 7436
Version : 5.0 FC
Setup : Two controllers
Issue :
Out of sync (failed over) new active controller should go for immediate
reboot,
During failover, if the standby controller is OUT OF SYNC and could not get
promoted to active, the node should be rebooted immediately.
Apr 6 16:03:53 CONTROLLER-2 osafamfd[431]: ER FAILOVER StandBy --> Active
FAILED, Standby OUT OF SYNC
Apr 6 16:03:53 CONTROLLER-2 osafamfd[431]: ER avd_role_change role change
failure
Apr 6 16:03:53 CONTROLLER-2 osafimmd[380]: WA IMMND DOWN on active controller
1 detected at standby immd!! 2. Possible failover
..
Apr 6 16:06:53 CONTROLLER-2 osafamfnd[441]: WA AMF director unexpectedly
crashed
Apr 6 16:06:53 CONTROLLER-2 osafamfnd[441]: Rebooting OpenSAF NodeId = 131599
EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received,
OwnNodeId = 131599, SupervisionTime = 60
Apr 6 16:06:53 CONTROLLER-2 opensaf_reboot: Rebooting local node; timeout=60
This issue is fixed as part of #1334, but might be observed because of #79
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets