- **status**: review --> fixed
- **Comment**:

commit 35d44ff686df8c4f15b372581a00d7c1a7c734a6
Author: Gary Lee <[email protected]>
Date:   Sat Nov 3 07:43:55 2018 +0000

    amfd: reset snd_msg_id in LostFound state [#2952]
    
    If a PL rejoins the main network partition before the node failover timer 
expires,
    it is told to reboot by AMFD. AMFND thinks it has become headless and
    resets rcv_msg_id to 0, and shows this when it receives the reboot msg from 
AMFD:
    
    Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: Message ID 
mismatch, rec xx, expected 1, OwnNodeId = xx, SupervisionTime = 60
    
    We can avoid this by resetting snd_msg_id for this PL in AMFD in state 
LostFound,
    before the reboot msg is sent.




---

** [tickets:#2952] amfd: reset message ID in LostFound state**

**Status:** fixed
**Milestone:** 5.18.12
**Created:** Thu Nov 01, 2018 09:06 PM UTC by Gary Lee
**Last Updated:** Fri Nov 02, 2018 05:35 AM UTC
**Owner:** Gary Lee


[#2918] adds support for delaying node failover due to network disturbances.

If a PL rejoins the main network partition before the node failover timer 
expires, it is told to reboot by AMFD. AMFND thinks it has become headless and 
resets rcv_msg_id to 0, and shows this when it receives the reboot msg from 
AMFD:

`Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: Message ID 
mismatch, rec xx, expected 1, OwnNodeId = xx, SupervisionTime = 60`

We can avoid this by resetting snd_msg_id for this PL in AMFD in state 
LostFound, before the reboot msg is sent.

~~~
diff --git a/src/amf/amfd/node_state.cc b/src/amf/amfd/node_state.cc
index a8659dc..9077c07 100644
--- a/src/amf/amfd/node_state.cc
+++ b/src/amf/amfd/node_state.cc
@@ -125,6 +125,8 @@ void LostFound::TimerExpired() {
   LOG_WA("Lost node '%s' has reappeared after network separation",
           node->node_name.c_str());
 
+  node->snd_msg_id = 0;
+
   if (fsm_->Active() == true) {
     LOG_WA("Sending node reboot order");
     avd_d2n_reboot_snd(node);
~~~

Then the proper message will be seen:

`Received reboot order, ordering reboot now!`




---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to