commit 4062588fae381ecf46b91ee7b7a5e4ab2e776210 (HEAD -> develop,
origin/develop, ticket-3308)
Author: thang.d.nguyen <[email protected]>
Date: Mon Feb 21 08:53:32 2022 +0700
amf: fix unexpected node reboot during failover [#3308]
During SC failover, message sent on ACTIVE AMFD can not be
checked point to AMFD on STANDBY SC. But the AMFND still
increase receive/send msg id count. Then STANDBY SC takes
ACTIVE and mismatch message id b/w AMFND and new active AMFD.
Solution is to make msg id count alignment b/w AMFD/AMFND
in this case.
---
** [tickets:#3308] amf: unexpected reboot due to mismatch msg id**
**Status:** assigned
**Milestone:** 5.22.04
**Created:** Mon Feb 21, 2022 01:11 AM UTC by Thang Duc Nguyen
**Last Updated:** Mon Feb 21, 2022 01:12 AM UTC
**Owner:** Thang Duc Nguyen
The issue was noticed in ticket #3040. In that ticket the solution is reboot to
recovery.
Some error msg in syslog
On SC changes from STB -> ACT
~~~
2022-02-19T14:06:28.439+01:00 SC-2 osafrded[26947]: NO Got peer info response
from node 0x2010f with role STANDBY
2022-02-19T14:06:32.366+01:00 SC-2 osafamfd[27233]: NO Switching StandBy -->
Active State
2022-02-19T14:06:32.489+01:00 SC-2 osafamfd[27233]: NO Active controller set to
SC-2
2022-02-19T14:06:32.491+01:00 SC-2 osafamfnd[27248]: EM AVND record not found,
after failover, snd_msg_id = 381, receive id = 380
2022-02-19T14:06:32.492+01:00 SC-2 osafamfnd[27248]: Rebooting OpenSAF NodeId =
2020f EE Name = , Reason: AVND record not found, after failover, OwnNodeId =
2020f, SupervisionTime = 60
2022-02-19T14:06:32.492+01:00 SC-2 osafamfnd[27248]: NO AVD NEW_ACTIVE, adest:1
2022-02-19T14:06:32.516+01:00 SC-2 osafrded[26947]: NO RDE role set to ACTIVE
~~~
On SC changes from ACT->STB
~~~
2022-02-19T14:06:29.589+01:00 SC-1 osafamfd[3531]: NO ROLE SWITCH Active -->
Quiesced
2022-02-19T14:06:29.817+01:00 SC-1 osafrded[3297]: NO New active controller
notification from consensus service
2022-02-19T14:06:30.662+01:00 SC-1 osafsmfd[3735]: NO SA_AMF_ADMIN_SI_SWAP
[rc=1] successfully initiated
2022-02-19T14:06:30.664+01:00 SC-1 osafsmfd[3735]: NO Campaign thread
terminated after SA_AMF_ADMIN_SI_SWAP
2022-02-19T14:06:32.495+01:00 SC-1 osafamfnd[3565]: NO AVD NEW_ACTIVE, adest:1
2022-02-19T14:06:32.497+01:00 SC-1 osafamfnd[3565]: EM AVND record not found,
after failover, snd_msg_id = 596, receive id = 594
2022-02-19T14:06:32.499+01:00 SC-1 osafamfnd[3565]: Rebooting OpenSAF NodeId =
2010f EE Name = , Reason: AVND record not found, after failover, OwnNodeId =
2010f, SupervisionTime = 60
2022-02-19T14:06:32.523+01:00 SC-1 osafamfd[3531]: NO Switching Quiesced -->
StandBy
~~~
The solution to prevent the unexpected reboot is to make msg id count align
with new active.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list._______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets