---
** [tickets:#2443] amf: amf gets stuck after headless while processing node_up
messages**
**Status:** assigned
**Milestone:** 5.17.06
**Created:** Thu Apr 27, 2017 11:59 AM UTC by Long H Buu Nguyen
**Last Updated:** Thu Apr 27, 2017 11:59 AM UTC
**Owner:** Long H Buu Nguyen
Description:
After headless, SCs come up. During that time, if the Active SC is rebooted
while the other SC is still initialising. There is a case that amfd in the
other SC gets stuck in processing node_up messages. As a result, opensafd fails
to start.
Observation:
Infinite node_up from syslog:
2017-04-18 14:17:36 SC-1 osafamfd[478]: NO Received node_up from 2040f: msg_id 1
2017-04-18 14:17:37 SC-1 osafamfd[478]: NO Received node_up from 2020f: msg_id 1
2017-04-18 14:17:37 SC-1 osafamfd[478]: NO Received node_up from 2030f: msg_id 1
...
Steps to reproduce:
1) Start a cluster.
2) Turn off SCs.
3) Turn on SCs.
4) After a SC becomes ACTIVE, while amfnd on the other SC is initialising NCS
SU, restart the active SC.
5) Amfnd on the other SC receives NEW_ACTIVE and then gets stuck with node_up
messages.
Investigation:
Assume after headless, SC-1 becomes ACTIVE. Amfnd in SC-2 sends a node_up
message to amfd-SC-1.
amfnd-SC-2 will instantiate NCS SUs in SC-2 as soon as amfd-SC-1 receives the
node_up message.
At the time NCS SUs in SC-2 are INSTANTIATED, if SC-1 is rebooted, amfnd-SC-2
receives NEW_ACTIVE because amfd-SC-2 is set to ACTIVE by RDE.
amfnd-SC-2 sends a node_up message to amfd-SC-2. Later, amfnd-SC-2 continues to
instantiate NCS SUs in SC-2. However, the NCS SUs in SC-2 are already
INSTANTIATED.
amfnd-SC-2 does not send oper_state message to amfd-SC-2 because the NCS SU
presence states do not change:
Apr 18 14:35:36.869223 osafamfnd [486:486:src/amf/amfnd/susm.cc:1563] >>
avnd_su_pres_fsm_run: 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Apr 18 14:35:36.869240 osafamfnd [486:486:src/amf/amfnd/susm.cc:1570] T1
Entering SU presence state FSM: current state: 3, event: 1, su
name:safSu=SC-1,safSg=2N,safApp=OpenSAF
Apr 18 14:35:36.869257 osafamfnd [486:486:src/amf/amfnd/susm.cc:1581] T1 Exited
SU presence state FSM: New State = 3
Apr 18 14:35:36.869273 osafamfnd [486:486:src/amf/amfnd/susm.cc:1614] <<
avnd_su_pres_fsm_run: 1
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets