---
** [tickets:#3241] amf: cluster stuck unhealthy when SCs brutal reboot**
**Status:** assigned
**Milestone:** 5.21.03
**Created:** Tue Dec 01, 2020 09:48 AM UTC by Thuan Tran
**Last Updated:** Tue Dec 01, 2020 09:48 AM UTC
**Owner:** Thuan Tran
Cluster stuck unhealthy under SCs brutal reboot
~~~
2020-11-26 06:58:45.011 SC-2 osafamfd[247]: NO Received node_up from 2010f:
msg_id 1
2020-11-26 06:58:45.012 SC-2 osafamfd[247]: NO Node 'SC-1' joined the cluster
2020-11-26 06:58:48.240 SC-2 systemd-sysctl[35]: Couldn't write '4 4 1 7' to
'kernel/printk', ignoring: Read-only file system
2020-11-26 06:58:48.252 SC-2 systemd-sysctl[35]: Couldn't write '1' to
'kernel/kptr_restrict', ignoring: Read-only file system
2020-11-26 06:58:45.512 SC-1 osafamfnd[260]: NO Assigning
'safSi=NoRed1,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF'
2020-11-26 06:58:45.518 SC-1 osafamfnd[260]: NO Assigned
'safSi=NoRed1,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=NoRed,safApp=OpenSAF'
2020-11-26 06:58:46.425 SC-1 osafdtmd[126]: NO Lost contact with 'SC-2'
2020-11-26 06:58:46.428 SC-1 osafamfnd[260]: WA AMF director unexpectedly
crashed
2020-11-26 06:58:46.428 SC-1 osafamfnd[260]: NO Checking
'safSu=SC-1,safSg=2N,safApp=OpenSAF' for pending messages
2020-11-26 06:58:46.428 SC-1 osafamfnd[260]: NO Checking
'safSu=SC-1,safSg=NoRed,safApp=OpenSAF' for pending messages
2020-11-26 06:58:46.436 SC-1 osafamfnd[260]: NO
'safSu=SC-1,safSg=2N,safApp=OpenSAF' Presence State INSTANTIATING =>
INSTANTIATED
~~~
SC-2 power off when SC-1 just up (not yet standby)
Then SC-1 enter headless and promote itself to Active (like roaming SC)
AMFND failed to record SU-SI as exist already
~~~
2020-11-26 06:58:49.365 SC-1 osafamfnd[260]: NO AVD NEW_ACTIVE, adest:1
2020-11-26 06:58:49.442 SC-1 osafamfnd[260]: NO saClmDispatch BAD_HANDLE
2020-11-26 06:58:49.442 SC-1 osafamfnd[260]: NO Sending node up due to
NCSMDS_NEW_ACTIVE
2020-11-26 06:58:56.028 SC-1 osafamfnd[260]: CR SU-SI record addition failed,
SU= safSu=SC-1,safSg=NoRed,safApp=OpenSAF : SI=safSi=NoRed1,safApp=OpenSAF
2020-11-26 06:58:56.038 SC-1 osafamfnd[260]: NO Assigning
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
2020-11-26 06:58:56.073 SC-1 osafamfnd[260]: NO Assigned
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
2020-11-26 06:58:56.700 SC-1 osafamfd[247]: NO Received node_up from 2020f:
msg_id 1
2020-11-26 06:58:57.086 SC-1 osafamfd[247]: NO Received node_up from 2050f:
msg_id 1
2020-11-26 06:58:57.087 SC-1 osafamfd[247]: NO Received node_up from 2030f:
msg_id 1
2020-11-26 06:58:57.090 SC-1 osafamfd[247]: NO Received node_up from 2040f:
msg_id 1
<143>1 2020-11-26T06:59:02.179518+01:00 SC-1 osafamfd 247 osafamfd [meta
sequenceId="18992"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node
2020f
<143>1 2020-11-26T06:59:02.579492+01:00 SC-1 osafamfd 247 osafamfd [meta
sequenceId="19025"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node
2030f
<143>1 2020-11-26T06:59:02.579642+01:00 SC-1 osafamfd 247 osafamfd [meta
sequenceId="19040"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node
2040f
<143>1 2020-11-26T06:59:02.579795+01:00 SC-1 osafamfd 247 osafamfd [meta
sequenceId="19055"] 247:amf/amfd/ndfsm.cc:373 TR invalid init state (2), node
2050f
~~~
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets