- **status**: unassigned --> fixed


---

** [tickets:#3271] amf: Issue of headless restoration with Roaming SC**

**Status:** fixed
**Milestone:** 5.21.12
**Created:** Fri Jul 02, 2021 05:07 AM UTC by Minh Hon Chau
**Last Updated:** Wed Sep 29, 2021 11:43 AM UTC
**Owner:** nobody


In robustness test of roaming SC cluster recovery from split brain, the test 
performs rolling split then rejoin every active SC by 3 seconds with promote 
active timer = 0.
The following log shows the issue starting point.

The SC-1 is promoted to active right after the previous active is split. amfnd 
on SC-1 starts to send headless state information to amfd on SC-1 (this case 
does not happen without roaming SC, where the active SC after headless does not 
have amfnd's headless information in the SC).

Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=2N,safApp=ABC-012 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=2N,safApp=OpenSAF <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=All-NWayActive,safApp=ABC-012,safSu=b769074fb6,safSg=NWayActive,safApp=ABC-012
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NWayActive,safApp=ABC-012 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=d4bf28eca3,safApp=ABC-012,safSu=b769074fb6,safSg=NoRed,safApp=ABC-012
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NoRed,safApp=ABC-012 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=748f4402ae,safApp=ABC-456,safSu=b769074fb6,safSg=NoRed,safApp=ABC-456
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NoRed,safApp=ABC-456 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=d4bf28eca3,safApp=OpenSAF,safSu=b769074fb6,safSg=NoRed,safApp=OpenSAF
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NoRed,safApp=OpenSAF <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO 10 CSICOMP states sent
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO 33 COMP states sent
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Sending node up due to 
NCSMDS_NEW_ACTIVE

Jun 30 19:19:45 SC-1 osafamfd[8779]: NO Receive message with event type:12, 
msg_type:31, from node:21a0f, msg_id:0
Jun 30 19:19:45 SC-1 osafamfd[8779]: NO Receive message with event type:13, 
msg_type:32, from node:21a0f, msg_id:0

amfd on SC-1 restores the headless information

Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Received node_up from 21a0f: msg_id 1
Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Enter restore headless cached RTAs from 
IMM
Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Leave reading headless cached RTAs from 
IMM: SUCCESS
Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Node 
'47740d42-79f8-a1c9-ea73-8cb599ef2deb' joined the cluster
Jun 30 19:19:47 SC-1 osafamfnd[8802]: NO Assigning 'safSi=SC-2N,safApp=OpenSAF' 
ACTIVE to 'safSu=b769074fb6,safSg=2N,safApp=OpenSAF'


The other SCs rejoins and misses out the headless restoration of amfd-SC1, that 
causes the issue of amfd-SC1 inconsistent with the amfnd(s) on other SCs

Jun 30 19:19:48 SC-1 osafamfd[8779]: NO Receive message with event type:12, 
msg_type:31, from node:21c0f, msg_id:0
Jun 30 19:19:48 SC-1 osafamfd[8779]: NO Receive message with event type:13, 
msg_type:32, from node:21c0f, msg_id:0

Jun 30 19:19:52 SC-1 osafamfd[8779]: NO Receive message with event type:12, 
msg_type:31, from node:21b0f, msg_id:0
Jun 30 19:19:52 SC-1 osafamfd[8779]: NO Receive message with event type:13, 
msg_type:32, from node:21b0f, msg_id:0


Jun 30 19:19:54 SC-1 osafamfd[8779]: NO Received node_up from 21b0f: msg_id 1
Jun 30 19:19:54 SC-1 osafamfd[8779]: NO Received node_up from 21c0f: msg_id 1


Jun 30 19:19:56 SC-1 osafamfd[8779]: NO Received node_up from 21b0f: msg_id 1
Jun 30 19:19:56 SC-1 osafamfd[8779]: NO Received node_up from 21c0f: msg_id 1

Jun 30 19:19:57 SC-1 osafamfd[8779]: NO Cluster startup is done


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to