Attached file is amfd trace that shows how a duplicated node mapped to SUs
Before headless, the node<->su map as below:
SU1: PL3, SU2: PL4, SU3: PL5, SU4: SC1, SU5: SC2

After headless, the nodes read from IMM and initialization phase was reverted 
in order, thus the node <-> su map was changed:
SU5: PL3, SU4: PL4, SU3: PL5, SU2: SC1, SU1: SC2

At the headless sync phase, the node of SU2 was updated to PL4, because SU2 was 
actually mapped to PL4 before headless, eventually it becomes SU2: PL4 and SU4: 
PL4.

The problem happens due to some reasons that order of SU read from IMM reverted 
and saAmfSUHostedByNode was empty which caused amfd pick randomly the node to 
assign to SU.

A solution could be:
(1) Make both active/standby amfd become early implementer/applier so that 
saAmfSUHostedByNode was read properly after headless
Or, (2) Just before reading headless sync information, amfd read (again) 
saAmfSUHostedByNode and update to SU. At this point, saAmfSUHostedByNode should 
be read as non-empty value and the saAmfSUHostedByNode was mapped in 
initialization phase is not reliable.




Attachments:

- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/76ef399e/9d50/attachment/osafamfd)
 (5.9 MB; application/octet-stream)


---

** [tickets:#2112] amfd: multiple SUs incorrectly assigned to single node**

**Status:** assigned
**Milestone:** 5.1.1
**Created:** Tue Oct 11, 2016 11:56 PM UTC by Gary Lee
**Last Updated:** Wed Oct 12, 2016 02:16 AM UTC
**Owner:** Minh Hon Chau


Multiple SUs are assigned to a single node after SC absence.

To reproduce:

0) load nwayactive demo
1) stop SCs
2) restart SCs

The following is observed:

root@SC-1:~# immlist safSu=SU4,safSg=AmfDemo,safApp=AmfDemo2
...
saAmfSUHostedByNode                                SA_NAME_T    
safAmfNode=PL-4,safAmfCluster=myAmfCluster (42) 

root@SC-1:~# immlist safSu=SU2,safSg=AmfDemo,safApp=AmfDemo2
...
saAmfSUHostedByNode                                SA_NAME_T    
safAmfNode=PL-4,safAmfCluster=myAmfCluster (42) 

SU2 is indeed assigned to PL-4, but SU4 was assigned to one of the SCs and is 
not assigned to PL-4.

Operations on SU4 will lead to a crash of amfnd on PL-4.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to