- **status**: review --> fixed
- **Comment**:

changeset:   8351:4eb1ebe62d35
branch:      opensaf-5.0.x
parent:      8348:c1e0f0bed4ff
user:        Praveen Malviya <[email protected]>
date:        Wed Nov 23 16:27:16 2016 +0530
summary:     amfnd: avoid sending multiple node-switchover recovery request for 
same fault[#1935]

changeset:   8352:15416dce3e2d
branch:      opensaf-5.1.x
parent:      8349:d67104d3bd8a
user:        Praveen Malviya <[email protected]>
date:        Wed Nov 23 16:27:38 2016 +0530
summary:     amfnd: avoid sending multiple node-switchover recovery request for 
same fault[#1935]

changeset:   8353:40605ae64694
tag:         tip
parent:      8350:49c6e28c410a
user:        Praveen Malviya <[email protected]>
date:        Wed Nov 23 16:27:54 2016 +0530
summary:     amfnd: avoid sending multiple node-switchover recovery request for 
same fault[#1935]


[staging:4eb1eb]
[staging:15416d]
[staging:40605a]




---

** [tickets:#1935] amfnd: amfnd is not resetting escalations params.**

**Status:** fixed
**Milestone:** 5.0.2
**Created:** Thu Aug 04, 2016 12:51 PM UTC by Praveen
**Last Updated:** Tue Sep 20, 2016 06:00 PM UTC
**Owner:** Praveen
**Attachments:**

- 
[add_su.xml](https://sourceforge.net/p/opensaf/tickets/1935/attachment/add_su.xml)
 (2.4 kB; text/xml)
- 
[messages](https://sourceforge.net/p/opensaf/tickets/1935/attachment/messages) 
(79.5 kB; application/octet-stream)
- 
[nodeswitch.xml](https://sourceforge.net/p/opensaf/tickets/1935/attachment/nodeswitch.xml)
 (9.5 kB; text/xml)
- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/1935/attachment/osafamfd) 
(11.0 MB; application/octet-stream)
- 
[osafamfnd](https://sourceforge.net/p/opensaf/tickets/1935/attachment/osafamfnd)
 (3.7 MB; application/octet-stream)


When AMFND sends node-switchover recovery request to AMFD, it performs 
failover/switchover of SUs on the failed node. After removal of assignments of 
all application SUs on the failed node. AMFD reboots the node if NodeAutoRepair 
is enabled. 
Consider the case when NodeAutoRepair is not enabled. In this case, AMFD will 
node reboot the failed node. 
So this failed node is present and in repair pending state. Now a user perfoms 
floowing operations:
1)Deletes the failed SU from the failed node along with the comps after lock 
and lock-in the SU. 
2)Adds agains same SU and comp
3)Performs unlock-in operations.
Because of unlock-in operation, AMFND at failed node will instantiate the SU. 
This will act as trigger and AMFND in avnd_su_pres_fsm_run() will try to inform 
for node-switchover again since AMFND has not clear some global variables 
related to escalation. Here AMFND may crash. In once case crash was observed 
and in another case, AMFND was successful to send to AMFD recovery request 
again.
In the crashed case, AMFND is accessing illegal memory. 
In successful case SU had got same address :
Aug  4 17:42:13.658513 osafamfnd [14409:susm.cc:1412] NO Informing director of 
Nodeswitchover
Aug  4 17:42:13.658519 osafamfnd [14409:di.cc:0724] >> avnd_di_oper_send: SU 
'0x2534350', recv '4'
Aug  4 17:42:13.658525 osafamfnd [14409:di.cc:0737] TR SU 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1, su oper '2'

Aug  4 17:42:58.449067 osafamfnd [14409:susm.cc:1412] NO Informing director of 
Nodeswitchover
Aug  4 17:42:58.449080 osafamfnd [14409:di.cc:0724] >> avnd_di_oper_send: SU 
'0x2534350', recv '4'
Aug  4 17:42:58.449093 osafamfnd [14409:di.cc:0737] TR SU 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1, su oper '1'

Attached is the configuration and traces, steps to reproduce:
1)Bring up the configuration by disabling nodeautorepair flag.
2)Kill comp in SU1.
3)Lock and lock-in the SU and the delete it.
4)Now add su again by using the attached add_su.xml.
5)Unlock-in the SU.




---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to