---

** [tickets:#2141] AMF: AMFD fails to stop clm track during role transition 
from active to quiesced**

**Status:** assigned
**Milestone:** 5.0.2
**Created:** Wed Oct 26, 2016 03:57 AM UTC by Minh Hon Chau
**Last Updated:** Wed Oct 26, 2016 03:57 AM UTC
**Owner:** Minh Hon Chau


In scenario of swapping 2N Opensaf SI (switch over), when active AMFD is moving 
to quiesced, AMFD fails to stop clm track callback due to return code 
SA_AIS_ERR_TIMEOUT. Currently AMFD only logs error but stopping track record 
has not done properly. That results into new standby AMFD (was being quiesced) 
receives clm track callback when other node leaves cluster. Eventually, 
clm_node_exit_complete() will be triggered at standby AMFD, which should not 
happen.

The consequence is standby AMFD fails to resolve checkpoint update if another 
node reboots afterward.

In SC2 (standby AMFD)
Oct 26 11:59:26.061288 osafamfd [468:clm.cc:0216] >> clm_track_cb: '0' '4' '1'
Oct 26 11:59:26.061294 osafamfd [468:clm.cc:0281] TR  Node Left: 
rootCauseEntity safNode=PL-3,safCluster=myClmCluster for node 131855
Oct 26 11:59:26.061298 osafamfd [468:clm.cc:0185] >> clm_node_exit_complete: 
2030f
Oct 26 11:59:26.061301 osafamfd [468:ndproc.cc:1139] >> avd_node_failover: 
'safAmfNode=PL-3,safAmfCluster=myAmfCluster'
...
Oct 26 11:59:26.070895 osafamfd [468:sg_nored_fsm.cc:0770] >> node_fail: 
safSu=PL-3,safSg=NoRed,safApp=OpenSAF, TEST sg_fsm_state=0
...
Oct 26 11:59:26.071007 osafamfd [468:siass.cc:0496] : >> avd_susi_delete: 
safSu=PL-3,safSg=NoRed,safApp=OpenSAF safSi=NoRed4,safApp=OpenSAF

In SC1 (active AMFD)
Oct 26 11:59:26.057724 osafamfd [488:ndfsm.cc:0671] >> avd_mds_avnd_down_evh: 
2030f, 0x7da3d0
Oct 26 11:59:26.057732 osafamfd [488:ndproc.cc:1139] >> avd_node_failover: 
'safAmfNode=PL-3,safAmfCluster=myAmfCluster'
Oct 26 11:59:26.057739 osafamfd [488:ndfsm.cc:0999] >> avd_node_mark_absent 
...
Oct 26 11:59:26.066576 osafamfd [488:sg_nored_fsm.cc:0770] >> node_fail: 
safSu=PL-3,safSg=NoRed,safApp=OpenSAF, TEST sg_fsm_state=0
...
Oct 26 11:59:26.066783 osafamfd [488:siass.cc:0496] : >> avd_susi_delete: 
safSu=PL-3,safSg=NoRed,safApp=OpenSAF safSi=NoRed4,safApp=OpenSAF

When AMFD-SC1 deletes susi, it checkpoints to standby AMFD, now standby AMFD 
fails to resolve this update because standby AMFD has already deleted it

Oct 26 11:59:26.073665 osafamfd [468:ckpt_dec.cc:0659] >> dec_siass: i_action 
'2'
...
Oct 26 11:59:26.073700 osafamfd [468:ckpt_updt.cc:0405] >> avd_ckpt_siass: 
'safSi=NoRed4,safApp=OpenSAF' 'safSu=PL-3,safSg=NoRed,safApp=OpenSAF'
Oct 26 11:59:26.073704 osafamfd [468:si.cc:0395] >> avd_si_get: 
safSi=NoRed4,safApp=OpenSAF
Oct 26 11:59:26.073706 osafamfd [468:si.cc:0396] << avd_si_get 
...
Oct 26 11:59:26.073722 osafamfd [468:ckpt_updt.cc:0508] ER avd_ckpt_siass: 
safSu=PL-3,safSg=NoRed,safApp=OpenSAF safSi=NoRed4,safApp=OpenSAF does not exist
Oct 26 11:59:26.073725 osafamfd [468:ckpt_dec.cc:0690] << dec_siass 

This error can be seen by comment out the avd_clm_track_stop() in 
amfd_switch_actv_qsd() to pretend the SA_AIS_ERR_TIMEOUT error code.

A simple solution could be, when standby AMFD receives clm_track_cb(), AMFD can 
retry to stop track record and quickly return out of callback


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to