- Attachments has changed:

Diff:

~~~~

--- old
+++ new
@@ -1 +0,0 @@
-osafamfnd (452.8 kB; application/octet-stream)

~~~~




---

** [tickets:#1839] AMF: No recovery if su failover timeout during cleanup**

**Status:** fixed
**Milestone:** 4.7.2
**Created:** Thu May 19, 2016 02:41 AM UTC by Minh Hon Chau
**Last Updated:** Wed Jul 03, 2019 05:10 AM UTC
**Owner:** Praveen


Configuration:
- 2N app, SU4 (to be active) on PL-4, SU5 (standby) on PL-5
- Set saAmfNodeSuFailOverProb = 5 secs on PL4, sleep 6 secs in clc cleanup 
script. saAmfCtDefClcCliTimeout should be large enough that doesn't cause 
timeout
- saAmfNodeSuFailoverMax=100 on PL4
- No_Recommended policy as default recovery -> this will cause component 
failover at once

Steps:
- Bring up SU4 as active, SU5 as standby
- Kill amf_demo

Result:
- SU4 is uninstantiated
- Although amf-state siass shows active/standby, but actually no amf_demo 
component is running in PL-4 since recovery wasn't completed

Some traces:
May 19 12:08:12.322825 osafamfnd [420:err.cc:0317] >> avnd_err_process: 
Comp:'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' 
esc_rcvr:'3'
May 19 12:08:12.322836 osafamfnd [420:err.cc:1419] >> avnd_err_esc_su_failover 
May 19 12:08:12.322883 osafamfnd [420:tmr.cc:0235] NO SU failover probation 
timer started (timeout: 5000000000 ns)
May 19 12:08:12.322953 osafamfnd [420:tmr.cc:0088] TR node error escalation 
timer started
May 19 12:08:12.323009 osafamfnd [420:su.cc:0742] NO Performing failover of 
'safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' (SU failover count: 1)
May 19 12:08:12.323020 osafamfnd [420:err.cc:1471] << avnd_err_esc_su_failover: 
retval=1
May 19 12:08:12.323052 osafamfnd [420:err.cc:0403] NO 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon' faulted due to 
'avaDown' : Recovery is 'componentFailover'
May 19 12:08:12.323063 osafamfnd [420:err.cc:0513] >> avnd_err_recover: 
SU:safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon 
Comp:safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon
May 19 12:08:12.323090 osafamfnd [420:err.cc:0747] >> 
avnd_err_rcvr_comp_failover: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon'

...

May 19 12:08:12.323690 osafamfnd [420:clc.cc:1979] TEST: >> 
avnd_comp_clc_inst_clean_hdler: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Cleanup event 
in the instantiated state
May 19 12:08:12.323743 osafamfnd [420:clc.cc:2827] >> 
avnd_comp_clc_cmd_execute: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon':CLC CLI 
command type:'AVND_COMP_CLC_CMD_TYPE_CLEANUP(3)'
May 19 12:08:12.323792 osafamfnd [420:clc.cc:2990] T1 CLC CLI 
script:'/srv/abcdtest/amf_demo/amf_demo_script'
May 19 12:08:12.323808 osafamfnd [420:clc.cc:2992] T1 CLC CLI command 
arguments[1] ='cleanup'

...

May 19 12:08:17.424586 osafamfnd [420:err.cc:1494] TEST: >> 
avnd_evt_tmr_node_err_esc_evh 
May 19 12:08:17.424613 osafamfnd [420:err.cc:1496] NO TEST: SU failover 
probation timer expired

...

May 19 12:08:18.357814 osafamfnd [420:clc.cc:2226] TEST: >> 
avnd_comp_clc_terming_cleansucc_hdler: 
'safComp=AmfDemo,safSu=SU4,safSg=AmfDemoTwon,safApp=AmfDemoTwon': Cleanup 
success event in the terminating state

Inside call of avnd_comp_clc_terming_cleansucc_hdler, no di_oper_state to 
report amfd so the recovery sequence can not continue.
Full trace is attached.


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to