- **status**: review --> fixed
- **Comment**:
https://sourceforge.net/p/opensaf/mailman/message/35157503/
changeset: 7742:9adaedd9fd5f
branch: opensaf-4.7.x
parent: 7730:00dce827427b
user: [email protected]
date: Wed Jun 22 11:23:16 2016 +0530
summary: amfnd: do not repair su without AMFD request in su-failover
escalation [#1863]
changeset: 7743:b3e6824c9282
branch: opensaf-5.0.x
parent: 7731:70f4af6d31fa
user: [email protected]
date: Wed Jun 22 11:23:42 2016 +0530
summary: amfnd: do not repair su without AMFD request in su-failover
escalation [#1863]
changeset: 7744:87002e625007
tag: tip
parent: 7741:492ebe554e61
user: [email protected]
date: Wed Jun 22 11:23:51 2016 +0530
summary: amfnd: do not repair su without AMFD request in su-failover
escalation [#1863]
---
** [tickets:#1863] amfnd: amfnd tries to repair su in su-failover recovery
without AMFD request.**
**Status:** fixed
**Milestone:** 4.7.2
**Created:** Mon Jun 06, 2016 11:16 AM UTC by Praveen
**Last Updated:** Tue Jun 14, 2016 09:00 AM UTC
**Owner:** Praveen
AMFND calls avnd_err_su_repair() to repair the SU when su-failover recovery is
going on.
This happens during su lock operation when a quiesced assigned comp faults with
su-failover recovery. AMFND launches cleaup of components due to su-failover.
In the meantime, AMFND gets removal of assignments and as a part of oper done
it deletes SUSI and callsavnd_err_su_repair(). Inside this function AMFND tries
to instantiate UNINSTANTIATED comps. In the reported case, however, no
componnet is started as it is in TERMINATING state. But it resets SU_FAILOVER
flag introduced in #1839. Since AMFND clears the flag, it loses the context of
su-failover escalation. When first comp is cleaned up, AMFND instantiates it.
Also AMFND does not inform AMFD about su-failover escalation and lock operation
gets timed out.
However before the fix of #1839 also, for same case AMFND tries to call
avnd_err_su_repair() to repair SU. If a component is found in UNINSTANTIATED
state then it can lead to instantiation. This can happen when AMFND gets
removal of assignment after cleanup of atleast one comp is completed.
Steps to reproduce:
1) Set recovery policy as su-failover and bring up amf demo. Do not enable
auto-repair
2) Lock the active su and make sure that comp faults after responding for
quiesced assignment.
3) Component will get instantiated without repair admin op and lock operation
will get timed out.
AMFND traces after fix of #1839:
Jun 6 14:42:22.731996 osafamfnd [9327:sidb.cc:0737] T1 SU-SI record deleted,
SU= safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1 : SI=safSi=AmfDemo,safApp=AmfDemo1
Jun 6 14:42:22.732012 osafamfnd [9327:sidb.cc:0785] << avnd_su_si_del: 1
Jun 6 14:42:22.732028 osafamfnd [9327:err.cc:1071] >> avnd_err_su_repair
Jun 6 14:42:22.732042 osafamfnd [9327:susm.cc:1408] TR
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' not terminated,
pres.st=4
Jun 6 14:42:22.732056 osafamfnd [9327:clc.cc:0764] >> avnd_comp_clc_fsm_run:
Comp 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', Ev '1'
Jun 6 14:42:22.732070 osafamfnd [9327:clc.cc:0854] T1
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1':Entering CLC FSM:
presence state:'SA_AMF_PRESENCE_TERMINATING(4)',
Event:'AVND_COMP_CLC_PRES_FSM_EV_INST'
Jun 6 14:42:22.732084 osafamfnd [9327:clc.cc:0868] T1 Exited CLC FSM
Jun 6 14:42:22.732096 osafamfnd [9327:clc.cc:0870] T1
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1':FSM Enter presence
state: 'SA_AMF_PRESENCE_TERMINATING(4)':FSM Exit presence
state:SA_AMF_PRESENCE_TERMINATING(4)
Jun 6 14:42:22.732109 osafamfnd [9327:clc.cc:0889] << avnd_comp_clc_fsm_run: 1
Jun 6 14:42:22.732120 osafamfnd [9327:err.cc:1129] << avnd_err_su_repair:
retval=1
Jun 6 14:42:22.732132 osafamfnd [9327:susm.cc:0255] >> avnd_su_siq_prc: SU
'safSu=SU1,safSg=AmfDem
AMFND traces before fix of #1839:
Jun 6 16:16:18.947878 osafamfnd [31308:err.cc:1064] >> avnd_err_su_repair
Jun 6 16:16:18.947890 osafamfnd [31308:susm.cc:1408] TR
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' not terminated,
pres.st=4
Jun 6 16:16:18.947903 osafamfnd [31308:clc.cc:0764] >> avnd_comp_clc_fsm_run:
Comp 'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1', Ev '1'
Jun 6 16:16:18.947916 osafamfnd [31308:clc.cc:0854] T1
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1':Entering CLC FSM:
presence state:'SA_AMF_PRESENCE_TERMINATING(4)',
Event:'AVND_COMP_CLC_PRES_FSM_EV_INST'
Jun 6 16:16:18.947929 osafamfnd [31308:clc.cc:0868] T1 Exited CLC FSM
Jun 6 16:16:18.947940 osafamfnd [31308:clc.cc:0870] T1
'safComp=AmfDemo,safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1':FSM Enter presence
state: 'SA_AMF_PRESENCE_TERMINATING(4)':FSM Exit presence
state:SA_AMF_PRESENCE_TERMINATING(4)
Jun 6 16:16:18.947952 osafamfnd [31308:clc.cc:0889] << avnd_comp_clc_fsm_run: 1
Jun 6 16:16:18.947982 osafamfnd [31308:err.cc:1120] << avnd_err_su_repair:
retval=1
Jun 6 16:16:18.948015 osafamfnd [31308:susm.cc:0255] >> avnd_su_siq_prc: SU
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1'
Jun 6 16:16:18.948027 osafamfnd [31308:susm.cc:0260] << avnd_su_siq_prc
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets