>From logs analysis, si swap was issued :
Mar 20 15:07:50.825598 osafamfd [2353:si.cc:0821] >> si_admin_op_cb: 
safSi=SI3,safApp=test2nApp op=7
But component got timeout while transitioning from Quisced to Standby and SU 
failover triggered.

Mar 20 15:08:10 SYSTEST-PLD-1 osafamfnd[6835]: NO Performing failover of 
'safSu=SU1,safSg=SG,safApp=test2nApp' (SU failover count: 5)
Mar 20 15:08:10 SYSTEST-PLD-1 osafamfnd[6835]: NO 
'safComp=COMP1,safSu=SU1,safSg=SG,safApp=test2nApp' recovery action escalated 
from 'componentFailover' to 'suFailover'
Mar 20 15:08:10 SYSTEST-PLD-1 osafamfnd[6835]: NO 
'safComp=COMP1,safSu=SU1,safSg=SG,safApp=test2nApp' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'suFailover'

It is reproducible with the following steps:
1. Configure SU failover for amf demo app and perform SI swap after unlocking 
both the SUs.
2. Keep gdb to Timeout when comp is going to standby from quisced.
3. Perform immlist. 
saAmfSINumCurrStandbyAssignments is still 1, which should be zero

saAmfSINumCurrStandbyAssignments                   SA_UINT32_T  1 (0x1)



---

** [tickets:#1276] AMF : saAmfSINumCurrStandbyAssignments is holding  invalid 
value in 2N model**

**Status:** assigned
**Milestone:** 4.6.RC1
**Created:** Fri Mar 20, 2015 10:28 AM UTC by Srikanth R
**Last Updated:** Mon Mar 30, 2015 11:40 AM UTC
**Owner:** Nagendra Kumar

*Setup*
Version : 4.6 FC
model : 2n
configuration : 1App,1SG,2SUs with 4comps each, 4SIs with 1 CSI each
SU1 is mapped to pl-3 and SU2 to pl-4


*Initial state*
All the AMF entities regarding the application are in unlocked states. SIs are 
in fully assigned state. SU1 is the standby SU and SU2 is the active SU

Steps Performed :

 -> Ran the command "/etc/init.d/opensafd stop" on the PL-3 node.

Mar 20 15:34:47 SYSTEST-PLD-1 opensafd: Stopping OpenSAF Services
Mar 20 15:34:47 SYSTEST-PLD-1 osafamfnd[6835]: NO Shutdown initiated

    Now SU2 on PL4 is having active assignments.

 -> Started opensaf on PL-3 node.

Mar 20 15:36:43 SYSTEST-PLD-1 opensafd: Starting OpenSAF Services (Using TIPC)
Mar 20 15:36:45 SYSTEST-PLD-1 opensafd: OpenSAF(4.6.FC - ) services 
successfully started
   
   Now SU2 on PL-4 is active and SU1 on PL-3 is standby.


 -> Stopped opensaf on PL-4 node.

Mar 20 15:36:50 SYSTEST-PLD-1 osafamfnd[16251]: NO Assigned 'all SIs' ACTIVE of 
'safSu=SU1,safSg=SG,safApp=test2nApp'
Mar 20 15:36:53 SYSTEST-PLD-1 kernel: [14611.120045] TIPC: Resetting link 
<1.1.3:eth2-1.1.4:eth2>, peer not responding
Mar 20 15:36:53 SYSTEST-PLD-1 kernel: [14611.120051] TIPC: Lost link 
<1.1.3:eth2-1.1.4:eth2> on network plane A
Mar 20 15:36:53 SYSTEST-PLD-1 kernel: [14611.120056] TIPC: Lost contact with 
<1.1.4>
Mar 20 15:37:08 SYSTEST-PLD-1 kernel: [14626.188976] TIPC: Established link 
<1.1.3:eth2-1.1.4:eth2> on network plane A

   Now SU1 on PL-3 is active and SU2 is unassigned state.

 -> Started opensaf on PL-4 node.

Mar 20 15:37:08 SYSTEST-PLD-1 kernel: [14626.188976] TIPC: Established link 
<1.1.3:eth2-1.1.4:eth2> on network plane A


 Now the amf-state of all SIs are showing as partially assigned, as 
saAmfSINumCurrStandbyAssignments is set to the value 2, which is invalid for 2n 
model.
Callbacks for the components are proper, only the imm attribute is improperly 
updated by AMF.

saAmfSIPrefStandbyAssignments                      SA_UINT32_T  1 (0x1)
saAmfSIPrefActiveAssignments                       SA_UINT32_T  1 (0x1)
saAmfSINumCurrStandbyAssignments                   SA_UINT32_T  2 (0x2)
saAmfSINumCurrActiveAssignments                    SA_UINT32_T  1 (0x1)
saAmfSIAssignmentState                             SA_UINT32_T  3 (0x3)



AMF lock on SI had resulted in following values :
saAmfSIPrefStandbyAssignments                      SA_UINT32_T  1 (0x1)
saAmfSIPrefActiveAssignments                       SA_UINT32_T  1 (0x1)
saAmfSINumCurrStandbyAssignments                   SA_UINT32_T  1 (0x1)
saAmfSINumCurrActiveAssignments                    SA_UINT32_T  0 (0x0)







---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to