This issue is reproducible on 4.4 GA changeset also (parent: 5044:73326ed911cb).
A ticket should exist for this.
 
Analysis:

1)Lock on node4
Cluster=myAmfCluster
Sep  4 12:51:15.155928 osafamfd [2324:node.cc:0674] >> node_admin_state_set: 
safAmfNode=PL-4,safAmfCluster=myAmfCluster AdmState UNLOCKED => LOCKED
Sep  4 12:51:15.155939 osafamfd [2324:

Quiesced states were given in PL-4 honoring SI deps.
When AMFD sends quiesced state for SI1 to PL-4, compone fault at PL-4 
leads to nodefailover recovery:
Sep  4 12:51:16.352377 osafamfd [2324:sgproc.cc:0468] >> avd_su_oper_state_evh: 
id:152, node:2040f, 'safSu=SU5,safSg=SGONE,safApp=TWONAPP' state:2
Sep  4 12:51:16.352395 osafamfd [23

Sep  4 12:51:16.355487 osafamfd [2324:sgproc.cc:1709] >> 
avd_node_down_appl_susi_failover: 'safAmfNode=PL-4,safAmfCluster=myAmfCluster'
Sep  4 12:51:16.355493 osafamfd [2324:su.cc:0769] >> set_oper_state: 
'safSu=SU4,safSg=SGONE,safApp=TWONAPP' ENABLED => DISABLED

2)AMFD performs failover of SU5
Sep  4 12:51:16.365446 osafamfd [2324:sg_2n_fsm.cc:0558] << avd_sg_2n_act_susi: 
act: 'safSu=SU5,safSg=SGONE,safApp=TWONAPP', stdby: 
'safSu=SU1,safSg=SGONE,safApp=TWONAPP'
Sep  4 12:51:16.365453 osafamfd [2324:si_dep.cc:2039] >> 
avd_sidep_si_dependency_exists_within_su
Sep  4 12:51:16.365459 osafamfd [2324:siass.cc:0669] >> avd_susi_role_failover: 
 'safSi=TWONSI1,safApp=TWONAPP' 'safSu=SU1,safSg=SGONE,safApp=TWONAPP'
Sep  4 12:51:16.365465 osafamfd [2324:si_dep.cc:1547] >> 
avd_sidep_is_si_failover_possible: SI: 'safSi=TWONSI1,safApp=TWONAPP'
Sep  4 12:51:16.365471 osafamfd [2324:si_dep.cc:1686] << 
avd_sidep_is_si_failover_possible: return value: 1
Sep  4 12:51:16.365477 osafamfd [2324:siass.cc:0517] >> avd_susi_mod_send: SI 
'safSi=TWONSI1,safApp=TWONAPP', SU 'safSu=SU1,safSg=SGONE,safApp=TWONAPP' 
ha_state:1

Sep  4 12:51:16.369024 osafamfd [2324:si_dep.cc:0202] TR 
'safSi=TWONSI2,safApp=TWONAPP' si_dep_state ASSIGNED => FAILOVER_UNDER_PROGRESS

Sep  4 12:51:16.369417 osafamfd [2324:siass.cc:0517] >> avd_susi_mod_send: SI 
'safSi=TWONSI5,safApp=TWONAPP', SU 'safSu=SU1,safSg=SGONE,safApp=TWONAPP' 
ha_state:1

and reboots PL-4
Sep  4 12:51:16.376988 osafamfd [2324:sgproc.cc:0386] NO Ordering reboot of 
'safAmfNode=PL-4,safAmfCluster=myAmfCluster' as node fail/switch-over repair 
action

3) AMFD gets response for SI2 from SU1 hosted on SC-1:

Sep  4 12:51:16.515366 osafamfd [2324:sgproc.cc:0751] >> avd_su_si_assign_evh: 
id:166, node:2010f, act:5, 'safSu=SU1,safSg=SGONE,safApp=TWONAPP', 
'safSi=TWONSI1,safApp=TWONAPP', ha:1, err:1, single:0

It sends active to dependent SI2:
Sep  4 12:51:16.516452 osafamfd [2324:siass.cc:0517] >> avd_susi_mod_send: SI 
'safSi=TWONSI2,safApp=TWONAPP', SU 'safSu=SU1,safSg=SGONE,safApp=TWONAPP' 
ha_state:1

4)AMFD gets response for SI5
Sep  4 12:51:16.649538 osafamfd [2324:sgproc.cc:0751] >> avd_su_si_assign_evh: 
id:167, node:2010f, act:5, 'safSu=SU1,safSg=SGONE,safApp=TWONAPP', 
'safSi=TWONSI5,safApp=TWONAPP', ha:1, err:1, single:0

5)AMFD gets response for SI1 for active modification:

Sep  4 12:51:16.760213 osafamfd [2324:sgproc.cc:0751] >> avd_su_si_assign_evh: 
id:168, node:2010f, act:5, 'safSu=SU1,safSg=SGONE,safApp=TWONAPP', 
'safSi=TWONSI2,safApp=TWONAPP', ha:1, err:1, single:0

Here AMFD executed chose assign logic after updating the si dep states of SI3 
and Si4:
Sep  4 12:51:16.761263 osafamfd [2324:si_dep.cc:0202] TR 
'safSi=TWONSI3,safApp=TWONAPP' si_dep_state ASSIGNED => READY_TO_ASSIGN
Sep  4 12:51:16.761579 osafamfd [2324:si_dep.cc:0202] TR 
'safSi=TWONSI4,safApp=TWONAPP' si_dep_state ASSIGNED => SPONSOR_UNASSIGNED

New assignments are created in SU2 with standby state.
Since active modifications for Si3,SI4 was not done in SU1 there are two 
standby assignments for them.

6)SG becomes stable:
Sep  4 12:51:17.178265 osafamfd [2324:sg_2n_fsm.cc:0558] << avd_sg_2n_act_susi: 
act: 'safSu=SU1,safSg=SGONE,safApp=TWONAPP', stdby: 
'safSu=SU2,safSg=SGONE,safApp=TWONAPP'
Sep  4 12:51:17.178274 osafamfd [2324:sg_2n_fsm.cc:0717] << 
avd_sg_2n_su_chose_asgn: '(null)'
Sep  4 12:51:17.178282 osafamfd [2324:sg.cc:1611] TR safSg=SGONE,safApp=TWONAPP 
sg_fsm_state 1 => 0


In rescreening for SI dep state, AMFD posts assignment event for SI3
Sep  4 12:51:17.178805 osafamfd [2324:si_dep.cc:2320] >> sidep_si_take_action: 
si:'safSi=TWONSI3,safApp=TWONAPP', si_dep_state:'READY_TO_ASSIGN'
Sep  4 12:51:17.178811 osafamfd [2324:si_dep.cc:0510] >> 
sidep_si_dep_state_evt_send: si:'safSi=TWONSI3,safApp=TWONAPP' evt_type:22
Sep  4 12:51:17.178818 osafamfd [2324:si_dep.cc:0532] << 
sidep_si_dep_state_evt_send: rc:1

AMFD does not update si_dep state of all the dependents when it performs the 
failover of sponsor SI1. It updates 
the state of Si2 only. Due to this when response for SI2 comes, AMFD does not 
send the active modification for 
SI3 and SI4.

AMFD crashes because in screening since it is not finding any active assignment 
for SI3 and SI4, it will post si dep event for their assignment. 



---

** [tickets:#1045] si's got two standby assignments in 2n model**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Thu Sep 04, 2014 07:35 AM UTC by surender khetavath
**Last Updated:** Thu Sep 04, 2014 08:54 AM UTC
**Owner:** nobody

changeset : 5697
model : 2n
configuration : 1App,1SG,5SUs with 3comps each, 5SIs with 3CSIs each
si-si deps configured as SI1<-SI2<-SI3<-SI4.
SU1 is active, SU2 is standby.
SU1 is mapped to SC-1 and SU2 to SC-2,SU3 to PL-3 and SU4,5 to PL-4
saAmfSGAutoRepair=1(True)
SuFailover=1(True)

Test:
-----
Perform node lock having active SU.Here SU5 was active and PL-4 is the node 
locked
Reject in the quiesced callback.
Unlock the node

SI3 & SI4 have 2 standby assignments

safSi=TWONSI1,safApp=TWONAPP
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=TWONSI2,safApp=TWONAPP
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=TWONSI3,safApp=TWONAPP
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)
safSi=TWONSI4,safApp=TWONAPP
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)
safSi=TWONSI5,safApp=TWONAPP
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)


safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI2,safApp=TWONAPP
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SU2\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI3,safApp=TWONAPP
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU2\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI4,safApp=TWONAPP
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI4,safApp=TWONAPP
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU2\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI5,safApp=TWONAPP
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI1,safApp=TWONAPP
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SU2\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI1,safApp=TWONAPP
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI3,safApp=TWONAPP
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU2\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI2,safApp=TWONAPP
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI5,safApp=TWONAPP
        saAmfSISUHAState=ACTIVE(1)



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to