- **status**: review --> fixed
- **Milestone**: future --> 4.4.FC


---

** [tickets:#504] SU failover was not happening after fault on NPI component.**

**Status:** fixed
**Created:** Fri Jul 12, 2013 10:26 AM UTC by Sirisha Alla
**Last Updated:** Mon Jul 22, 2013 12:01 PM UTC
**Owner:** Praveen

Setup used:-
Oracle Linux Server release 6.4, TCP, PBE enabled, IPV6 address.
4.3 Latest with patch for #98.

Problem description:-
====
SU Failover was not happening after fault on NPI component of SU1 using error 
report API with recommended recovery as component restart. 

Steps followed:-
1) Runtime configure 2N model with following configurations.
* 3 SUs. Each SU containing mixed PI and NPI components viz
  ->Two PI components(COMP1 and COMP2) with saAmfCtCompCategory=1, 
saAmfCtDefRecoveryOnError=2, saAmfCtDefDisableRestart=0 
  -> Two NPI components(COMP3 and COMP4) with saAmfCtCompCategory=8, 
saAmfCtDefRecoveryOnError=2, saAmfCtDefDisableRestart=1
SU1 was spwaned on PL-4, SU2 and SU3 was on PL-3.

* saAmfSgtDefAutoRepair=0, saAmfSGAutoRepair=1, saAmfSutDefSUFailover=1, 
saAmfSGNumPrefInserviceSUs=3, saAmfSUFailover=1 for each SUs.
* Single SI containing 4 CSIs.

2) Performed admin unlock-in first then unlock of each SUs.
   Observed that two PI and two NPI processes was up and running on PL-4 and 
two PI components was running each for SU2 and SU3 on PL-3(total 4).

Intital assignments was as shown below:-
[root@OEL-64BIT-SLOT1 framework]# amf-state siass ha
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=d_npi_2\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=d_npi_1\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)

safSu=d_npi_3,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_2,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)


3) Performed error report on NPI COMP3 of SU1 using error report API with 
recommended recovery as component restart.

After step3, observed that SU failover was triggered and successfully invoked 
the cleanup scripts for all the PI and NPI components of SU1 and finally SU1 
moved to UNINSTANTIATED but since SGAutoRepair is 1 finally SU1 moves to 
enabled/instantiated state but su failover really doesn't happpen because 
amf-state siass ha command was showing the same HA assignments as it was before 
the fault on COMP3 SU1. 

Assignments after fault on NPI comp3 SU1:-
===================================
[root@OEL-64BIT-SLOT1 framework]# amf-state siass ha
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=d_npi_2\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
        saAmfSISUHAState=STANDBY(2)
safSISU=safSu=d_npi_1\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
        saAmfSISUHAState=ACTIVE(1)


[root@OEL-64BIT-SLOT1 framework]# amf-state su
safSu=PL-3,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=PL-4,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=2N,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=2N,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_3,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=UNINSTANTIATED(1)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_2,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
[root@OEL-64BIT-SLOT1 framework]# 


[root@OEL-64BIT-SLOT1 framework]# amf-state si
safSi=NoRed1,safApp=OpenSAF
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=NoRed2,safApp=OpenSAF
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=NoRed3,safApp=OpenSAF
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=NoRed4,safApp=OpenSAF
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=SC-2N,safApp=OpenSAF
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=d_npi,safApp=npiApp
        saAmfSIAdminState=UNLOCKED(1)
        saAmfSIAssignmentState=FULLY_ASSIGNED(2)

Further lock on SU1 was retured with timeout error.
[root@OEL-64BIT-SLOT1 framework]# amf-adm lock 
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
error - command timed out (alarm)
[root@OEL-64BIT-SLOT1 framework]# amf-state su
safSu=PL-3,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=PL-4,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=2N,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=2N,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=NoRed,safApp=OpenSAF
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_3,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=LOCKED(2)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=d_npi_2,safSg=SG_d_npi,safApp=npiApp
        saAmfSUAdminState=UNLOCKED(1)
        saAmfSUOperState=ENABLED(1)
        saAmfSUPresenceState=INSTANTIATED(3)
        saAmfSUReadinessState=IN-SERVICE(2)



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to