Attached bug_npi30.tgz contains empty files for AMFD and AMFND traces for all
the nodes.
Attached syslog messages are not sufficient to debug the issue. Please provide
AMFD and AMFND traces.
---
** [tickets:#504] SU failover was not happening after fault on NPI component.**
**Status:** unassigned
**Created:** Fri Jul 12, 2013 10:26 AM UTC by Sirisha Alla
**Last Updated:** Wed Jul 17, 2013 08:41 AM UTC
**Owner:** nobody
Setup used:-
Oracle Linux Server release 6.4, TCP, PBE enabled, IPV6 address.
4.3 Latest with patch for #98.
Problem description:-
====
SU Failover was not happening after fault on NPI component of SU1 using error
report API with recommended recovery as component restart.
Steps followed:-
1) Runtime configure 2N model with following configurations.
* 3 SUs. Each SU containing mixed PI and NPI components viz
->Two PI components(COMP1 and COMP2) with saAmfCtCompCategory=1,
saAmfCtDefRecoveryOnError=2, saAmfCtDefDisableRestart=0
-> Two NPI components(COMP3 and COMP4) with saAmfCtCompCategory=8,
saAmfCtDefRecoveryOnError=2, saAmfCtDefDisableRestart=1
SU1 was spwaned on PL-4, SU2 and SU3 was on PL-3.
* saAmfSgtDefAutoRepair=0, saAmfSGAutoRepair=1, saAmfSutDefSUFailover=1,
saAmfSGNumPrefInserviceSUs=3, saAmfSUFailover=1 for each SUs.
* Single SI containing 4 CSIs.
2) Performed admin unlock-in first then unlock of each SUs.
Observed that two PI and two NPI processes was up and running on PL-4 and
two PI components was running each for SU2 and SU3 on PL-3(total 4).
Intital assignments was as shown below:-
[root@OEL-64BIT-SLOT1 framework]# amf-state siass ha
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=d_npi_2\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=d_npi_1\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSu=d_npi_3,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_2,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
3) Performed error report on NPI COMP3 of SU1 using error report API with
recommended recovery as component restart.
After step3, observed that SU failover was triggered and successfully invoked
the cleanup scripts for all the PI and NPI components of SU1 and finally SU1
moved to UNINSTANTIATED but since SGAutoRepair is 1 finally SU1 moves to
enabled/instantiated state but su failover really doesn't happpen because
amf-state siass ha command was showing the same HA assignments as it was before
the fault on COMP3 SU1.
Assignments after fault on NPI comp3 SU1:-
===================================
[root@OEL-64BIT-SLOT1 framework]# amf-state siass ha
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=d_npi_2\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
saAmfSISUHAState=STANDBY(2)
safSISU=safSu=d_npi_1\,safSg=SG_d_npi\,safApp=npiApp,safSi=d_npi,safApp=npiApp
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-4\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed4,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=PL-3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
[root@OEL-64BIT-SLOT1 framework]# amf-state su
safSu=PL-3,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=PL-4,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=2N,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=2N,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_3,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_2,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
[root@OEL-64BIT-SLOT1 framework]#
[root@OEL-64BIT-SLOT1 framework]# amf-state si
safSi=NoRed1,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=NoRed2,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=NoRed3,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=NoRed4,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=SC-2N,safApp=OpenSAF
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)
safSi=d_npi,safApp=npiApp
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)
Further lock on SU1 was retured with timeout error.
[root@OEL-64BIT-SLOT1 framework]# amf-adm lock
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
error - command timed out (alarm)
[root@OEL-64BIT-SLOT1 framework]# amf-state su
safSu=PL-3,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=PL-4,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=2N,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-1,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=2N,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SC-2,safSg=NoRed,safApp=OpenSAF
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_3,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=d_npi_1,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=LOCKED(2)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=d_npi_2,safSg=SG_d_npi,safApp=npiApp
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets