- **Milestone**: 4.6.2 --> 4.7.2


---

** [tickets:#1529] Node rebooted as saImmOiInitialize_2 failed during 
middleware active assignment**

**Status:** unassigned
**Milestone:** 4.7.2
**Created:** Thu Oct 08, 2015 07:53 AM UTC by Chani Srivastava
**Last Updated:** Mon Nov 02, 2015 09:08 AM UTC
**Owner:** nobody
**Attachments:**

- 
[1529.tgz](https://sourceforge.net/p/opensaf/tickets/1529/attachment/1529.tgz) 
(586.3 kB; application/x-compressed-tar)
- 
[SC1_syslog.txt](https://sourceforge.net/p/opensaf/tickets/1529/attachment/SC1_syslog.txt)
 (436.4 kB; text/plain)
- 
[SC2_syslog.txt](https://sourceforge.net/p/opensaf/tickets/1529/attachment/SC2_syslog.txt)
 (425.6 kB; text/plain)


Setup:
Changeset-6901
Invoked continuous failovers on a 4-node Cluster with 2 controllers and 2 
payloads. All nodes have 64bit architecture.
2PBE enabled with 25K objects

Issue Observed:
Cluster reset occurred on invoking continuous failovers

Attachments:
Attaching syslogs for SC-1 and SC-2
Traces for immnd and immd can be shared seperately if required

Steps:
* Initially SC-1 is active and SC-2 standby
* A test script invoked failover via killing osafclmd on SC1
* SC-2 became active

Oct  7 18:23:32 OSAF-SC1 root: killing osafclmd from invoke_failover.sh
Oct  7 19:25:20 OSAF-SC2 osafamfd[2191]: NO FAILOVER StandBy --> Active

* On the new active controler, saImmOiInitialize_2 failed 

Oct  7 19:25:22 OSAF-SC2 osafntfimcnd[2735]: ER ntfimcn_imm_init 
saImmOiInitialize_2 failed SA_AIS_ERR_TIMEOUT (5)
Oct  7 19:25:22 OSAF-SC2 osafntfimcnd[2735]: ER ntfimcn_imm_init() Fail
Oct  7 19:25:22 OSAF-SC2 osafimmnd[2131]: NO Implementer connected: 333 
(safLckService) <299, 2020f>
Oct  7 19:25:22 OSAF-SC2 osafimmnd[2131]: NO Implementer connected: 334 
(safEvtService) <298, 2020f>
Oct  7 19:25:23 OSAF-SC2 osafntfimcnd[2738]: ER ntfimcn_imm_init 
saImmOiInitialize_2 failed SA_AIS_ERR_TIMEOUT (5)
Oct  7 19:25:23 OSAF-SC2 osafntfimcnd[2738]: ER ntfimcn_imm_init() Fail
Oct  7 19:25:23 OSAF-SC2 osafimmnd[2131]: WA MDS Send Failed
Oct  7 19:25:23 OSAF-SC2 osafimmnd[2131]: WA Error code 2 returned for message 
type 4 - ignoring

* Other services also fail to initialize with IMM on new active 
controller..i.e. SC-2

* And finally SMF had csi set timeout
* SC-2 went for reboot and hence the entire cluster reset, as SC-2 is the only 
active controller at the time

Oct  7 19:25:51 OSAF-SC2 osafamfnd[2205]: NO 
'safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'nodeFailfast'
Oct  7 19:25:51 OSAF-SC2 osafamfnd[2205]: ER 
safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackTimeout Recovery is:nodeFailfast
Oct  7 19:25:51 OSAF-SC2 osafamfnd[2205]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
Oct  7 19:25:51 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60




---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to