- **Milestone**: 4.6.2 --> 4.7.2
---
** [tickets:#1529] Node rebooted as saImmOiInitialize_2 failed during
middleware active assignment**
**Status:** unassigned
**Milestone:** 4.7.2
**Created:** Thu Oct 08, 2015 07:53 AM UTC by Chani Srivastava
**Last Updated:** Mon Nov 02, 2015 09:08 AM UTC
**Owner:** nobody
**Attachments:**
-
[1529.tgz](https://sourceforge.net/p/opensaf/tickets/1529/attachment/1529.tgz)
(586.3 kB; application/x-compressed-tar)
-
[SC1_syslog.txt](https://sourceforge.net/p/opensaf/tickets/1529/attachment/SC1_syslog.txt)
(436.4 kB; text/plain)
-
[SC2_syslog.txt](https://sourceforge.net/p/opensaf/tickets/1529/attachment/SC2_syslog.txt)
(425.6 kB; text/plain)
Setup:
Changeset-6901
Invoked continuous failovers on a 4-node Cluster with 2 controllers and 2
payloads. All nodes have 64bit architecture.
2PBE enabled with 25K objects
Issue Observed:
Cluster reset occurred on invoking continuous failovers
Attachments:
Attaching syslogs for SC-1 and SC-2
Traces for immnd and immd can be shared seperately if required
Steps:
* Initially SC-1 is active and SC-2 standby
* A test script invoked failover via killing osafclmd on SC1
* SC-2 became active
Oct 7 18:23:32 OSAF-SC1 root: killing osafclmd from invoke_failover.sh
Oct 7 19:25:20 OSAF-SC2 osafamfd[2191]: NO FAILOVER StandBy --> Active
* On the new active controler, saImmOiInitialize_2 failed
Oct 7 19:25:22 OSAF-SC2 osafntfimcnd[2735]: ER ntfimcn_imm_init
saImmOiInitialize_2 failed SA_AIS_ERR_TIMEOUT (5)
Oct 7 19:25:22 OSAF-SC2 osafntfimcnd[2735]: ER ntfimcn_imm_init() Fail
Oct 7 19:25:22 OSAF-SC2 osafimmnd[2131]: NO Implementer connected: 333
(safLckService) <299, 2020f>
Oct 7 19:25:22 OSAF-SC2 osafimmnd[2131]: NO Implementer connected: 334
(safEvtService) <298, 2020f>
Oct 7 19:25:23 OSAF-SC2 osafntfimcnd[2738]: ER ntfimcn_imm_init
saImmOiInitialize_2 failed SA_AIS_ERR_TIMEOUT (5)
Oct 7 19:25:23 OSAF-SC2 osafntfimcnd[2738]: ER ntfimcn_imm_init() Fail
Oct 7 19:25:23 OSAF-SC2 osafimmnd[2131]: WA MDS Send Failed
Oct 7 19:25:23 OSAF-SC2 osafimmnd[2131]: WA Error code 2 returned for message
type 4 - ignoring
* Other services also fail to initialize with IMM on new active
controller..i.e. SC-2
* And finally SMF had csi set timeout
* SC-2 went for reboot and hence the entire cluster reset, as SC-2 is the only
active controller at the time
Oct 7 19:25:51 OSAF-SC2 osafamfnd[2205]: NO
'safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to
'csiSetcallbackTimeout' : Recovery is 'nodeFailfast'
Oct 7 19:25:51 OSAF-SC2 osafamfnd[2205]: ER
safComp=SMF,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due
to:csiSetcallbackTimeout Recovery is:nodeFailfast
Oct 7 19:25:51 OSAF-SC2 osafamfnd[2205]: Rebooting OpenSAF NodeId = 131599 EE
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId =
131599, SupervisionTime = 60
Oct 7 19:25:51 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets