- **status**: unassigned --> invalid
- **Comment**:

The ticket #1219 is closed as invalid.
re-open the ticket if the problem is observed again



---

** [tickets:#1835] Imm: Immd helathcheck callback got timed-out on active 
controller when starting opensaf on PL-4 and stopping opensaf on PL-3 
simultaneously.**

**Status:** invalid
**Milestone:** 5.0.2
**Created:** Tue May 17, 2016 10:27 AM UTC by Madhurika Koppula
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody
**Attachments:**

- 
[messages_SC-1](https://sourceforge.net/p/opensaf/tickets/1835/attachment/messages_SC-1)
 (794.5 kB; application/octet-stream)


Setup:
Changeset- 7613
Version - opensaf 5.0
4 nodes cluster with single PBE.

Reproducible steps:

1) Bring up Active controller, standby controller and any payload PL-3.
2) Now bringup payload Pl-4 and stop opensaf on payload PL-3 during Immnd 
start-up sync of PL-4.

Below is the snippet of Immd helathcheck callback time-out on active controller 
SC-1.


May 17 15:00:25 REG-S1 osafmsgd[11279]: ER saImmOiImplementerSet failed with 
return value=6
May 17 15:01:35 REG-S1 osafimmloadd: ER Too many TRY_AGAIN on saImmOmSearchNext 
- aborting
May 17 15:01:35 REG-S1 osafimmnd[11165]: ER SYNC APPARENTLY FAILED status:1
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO -SERVER STATE: 
IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE (2761)
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO Epoch set to 8 in ImmModel
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO Coord broadcasting ABORT_SYNC, 
epoch:8

May 17 15:05:13 REG-S1 osafamfnd[11227]: NO SU failover probation timer started 
(timeout: 1200000000000 ns)
May 17 15:05:13 REG-S1 osafamfnd[11227]: NO Performing failover of 
'safSu=SC-1,safSg=2N,safApp=OpenSAF' (SU failover count: 1)

**May 17 15:05:13 REG-S1 osafamfnd[11227]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
May 17 15:05:13 REG-S1 osafamfnd[11227]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
May 17 15:05:13 REG-S1 osafamfnd[11227]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:healthCheckcallbackTimeout Recovery is:suFailover**

May 17 15:05:13 REG-S1 osafamfnd[11227]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60
May 17 15:05:13 REG-S1 opensaf_reboot: Rebooting local node; timeout=60
May 17 15:05:17 REG-S1 kernel: [21682.049674] md: stopping all md devices.

Attaching the syslog of Active controller.
Immnd traces are huge to attach.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to