- **summary**: Cluster reset happend during headless as CLMNA faulted due to
csiSetcallbackTimeout --> Cluster reset happend during headless as CLMNA
faulted due to healthCheckcallbackTimeout
- Description has changed:
Diff:
--- old
+++ new
@@ -4,13 +4,13 @@
Setup : 5 nodes ( 3 controllers and 2 payloads with headless feature enabled &
1PBE with 10K objects
#Summary :
-Cluster reset happend during headless as CLMNA faulted due to
csiSetcallbackTimeout
+Cluster reset happend during headless as CLMNA faulted due to
healthCheckcallbackTimeout
#Steps followed & Observed behaviour
1. Invoked headless by killing Active followed by Standby and Spare Controller,
maintaining gap of 6 sec between controller reboot
-2. After couple of failover, CLMNA faulted on PL-4 and PL-5 due to
csiSetcallbackTimeout, and cluster reset happened.
+2. After couple of failover, CLMNA faulted on PL-4 and PL-5 due to
healthCheckcallbackTimeout, and cluster reset happened.
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: NO SU failover probation timer
started (timeout: 12000 ns)
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: NO Performing failover of
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (SU failover count: 1)
---
** [tickets:#2025] Cluster reset happend during headless as CLMNA faulted due
to healthCheckcallbackTimeout**
**Status:** unassigned
**Milestone:** 4.7.2
**Created:** Mon Sep 12, 2016 07:32 AM UTC by Ritu Raj
**Last Updated:** Mon Sep 12, 2016 07:32 AM UTC
**Owner:** nobody
**Attachments:**
-
[PL-4.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2025/attachment/PL-4.tar.bz2)
(38.4 kB; application/x-bzip)
-
[PL-5.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2025/attachment/PL-5.tar.bz2)
(59.0 kB; application/x-bzip)
-
[SC-1.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2025/attachment/SC-1.tar.bz2)
(160.7 kB; application/x-bzip)
-
[SC-2.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2025/attachment/SC-2.tar.bz2)
(107.4 kB; application/x-bzip)
-
[SC-3.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2025/attachment/SC-3.tar.bz2)
(109.8 kB; application/x-bzip)
#Environment details
OS : Suse 64bit
Changeset : 7997 ( 5.1.FC)
Setup : 5 nodes ( 3 controllers and 2 payloads with headless feature enabled &
1PBE with 10K objects
#Summary :
Cluster reset happend during headless as CLMNA faulted due to
healthCheckcallbackTimeout
#Steps followed & Observed behaviour
1. Invoked headless by killing Active followed by Standby and Spare Controller,
maintaining gap of 6 sec between controller reboot
2. After couple of failover, CLMNA faulted on PL-4 and PL-5 due to
healthCheckcallbackTimeout, and cluster reset happened.
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: NO SU failover probation timer
started (timeout: 12000 ns)
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: NO Performing failover of
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (SU failover count: 1)
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: NO
'safComp=CLMNA,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' recovery action escalated
from 'componentFailover' to 'suFailover'
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: NO
'safComp=CLMNA,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: ER
safComp=CLMNA,safSu=PL-4,safSg=NoRed,safApp=OpenSAF Faulted due
to:healthCheckcallbackTimeout Recovery is:suFailover
Sep 10 17:52:46 SCALE_SLOT-74 osafamfnd[12421]: Rebooting OpenSAF NodeId =
132111 EE Name = , Reason: Component faulted: recovery is node failfast,
OwnNodeId = 132111, SupervisionTime = 60
Notes:
1. There is time gap between system
With respect to PL-4(Sep 10 17:52:46 SCALE_SLOT-74) the corresponding time for
other system as:
Sep 27 18:46:53: SC-1
Oct 03 10:02:54: SC-2
Oct 03 10:26:44: SC-3
Sep 10 17:54:46: PL-5
There is No syslog logged on controller's during above time.
2. Syslog of SC-1,SC-2,SC-3, PL-4 and PL-5 attached
3. clmnd traces not enabled
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets