- **status**: unassigned --> assigned
- **assigned_to**: Srinivas Siva Mangipudy
- **Blocker**:  --> False



---

** [tickets:#682] LOG: New Active reboots when coordinator IMMND is killed in 
the middle of switchover**

**Status:** assigned
**Milestone:** future
**Created:** Fri Dec 20, 2013 05:27 AM UTC by Sirisha Alla
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Srinivas Siva Mangipudy
**Attachments:**

- 
[logs.tar.bz2](https://sourceforge.net/p/opensaf/tickets/682/attachment/logs.tar.bz2)
 (4.3 MB; application/x-bzip)
- 
[tic682.tgz](https://sourceforge.net/p/opensaf/tickets/682/attachment/tic682.tgz)
 (208.3 kB; application/x-compressed-tar)


The issue is observed on changeset 4733 + #220 patches corresponding to cs 4741 
and cs 4742. The test setup is a 4 node SLES 64bit VMs.The setup is single PBE 
enabled loaded with 25k objects.

SC-2(SLES-64BIT-SLOT2) is Active and IMMND coordinator is hosted on 
SC-1(SLES-64BIT-SLOT1). Controller Switchover is initiated and immnd is killed 
on SC-1. SC-1 went for reboot because of the csi set callback timeout of logd.

/var/log/messages of SC-1 and SC-2 corresponding to the above mentioned steps :

SC-2:

Dec 19 17:21:36 SLES-64BIT-SLOT2 osafamfd[3609]: NO safSi=SC-2N,safApp=OpenSAF 
Swap initiated
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafamfnd[3619]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' QUIESCED to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
18 <320, 2020f> (safMsgGrpService)
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO implementer for class 
'SaSmfCampaign' is released => class extent is UNSAFE
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
22 <319, 2020f> (safEvtService)
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
23 <3, 2020f> (safLogService)
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO implementer for class 
'OpenSafSmfConfig' is released => class extent is UNSAFE
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO implementer for class 
'SaSmfSwBundle' is released => class extent is UNSAFE
Dec 19 17:21:36 SLES-64BIT-SLOT2 osafimmnd[3554]: NO Implementer disconnected 
24 <298, 2020f> (safSmfService)
Dec 19 17:21:37 SLES-64BIT-SLOT2 osafimmnd[3554]: NO IDec 19 17:21:38 

SC-1:

SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 18 <0, 2020f> 
(safMsgGrpService)
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO implementer for class 
'SaSmfCampaign' is released => class extent is UNSAFE
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
22 <0, 2020f> (safEvtService)
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
23 <0, 2020f> (safLogService)
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO implementer for class 
'OpenSafSmfConfig' is released => class extent is UNSAFE
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO implementer for class 
'SaSmfSwBundle' is released => class extent is UNSAFE
Dec 19 17:21:38 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
24 <0, 2020f> (safSmfService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
20 <0, 2020f> (safLckService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
19 <0, 2020f> (safCheckPointService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmnd[3498]: NO Implementer disconnected 
21 <0, 2020f> (safClmService)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmpbed: WA PBE lost contact with parent 
IMMND - Exiting
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafamfnd[3578]: NO 
'safComp=IMMND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafntfimcnd[3829]: ER saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafamfd[3565]: NO Re-initializing with IMM
Dec 19 17:21:39 SLES-64BIT-SLOT1 osafimmd[3488]: NO IMMND coord at 2020f
mplementer disconnected 20 <303, 2020f> (safLckService)
......

Dec 19 17:21:49 SLES-64BIT-SLOT1 osafimmnd[3953]: NO Implementer connected: 40 
(OpenSafImmPBE) <0, 2020f>
Dec 19 17:21:49 SLES-64BIT-SLOT1 osafamfd[3565]: NO Finished re-initializing 
with IMM
Dec 19 17:21:50 SLES-64BIT-SLOT1 osafimmnd[3953]: NO PBE-OI established on 
other SC. Dumping incrementally to file imm.db
Dec 19 17:23:40 SLES-64BIT-SLOT1 osafamfnd[3578]: NO 
'safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'nodeFailfast'
Dec 19 17:23:40 SLES-64BIT-SLOT1 osafamfnd[3578]: ER 
safComp=LOG,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackTimeout Recovery is:nodeFailfast
Dec 19 17:23:40 SLES-64BIT-SLOT1 osafamfnd[3578]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131343, SupervisionTime = 60
Dec 19 17:23:40 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
timeout=60

When LOGD trace is examined there is no information at that point of time for 
the failure.

Dec 19 17:21:48.406386 osaflogd [3518:imma_oi_api.c:0445] >> saImmOiDispatch
Dec 19 17:21:48.406468 osaflogd [3518:imma_oi_api.c:0572] << saImmOiDispatch
Dec 19 17:21:48.406494 osaflogd [3518:lgs_main.c:0373] << imm_reinit_thread
Dec 19 17:21:48.406619 osaflogd [3518:lgs_imm.c:2134] >> imm_impl_set
Dec 19 17:21:48.417979 osaflogd [3518:lgs_imm.c:2182] << imm_impl_set
Dec 19 17:24:31.724994 osaflogd [2427:lgs_main.c:0213] >> log_initialize
Dec 19 17:24:32.311734 osaflogd [2427:lgs_file.c:0262] >> lgs_file_init
Dec 19 17:24:32.311823 osaflogd [2427:lgs_imm.c:1579] >> read_logsv_config_obj: 
(logConfig=1,safApp=safLogService)
Dec 19 17:24:32.311872 osaflogd [2427:imma_om_api.c:0140] >> saImmOmInitialize
Dec 19 17:24:32.311914 osaflogd [2427:imma_om_api.c:0167] TR OM client version 
A.2.11 or higher
Dec 19 17:24:32.311936 osaflogd [2427:imma_om_api.c:0192] >> initialize_common
Dec 19 17:24:32.311957 osaflogd [2427:imma_init.c:0261] >> imma_startup: use 
count 0
Dec 19 17:24:32.311985 osaflogd [2427:ncs_main_pub.c:0223] TR

Switchover operation timedout. This issue is reproducible. Attaching the 
syslogs and logd trace on SC-1 and IMMND traces on both the controllers.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to