Neel,

I have seen another issue after a switchover. This time CLM failed with 
BAD_HANDLE on classImplSet(). Is this issue also resolved by #946 or 
should we open a new ticket for it?

Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmnd[9565]: NO Implementer locally 
disconnected. Marking it as doomed 26 <5, 2020f> (safClmService)
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmpbed: IN Delaying class delete 
at slave PBE due to ongoing commit of ccb:15/21
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmnd[9565]: NO Implementer 
disconnected 26 <5, 2020f> (safClmService)
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmnd[9565]: NO Implementer 
connected: 30 (safClmService) <324, 2020f>
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafclmd[9606]: ER 
saImmOiClassImplementerSet failed for class SaClmNode rc:9, exiting
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafamfnd[9639]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 
'avaDown' : Recovery is 'nodeFailfast'
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafamfnd[9639]: ER 
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown 
Recovery is:nodeFailfast
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafamfnd[9639]: Rebooting OpenSAF 
NodeId = 131599 EE Name = , Reason: Component faulted: recovery is node 
failfast, OwnNodeId = 131599, SupervisionTime = 60
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmnd[9565]: NO Implementer locally 
disconnected. Marking it as doomed 30 <324, 2020f> (safClmService)
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmnd[9565]: WA IMMND - Client went 
down so no response
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmnd[9565]: WA IMMND - Client went 
down so no response
Sep 11 14:58:24 SLES-64BIT-SLOT2 osafimmnd[9565]: NO Implementer 
disconnected 30 <324, 2020f> (safClmService)
Sep 11 14:58:24 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node; 
timeout=60

clmd traces on that node

Sep 11 14:58:24.715267 osafclmd [9606:clms_imm.c:0530] TR Implementer 
not yet set: Switching on the tryagain flag
Sep 11 14:58:24.715296 osafclmd [9606:clms_imm.c:0533] << 
clms_node_update_rattr
Sep 11 14:58:24.715319 osafclmd [9606:clms_imm.c:0449] >> 
clms_admin_state_update_rattr: Admin state 1 update for node 
safNode=SC-1,safCluster=myClmCluster
Sep 11 14:58:24.715343 osafclmd [9606:clms_util.c:0036] >> 
clms_node_get_by_name: name input safNode=SC-1,safCluster=myClmCluster 
length 36
Sep 11 14:58:24.715369 osafclmd [9606:clms_util.c:0044] TR nodename 
after patricia tree get safNode=SC-1,safCluster=myClmCluster
Sep 11 14:58:24.715393 osafclmd [9606:clms_util.c:0047] << 
clms_node_get_by_name
Sep 11 14:58:24.715421 osafclmd [9606:imma_oi_api.c:2275] T2 
ERR_BAD_HANDLE: No initialized handle exists!
Sep 11 14:58:24.715452 osafclmd [9606:clms_imm.c:0488] IN 
saImmOiRtObjectUpdate for safNode=SC-1,safCluster=myClmCluster failed 
with rc = 9. Reinit with IMM
Sep 11 14:58:24.715480 osafclmd [9606:imma_oi_api.c:0622] >> saImmOiFinalize
Sep 11 14:58:24.715505 osafclmd [9606:imma_oi_api.c:0626] T2 
ERR_BAD_HANDLE: No initialized handle exists!

....

Sep 11 14:58:24.719788 osafclmd [9606:clms_imm.c:0721] TR Implementer is 
not set. Switching on flag in clms_cluster_update_rattr
Sep 11 14:58:24.719801 osafclmd [9606:clms_imm.c:0724] << 
clms_cluster_update_rattr
Sep 11 14:58:24.719815 osafclmd [9606:clms_imm.c:0633] << 
clms_retry_pending_rtupdates
Sep 11 14:58:24.719828 osafclmd [9606:clms_main.c:0435] TR There is an 
IMM task to be tried again. setting poll time out to 500

.....
Sep 11 14:58:24.740335 osafclmd [9606:clms_imm.c:0620] << 
clms_retry_pending_rtupdates: Implementerset yet to happen, try later
Sep 11 14:58:24.740345 osafclmd [9606:clms_main.c:0435] TR There is an 
IMM task to be tried again. setting poll time out to 500
Sep 11 14:58:24.744013 osafclmd [9606:imma_oi_api.c:1636] T2 
ERR_BAD_HANDLE: No implementer is set for this handle
Sep 11 14:58:24.744055 osafclmd [9606:clms_imm.c:2291] ER 
saImmOiClassImplementerSet failed for class SaClmNode rc:9, exiting

Regards,
Sirisha


On Wednesday 10 September 2014 09:28 PM, Neelakanta Reddy wrote:
> A.  SLOT1 node went down:
>
> 1. CLM got BAD_HANDLE and finalizes the handle
>
> Sep 10 14:56:51.543332 osafclmd [7511:imma_oi_api.c:0622] >> saImmOiFinalize
> Sep 10 14:56:51.543370 osafclmd [7511:imma_oi_api.c:0626] T2 ERR_BAD_HANDLE: 
> No initialized handle exists!
>
> 2. Discard implementer is called
>
> Sep 10 14:56:51.538179 osafimmnd [7448:immsv_evt.c:5363] T8 Sending:  
> IMMD_EVT_ND2D_DISCARD_IMPL to 0
> Sep 10 14:56:51.539878 osafimmnd [7448:ImmModel.cc:11474] >> 
> discardImplementer
> Sep 10 14:56:51.539994 osafimmnd [7448:ImmModel.cc:11510] NO Implementer 
> locally disconnected. Marking it as doomed 190 <17, 2010f> (safClmService)
> Sep 10 14:56:51.540181 osafimmnd [7448:ImmModel.cc:11534] << 
> discardImplementer
>
> 3. But the implemnter actually got disconnected at
>
> Sep 10 14:56:51.580449 osafimmnd [7448:immnd_evt.c:8588] T2 Global discard 
> implementer for id:190
> Sep 10 14:56:51.580462 osafimmnd [7448:ImmModel.cc:11474] >> 
> discardImplementer
> Sep 10 14:56:51.580496 osafimmnd [7448:ImmModel.cc:11481] NO Implementer 
> disconnected 190 <17, 2010f> (safClmService)
> Sep 10 14:56:51.580518 osafimmnd [7448:ImmModel.cc:11534] << 
> discardImplementer
>
> 4. CLM tries to re-initializes and receives ERR_EXISTS
>
>
> Sep 10 14:56:51.551900 osafclmd [7511:imma_oi_api.c:0440] << 
> saImmOiSelectionObjectGet
> Sep 10 14:56:51.551942 osafclmd [7511:clms_imm.c:2286] ER 
> saImmOiImplementerSet failed rc:14, exiting
> Sep 10 14:59:51.245982 osafclmd [2538:clms_main.c:0267] >> clms_init
>
>
> Sep 10 14:56:51.548981 osafimmnd [7448:immsv_evt.c:5382] T8 Received: 
> IMMND_EVT_A2ND_OI_IMPL_SET (40) from 2010f
> Sep 10 14:56:51.549023 osafimmnd [7448:immnd_evt.c:2471] T2 SENDRSP FAIL 14
>
>
> #946 fixes the above problem in CLM
>
>
> B. Slot2 node went down(Quiesced --> Active)
>
> 1. Sep 10 14:56:57.681152 osafamfd [6896:role.cc:0375] NO FAILOVER Quiesced 
> --> Active
>
> 2. saImmOiRtObjectUpdate_2 got BAD_HANDLE so AMFD  tries to re-initialize 
> with IMM and calls avd_imm_reinit_bg
>
> Sep 10 14:56:57.701333 osafamfd [6896:imma_oi_api.c:2279] >> 
> saImmOiRtObjectUpdate_2
> Sep 10 14:56:57.701344 osafamfd [6896:imma_oi_api.c:2345] T2 ERR_BAD_HANDLE: 
> The SaImmOiHandleT is not associated with any implementer name
> Sep 10 14:56:57.701353 osafamfd [6896:imma_oi_api.c:2554] << 
> saImmOiRtObjectUpdate_2
> Sep 10 14:56:57.701362 osafamfd [6896:imm.cc:0164] TR BADHANDLE
> Sep 10 14:56:57.701370 osafamfd [6896:imm.cc:1660] >> avd_imm_reinit_bg
> Sep 10 14:56:57.701406 osafamfd [6896:imm.cc:1662] NO Re-initializing with IMM
> Sep 10 14:56:57.701420 osafamfd [6896:imma_oi_api.c:0622] >> saImmOiFinalize
>
> 3. Before the finalize is not completed in clearing the OI handle, impl_set 
> is called by AMFD in the function avd_role_failover_qsd_actv(calling 
> avd_imm_impl_set_task_create). Because of this amfd exited.
>
> Sep 10 14:56:57.701178 osafamfd [6896:role.cc:0498] << 
> avd_role_failover_qsd_actv
>
>
> Sep 10 14:56:57.702256 osafamfd [6896:imm.cc:1215] >> avd_imm_impl_set
> Sep 10 14:56:57.702273 osafamfd [6896:imma_oi_api.c:1281] T4 ERR_LIBRARY: 
> Overlapping use of IMM OI handle by multiple threads
> Sep 10 14:56:57.703683 osafamfd [6896:imm.cc:1218] ER saImmOiImplementerSet 
> failed 2
> Sep 10 14:56:57.703788 osafamfd [6896:imm.cc:1288] ER exiting since 
> avd_imm_impl_set failed
>
>
> Because of using shared Oihandle, across multiple threads in AMFD the 
> saImmOiImplementerSet failed with ERR_LIBRARY.
>
>
>
> ---
>
> ** [tickets:#1059] 2PBE: cluster reset observed during switchovers**
>
> **Status:** unassigned
> **Milestone:** 4.3.3
> **Created:** Wed Sep 10, 2014 09:57 AM UTC by Sirisha Alla
> **Last Updated:** Wed Sep 10, 2014 10:29 AM UTC
> **Owner:** nobody
>
> The issue is seen on SLES X86. OpenSAF is running with changeset 5697 with 
> 2PBE with 50k application objects.
>
> Switchovers with IMM application running is in progress when the issue is 
> observed.
>
> Syslog on SC-1:
>
> Sep 10 14:56:47 SLES-64BIT-SLOT1 osafamfnd[7540]: NO Assigned 
> 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
> Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
> 182 <0, 2020f> (@OpenSafImmReplicatorB)
> Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
> commit for PRTA update Ccb:100000063/4294967395
> Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmpbed: IN Slave PBE replied with OK on 
> attempt to start prepare of ccb:100000063/4294967395
> Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
> commit for Ccb:6/6
> Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer (applier) 
> connected: 193 (@OpenSafImmReplicatorB) <0, 2020f>
> Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
> replied with transient error on prepare for ccb:6/6
> Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: IN Slave PBE replied with OK on 
> attempt to start prepare of ccb:6/6
> Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
> commit for PRTA update Ccb:100000064/4294967396
> Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Ccb 6 COMMITTED 
> (SetUp_Ccb)
> Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
> replied with transient error on prepare for ccb:100000064/4294967396
> Sep 10 14:56:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
> replied with transient error on prepare for ccb:100000064/4294967396
> Sep 10 14:56:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
> replied with transient error on prepare for ccb:100000064/4294967396
> Sep 10 14:56:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
> replied with transient error on prepare for ccb:100000064/4294967396
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
> replied with transient error on prepare for ccb:100000064/4294967396
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
> replied with transient error on prepare for ccb:100000064/4294967396
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: WA Start prepare for ccb: 
> 100000064/4294967396 towards slave PBE returned: '6' from sttandby PBE
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA 
> update Ccb:100000064/4294967396 towards PBE-B
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (20) in PRTA 
> update (ccbId:100000064)
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmnd[7448]: WA update of PERSISTENT 
> runtime attributes in object 'safNode=PL-4,safCluster=myClmCluster' REVERTED. 
> PBE rc:20
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer locally 
> disconnected. Marking it as doomed 190 <17, 2010f> (safClmService)
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafclmd[7511]: ER saImmOiImplementerSet 
> failed rc:14, exiting
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafamfnd[7540]: NO 
> 'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
> Recovery is 'nodeFailfast'
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafamfnd[7540]: ER 
> safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown 
> Recovery is:nodeFailfast
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafamfnd[7540]: Rebooting OpenSAF NodeId = 
> 131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
> OwnNodeId = 131343, SupervisionTime = 60
> Sep 10 14:56:51 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
> timeout=60
> Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
> 190 <17, 2010f> (safClmService)
> Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: ER PBE PRTAttrs Update 
> continuation missing! invoc:100
> Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
> 16 <0, 2020f> (@OpenSafImmPBE)
> Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
> 17 <0, 2020f> (OsafImmPbeRt_B)
> Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: WA Timeout on syncronous 
> admin operation 108
> Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmpbed: WA Failed to delete class 
> towards slave PBE. Library or immsv replied Rc:5 - ignoring
>
> Syslog on SC-2
>
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER Failed to stop cluster 
> tracking 5
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER ClmTrack stop failed
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osaffmd[6816]: NO Current role: ACTIVE
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osaffmd[6816]: Rebooting OpenSAF NodeId = 
> 131343 EE Name = , Reason: Received Node Down for peer controller, OwnNodeId 
> = 131599, SupervisionTime = 60
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: WA IMMND DOWN on active 
> controller f1 detected at standby immd!! f2. Possible failover
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafrded[6807]: NO RDE role set to QUIESCED
> .....
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osaffmd[6816]: NO Controller Failover: 
> Setting role to ACTIVE
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafrded[6807]: NO RDE role set to ACTIVE
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: NO ACTIVE request
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osaflogd[6850]: NO ACTIVE request
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafntfd[6863]: NO ACTIVE request
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafclmd[6877]: NO ACTIVE request
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: NO ellect_coord invoke from 
> rda_callback ACTIVE
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: NO New coord elected, 
> resides at 2020f
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO 2PBE configured, 
> IMMSV_PBE_FILE_SUFFIX:.2020f (sync)
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO This IMMND is now the 
> NEW Coord
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Coord broadcasting 
> PBE_PRTO_PURGE_MUTATIONS, epoch:18
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: NO Node 'SC-1' left the 
> cluster
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfnd[6906]: NO Assigning 
> 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafntfimcnd[7701]: NO exiting on signal 15
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: WA Global PURGE PERSISTENT 
> RTO mutations received in epoch 18
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: WA PBE failed in 
> persistification of class delete fWiTRoVwpQDAfWNBVtqJ
> ......
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: NO FAILOVER Quiesced --> 
> Active
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER ncs_mbcsv_svc 
> NCS_MBCSV_OP_CHG_ROLE 1 failed
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer connected: 
> 200 (MsgQueueService131343) <410, 2020f>
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer locally 
> disconnected. Marking it as doomed 200 <410, 2020f> (MsgQueueService131343)
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer (applier) 
> connected: 201 (@OpenSafImmReplicatorA) <412, 2020f>
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafntfimcnd[7722]: NO Started
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer disconnected 
> 184 <10, 2020f> (safAmfService)
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: NO Re-initializing with IMM
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer disconnected 
> 200 <410, 2020f> (MsgQueueService131343)
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER saImmOiImplementerSet 
> failed 2
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER exiting since 
> avd_imm_impl_set failed
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfnd[6906]: ER AMF director 
> unexpectedly crashed
> Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfnd[6906]: Rebooting OpenSAF NodeId = 
> 131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) 
> received, OwnNodeId = 131599, SupervisionTime = 60
> Sep 10 14:56:57 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node; 
> timeout=60
>
> Syslog and traces are attached.
>
>
> ---
>
> Sent from sourceforge.net because you indicated interest in 
> <https://sourceforge.net/p/opensaf/tickets/1059/>
>
> To unsubscribe from further messages, please visit 
> <https://sourceforge.net/auth/subscriptions/>




---

** [tickets:#1059] CLM: ERR_EXIST from implementer-set needs to be handled as 
limited TRY_AGAIN.**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Wed Sep 10, 2014 09:57 AM UTC by Sirisha Alla
**Last Updated:** Wed Sep 10, 2014 05:30 PM UTC
**Owner:** nobody

The issue is seen on SLES X86. OpenSAF is running with changeset 5697 with 2PBE 
with 50k application objects.

Switchovers with IMM application running is in progress when the issue is 
observed.

Syslog on SC-1:

Sep 10 14:56:47 SLES-64BIT-SLOT1 osafamfnd[7540]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
182 <0, 2020f> (@OpenSafImmReplicatorB)
Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:100000063/4294967395
Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmpbed: IN Slave PBE replied with OK on 
attempt to start prepare of ccb:100000063/4294967395
Sep 10 14:56:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for Ccb:6/6
Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer (applier) 
connected: 193 (@OpenSafImmReplicatorB) <0, 2020f>
Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
replied with transient error on prepare for ccb:6/6
Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: IN Slave PBE replied with OK on 
attempt to start prepare of ccb:6/6
Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:100000064/4294967396
Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Ccb 6 COMMITTED (SetUp_Ccb)
Sep 10 14:56:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
replied with transient error on prepare for ccb:100000064/4294967396
Sep 10 14:56:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
replied with transient error on prepare for ccb:100000064/4294967396
Sep 10 14:56:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
replied with transient error on prepare for ccb:100000064/4294967396
Sep 10 14:56:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
replied with transient error on prepare for ccb:100000064/4294967396
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
replied with transient error on prepare for ccb:100000064/4294967396
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 1 or Immsv (6) 
replied with transient error on prepare for ccb:100000064/4294967396
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: WA Start prepare for ccb: 
100000064/4294967396 towards slave PBE returned: '6' from sttandby PBE
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA 
update Ccb:100000064/4294967396 towards PBE-B
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (20) in PRTA update 
(ccbId:100000064)
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmnd[7448]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-4,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer locally 
disconnected. Marking it as doomed 190 <17, 2010f> (safClmService)
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafclmd[7511]: ER saImmOiImplementerSet 
failed rc:14, exiting
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafamfnd[7540]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafamfnd[7540]: ER 
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafamfnd[7540]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131343, SupervisionTime = 60
Sep 10 14:56:51 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
timeout=60
Sep 10 14:56:51 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
190 <17, 2010f> (safClmService)
Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: ER PBE PRTAttrs Update 
continuation missing! invoc:100
Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
16 <0, 2020f> (@OpenSafImmPBE)
Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: NO Implementer disconnected 
17 <0, 2020f> (OsafImmPbeRt_B)
Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmnd[7448]: WA Timeout on syncronous 
admin operation 108
Sep 10 14:56:53 SLES-64BIT-SLOT1 osafimmpbed: WA Failed to delete class towards 
slave PBE. Library or immsv replied Rc:5 - ignoring

Syslog on SC-2

Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER Failed to stop cluster 
tracking 5
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER ClmTrack stop failed
Sep 10 14:56:57 SLES-64BIT-SLOT2 osaffmd[6816]: NO Current role: ACTIVE
Sep 10 14:56:57 SLES-64BIT-SLOT2 osaffmd[6816]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Received Node Down for peer controller, OwnNodeId = 
131599, SupervisionTime = 60
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: WA IMMND DOWN on active 
controller f1 detected at standby immd!! f2. Possible failover
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafrded[6807]: NO RDE role set to QUIESCED
.....
Sep 10 14:56:57 SLES-64BIT-SLOT2 osaffmd[6816]: NO Controller Failover: Setting 
role to ACTIVE
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafrded[6807]: NO RDE role set to ACTIVE
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: NO ACTIVE request
Sep 10 14:56:57 SLES-64BIT-SLOT2 osaflogd[6850]: NO ACTIVE request
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafntfd[6863]: NO ACTIVE request
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafclmd[6877]: NO ACTIVE request
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: NO ellect_coord invoke from 
rda_callback ACTIVE
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmd[6826]: NO New coord elected, resides 
at 2020f
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO 2PBE configured, 
IMMSV_PBE_FILE_SUFFIX:.2020f (sync)
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO This IMMND is now the NEW 
Coord
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Coord broadcasting 
PBE_PRTO_PURGE_MUTATIONS, epoch:18
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: NO Node 'SC-1' left the cluster
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfnd[6906]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafntfimcnd[7701]: NO exiting on signal 15
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: WA Global PURGE PERSISTENT 
RTO mutations received in epoch 18
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: WA PBE failed in 
persistification of class delete fWiTRoVwpQDAfWNBVtqJ
......
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: NO FAILOVER Quiesced --> Active
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER ncs_mbcsv_svc 
NCS_MBCSV_OP_CHG_ROLE 1 failed
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer connected: 200 
(MsgQueueService131343) <410, 2020f>
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer locally 
disconnected. Marking it as doomed 200 <410, 2020f> (MsgQueueService131343)
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer (applier) 
connected: 201 (@OpenSafImmReplicatorA) <412, 2020f>
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafntfimcnd[7722]: NO Started
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer disconnected 
184 <10, 2020f> (safAmfService)
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: NO Re-initializing with IMM
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafimmnd[6836]: NO Implementer disconnected 
200 <410, 2020f> (MsgQueueService131343)
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER saImmOiImplementerSet 
failed 2
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfd[6896]: ER exiting since 
avd_imm_impl_set failed
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfnd[6906]: ER AMF director unexpectedly 
crashed
Sep 10 14:56:57 SLES-64BIT-SLOT2 osafamfnd[6906]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) 
received, OwnNodeId = 131599, SupervisionTime = 60
Sep 10 14:56:57 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node; 
timeout=60

Syslog and traces are attached. 


---

Sent from sourceforge.net because [email protected] is 
subscribed to http://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
http://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to