SC-1:
messages:
Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class
RunClassNameForOITesting committing with ccbId:100000186
Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE
commit for PRTA update Ccb:100000187/4294967687
Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmnd[3145]: NO Create of class
RunClassNameForOITesting is PERSISTENT.
Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmpbed: IN Slave PBE replied with OK on
attempt to start prepare of ccb:100000187/4294967687
Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE
commit for PRTA update Ccb:100000188/4294967688
Nov 13 11:38:25 SLES-64BIT-SLOT1 osafimmnd[3145]: ER PBE PRTAttrs Update
continuation missing! invoc:391
osfimmnd:
1. RT-obj update with ccb:100000187/4294967687
Nov 13 11:38:20.976279 osafimmnd [3145:immnd_evt.c:8610] T2 FEVS from myself,
still pending:0
Nov 13 11:38:20.976342 osafimmnd [3145:immsv_evt.c:5382] T8 Received:
IMMND_EVT_A2ND_OI_OBJ_MODIFY (38) from 0
Nov 13 11:38:20.976382 osafimmnd [3145:immnd_evt.c:6256] >>
immnd_evt_proc_rt_object_modify
Nov 13 11:38:20.976401 osafimmnd [3145:immnd_evt.c:6277] TR We expect there to
be a PBE
Nov 13 11:38:20.976420 osafimmnd [3145:ImmModel.cc:1917] >>
immModel_rtObjectUpdate
Nov 13 11:38:20.976437 osafimmnd [3145:ImmModel.cc:1919] T5 on enter isPl:0
----
----
Nov 13 11:38:20.977526 osafimmpbed [3174:imma_proc.c:1157] >>
imma_proc_obj_modify
Nov 13 11:38:20.977570 osafimmpbed [3174:imma_proc.c:1179] T3 PBE-OI received
runtime attributes update
Nov 13 11:38:20.977611 osafimmpbed [3174:imma_proc.c:1211] TR
IMMA_CALLBACK_OI_CCB_MODIFY Posted for ccb 0
Nov 13 11:38:20.977634 osafimmpbed [3174:imma_proc.c:1219] <<
imma_proc_obj_modify
Nov 13 11:38:20.977656 osafimmpbed [3174:imma_proc.c:1224] >>
imma_proc_free_pointers
Nov 13 11:38:20.977672 osafimmpbed [3174:imma_proc.c:1314] <<
imma_proc_free_pointers
Nov 13 11:38:20.977755 osafimmnd [3145:immnd_evt.c:6514] <<
immnd_evt_proc_rt_object_modify
2. preparing ccb towards PBE-B
Nov 13 11:38:25.568935 osafimmpbed [3174:immpbe_daemon.cc:1176] >>
saImmOiCcbObjectModifyCallback: Modify callback for CCB:4294967687
object:safNode=PL-3,safCluster=myClmCluster
Nov 13 11:38:25.569804 osafimmpbed [3174:immpbe_daemon.cc:1235] IN Starting
distributed PBE commit for PRTA update Ccb:100000187/4294967687
Nov 13 11:38:25.569882 osafimmpbed [3174:immpbe_daemon.cc:0219] >>
pbe2_start_prepare_ccb_A_to_B
Nov 13 11:38:25.569910 osafimmpbed [3174:imma_om_api.c:3461] >>
admin_op_invoke_common
Nov 13 11:38:25.569937 osafimmpbed [3174:imma_om_api.c:3593] TR
immInvocations:554
Nov 13 11:38:25.569962 osafimmpbed [3174:imma_om_api.c:3614] TR PARAM:ccbId
Nov 13 11:38:25.569990 osafimmpbed [3174:imma_om_api.c:3614] TR PARAM:numOps
Nov 13 11:38:25.570166 osafimmnd [3145:immsv_evt.c:5382] T8 Received:
IMMND_EVT_A2ND_IMM_FEVS (14) from 2010f
Nov 13 11:38:25.570242 osafimmnd [3145:immnd_evt.c:2785] T2 sender_count:
17179869739 size: 146
----
----
----
Nov 13 11:38:25.580212 osafimmpbed [3174:imma_om_api.c:3663] TR Fevs send
RETURNED:1
Nov 13 11:38:25.580238 osafimmpbed [3174:imma_om_api.c:3678] TR Normal return
Nov 13 11:38:25.580249 osafimmpbed [3174:imma_om_api.c:3812] <<
admin_op_invoke_common
Nov 13 11:38:25.580281 osafimmpbed [3174:immpbe_daemon.cc:0274] IN Slave PBE
replied with OK on attempt to start prepare of ccb:100000187/4294967687
Nov 13 11:38:25.580290 osafimmpbed [3174:immpbe_daemon.cc:0278] <<
pbe2_start_prepare_ccb_A_to_B
3. PBE-B replied with Ok
Nov 13 11:38:25.583243 osafimmpbed [3174:immpbe_dump.cc:1284] >>
objectModifyDiscardAllValuesOfAttrToPBE
Nov 13 11:38:25.583316 osafimmpbed [3174:immpbe_dump.cc:1318] T2 Successfully
accessed object 'safNode=PL-3,safCluster=myClmCluster'
Nov 13 11:38:25.583334 osafimmpbed [3174:immpbe_dump.cc:1319] T2
object_id:50226 class_id:17
---
---
Nov 13 11:38:25.586546 osafimmpbed [3174:immpbe_dump.cc:1242] >>
stampObjectWithCcbId
Nov 13 11:38:25.586579 osafimmpbed [3174:immpbe_dump.cc:1261] <<
stampObjectWithCcbId
Nov 13 11:38:25.586593 osafimmpbed [3174:immpbe_dump.cc:1912] <<
objectModifyAddValuesOfAttrToPBE
Nov 13 11:38:25.735291 osafimmpbed [3174:immpbe_daemon.cc:1271] TR Commit PBE
transaction 100000187 for rt attr update OK
Nov 13 11:38:25.735380 osafimmpbed [3174:immpbe_daemon.cc:1317] <<
saImmOiCcbObjectModifyCallback
Nov 13 11:38:25.735394 osafimmpbed [3174:imma_proc.c:2558] TR ccb-object-modify
callback returned RC:1
4. RT-update is updated in PBE-A but not in PBE-B and imm database
Nov 13 11:38:25.736693 osafimmpbed [3174:imma_proc.c:2542] TR
ccb-object-modify: make the callback
Nov 13 11:38:25.736712 osafimmpbed [3174:immpbe_daemon.cc:1176] >>
saImmOiCcbObjectModifyCallback: Modify callback for CCB:4294967688
object:safNode=PL-3,safCluster=myClmCluster
Nov 13 11:38:25.737552 osafimmpbed [3174:immpbe_daemon.cc:1235] IN Starting
distributed PBE commit for PRTA update Ccb:100000188/4294967688
Nov 13 11:38:25.738207 osafimmnd [3145:immnd_evt.c:8594] >>
immnd_evt_proc_fevs_rcv
Nov 13 11:38:25.738282 osafimmnd [3145:immnd_evt.c:8610] T2 FEVS from myself,
still pending:0
Nov 13 11:38:25.738301 osafimmnd [3145:immsv_evt.c:5382] T8 Received:
IMMND_EVT_A2ND_PBE_PRT_ATTR_UPDATE_RSP (75) from 0
Nov 13 11:38:25.738314 osafimmnd [3145:immnd_evt.c:4132] >>
immnd_evt_pbe_rt_attr_update_rsp
Nov 13 11:38:25.738324 osafimmnd [3145:ImmModel.cc:14518] >>
pbePrtAttrUpdateContinuation
Nov 13 11:38:25.738360 osafimmnd [3145:ImmModel.cc:14531] ER PBE PRTAttrs
Update continuation missing! invoc:391
SC-2:
messages:
1. PBE-B receives RT-update callback and times out
Nov 13 11:38:10 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:10 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:11 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:11 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:12 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:12 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:13 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:13 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in waiting
on prepare for PRTA update ccb:100000187 dn:safNode=PL-3,safCluster=myClmCluster
2. once the prapre is arrived after update callback is giving ok
Nov 13 11:38:25 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE
slave ccbId:100000187/4294967687 numOps:1
Nov 13 11:38:25 SLES-64BIT-SLOT2 osafimmnd[2491]: ER PBE PRTAttrs Update
continuation missing! invoc:391
In this scenario there are two problems:
1. PBE-B give prepare OK, when the RT-update callback is timeout in PBE-B, this
has to be avoided.
solution may be is to have last RT-update callback CCBID and if prepare comes
for lesser than RT-CCBID then PBE-B should not give OK to PBE-A
2. Because of PBE-B giving OK to prepare "sqlite_prepare_ccb" is called and
information is stored in ccbutil, in further any CCB/RT operations PBE-B will
give continues errors:
Nov 13 11:38:25 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare
ccb:100000188/4294967688 received at Pbe slave when Prior Ccb 4294967687 still
processing
----
---
Nov 13 12:21:47 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare
ccb:10000046b/4294968427 received at Pbe slave when Prior Ccb 4294967687 still
processing
solution may be:
In PBE-B adminoperation with opId == OPENSAF_IMM_PBE_CCB_PREPARE do
sqlite_prepare_ccbonly if (rc == SA_AIS_OK &&ccbId < 0x100000000)
and when Slave PBE time-out in waiting on prepare for PRTA create/delete/update
do sqlite_prepare_ccb
@@ -1253,6 +1253,7 @@ static SaAisErrorT saImmOiCcbObjectModif
goto done;
}
+ commit_prta_trans:
TRACE("Build PBE transaction for rt obj update");
---
** [tickets:#1211] ERR_NO_RESOURCES is continuously returned when PRTA object
is being updated**
**Status:** unassigned
**Milestone:** 5.0
**Created:** Thu Nov 13, 2014 11:41 AM UTC by Sirisha Alla
**Last Updated:** Thu Nov 13, 2014 11:42 AM UTC
**Owner:** nobody
This issue is seen with OpenSAF4.5 GA tag + patches for 1092, 1173, 1057, 1063,
1080. 2PBE is enabled and loaded with 50k objects.
The test application is started at
Nov 13 11:37:53 SLES-64BIT-SLOT1 osafimmnd[3145]: NO Implementer connected: 873
(RUNTIMEIMPL) <0, 2030f>
There are continuous switchovers going on in the background. After sometime
following error messages are seen in the syslogs of the controllers:
Syslog of SC-1:
Nov 13 11:38:14 SLES-64BIT-SLOT1 osafimmnd[3145]: WA update of PERSISTENT
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED.
PBE rc:18
Nov 13 11:38:14 SLES-64BIT-SLOT1 osafimmnd[3145]: NO Implementer locally
disconnected. Marking it as doomed 878 <1006, 2010f> (safClmService)
Syslog of SC-2:
Nov 13 11:38:13 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:13 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare
from primary on PRTA update ccb:100000187
Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in waiting
on prepare for PRTA update ccb:100000187 dn:safNode=PL-3,safCluster=myClmCluster
Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmpbed: NO 2PBE Error (18) in PRTA update
(ccbId:100000187)
Nov 13 11:38:14 SLES-64BIT-SLOT2 osafimmnd[2491]: WA update of PERSISTENT
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED.
PBE rc:18
Nov 13 11:38:14 SLES-64BIT-SLOT2 osafrded[2462]: NO RDE role set to QUIESCED
The application is stopped at
Nov 13 11:44:14 SLES-64BIT-SLOT1 osafimmnd[3145]: NO Implementer disconnected
873 <0, 2030f> (RUNTIMEIMPL)
But the above ERR_NO_RESOURCES messages continue to appear in the syslog. CCBs
are getting aborted once IMM enters this state. These messages are stopped only
with cluster reset.
Syslogs of the controllers attached. The traces are very huge and are available.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets