Actually, even simpler is just not replying and letting the om-client timeout.
Thew comment even implies that solution.
But is seems not to be what is happening.

/AndersBj

________________________________
From: Anders Bjornerstedt [mailto:[email protected]]
Sent: den 24 oktober 2014 11:49
To: [email protected]
Subject: [tickets] [opensaf:tickets] Re: #1091 2PBE: class create timesout 
before default SYNCR_TIMEOUT


Yes, but the question is if ERR_TIMEOUT is the best return code here.
This is a comment in ImmModel::pbePrtoPurgeMutations():

            /* Only reply the default reply of TRY_AGAIN on failed
               RTO ops, not failed class ops. The RTO ops are
               reverted on fauilure here, which is consistent with
               a reply of TRY_AGAIN. The class ops are NOT reverted,
               so a reply of TRY_AGAIN would be incorrect and a
               reply of OK would be premature. For class ops
               that have trouble with the PBE, ERR_TIMEOUT would
               be the only proper reply, so we let the client
               timeout by not replying.
            */


Since the class-create is actually not reverted out in imm-ram, we could in 
principle return SA_AIS_OK.
The only problem is that there is apparently a problem with persistifying the 
class to PBE here.
The PBE or PBEs should have (must have) restarted if they explicitly fail to 
create the class.

The best solution would be to increase the timeout for waiting on PBE to 
process class-create to 10 seconds.
That would avoid the strange "early" return of ERR_TIMEOUT here.

I will change the ticket to an enhancement to give special treatment for the 
timeout on class-create and class-delete.

/AndersBj

________________________________

From: Neelakanta Reddy [mailto:[email protected]]
Sent: den 24 oktober 2014 11:29
To: [opensaf:tickets]
Subject: [opensaf:tickets] #1091 2PBE: class create timesout before default 
SYNCR_TIMEOUT

This looks to as per design.

once the class is created and PBE is enabled then sPbeRtReqContinuationMap will 
be added and the timeout for sPbeRtReqContinuationMap is default 6 seconds.

IMMND timesout after 6 seconds and sends timeout to classcreate.

SLOT-3

imma:
Sep 15 18:16:32.567528 imma [8200:imma_om_api.c:4303] >> saImmOmClassCreate_2
Sep 15 18:16:32.567596 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjApplNoResponseModCallback_101 category:1
Sep 15 18:16:38.481486 imma [8200:imma_om_api.c:4594] TR Return code:5
Sep 15 18:16:38.481762 imma [8200:imma_om_api.c:4640] << saImmOmClassCreate_2

immnd:

Sep 15 18:16:32.572065 osafimmnd [7463:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_CLASS_CREATE (27) from 0
Sep 15 18:16:32.572102 osafimmnd [7463:immnd_evt.c:5033] TR We expect there to 
be a PBE
Sep 15 18:16:32.572137 osafimmnd [7463:ImmModel.cc:2832] >> classCreate: 
cont:0x7fffe2fbcdf8 connp:0x7fffe2fbcdf0 nodep:0x7fffe2fbcdf4
Sep 15 18:16:32.572241 osafimmnd [7463:ImmModel.cc:2861] T5 CREATE CLASS 
'testMA_verifyObjApplNoResponseModCallback_101' category:1
Sep 15 18:16:32.572283 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'SaImmAttrImplementerName'
Sep 15 18:16:32.572426 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'SaImmAttrAdminOwnerName'
Sep 15 18:16:32.572492 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'SaImmAttrClassName'
Sep 15 18:16:32.572536 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'attrName6'
Sep 15 18:16:32.572580 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'attrName5'
Sep 15 18:16:32.572622 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'attrName4'
Sep 15 18:16:32.572657 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'attrName3'
Sep 15 18:16:32.572690 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'attrName2'
Sep 15 18:16:32.572758 osafimmnd [7463:ImmModel.cc:4068] T5 create attribute 
'attrName1'
Sep 15 18:16:32.572824 osafimmnd [7463:ImmModel.cc:10921] >> updateImmObject
Sep 15 18:16:32.572890 osafimmnd [7463:ImmModel.cc:10967] T5 Adding new class 
testMA_verifyObjApplNoResponseModCallback_101
Sep 15 18:16:32.573015 osafimmnd [7463:ImmModel.cc:10973] << updateImmObject
Sep 15 18:16:32.573107 osafimmnd [7463:ImmModel.cc:3284] << classCreate

Sep 15 18:16:38.478422 osafimmnd [7463:ImmModel.cc:12023] T5 Timeout on 
PbeRtReqContinuation 261993136911
Sep 15 18:16:38.478455 osafimmnd [7463:ImmModel.cc:12090] T5 
sPbeRegressPeriods:23
Sep 15 18:16:38.481100 osafimmnd [7463:immnd_proc.c:1233] WA Timeout on 
Persistent runtime Object Mutation, waiting on PBE

________________________________

[tickets:#1091]<http://sourceforge.net/p/opensaf/tickets/1091>http://sourceforge.net/p/opensaf/tickets/1091
 2PBE: class create timesout before default SYNCR_TIMEOUT

Status: accepted
Milestone: 4.4.2
Created: Mon Sep 15, 2014 03:45 PM UTC by Sirisha Alla
Last Updated: Fri Oct 24, 2014 09:09 AM UTC
Owner: Anders Bjornerstedt

The issue is seen on SLES X86 VMs running with opensaf changeset 5697+#946 
patch. The IMM DB is loaded with 50k objects.

Class Create is returning TIMEOUT and on further call returns ERR_BUSY very 
frequently.

Agent traces:

Sep 15 18:16:23.961232 imma [8200:imma_om_api.c:4303] >> saImmOmClassCreate_2
Sep 15 18:16:23.961245 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:29.440968 imma [8200:imma_om_api.c:4594] TR Return code:5
Sep 15 18:16:29.441017 imma [8200:imma_om_api.c:4640] << saImmOmClassCreate_2

Total time taken is not even IMMA_SYNCR_TIMEOUT which is defaulted to 10 
seconds. Application does not have any SYNCR_TIMEOUT customized.

Further call to Class Create returns ERR_BUSY.

Sep 15 18:16:29.568306 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:29.568394 imma [8200:ntfa_mds.c:0369] TR NTFS down
Sep 15 18:16:29.942518 imma [8200:imma_om_api.c:4303] >> saImmOmClassCreate_2
Sep 15 18:16:29.942608 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:30.495659 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:30.495742 imma [8200:ntfa_mds.c:0382] T2 MSG from NTFS 
NCSMDS_NEW_ACTIVE/UP
Sep 15 18:16:30.524268 imma [8200:imma_om_api.c:4594] TR Return code:10
Sep 15 18:16:30.524315 imma [8200:imma_om_api.c:4640] << saImmOmClassCreate_2

Syslog on SC-2:

Sep 15 18:16:33 SLES-64BIT-SLOT2 osafamfnd[2467]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmd[2356]: WA IMMD not re-electing coord 
for switch-over (si-swap) coord at (2010f)
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafntfimcnd[2427]: NO exiting on signal 15
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO ERR_BUSY: Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 received while class with same 
name is already being mutated
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 124 
(safMsgGrpService) <312, 2020f>
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 125 
(safLogService) <3, 2020f>
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 126 
(safCheckPointService) <308, 2020f>
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafsmfd[2597]: NO Backup create cmd = 
/usr/lib64/opensaf/smf-backup-create

Syslog on SC-1:

Sep 15 18:16:46 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjApplRejModifyCallback_101 committing with ccbId:10000003a
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjApplRejModifyCallback_101 is PERSISTENT.
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: WA Primary PBE failed to create 
class towards slave PBE. Library or immsv replied Rc:6 - ignoring
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 committing with ccbId:10000003b
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:10000003c/4294967356
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 is PERSISTENT.
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA Start prepare for ccb: 
10000003c/4294967356 towards slave PBE returned: '6' from Immsv
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA 
update Ccb:10000003c/4294967356 towards PBE-B
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (20) in PRTA update 
(ccbId:10000003c)
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmnd[7804]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA Primary PBE failed to create 
class towards slave PBE. Library or immsv replied Rc:6 - ignoring
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjApplNoResponseModCallback_101 committing with ccbId:10000003d
Sep 15 18:16:51 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjApplNoResponseModCallback_101 is PERSISTENT.

There is few seconds difference between both the nodes.
Syslog and immnd traces are attached.

________________________________

Sent from sourceforge.net because you indicated interest in 
https://sourceforge.net/p/opensaf/tickets/1091/<https://sourceforge.net/p/opensaf/tickets/1091>https://sourceforge.net/p/opensaf/tickets/1091

To unsubscribe from further messages, please visit 
https://sourceforge.net/auth/subscriptions/<https://sourceforge.net/auth/subscriptions>https://sourceforge.net/auth/subscriptions

________________________________

[tickets:#1091]<http://sourceforge.net/p/opensaf/tickets/1091> 2PBE: class 
create timesout before default SYNCR_TIMEOUT

Status: accepted
Milestone: 4.4.2
Created: Mon Sep 15, 2014 03:45 PM UTC by Sirisha Alla
Last Updated: Fri Oct 24, 2014 09:29 AM UTC
Owner: Anders Bjornerstedt

The issue is seen on SLES X86 VMs running with opensaf changeset 5697+#946 
patch. The IMM DB is loaded with 50k objects.

Class Create is returning TIMEOUT and on further call returns ERR_BUSY very 
frequently.

Agent traces:

Sep 15 18:16:23.961232 imma [8200:imma_om_api.c:4303] >> saImmOmClassCreate_2
Sep 15 18:16:23.961245 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:29.440968 imma [8200:imma_om_api.c:4594] TR Return code:5
Sep 15 18:16:29.441017 imma [8200:imma_om_api.c:4640] << saImmOmClassCreate_2

Total time taken is not even IMMA_SYNCR_TIMEOUT which is defaulted to 10 
seconds. Application does not have any SYNCR_TIMEOUT customized.

Further call to Class Create returns ERR_BUSY.

Sep 15 18:16:29.568306 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:29.568394 imma [8200:ntfa_mds.c:0369] TR NTFS down
Sep 15 18:16:29.942518 imma [8200:imma_om_api.c:4303] >> saImmOmClassCreate_2
Sep 15 18:16:29.942608 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:30.495659 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:30.495742 imma [8200:ntfa_mds.c:0382] T2 MSG from NTFS 
NCSMDS_NEW_ACTIVE/UP
Sep 15 18:16:30.524268 imma [8200:imma_om_api.c:4594] TR Return code:10
Sep 15 18:16:30.524315 imma [8200:imma_om_api.c:4640] << saImmOmClassCreate_2

Syslog on SC-2:

Sep 15 18:16:33 SLES-64BIT-SLOT2 osafamfnd[2467]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmd[2356]: WA IMMD not re-electing coord 
for switch-over (si-swap) coord at (2010f)
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafntfimcnd[2427]: NO exiting on signal 15
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO ERR_BUSY: Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 received while class with same 
name is already being mutated
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 124 
(safMsgGrpService) <312, 2020f>
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 125 
(safLogService) <3, 2020f>
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 126 
(safCheckPointService) <308, 2020f>
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafsmfd[2597]: NO Backup create cmd = 
/usr/lib64/opensaf/smf-backup-create

Syslog on SC-1:

Sep 15 18:16:46 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjApplRejModifyCallback_101 committing with ccbId:10000003a
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjApplRejModifyCallback_101 is PERSISTENT.
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: WA Primary PBE failed to create 
class towards slave PBE. Library or immsv replied Rc:6 - ignoring
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 committing with ccbId:10000003b
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:10000003c/4294967356
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 is PERSISTENT.
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:10000003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA Start prepare for ccb: 
10000003c/4294967356 towards slave PBE returned: '6' from Immsv
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA 
update Ccb:10000003c/4294967356 towards PBE-B
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (20) in PRTA update 
(ccbId:10000003c)
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmnd[7804]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA Primary PBE failed to create 
class towards slave PBE. Library or immsv replied Rc:6 - ignoring
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjApplNoResponseModCallback_101 committing with ccbId:10000003d
Sep 15 18:16:51 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjApplNoResponseModCallback_101 is PERSISTENT.

There is few seconds difference between both the nodes.
Syslog and immnd traces are attached.

________________________________

Sent from sourceforge.net because [email protected] is 
subscribed to 
http://sourceforge.net/p/opensaf/tickets/<http://sourceforge.net/p/opensaf/tickets>

To unsubscribe from further messages, a project admin can change settings at 
http://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to