[tickets] [opensaf:tickets] #1075 2PBE: pbed aborts at sqlite_prepare_ccb

2014-09-15 Thread Sirisha Alla



---

** [tickets:#1075] 2PBE: pbed aborts at sqlite_prepare_ccb**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 06:45 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 06:45 AM UTC
**Owner:** nobody

The issue is seen on SLES X86 running with 2PBE and 50k objects. Opensaf is 
with changeset 5697 + #946 patch. IMM Applications along with switchovers is in 
progress

Syslog of SC-1:


Sep 12 18:01:34 SLES-64BIT-SLOT1 osafamfnd[2526]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
431 0, 2020f (@OpenSafImmReplicatorA)
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:100a5/4294967461 numOps:1
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 446 (@OpenSafImmReplicatorA) 0, 2020f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 447 (@OpenSafImmReplicatorB) 333, 2010f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafntfimcnd[4781]: NO Started
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100a5 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 18:01:40 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:44 SLES-64BIT-SLOT1 osafimmnd[2452]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
422 0, 2020f (safAmfService)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 448 (@safAmfService2020f) 0, 2020f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Switching StandBy -- 
Active State
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
435 12, 2010f (@safAmfService2010f)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer connected: 449 
(safAmfService) 12, 2010f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafrded[2423]: NO RDE role set to ACTIVE
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Controller switch over done
Sep 12 18:01:46 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100a5)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a6
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: ER sqlite_prepare_ccb invoked 
when no sqlite transaction has been started.
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: ER PBE PRTAttrs Update 
continuation missing! invoc:165
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 437 313, 2010f (@OpenSafImmPBE)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 438 314, 2010f (OsafImmPbeRt_B)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
437 313, 2010f (@OpenSafImmPBE)

(gdb) bt full
  #0  0x7f8be076ab55 in raise () from /lib64/libc.so.6
No symbol table info available.
  #1  0x7f8be076c131 in abort () from /lib64/libc.so.6
No symbol table info available.
  #2  0x00406cec in sqlite_prepare_ccb(unsigned long long, unsigned 
long long, CcbUtilOperationData*) () at immpbe_daemon.cc:92
category_mask = 0
max_waiting_time_ms = 5000
ccb_id_string = 0x41fc32 ccbId
rtCallbacks = {
  saImmOiAdminOperationCallback = 0x407a94 
saImmOiAdminOperationCallback(unsigned long long, 

[tickets] [opensaf:tickets] #1022 AMF: no response to SI lock when payload is rebooted

2014-09-15 Thread Nagendra Kumar
changeset:   5774:b1e9a09afa75
branch:  opensaf-4.5.x
tag: tip
parent:  5772:bf5e8642ba6c
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 11:49:36 2014 +0530
summary: amfd: send SI admin response after sg becomes stable [#1022]

[staging:b1e9a0]



---

** [tickets:#1022] AMF: no response to SI lock when payload is rebooted**

**Status:** fixed
**Milestone:** 4.3.3
**Created:** Wed Aug 27, 2014 10:31 AM UTC by Hans Feldt
**Last Updated:** Fri Sep 05, 2014 07:01 AM UTC
**Owner:** Nagendra Kumar

Test case that used to work before with 4.4

Do SI lock
active test app does not respond
reboot the node where active test app is hosted
= No admin op response (timeout)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1076 2PBE: pbed aborts at pbeClosePrepareTrans

2014-09-15 Thread Sirisha Alla



---

** [tickets:#1076] 2PBE: pbed aborts at pbeClosePrepareTrans**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 06:52 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 06:52 AM UTC
**Owner:** nobody

The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running on 
changeset 5697 + #946 patches

Syslog on SC-2:

Sep 12 19:15:00 SLES-64BIT-SLOT2 osafamfnd[2409]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
618 0, 2010f (@OpenSafImmReplicatorA)
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 640 (@OpenSafImmReplicatorA) 0, 2010f
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100f0 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Got error on non local rt 
object update err: 6
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
610 0, 2010f (safAmfService)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 641 (@safAmfService2010f) 0, 2010f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Switching StandBy -- 
Active State
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
623 14, 2020f (@safAmfService2020f)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer connected: 642 
(safAmfService) 14, 2020f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafrded[2303]: NO RDE role set to ACTIVE
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafclmd[2377]: NO ACTIVE request
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Controller switch over done
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA s_info-to_svc == 0 
reply context destroyed before this reply could be made
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: ER Failed to send response to 
agent/client over MDS rc:2
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100f0)
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmnd[2332]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:21
Sep 12 19:15:15 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Create of class 
testMA_verifyPrimNoResponseDelCallback_101 is PERSISTENT.
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: IN Create of class 
testMA_verifyPrimNoResponseDelCallback_101 committing with ccbId:100ee
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: ER pbePrepareTrans was called 
when sqliteTransLock(0)!=1
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:17 SLES-64BIT-SLOT2 osafimmnd[2332]: WA SLAVE PBE process has 
apparently died at non coord

Program terminated with signal 6, Aborted.
  #0  0x7fd4af31fb55 in raise () from /lib64/libc.so.6
(gdb) bt
  #0  0x7fd4af31fb55 in raise () from /lib64/libc.so.6
  #1  0x7fd4af321131 in abort () from /lib64/libc.so.6
  #2  0x004104f7 in pbeClosePrepareTrans() () at 

[tickets] [opensaf:tickets] #1054 su operational state shows empty

2014-09-15 Thread Nagendra Kumar
- **status**: review -- fixed
- **Comment**:

changeset:   5775:beb05ae5f068
branch:  opensaf-4.5.x
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 12:00:56 2014 +0530
summary: amfd: update su oper state to imm during su addition [#1054]

changeset:   5776:dbf909b68ddc
tag: tip
parent:  5773:ecada8cb88a3
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 12:01:34 2014 +0530
summary: amfd: update su oper state to imm during su addition [#1054]

[staging:beb05a]
[staging:dbf909]




---

** [tickets:#1054] su operational state shows empty**

**Status:** fixed
**Milestone:** 4.3.3
**Created:** Tue Sep 09, 2014 09:03 AM UTC by surender khetavath
**Last Updated:** Thu Sep 11, 2014 12:00 PM UTC
**Owner:** Nagendra Kumar

changeset : 5697
model : 2n
configuration : 1App,1SG,5SUs with 3comps each, 5SIs with 3CSIs each
si-si deps configured as SI1-SI2-SI3-SI4.
SU1 is active, SU2 is standby.
SU1 is mapped to SC-1 and SU2 to SC-2,SU3 to PL-3 and SU4,5 to PL-4
saAmfSGAutoRepair=1(True)
SuFailover=1(True)

Test : 
Do not start opensaf on PL-3 and PL-4
Bring up a model such that few sus are hosted on PL-3 and PL-4.
Unlock-in and unlock the Sus.

SUs get unlocked, but operational state is not moved to DISABLED, though comp's 
operational state is disabled. 

safSu=SU3,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=Empty
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=SU4,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=Empty
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=SU5,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=Empty
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)

safComp=COMP3SU3TWONAPP,safSu=SU3,safSg=SGONE,safApp=TWONAPP
saAmfCompOperState=DISABLED(2)
saAmfCompPresenceState=UNINSTANTIATED(1)
saAmfCompReadinessState=OUT-OF-SERVICE(1)
safComp=COMP1SU4TWONAPP,safSu=SU4,safSg=SGONE,safApp=TWONAPP
saAmfCompOperState=DISABLED(2)
saAmfCompPresenceState=UNINSTANTIATED(1)
saAmfCompReadinessState=OUT-OF-SERVICE(1)
safComp=COMP2SU4TWONAPP,safSu=SU4,safSg=SGONE,safApp=TWONAPP
saAmfCompOperState=DISABLED(2)
saAmfCompPresenceState=UNINSTANTIATED(1)
saAmfCompReadinessState=OUT-OF-SERVICE(1)
safComp=COMP3SU4TWONAPP,safSu=SU4,safSg=SGONE,safApp=TWONAPP
saAmfCompOperState=DISABLED(2)
saAmfCompPresenceState=UNINSTANTIATED(1)
saAmfCompReadinessState=OUT-OF-SERVICE(1)
safComp=COMP1SU5TWONAPP,safSu=SU5,safSg=SGONE,safApp=TWONAPP
saAmfCompOperState=DISABLED(2)
saAmfCompPresenceState=UNINSTANTIATED(1)
saAmfCompReadinessState=OUT-OF-SERVICE(1)
safComp=COMP2SU5TWONAPP,safSu=SU5,safSg=SGONE,safApp=TWONAPP
saAmfCompOperState=DISABLED(2)
saAmfCompPresenceState=UNINSTANTIATED(1)
saAmfCompReadinessState=OUT-OF-SERVICE(1)
safComp=COMP3SU5TWONAPP,safSu=SU5,safSg=SGONE,safApp=TWONAPP
saAmfCompOperState=DISABLED(2)
saAmfCompPresenceState=UNINSTANTIATED(1)
saAmfCompReadinessState=OUT-OF-SERVICE(1)




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1077 opensaf randomly and frequently fails to start with trace enabled

2014-09-15 Thread Hans Feldt



---

** [tickets:#1077] opensaf randomly and frequently fails to start with trace 
enabled**

**Status:** accepted
**Milestone:** 4.5.0
**Created:** Mon Sep 15, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Mon Sep 15, 2014 07:08 AM UTC
**Owner:** Hans Feldt

In IMM and NTF logging and tracing is done between fork and exec. This together 
with the added call to tzset() in logtrace creates a deadlock in the child. 
Here's an example of how immpbed hangs for ever (no supervision in immnd):

The system appears to have started correctly but configuration changes times 
out:

root@SC-1:/# immcfg -a saAmfClusterStartupTimeout=100 
safAmfCluster=myAmfCluster
error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6)
error - immcfg command timed out (alarm)


 all process including pbe has started:

root   391  0.0  0.0 146660  1144 ?Ssl 07:08   0:00 
/usr/local/lib/opensaf/osafrded
root   405  0.0  0.0 148848  1144 ?Ssl 07:08   0:00 
/usr/local/lib/opensaf/osaffmd
root   414  0.0  0.0 157324  1428 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafimmd
root   423  0.0  0.0 238192  2600 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafimmnd --tracemask=0x
root   437  0.0  0.0 227412  3884 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osaflogd
root   449  0.0  0.0 159552  1564 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafntfd
root   459  0.0  0.0 157892  1708 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafclmd
root   464  0.0  0.0 164344  1052 ?SN   07:08   0:00 
/usr/local/lib/opensaf/osafimmnd --tracemask=0x
root   469  0.0  0.0 146656  1156 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafclmna
root   477  0.0  0.0 167104  2712 ?Ssl 07:08   0:00 
/usr/local/lib/opensaf/osafamfd
root   486  0.0  0.0 225600  1964 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafamfnd
root   499  0.0  0.0 148728  1036 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafsmfnd
root   504  0.0  0.0 254200  1840 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafsmfd
root   536  0.0  0.0 157928  1904 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafckptnd
root   555  0.0  0.0 146644  1036 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafamfwd
root   596  0.0  0.0 153592  1332 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafckptd

 gdb backtrace shows that pbe is hanging in the newly added tzset in logtrace:

(gdb) bt
#0  __lll_lock_wait_private () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x7f180cee39de in _L_lock_2427 () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x7f180cee37b1 in __tzset () at tzset.c:598
#3  0x7f180db5e8cf in output (file=0x4722ff immnd_proc.c, line=1577, 
priority=priority@entry=7, category=category@entry=1, 
format=format@entry=0x472452 Exec: %s %s %s, ap=ap@entry=0x7fff58b213d8)
at logtrace.c:96
#4  0x7f180db5ed9b in _logtrace_trace (file=file@entry=0x4722ff 
immnd_proc.c, line=line@entry=1577, category=category@entry=1, 
format=format@entry=0x472452 Exec: %s %s %s) at logtrace.c:173
#5  0x00409cee in immnd_forkPbe (cb=cb@entry=0x691540 _immnd_cb) at 
immnd_proc.c:1577
#6  0x0041e570 in immnd_proc_server 
(timeout=timeout@entry=0x7fff58b21fd8) at immnd_proc.c:2111
#7  0x0040a763 in main (argc=optimized out, argv=optimized out) at 
immnd_main.c:355

Sep 13 07:08:17 SC-1 osafimmnd[423]: NO STARTING PBE process.
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO 
pbe-db-file-path:/srv/shared/imm//imm.db VETERAN:0 B:0
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO Implementer connected: 2 
(safClmService) 13, 2010f
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO implementer for class 'SaClmNode' is 
safClmService = class extent is safe.
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO implementer for class 'SaClmCluster' is 
safClmService = class extent is safe.
Sep 13 07:08:17 SC-1 osafclmna[469]: Started
Sep 13 07:08:17 SC-1 osafclmna[469]: NO safNode=SC-1,safCluster=myClmCluster 
Joined cluster, nodeid=2010f
Sep 13 07:08:17 SC-1 osafamfd[477]: Started
Sep 13 07:08:17 SC-1 osafamfd[477]: NO Invalid configuration, 
saAmfCtDefRecoveryOnError=NO_RECOMMENDATION(1) for 
'safVersion=4.0.0,safCompType=OpenSafCompTypeAMFWDOG'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO COMPONENT_FAILOVER(3) used instead of 
NO_RECOMMENDATION(1) for 'safVersion=4.0.0,safCompType=OpenSafCompTypeAMFWDOG'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO Invalid configuration, 
saAmfCtDefRecoveryOnError=NO_RECOMMENDATION(1) for 
'safVersion=4.0.0,safCompType=OpenSafCompTypeCPND'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO COMPONENT_FAILOVER(3) used instead of 
NO_RECOMMENDATION(1) for 'safVersion=4.0.0,safCompType=OpenSafCompTypeCPND'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO Invalid configuration, 
saAmfCtDefRecoveryOnError=NO_RECOMMENDATION(1) for 
'safVersion=4.0.0,safCompType=OpenSafCompTypeSMFND'
Sep 13 07:08:17 SC-1 

[tickets] [opensaf:tickets] #1075 2PBE: pbed aborts at sqlite_prepare_ccb

2014-09-15 Thread Anders Bjornerstedt
- **status**: unassigned -- assigned
- **assigned_to**: Anders Bjornerstedt



---

** [tickets:#1075] 2PBE: pbed aborts at sqlite_prepare_ccb**

**Status:** assigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 06:45 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 06:45 AM UTC
**Owner:** Anders Bjornerstedt

The issue is seen on SLES X86 running with 2PBE and 50k objects. Opensaf is 
with changeset 5697 + #946 patch. IMM Applications along with switchovers is in 
progress

Syslog of SC-1:


Sep 12 18:01:34 SLES-64BIT-SLOT1 osafamfnd[2526]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
431 0, 2020f (@OpenSafImmReplicatorA)
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:100a5/4294967461 numOps:1
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 446 (@OpenSafImmReplicatorA) 0, 2020f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 447 (@OpenSafImmReplicatorB) 333, 2010f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafntfimcnd[4781]: NO Started
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100a5 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 18:01:40 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:44 SLES-64BIT-SLOT1 osafimmnd[2452]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
422 0, 2020f (safAmfService)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 448 (@safAmfService2020f) 0, 2020f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Switching StandBy -- 
Active State
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
435 12, 2010f (@safAmfService2010f)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer connected: 449 
(safAmfService) 12, 2010f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafrded[2423]: NO RDE role set to ACTIVE
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Controller switch over done
Sep 12 18:01:46 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100a5)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a6
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: ER sqlite_prepare_ccb invoked 
when no sqlite transaction has been started.
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: ER PBE PRTAttrs Update 
continuation missing! invoc:165
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 437 313, 2010f (@OpenSafImmPBE)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 438 314, 2010f (OsafImmPbeRt_B)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
437 313, 2010f (@OpenSafImmPBE)

(gdb) bt full
  #0  0x7f8be076ab55 in raise () from /lib64/libc.so.6
No symbol table info available.
  #1  0x7f8be076c131 in abort () from /lib64/libc.so.6
No symbol table info available.
  #2  0x00406cec in sqlite_prepare_ccb(unsigned long long, unsigned 
long long, CcbUtilOperationData*) () at immpbe_daemon.cc:92
category_mask = 0
max_waiting_time_ms = 5000
ccb_id_string = 0x41fc32 ccbId
rtCallbacks = {
  

[tickets] [opensaf:tickets] #1076 2PBE: pbed aborts at pbeClosePrepareTrans

2014-09-15 Thread Anders Bjornerstedt
- **status**: unassigned -- assigned
- **assigned_to**: Anders Bjornerstedt
- **Milestone**: 4.3.3 -- 4.4.1



---

** [tickets:#1076] 2PBE: pbed aborts at pbeClosePrepareTrans**

**Status:** assigned
**Milestone:** 4.4.1
**Created:** Mon Sep 15, 2014 06:52 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 06:52 AM UTC
**Owner:** Anders Bjornerstedt

The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running on 
changeset 5697 + #946 patches

Syslog on SC-2:

Sep 12 19:15:00 SLES-64BIT-SLOT2 osafamfnd[2409]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
618 0, 2010f (@OpenSafImmReplicatorA)
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 640 (@OpenSafImmReplicatorA) 0, 2010f
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100f0 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Got error on non local rt 
object update err: 6
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
610 0, 2010f (safAmfService)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 641 (@safAmfService2010f) 0, 2010f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Switching StandBy -- 
Active State
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
623 14, 2020f (@safAmfService2020f)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer connected: 642 
(safAmfService) 14, 2020f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafrded[2303]: NO RDE role set to ACTIVE
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafclmd[2377]: NO ACTIVE request
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Controller switch over done
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA s_info-to_svc == 0 
reply context destroyed before this reply could be made
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: ER Failed to send response to 
agent/client over MDS rc:2
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100f0)
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmnd[2332]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:21
Sep 12 19:15:15 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Create of class 
testMA_verifyPrimNoResponseDelCallback_101 is PERSISTENT.
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: IN Create of class 
testMA_verifyPrimNoResponseDelCallback_101 committing with ccbId:100ee
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: ER pbePrepareTrans was called 
when sqliteTransLock(0)!=1
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:17 SLES-64BIT-SLOT2 osafimmnd[2332]: WA SLAVE PBE process has 
apparently died at non coord

Program terminated with signal 6, Aborted.
  #0  0x7fd4af31fb55 in raise () from /lib64/libc.so.6
(gdb) bt
  #0  0x7fd4af31fb55 in raise () from 

[tickets] [opensaf:tickets] #1075 2PBE: pbed aborts at sqlite_prepare_ccb

2014-09-15 Thread Anders Bjornerstedt
- **Milestone**: 4.3.3 -- 4.4.1



---

** [tickets:#1075] 2PBE: pbed aborts at sqlite_prepare_ccb**

**Status:** assigned
**Milestone:** 4.4.1
**Created:** Mon Sep 15, 2014 06:45 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 07:19 AM UTC
**Owner:** Anders Bjornerstedt

The issue is seen on SLES X86 running with 2PBE and 50k objects. Opensaf is 
with changeset 5697 + #946 patch. IMM Applications along with switchovers is in 
progress

Syslog of SC-1:


Sep 12 18:01:34 SLES-64BIT-SLOT1 osafamfnd[2526]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
431 0, 2020f (@OpenSafImmReplicatorA)
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:100a5/4294967461 numOps:1
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 446 (@OpenSafImmReplicatorA) 0, 2020f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 447 (@OpenSafImmReplicatorB) 333, 2010f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafntfimcnd[4781]: NO Started
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100a5 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 18:01:40 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:44 SLES-64BIT-SLOT1 osafimmnd[2452]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
422 0, 2020f (safAmfService)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 448 (@safAmfService2020f) 0, 2020f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Switching StandBy -- 
Active State
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
435 12, 2010f (@safAmfService2010f)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer connected: 449 
(safAmfService) 12, 2010f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafrded[2423]: NO RDE role set to ACTIVE
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Controller switch over done
Sep 12 18:01:46 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100a5)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a6
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: ER sqlite_prepare_ccb invoked 
when no sqlite transaction has been started.
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: ER PBE PRTAttrs Update 
continuation missing! invoc:165
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 437 313, 2010f (@OpenSafImmPBE)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 438 314, 2010f (OsafImmPbeRt_B)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
437 313, 2010f (@OpenSafImmPBE)

(gdb) bt full
  #0  0x7f8be076ab55 in raise () from /lib64/libc.so.6
No symbol table info available.
  #1  0x7f8be076c131 in abort () from /lib64/libc.so.6
No symbol table info available.
  #2  0x00406cec in sqlite_prepare_ccb(unsigned long long, unsigned 
long long, CcbUtilOperationData*) () at immpbe_daemon.cc:92
category_mask = 0
max_waiting_time_ms = 5000
ccb_id_string = 0x41fc32 ccbId
rtCallbacks = {
  saImmOiAdminOperationCallback = 0x407a94 

[tickets] [opensaf:tickets] #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-15 Thread Anders Bjornerstedt
- **Version**:  -- 4.3
- **Milestone**: 4.6.FC -- 4.3.3



---

** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
**Last Updated:** Fri Sep 12, 2014 09:20 PM UTC
**Owner:** Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is 
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep  6  6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep  6  6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep  6  6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look 
for id:1664
Sep  6  6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Sep  6  6:58:02.096725 osafimmnd [502:immnd_evt.c:1676]  
immnd_evt_proc_search_next
Sep  6  6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: 
step:540

I have managed to overcome this bug temporary by making following patch:

+++ b/osaf/libs/common/immsv/include/immsv_api.hSat Sep 06 08:38:16 
2014 +
@@ -70,7 +70,7 @@

 /*Max # of outstanding fevs messages towards director.*/
 /*Note max-max is 255. cb-fevs_replies_pending is an uint8_t*/
-#define IMMSV_DEFAULT_FEVS_MAX_PENDING 16
+#define IMMSV_DEFAULT_FEVS_MAX_PENDING 255

 #define IMMSV_MAX_OBJECTS 1
 #define IMMSV_MAX_ATTRIBUTES 128



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1076 2PBE: pbed aborts at pbeClosePrepareTrans

2014-09-15 Thread Anders Bjornerstedt
When writing tickets on 2PBE, please always include syslogs from both SCs
that cover the time of the incident. 


---

** [tickets:#1076] 2PBE: pbed aborts at pbeClosePrepareTrans**

**Status:** assigned
**Milestone:** 4.4.1
**Created:** Mon Sep 15, 2014 06:52 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 07:22 AM UTC
**Owner:** Anders Bjornerstedt

The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running on 
changeset 5697 + #946 patches

Syslog on SC-2:

Sep 12 19:15:00 SLES-64BIT-SLOT2 osafamfnd[2409]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
618 0, 2010f (@OpenSafImmReplicatorA)
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 640 (@OpenSafImmReplicatorA) 0, 2010f
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100f0 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Got error on non local rt 
object update err: 6
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
610 0, 2010f (safAmfService)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 641 (@safAmfService2010f) 0, 2010f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Switching StandBy -- 
Active State
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
623 14, 2020f (@safAmfService2020f)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer connected: 642 
(safAmfService) 14, 2020f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafrded[2303]: NO RDE role set to ACTIVE
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafclmd[2377]: NO ACTIVE request
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Controller switch over done
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA s_info-to_svc == 0 
reply context destroyed before this reply could be made
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: ER Failed to send response to 
agent/client over MDS rc:2
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100f0)
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmnd[2332]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:21
Sep 12 19:15:15 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Create of class 
testMA_verifyPrimNoResponseDelCallback_101 is PERSISTENT.
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: IN Create of class 
testMA_verifyPrimNoResponseDelCallback_101 committing with ccbId:100ee
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: ER pbePrepareTrans was called 
when sqliteTransLock(0)!=1
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:17 SLES-64BIT-SLOT2 osafimmnd[2332]: WA SLAVE PBE process has 
apparently died at non coord

Program terminated with signal 6, Aborted.
  #0  0x7fd4af31fb55 in raise () from /lib64/libc.so.6
(gdb) bt
  #0  0x7fd4af31fb55 in raise () from 

[tickets] [opensaf:tickets] #1075 2PBE: pbed aborts at sqlite_prepare_ccb

2014-09-15 Thread Anders Bjornerstedt
When writing tickets on 2PBE, please always include syslogs from both SCs
that cover the time of the incident. 


---

** [tickets:#1075] 2PBE: pbed aborts at sqlite_prepare_ccb**

**Status:** assigned
**Milestone:** 4.4.1
**Created:** Mon Sep 15, 2014 06:45 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 07:23 AM UTC
**Owner:** Anders Bjornerstedt

The issue is seen on SLES X86 running with 2PBE and 50k objects. Opensaf is 
with changeset 5697 + #946 patch. IMM Applications along with switchovers is in 
progress

Syslog of SC-1:


Sep 12 18:01:34 SLES-64BIT-SLOT1 osafamfnd[2526]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
431 0, 2020f (@OpenSafImmReplicatorA)
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:100a5/4294967461 numOps:1
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 446 (@OpenSafImmReplicatorA) 0, 2020f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 447 (@OpenSafImmReplicatorB) 333, 2010f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafntfimcnd[4781]: NO Started
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100a5 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 18:01:40 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:44 SLES-64BIT-SLOT1 osafimmnd[2452]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
422 0, 2020f (safAmfService)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 448 (@safAmfService2020f) 0, 2020f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Switching StandBy -- 
Active State
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
435 12, 2010f (@safAmfService2010f)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer connected: 449 
(safAmfService) 12, 2010f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafrded[2423]: NO RDE role set to ACTIVE
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Controller switch over done
Sep 12 18:01:46 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100a5)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a6
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: ER sqlite_prepare_ccb invoked 
when no sqlite transaction has been started.
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: ER PBE PRTAttrs Update 
continuation missing! invoc:165
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 437 313, 2010f (@OpenSafImmPBE)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 438 314, 2010f (OsafImmPbeRt_B)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
437 313, 2010f (@OpenSafImmPBE)

(gdb) bt full
  #0  0x7f8be076ab55 in raise () from /lib64/libc.so.6
No symbol table info available.
  #1  0x7f8be076c131 in abort () from /lib64/libc.so.6
No symbol table info available.
  #2  0x00406cec in sqlite_prepare_ccb(unsigned long long, unsigned 
long long, CcbUtilOperationData*) () at immpbe_daemon.cc:92
category_mask = 0
max_waiting_time_ms = 5000
ccb_id_string = 0x41fc32 ccbId

[tickets] [opensaf:tickets] #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-15 Thread Anders Bjornerstedt
- **status**: unassigned -- invalid
- **Comment**:

The symptoms indicate a performance problem with the setup of resources vs
load for this test.

The test  setup manages to get sync (plus presumably other trafic ?) to 
overload fevs. 

OpenSAF currently has no overload protection or load regulation so 
overloading the system will inevtiably cause degeneration of service.
Here this results in the failure of a sync. 
This should lead to the joing paylod retrying the sync.

If the 3rd/4th payload always fails to join it means this resource and
load configuration can not support more than 2 SC and 2 payloads.




---

** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)**

**Status:** invalid
**Milestone:** 4.3.3
**Created:** Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
**Last Updated:** Mon Sep 15, 2014 07:25 AM UTC
**Owner:** Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is 
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep  6  6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep  6  6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep  6  6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look 
for id:1664
Sep  6  6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Sep  6  6:58:02.096725 osafimmnd [502:immnd_evt.c:1676]  
immnd_evt_proc_search_next
Sep  6  6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: 
step:540

I have managed to overcome this bug temporary by making following patch:

+++ b/osaf/libs/common/immsv/include/immsv_api.hSat Sep 06 08:38:16 
2014 +
@@ -70,7 +70,7 @@

 /*Max # of outstanding fevs messages towards director.*/
 /*Note max-max is 255. cb-fevs_replies_pending is an uint8_t*/
-#define IMMSV_DEFAULT_FEVS_MAX_PENDING 16
+#define IMMSV_DEFAULT_FEVS_MAX_PENDING 255

 #define IMMSV_MAX_OBJECTS 1
 #define IMMSV_MAX_ATTRIBUTES 128



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1048 Amf: su shutdown should return SA_AIS_ERR_INTERRUPT

2014-09-15 Thread Nagendra Kumar
- **status**: review -- fixed
- **Comment**:

changeset:   5777:7b76c9933b05
branch:  opensaf-4.5.x
parent:  5775:beb05ae5f068
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 13:20:54 2014 +0530
summary: amfd: respond with ERR_INTERRUPT to su shutdown op [#1048]

changeset:   5778:572380024fdb
tag: tip
parent:  5776:dbf909b68ddc
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 13:21:05 2014 +0530
summary: amfd: respond with ERR_INTERRUPT to su shutdown op [#1048]

[staging:7b76c9]
[staging:572380]




---

** [tickets:#1048] Amf: su shutdown should return SA_AIS_ERR_INTERRUPT**

**Status:** fixed
**Milestone:** 4.3.3
**Created:** Mon Sep 08, 2014 10:38 AM UTC by Nagendra Kumar
**Last Updated:** Tue Sep 09, 2014 11:28 AM UTC
**Owner:** Nagendra Kumar

SU shutdown when interrupted by SU lock, should return SA_AIS_ERR_INTERRUPT 
rather than OK.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1025 Amf : osafamfnd crashed during restart failure

2014-09-15 Thread Nagendra Kumar
- **status**: review -- fixed
- **Comment**:

changeset:   5779:6c62a01ef630
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 13:34:49 2014 +0530
summary: amfnd: perform su failover if npi su translates into inst fail 
state [#1025]

changeset:   5780:9ac53ee22ac2
branch:  opensaf-4.5.x
parent:  5777:7b76c9933b05
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 13:35:18 2014 +0530
summary: amfnd: perform su failover if npi su translates into inst fail 
state [#1025]

changeset:   5781:b62f09e680af
branch:  opensaf-4.4.x
tag: tip
parent:  5771:ca844aed9b16
user:Nagendra Kumarnagendr...@oracle.com
date:Mon Sep 15 13:35:38 2014 +0530
summary: amfnd: perform su failover if npi su translates into inst fail 
state [#1025]

[staging:6c62a0]
[staging:9ac53e]
[staging:b62f09]




---

** [tickets:#1025] Amf : osafamfnd crashed during restart failure **

**Status:** fixed
**Milestone:** 4.3.3
**Created:** Wed Aug 27, 2014 05:31 PM UTC by KANG-SEN LU
**Last Updated:** Mon Sep 15, 2014 08:21 AM UTC
**Owner:** Nagendra Kumar

We are running opensaf 4.4.0.

Here is a gdb stack trace of osafamfnd crash:

==
(gdb) bt
0 0x7f457067f425 in __GI_raise (sig=optimized out=) at

../nptl/sysdeps/unix/sysv/linux/raise.c:64
1 0x7f4570682b8b in __GI_abort () at abort.c:91
2 0x7f4572105f21 in __osafassert_fail (

__file=0x448498

/home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/di.cc, line=569,

func=0x4488b0 avnd_di_susi_resp_send(avnd_cb_tag*, avnd_su_tag*,= 
avnd_su_si_rec*)::__FUNCTION__= avnd_di_susi_resp_send,
__assertion=0x44837a m_AVND_SU_IS_ASSIGN_PEND(su))
at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/libs/core/leap/sysf_def.c:278
3 0x00427a42 in avnd_di_susi_resp_send

(cb=cb@entry=0x65e4a0 _avnd_cb, su=su@entry=0x2444980,
si=si@entry=0x2439720)
at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/di.cc:569
4 0x00438c21 in avnd_su_pres_st_chng_prc

(final_st=SA_AMF_PRESENCE_INSTANTIATION_FAILED,
prv_st=SA_AMF_PRESENCE_INSTANTIATED, su=0x2444980,
cb=0x65e4a0 _avnd_cb) at
/home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/susm.cc:1608
5 avnd_su_pres_fsm_run (cb=cb@entry=0x65e4a0 _avnd_cb,

su=0x2444980, comp=comp@entry=0x2444bb0, ev=optimized out=)
at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/susm.cc:1394
6 0x004188b3 in avnd_comp_clc_st_chng_prc

(cb=cb@entry=0x65e4a0 _avnd_cb, comp=comp@entry=0x2444bb0,
prv_st=prv_st@entry=SA_AMF_PRESENCE_RESTARTING,
final_st=final_st@entry=SA_AMF_PRESENCE_INSTANTIATION_FAILED)
at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/clc.cc:1298
7 0x0041a512 in avnd_comp_clc_fsm_run

(cb=cb@entry=0x65e4a0 _avnd_cb, comp=comp@entry=0x2444bb0,
ev=optimized out=)
at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/clc.cc:862
8 0x0041aa39 in avnd_evt_clc_resp_evh (cb=0x65e4a0

_avnd_cb, evt=0x7f45640008c0)
at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/clc.cc:416
9 0x0042c23c in avnd_evt_process (evt=0x7f45640008c0)

at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-

4.4.0/osaf/services/saf/amf/amfnd/main.cc:678
10 avnd_main_process () at

/home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-
4.4.0/osaf/services/saf/amf/amfnd/main.cc:619
11 0x00405328 in main (argc=1, argv=0x7fff7bcfc988)

at /home/ksenlu/sandbox/klu_main/cae/extern/opensaf4/opensaf-

4.4.0/osaf/services/saf/amf/amfnd/main.cc:178
(gdb)




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1075 2PBE: pbed aborts at sqlite_prepare_ccb

2014-09-15 Thread Sirisha Alla
Traces for SC-2 are also attached to #1076


Attachment: SLOT2.tar.bz2 (11.8 MB; application/x-bzip) 


---

** [tickets:#1075] 2PBE: pbed aborts at sqlite_prepare_ccb**

**Status:** assigned
**Milestone:** 4.4.1
**Created:** Mon Sep 15, 2014 06:45 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 08:54 AM UTC
**Owner:** Anders Bjornerstedt

The issue is seen on SLES X86 running with 2PBE and 50k objects. Opensaf is 
with changeset 5697 + #946 patch. IMM Applications along with switchovers is in 
progress

Syslog of SC-1:


Sep 12 18:01:34 SLES-64BIT-SLOT1 osafamfnd[2526]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-1,safSg=2N,safApp=OpenSAF'
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
431 0, 2020f (@OpenSafImmReplicatorA)
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:100a5/4294967461 numOps:1
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 446 (@OpenSafImmReplicatorA) 0, 2020f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 447 (@OpenSafImmReplicatorB) 333, 2010f
Sep 12 18:01:34 SLES-64BIT-SLOT1 osafntfimcnd[4781]: NO Started
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:35 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:36 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:37 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:38 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a5
Sep 12 18:01:39 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100a5 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 18:01:40 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:44 SLES-64BIT-SLOT1 osafimmnd[2452]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
422 0, 2020f (safAmfService)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer (applier) 
connected: 448 (@safAmfService2020f) 0, 2020f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Switching StandBy -- 
Active State
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
435 12, 2010f (@safAmfService2010f)
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer connected: 449 
(safAmfService) 12, 2010f
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafrded[2423]: NO RDE role set to ACTIVE
Sep 12 18:01:45 SLES-64BIT-SLOT1 osafamfd[2516]: NO Controller switch over done
Sep 12 18:01:46 SLES-64BIT-SLOT1 osafimmnd[2452]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100a5)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100a6
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmpbed: ER sqlite_prepare_ccb invoked 
when no sqlite transaction has been started.
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: ER PBE PRTAttrs Update 
continuation missing! invoc:165
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 437 313, 2010f (@OpenSafImmPBE)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer locally 
disconnected. Marking it as doomed 438 314, 2010f (OsafImmPbeRt_B)
Sep 12 18:01:49 SLES-64BIT-SLOT1 osafimmnd[2452]: NO Implementer disconnected 
437 313, 2010f (@OpenSafImmPBE)

(gdb) bt full
  #0  0x7f8be076ab55 in raise () from /lib64/libc.so.6
No symbol table info available.
  #1  0x7f8be076c131 in abort () from /lib64/libc.so.6
No symbol table info available.
  #2  0x00406cec in sqlite_prepare_ccb(unsigned long long, unsigned 
long long, CcbUtilOperationData*) () at immpbe_daemon.cc:92
category_mask = 0
max_waiting_time_ms = 5000
ccb_id_string = 0x41fc32 ccbId
rtCallbacks 

[tickets] [opensaf:tickets] #1076 2PBE: pbed aborts at pbeClosePrepareTrans

2014-09-15 Thread Sirisha Alla
Traces for SC-1 are also attached to #1075 as mentioned in the ticket


Attachment: SLOT1.tar.bz2 (13.0 MB; application/x-bzip) 


---

** [tickets:#1076] 2PBE: pbed aborts at pbeClosePrepareTrans**

**Status:** assigned
**Milestone:** 4.4.1
**Created:** Mon Sep 15, 2014 06:52 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 07:34 AM UTC
**Owner:** Anders Bjornerstedt

The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running on 
changeset 5697 + #946 patches

Syslog on SC-2:

Sep 12 19:15:00 SLES-64BIT-SLOT2 osafamfnd[2409]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
618 0, 2010f (@OpenSafImmReplicatorA)
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:00 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 640 (@OpenSafImmReplicatorA) 0, 2010f
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:01 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:02 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:100f0
Sep 12 19:15:05 SLES-64BIT-SLOT2 osafimmpbed: NO Slave PBE time-out in waiting 
on porepare for PRTA update ccb:100f0 
dn:safNode=PL-3,safCluster=myClmCluster
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:06 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Got error on non local rt 
object update err: 6
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
610 0, 2010f (safAmfService)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer (applier) 
connected: 641 (@safAmfService2010f) 0, 2010f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Switching StandBy -- 
Active State
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
623 14, 2020f (@safAmfService2020f)
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer connected: 642 
(safAmfService) 14, 2020f
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafrded[2303]: NO RDE role set to ACTIVE
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafclmd[2377]: NO ACTIVE request
Sep 12 19:15:07 SLES-64BIT-SLOT2 osafamfd[2396]: NO Controller switch over done
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA Timeout on Persistent 
runtime Object Mutation, waiting on PBE
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: WA s_info-to_svc == 0 
reply context destroyed before this reply could be made
Sep 12 19:15:12 SLES-64BIT-SLOT2 osafimmnd[2332]: ER Failed to send response to 
agent/client over MDS rc:2
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmpbed: NO 2PBE Error (21) in PRTA update 
(ccbId:100f0)
Sep 12 19:15:14 SLES-64BIT-SLOT2 osafimmnd[2332]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:21
Sep 12 19:15:15 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Create of class 
testMA_verifyPrimNoResponseDelCallback_101 is PERSISTENT.
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: IN Create of class 
testMA_verifyPrimNoResponseDelCallback_101 committing with ccbId:100ee
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmpbed: ER pbePrepareTrans was called 
when sqliteTransLock(0)!=1
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer locally 
disconnected. Marking it as doomed 626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
625 315, 2020f (@OpenSafImmPBE)
Sep 12 19:15:16 SLES-64BIT-SLOT2 osafimmnd[2332]: NO Implementer disconnected 
626 316, 2020f (OsafImmPbeRt_B)
Sep 12 19:15:17 SLES-64BIT-SLOT2 osafimmnd[2332]: WA SLAVE PBE process has 
apparently died at non coord

Program terminated with signal 6, Aborted.
  #0  0x7fd4af31fb55 in raise () from /lib64/libc.so.6
(gdb) bt
  #0  0x7fd4af31fb55 in raise () 

[tickets] [opensaf:tickets] #1008 IMM: Changes to access control attribute should be local validated

2014-09-15 Thread Zoran Milinkovic
- **status**: review -- fixed
- **Comment**:

opensaf-4.5.x :

changeset:   5782:8b7f5de7bd3a
branch:  opensaf-4.5.x
parent:  5780:9ac53ee22ac2
user:Zoran Milinkovic zoran.milinko...@ericsson.com
date:Tue Sep 09 13:30:56 2014 +0200
summary: imm: add a validation for accessControlMode [#1008]

--

default(4.6) :

changeset:   5783:6799e094d3b2
tag: tip
parent:  5779:6c62a01ef630
user:Zoran Milinkovic zoran.milinko...@ericsson.com
date:Tue Sep 09 13:30:56 2014 +0200
summary: imm: add a validation for accessControlMode [#1008]



---

** [tickets:#1008] IMM: Changes to access control attribute should be local 
validated**

**Status:** fixed
**Milestone:** 4.5.0
**Created:** Fri Aug 22, 2014 01:52 PM UTC by Anders Bjornerstedt
**Last Updated:** Tue Sep 09, 2014 01:50 PM UTC
**Owner:** Zoran Milinkovic

Validation logic should be added to ImmModel::ccbObjectmodify checking
that updates to the 'accessControlMode' attribute in the OpensAF service
object are in the value range of the enum defined for it.

Local validation (meaning local to the modify operation) can be done here
because there is no inter-object relation from this attribute with any other
imm object (currently) in the imm model.

Create operations are not an issue because the immsv already rejects creates
of instances of either class 'OpensafImm' or 'SaImmMngt'.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1044 few si's are not assigned after su admin repair

2014-09-15 Thread Praveen
- **status**: review -- fixed
- **Comment**:

changeset:   5784:10d475cc2d08
branch:  opensaf-4.4.x
parent:  5781:b62f09e680af
user:praveen.malv...@oracle.com
date:Mon Sep 15 14:54:57 2014 +0530
summary: amfd: respond node level admin op at sufailover during node unlock 
[#1044]

changeset:   5785:ea9d40a93b09
branch:  opensaf-4.5.x
parent:  5782:8b7f5de7bd3a
user:praveen.malv...@oracle.com
date:Mon Sep 15 14:55:19 2014 +0530
summary: amfd: respond node level admin op at sufailover during node unlock 
[#1044]

changeset:   5786:dcae8bcf532f
tag: tip
parent:  5783:6799e094d3b2
user:praveen.malv...@oracle.com
date:Mon Sep 15 14:55:44 2014 +0530
summary: amfd: respond node level admin op at sufailover during node unlock 
[#1044]

[staging:10d475]
[staging:ea9d40]
[staging:dcae8b]




---

** [tickets:#1044] few si's are not assigned after su admin repair**

**Status:** fixed
**Milestone:** 4.4.1
**Created:** Thu Sep 04, 2014 06:38 AM UTC by surender khetavath
**Last Updated:** Tue Sep 09, 2014 09:42 AM UTC
**Owner:** Praveen

changeset : 5697
model : 2n
configuration : 1App,1SG,5SUs with 3comps each, 5SIs with 3CSIs each
si-si deps configured as SI1-SI2-SI3-SI4.
SU1 is active, SU2 is standby.
SU1 is mapped to SC-1 and SU2 to SC-2,SU3 to PL-3 and SU4,5 to PL-4
saAmfSGAutoRepair=1(True)
SuFailover=1(True)

Test:
Node lock having active SU
in the new active assignments reject the new assignments
unlock the Node

due to continuous faults all the SUs move to disabled,Uninstantiated,OSS.
Now repairing the SU1 causes only few sis assigned and few unassigned as shown 
below.

safSi=TWONSI2,safApp=TWONAPP
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)
safSi=TWONSI3,safApp=TWONAPP
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)
safSi=TWONSI4,safApp=TWONAPP
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=UNASSIGNED(1)
safSi=TWONSI5,safApp=TWONAPP
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)
safSi=TWONSI1,safApp=TWONAPP
saAmfSIAdminState=UNLOCKED(1)
saAmfSIAssignmentState=PARTIALLY_ASSIGNED(3)

safSu=SU2,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=DISABLED(2)
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=SU3,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=DISABLED(2)
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=SU5,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=DISABLED(2)
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)
safSu=SU1,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=ENABLED(1)
saAmfSUPresenceState=INSTANTIATED(3)
saAmfSUReadinessState=IN-SERVICE(2)
safSu=SU4,safSg=SGONE,safApp=TWONAPP
saAmfSUAdminState=UNLOCKED(1)
saAmfSUOperState=DISABLED(2)
saAmfSUPresenceState=UNINSTANTIATED(1)
saAmfSUReadinessState=OUT-OF-SERVICE(1)

safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI5,safApp=TWONAPP
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI1,safApp=TWONAPP
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SU1\,safSg=SGONE\,safApp=TWONAPP,safSi=TWONSI2,safApp=TWONAPP
saAmfSISUHAState=ACTIVE(1)


safAmfNode=PL-3,safAmfCluster=myAmfCluster
saAmfNodeAdminState=UNLOCKED(1)
saAmfNodeOperState=ENABLED(1)
safAmfNode=PL-4,safAmfCluster=myAmfCluster
saAmfNodeAdminState=UNLOCKED(1)
saAmfNodeOperState=ENABLED(1)
safAmfNode=PL-5,safAmfCluster=myAmfCluster
saAmfNodeAdminState=UNLOCKED(1)
saAmfNodeOperState=ENABLED(1)
safAmfNode=SC-1,safAmfCluster=myAmfCluster
saAmfNodeAdminState=UNLOCKED(1)
saAmfNodeOperState=ENABLED(1)
safAmfNode=SC-2,safAmfCluster=myAmfCluster
saAmfNodeAdminState=UNLOCKED(1)
saAmfNodeOperState=ENABLED(1)



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list

[tickets] [opensaf:tickets] #1078 Quiesced controller fails to become Active

2014-09-15 Thread Sirisha Alla



---

** [tickets:#1078] Quiesced controller fails to become Active**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 09:39 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 09:39 AM UTC
**Owner:** nobody

The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running 
with changeset 5697+#946 patch.

IMM Applications along with switchover is in progress. After SC-2 moved to 
Quiesced, SC-1 went for reboot because of #1067. SC-2 which was in Quiesced 
tried to become Active but implementer set timedout for amfd and the cluster 
went for reboot.

Syslog of SC-1:

Sep 15 09:38:48 SLES-64BIT-SLOT1 osafclmd[2471]: ER saImmOiClassImplementerSet 
failed for class SaClmNode rc:9, exiting
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: ER 
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131343, SupervisionTime = 60
Sep 15 09:38:48 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
timeout=60

This is a known issue #1067

Syslog of SC-2:


Sep 15 09:38:52 SLES-64BIT-SLOT2 osafimmnd[2340]: NO Epoch set to 55 in ImmModel
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER saImmOiImplementerSet 
failed 5
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER avd_imm_applier_set FAILED
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: role.cc:592: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: ER AMF director unexpectedly 
crashed
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) 
received, OwnNodeId = 131599, SupervisionTime = 60
Sep 15 09:38:56 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node; 
timeout=60

AMFD Traces on SC-2:

Sep 15  9:38:45.443707 osafamfd [2400:imm.cc:1252]  avd_imm_applier_set
Sep 15  9:38:50.128084 osafamfd [2400:mds.cc:0453] TR avnd 2010f97b02025 down
Sep 15  9:38:50.128319 osafamfd [2400:mbcsv_mds.c:0435] T1 RED_DOWN event. 
pwe_hdl: 65537, anchor:564116434460706
Sep 15  9:38:50.128343 osafamfd [2400:mbcsv_pwe_anc.c:0122]  
mbcsv_rmv_pwe_anc_entry
Sep 15  9:38:50.128359 osafamfd [2400:mbcsv_pwe_anc.c:0144]  
mbcsv_rmv_pwe_anc_entry
..

Sep 15  9:38:54.535469 osafamfd [2400:timer.cc:0169]  avd_tmr_exp
Sep 15  9:38:56.456335 osafamfd [2400:imm.cc:1256] ER saImmOiImplementerSet 
failed 5
Sep 15  9:38:56.456351 osafamfd [2400:role.cc:0591] ER avd_imm_applier_set 
FAILED
Sep 15  9:41:03.730331 osafamfd [2439:main.cc:0464]  initialize

IMMND Traces on SC-2:

Sep 15  9:38:45.447989 osafimmnd [2340:ImmModel.cc:12117]  implementerSet
Sep 15  9:38:45.448038 osafimmnd [2340:ImmModel.cc:12158] T7 Re-using 
implementer for @safAmfService2020f
Sep 15  9:38:45.448091 osafimmnd [2340:ImmModel.cc:12201] TR TRY_AGAIN: ccb 27 
is active on object 'attrName_testMA_verifyObjApplRejModifyCallback_101' bound 
to object applier '@safAmfService2020f'. Can not re-attach applier
Sep 15  9:38:45.448129 osafimmnd [2340:ImmModel.cc:12303]  implementerSet

attrName_testMA_verifyObjApplRejModifyCallback_101 is an application class 
object. Syslog and IMMND traces for both the controllers are attached.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1078 Quiesced controller fails to become Active

2014-09-15 Thread Neelakanta Reddy
- **status**: unassigned -- assigned
- **assigned_to**: Neelakanta Reddy
- **Part**: - -- nd



---

** [tickets:#1078] Quiesced controller fails to become Active**

**Status:** assigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 09:39 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 09:39 AM UTC
**Owner:** Neelakanta Reddy

The issue is seen on SLES X86 with 2PBE and 50k objects. Opensaf is running 
with changeset 5697+#946 patch.

IMM Applications along with switchover is in progress. After SC-2 moved to 
Quiesced, SC-1 went for reboot because of #1067. SC-2 which was in Quiesced 
tried to become Active but implementer set timedout for amfd and the cluster 
went for reboot.

Syslog of SC-1:

Sep 15 09:38:48 SLES-64BIT-SLOT1 osafclmd[2471]: ER saImmOiClassImplementerSet 
failed for class SaClmNode rc:9, exiting
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: NO 
'safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: ER 
safComp=CLM,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Sep 15 09:38:48 SLES-64BIT-SLOT1 osafamfnd[2510]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131343, SupervisionTime = 60
Sep 15 09:38:48 SLES-64BIT-SLOT1 opensaf_reboot: Rebooting local node; 
timeout=60

This is a known issue #1067

Syslog of SC-2:


Sep 15 09:38:52 SLES-64BIT-SLOT2 osafimmnd[2340]: NO Epoch set to 55 in ImmModel
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER saImmOiImplementerSet 
failed 5
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: ER avd_imm_applier_set FAILED
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfd[2400]: role.cc:592: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: ER AMF director unexpectedly 
crashed
Sep 15 09:38:56 SLES-64BIT-SLOT2 osafamfnd[2410]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) 
received, OwnNodeId = 131599, SupervisionTime = 60
Sep 15 09:38:56 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node; 
timeout=60

AMFD Traces on SC-2:

Sep 15  9:38:45.443707 osafamfd [2400:imm.cc:1252]  avd_imm_applier_set
Sep 15  9:38:50.128084 osafamfd [2400:mds.cc:0453] TR avnd 2010f97b02025 down
Sep 15  9:38:50.128319 osafamfd [2400:mbcsv_mds.c:0435] T1 RED_DOWN event. 
pwe_hdl: 65537, anchor:564116434460706
Sep 15  9:38:50.128343 osafamfd [2400:mbcsv_pwe_anc.c:0122]  
mbcsv_rmv_pwe_anc_entry
Sep 15  9:38:50.128359 osafamfd [2400:mbcsv_pwe_anc.c:0144]  
mbcsv_rmv_pwe_anc_entry
..

Sep 15  9:38:54.535469 osafamfd [2400:timer.cc:0169]  avd_tmr_exp
Sep 15  9:38:56.456335 osafamfd [2400:imm.cc:1256] ER saImmOiImplementerSet 
failed 5
Sep 15  9:38:56.456351 osafamfd [2400:role.cc:0591] ER avd_imm_applier_set 
FAILED
Sep 15  9:41:03.730331 osafamfd [2439:main.cc:0464]  initialize

IMMND Traces on SC-2:

Sep 15  9:38:45.447989 osafimmnd [2340:ImmModel.cc:12117]  implementerSet
Sep 15  9:38:45.448038 osafimmnd [2340:ImmModel.cc:12158] T7 Re-using 
implementer for @safAmfService2020f
Sep 15  9:38:45.448091 osafimmnd [2340:ImmModel.cc:12201] TR TRY_AGAIN: ccb 27 
is active on object 'attrName_testMA_verifyObjApplRejModifyCallback_101' bound 
to object applier '@safAmfService2020f'. Can not re-attach applier
Sep 15  9:38:45.448129 osafimmnd [2340:ImmModel.cc:12303]  implementerSet

attrName_testMA_verifyObjApplRejModifyCallback_101 is an application class 
object. Syslog and IMMND traces for both the controllers are attached.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1062 imm: immnd may crash in resourceDisplay

2014-09-15 Thread Neelakanta Reddy
The opName is checked in immnd if it fails INVALID_PARAM is returned before 
passing the function to ImmModel::resourceDisplay. In general OPname will not 
be passed as NULL to ImmModel::resourceDisplay. But, it is good to have the 
verification at ImmModel::resourceDisplay.




---

** [tickets:#1062] imm: immnd may crash in resourceDisplay**

**Status:** accepted
**Milestone:** 4.5.0
**Created:** Wed Sep 10, 2014 02:10 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Sep 12, 2014 07:48 AM UTC
**Owner:** Neelakanta Reddy

immnd may crash in immModel_resourceDisplay and ImmModel::resourceDisplay if 
operation name is not provided.
In this case, opName is NULL, and strcmp() functions may crash.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1062 imm: immnd may crash in resourceDisplay

2014-09-15 Thread Neelakanta Reddy
- **status**: accepted -- review



---

** [tickets:#1062] imm: immnd may crash in resourceDisplay**

**Status:** review
**Milestone:** 4.5.0
**Created:** Wed Sep 10, 2014 02:10 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 09:44 AM UTC
**Owner:** Neelakanta Reddy

immnd may crash in immModel_resourceDisplay and ImmModel::resourceDisplay if 
operation name is not provided.
In this case, opName is NULL, and strcmp() functions may crash.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-15 Thread Adrian Szwej
I don't think it is performance problems.
There is nothing indicating CPU load; memory; nor IO bandwith.
This is just a simple node joining seem to trigger some logical bug.
There is no application; but just pure opensaf.

I am now trying to elaborate with different MDS configuration options and MDS 
buffer settings together with MTU 9000 just to see if there is any difference 
in triggering this bug.

Opensaf is running inside containers; meaning there is no virtualization 
overhead.

Could you hint me what could cause the outstanding messages to reach 16?
E.g. could message loss / timing issue lead to this?
I am having more nodes configured than what is actually joining at the moment; 
around 10. But I am bringing them into cluster one by one.



---

** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)**

**Status:** invalid
**Milestone:** 4.3.3
**Created:** Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
**Last Updated:** Mon Sep 15, 2014 07:45 AM UTC
**Owner:** Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is 
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep  6  6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep  6  6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep  6  6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look 
for id:1664
Sep  6  6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Sep  6  6:58:02.096725 osafimmnd [502:immnd_evt.c:1676]  
immnd_evt_proc_search_next
Sep  6  6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: 
step:540

I have managed to overcome this bug temporary by making following patch:

+++ b/osaf/libs/common/immsv/include/immsv_api.hSat Sep 06 08:38:16 
2014 +
@@ -70,7 +70,7 @@

 /*Max # of outstanding fevs messages towards director.*/
 /*Note max-max is 255. cb-fevs_replies_pending is an uint8_t*/
-#define IMMSV_DEFAULT_FEVS_MAX_PENDING 16
+#define IMMSV_DEFAULT_FEVS_MAX_PENDING 255

 #define IMMSV_MAX_OBJECTS 1
 #define IMMSV_MAX_ATTRIBUTES 128



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #520 Mds: Tune MDS logging to minimal informative

2014-09-15 Thread A V Mahesh (AVM)
- **status**: review -- fixed
- **Comment**:

changeset:   5787:1ae3fab58c88
branch:  opensaf-4.5.x
parent:  5785:ea9d40a93b09
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:42:43 2014 +0530
summary: mds: change mds logging prefix readability [#520]
 
changeset:   5788:47bb76fa989b
branch:  opensaf-4.5.x
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:45:07 2014 +0530
summary: mds: use ncsmds_svc_names mapping array of ncsmds_svc_id for 
logging [#520]
 
changeset:   5789:be5d69ad7c82
branch:  opensaf-4.5.x
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:46:34 2014 +0530
summary: mds: update adest details for log readability [#520]
 
changeset:   5790:43f758230e5f
branch:  opensaf-4.5.x
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:47:33 2014 +0530
summary: mds: update logs according for readability [#520]
 
changeset:   5791:e970f1e8ff6b
parent:  5786:dcae8bcf532f
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:51:41 2014 +0530
summary: mds: change mds logging prefix readability [#520]
 
changeset:   5792:f8dbeb5ee3c0
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:52:40 2014 +0530
summary: mds: use ncsmds_svc_names mapping array of ncsmds_svc_id for 
logging [#520]
 
changeset:   5793:a9233c37c652
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:53:20 2014 +0530
summary: mds: update adest details for log readability [#520]
 
changeset:   5794:c01bcaff9ca7
tag: tip
user:A V Mahesh mahesh.va...@oracle.com
date:Mon Sep 15 14:54:02 2014 +0530
summary: mds: update logs according for readability [#520]



---

** [tickets:#520] Mds: Tune MDS logging  to minimal  informative **

**Status:** fixed
**Milestone:** 4.5.0
**Created:** Thu Jul 25, 2013 01:19 PM UTC by hano
**Last Updated:** Mon Aug 18, 2014 04:43 AM UTC
**Owner:** A V Mahesh (AVM)

Minimize the  MDS logging  to  only in case of  required so that it can not 
reach  1 Mb of log rotation range/size sooner .


amfnd core dump is produced when amfnd main thread (10720) is waiting for a 
pthread mutex, gl_mds_library_mutex, which is held by the mds thread (10723).
The amf watchdog detects this (no healthchecks received) and sends an abort 
signal to the amfnd. Holding a mutex during file operations in MDS is not 
correct and should be corrected. (HR50165)
 

 #0  0x7f7830d70294 in __lll_lock_wait () from /lib64/libpthread.so.0

(gdb) p gl_mds_library_mutex

$1 = {__data = {__lock = 2, __count = 1, __owner = 10723, __nusers = 1, __kind 
= 1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},

  __size =  ã ) , ' ' repeats 22 times, __align = 4294967298}

(gdb) info thr

  Id   Target Id Frame

  4Thread 0x7f7832263b00 (LWP 10723) 0x7f783083e20d in write () from 
/lib64/libc.so.6

  3Thread 0x7f7832283b00 (LWP 10722) 0x7f7830844f53 in select () from 
/lib64/libc.so.6

  2Thread 0x7f7832243b00 (LWP 10724) 0x7f7830d7076d in read () from 
/lib64/libpthread.so.0

* 1Thread 0x7f7832286700 (LWP 10720) 0x7f7830d70294 in __lll_lock_wait 
() from /lib64/libpthread.so.0

 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1061 imm: memory leak in dumping resources in PBE

2014-09-15 Thread Neelakanta Reddy
- **status**: unassigned -- accepted
- **assigned_to**: Zoran Milinkovic -- Neelakanta Reddy
- **Part**: - -- tools



---

** [tickets:#1061] imm: memory leak in dumping resources in PBE**

**Status:** accepted
**Milestone:** 4.5.0
**Created:** Wed Sep 10, 2014 12:35 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Sep 10, 2014 12:35 PM UTC
**Owner:** Neelakanta Reddy

In dumping resources (in PBE), data are collected, sent as a result, but 
allocated memory (rparams) is not freed.

p
SaImmAdminOperationParamsT_2 ** rparams;br/
rparams = (SaImmAdminOperationParamsT_2 **) realloc(NULL, 
sizeof(SaImmAdminOperationParamsT_2 *));
/p


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-15 Thread Anders Bjornerstedt
Well a hint is that you managed to bypass the problem (temporarily) by 
increasing a queue size.

The error:
Sep 6 6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Is very rarely seen, but can happen due to the latency of fevs turn arround 
being lower than the rate of generated trafic.

So the  question for you is simply why this happens in our setup and with your 
traffic.
or if there is anything else unusual with your setup or traffic.
If the *only* imm  traffic is sync traffic then it is really strange.

Again, this is a rare problem (in fact no one has complained about this before 
that I can recall) and involves a mechanism that
has been there since the start of OpenSAF.

If the same problem had popped up in testing of 4.5 it would indicate som 
introduced problem.
But no one has reported any problem like this.

/AndersBj


From: Adrian Szwej [mailto:adriansz...@users.sf.net]
Sent: den 15 september 2014 12:04
To: [opensaf:tickets]
Subject: [opensaf:tickets] Re: #1072 Sync stop after few payload nodes joining 
the cluster (TCP)


I don't think it is performance problems.
There is nothing indicating CPU load; memory; nor IO bandwith.
This is just a simple node joining seem to trigger some logical bug.
There is no application; but just pure opensaf.

I am now trying to elaborate with different MDS configuration options and MDS 
buffer settings together with MTU 9000 just to see if there is any difference 
in triggering this bug.

Opensaf is running inside containers; meaning there is no virtualization 
overhead.

Could you hint me what could cause the outstanding messages to reach 16?
E.g. could message loss / timing issue lead to this?
I am having more nodes configured than what is actually joining at the moment; 
around 10. But I am bringing them into cluster one by one.



[tickets:#1072]http://sourceforge.net/p/opensaf/tickets/1072 Sync stop after 
few payload nodes joining the cluster (TCP)

Status: invalid
Milestone: 4.3.3
Created: Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
Last Updated: Mon Sep 15, 2014 07:45 AM UTC
Owner: Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep 6 6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep 6 6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep 6 6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look for 
id:1664
Sep 6 6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Sep 6 6:58:02.096725 osafimmnd [502:immnd_evt.c:1676]  
immnd_evt_proc_search_next
Sep 6 6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: step:540

I have managed to overcome this bug temporary by making following patch:

+++ b/osaf/libs/common/immsv/include/immsv_api.hSat Sep 06 08:38:16 
2014 +
@@ -70,7 +70,7 @@

 /*Max # of outstanding fevs messages towards director.*/
 /*Note max-max is 255. cb-fevs_replies_pending is an uint8_t*/
-#define IMMSV_DEFAULT_FEVS_MAX_PENDING 16
+#define IMMSV_DEFAULT_FEVS_MAX_PENDING 255

 #define IMMSV_MAX_OBJECTS 1
 #define IMMSV_MAX_ATTRIBUTES 128




Sent from sourceforge.net because you indicated interest in 
https://sourceforge.net/p/opensaf/tickets/1072/https://sourceforge.net/p/opensaf/tickets/1072

To unsubscribe from further messages, please visit 
https://sourceforge.net/auth/subscriptions/https://sourceforge.net/auth/subscriptions



---

** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)**

**Status:** invalid
**Milestone:** 4.3.3
**Created:** Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
**Last Updated:** Mon Sep 15, 2014 07:45 AM UTC
**Owner:** Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is 
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep  6  6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep  6  6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep  6  6:58:02.096613 

[tickets] [opensaf:tickets] #1079 longDn : searchInitialize is not default to 100 handles

2014-09-15 Thread surender khetavath



---

** [tickets:#1079] longDn : searchInitialize is not default to 100 handles**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 11:00 AM UTC by surender khetavath
**Last Updated:** Mon Sep 15, 2014 11:00 AM UTC
**Owner:** nobody

changeset : 5697

Test:
export IMMA_MAX_OPEN_SEARCHES_PER_HANDLE=-1
OmInit()
searchInit() in loop till 100. Expectation is that 101th search request must 
result is ERR_NO_RESOURCES. But the return value is SA_AIS_OK.

statement from READEME
A default maximum of 100 concurrently open search handles per om-handle
are allowed.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1080 2PBE: pbed crashed at immpbe_dump.cc:2273

2014-09-15 Thread Sirisha Alla



---

** [tickets:#1080] 2PBE: pbed crashed at immpbe_dump.cc:2273**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 11:32 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 11:32 AM UTC
**Owner:** nobody

The issue is seen on SLES X86 VMs running on opensaf changeset 5697 + #946 
patch and 50k objects.

The cluster is brought up with new DB. Application CCBs are in progress. When 
the same application is run #1063 is observed previously.

Syslog on SC-2:

Sep 15 15:14:51 SLES-64BIT-SLOT2 osafimmnd[2366]: NO ccbResult: CCB 2 in 
critical state! Commit/apply in progress
Sep 15 15:14:51 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Validation error 
(BAD_OPERATION) reported by implementer 'OpenSafImmPBE', Ccb 2 will be aborted
Sep 15 15:14:51 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Ccb 2 ABORTED 
(immcfg_SLES-64BIT-SLOT2_3938)
Sep 15 15:14:56 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer locally 
disconnected. Marking it as doomed 16 304, 2020f (@OpenSafImmPBE)
Sep 15 15:14:56 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer locally 
disconnected. Marking it as doomed 17 305, 2020f (OsafImmPbeRt_B)
Sep 15 15:14:56 SLES-64BIT-SLOT2 kernel: [ 2550.400926] osafimmpbed[2916] 
general protection ip:7f7168a3a812 sp:7f71689b9578 error:0 in 
libc-2.11.3.so[7f71689bb000+16b000]
Sep 15 15:14:56 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer disconnected 
16 304, 2020f (@OpenSafImmPBE)
Sep 15 15:14:56 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer disconnected 
17 305, 2020f (OsafImmPbeRt_B)
Sep 15 15:14:56 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Validation error 
(BAD_OPERATION) reported by implementer 'OpenSafImmPBE', Ccb 3 will be aborted
Sep 15 15:14:56 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Ccb 3 ABORTED 
(immcfg_SLES-64BIT-SLOT2_3945)

back trace of the core file:

(gdb) bt full
  #0  0x7f7168a3a812 in __strlen_sse2 () from /lib64/libc.so.6
No symbol table info available.
  #1  0x7f7169f71fe0 in std::basic_stringchar, std::char_traitschar, 
std::allocatorchar ::basic_string(char const*, std::allocatorchar const) 
()
   from /usr/lib/../lib64/libstdc++.so.6
No symbol table info available.
  #2  0x00417953 in objectToPBE(std::string, SaImmAttrValuesT_2 
const**, std::mapstd::string, ClassInfo*, std::lessstd::string, 
std::allocatorstd::pairstd::string const, ClassInfo*  *, void*, unsigned 
int, char*, unsigned long long) () at immpbe_dump.cc:2273
sPbeFileName = {static npos = optimized out, _M_dataplus = 
{std::allocatorchar = {__gnu_cxx::new_allocatorchar = {No data 
fields}, No data fields}, 
_M_p = 0x67ec48 /home/sirisha/immsv/immpbe//imm.db.2020f}}
preparedSql = {0x422700 INSERT INTO objects_int_multi (obj_id, 
attr_name, int_val) VALUES (?, ?, ?), 
  0x422750 INSERT INTO objects_real_multi (obj_id, attr_name, 
real_val) VALUES (?, ?, ?), 
  0x4227a0 INSERT INTO objects_text_multi (obj_id, attr_name, 
text_val) VALUES (?, ?, ?), 
  0x4227f0 INSERT INTO classes (class_id, class_name, class_category) 
VALUES (?, ?, ?), 
  0x422840 INSERT INTO attr_def (class_id, attr_name, attr_type, 
attr_flags) VALUES (?, ?, ?, ?), 
  0x422898 INSERT INTO attr_dflt (class_id, attr_name, int_dflt) 
VALUES (?, ?, ?), 
  0x4228e0 INSERT INTO attr_dflt (class_id, attr_name, real_dflt) 
VALUES (?, ?, ?), 
  0x422928 INSERT INTO attr_dflt (class_id, attr_name, text_dflt) 
VALUES (?, ?, ?), 
  0x422970 INSERT INTO objects (obj_id, class_id, dn, last_ccb) VALUES 
(?, ?, ?, ?), 
  0x4229c0 INSERT INTO ccb_commits (ccb_id, epoch, commit_time) VALUES 
(?, ?, ?), 0x422a08 SELECT obj_id FROM objects WHERE class_id = ?, 
  0x422a38 SELECT obj_id,class_id FROM objects WHERE dn = ?, 0x422a70 
SELECT class_name FROM classes WHERE class_id = ?, 
  0x422aa8 SELECT class_id FROM classes WHERE class_name = ?, 
0x422ae0 SELECT attr_type,attr_flags FROM attr_def WHERE class_id = ? AND 
attr_name = ?, 
  0x422b30 SELECT epoch, commit_time FROM ccb_commits WHERE ccb_id = 
?, 0x422b70 DELETE FROM classes WHERE class_id = ?, 
  0x422b98 DELETE FROM objects WHERE obj_id = ?, 0x422bc0 DELETE 
FROM attr_def WHERE class_id = ?, 0x422be8 DELETE FROM attr_dflt WHERE 
class_id = ?, 
  0x422c18 DELETE FROM objects_int_multi WHERE obj_id = ?, 0x422c48 
DELETE FROM objects_real_multi WHERE obj_id = ?, 
  0x422c78 DELETE FROM objects_text_multi WHERE obj_id = ?, 0x422ca8 
DELETE FROM objects_int_multi WHERE obj_id = ? AND attr_name = ?, 
  0x422cf0 DELETE FROM objects_real_multi WHERE obj_id = ? AND 
attr_name = ?, 0x422d38 DELETE FROM objects_text_multi WHERE obj_id = ? AND 
attr_name = ?, 
  0x422d80 DELETE FROM objects_int_multi WHERE obj_id = ? AND 
attr_name = ? AND int_val = ?, 
  0x422dd8 DELETE FROM objects_real_multi WHERE obj_id = ? AND 
attr_name = ? AND real_val = ?, 
  0x422e30 DELETE FROM objects_text_multi WHERE obj_id = ? AND 

[tickets] [opensaf:tickets] #1077 opensaf randomly and frequently fails to start with trace enabled

2014-09-15 Thread Hans Feldt
- **status**: accepted -- review



---

** [tickets:#1077] opensaf randomly and frequently fails to start with trace 
enabled**

**Status:** review
**Milestone:** 4.5.0
**Created:** Mon Sep 15, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Mon Sep 15, 2014 07:08 AM UTC
**Owner:** Hans Feldt

In IMM and NTF logging and tracing is done between fork and exec. This together 
with the added call to tzset() in logtrace creates a deadlock in the child. 
Here's an example of how immpbed hangs for ever (no supervision in immnd):

The system appears to have started correctly but configuration changes times 
out:

root@SC-1:/# immcfg -a saAmfClusterStartupTimeout=100 
safAmfCluster=myAmfCluster
error - saImmOmCcbObjectModify_2 FAILED: SA_AIS_ERR_TRY_AGAIN (6)
error - immcfg command timed out (alarm)


 all process including pbe has started:

root   391  0.0  0.0 146660  1144 ?Ssl 07:08   0:00 
/usr/local/lib/opensaf/osafrded
root   405  0.0  0.0 148848  1144 ?Ssl 07:08   0:00 
/usr/local/lib/opensaf/osaffmd
root   414  0.0  0.0 157324  1428 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafimmd
root   423  0.0  0.0 238192  2600 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafimmnd --tracemask=0x
root   437  0.0  0.0 227412  3884 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osaflogd
root   449  0.0  0.0 159552  1564 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafntfd
root   459  0.0  0.0 157892  1708 ?SNsl 07:08   0:00 
/usr/local/lib/opensaf/osafclmd
root   464  0.0  0.0 164344  1052 ?SN   07:08   0:00 
/usr/local/lib/opensaf/osafimmnd --tracemask=0x
root   469  0.0  0.0 146656  1156 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafclmna
root   477  0.0  0.0 167104  2712 ?Ssl 07:08   0:00 
/usr/local/lib/opensaf/osafamfd
root   486  0.0  0.0 225600  1964 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafamfnd
root   499  0.0  0.0 148728  1036 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafsmfnd
root   504  0.0  0.0 254200  1840 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafsmfd
root   536  0.0  0.0 157928  1904 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafckptnd
root   555  0.0  0.0 146644  1036 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafamfwd
root   596  0.0  0.0 153592  1332 ?Ssl  07:08   0:00 
/usr/local/lib/opensaf/osafckptd

 gdb backtrace shows that pbe is hanging in the newly added tzset in logtrace:

(gdb) bt
#0  __lll_lock_wait_private () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x7f180cee39de in _L_lock_2427 () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x7f180cee37b1 in __tzset () at tzset.c:598
#3  0x7f180db5e8cf in output (file=0x4722ff immnd_proc.c, line=1577, 
priority=priority@entry=7, category=category@entry=1, 
format=format@entry=0x472452 Exec: %s %s %s, ap=ap@entry=0x7fff58b213d8)
at logtrace.c:96
#4  0x7f180db5ed9b in _logtrace_trace (file=file@entry=0x4722ff 
immnd_proc.c, line=line@entry=1577, category=category@entry=1, 
format=format@entry=0x472452 Exec: %s %s %s) at logtrace.c:173
#5  0x00409cee in immnd_forkPbe (cb=cb@entry=0x691540 _immnd_cb) at 
immnd_proc.c:1577
#6  0x0041e570 in immnd_proc_server 
(timeout=timeout@entry=0x7fff58b21fd8) at immnd_proc.c:2111
#7  0x0040a763 in main (argc=optimized out, argv=optimized out) at 
immnd_main.c:355

Sep 13 07:08:17 SC-1 osafimmnd[423]: NO STARTING PBE process.
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO 
pbe-db-file-path:/srv/shared/imm//imm.db VETERAN:0 B:0
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO Implementer connected: 2 
(safClmService) 13, 2010f
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO implementer for class 'SaClmNode' is 
safClmService = class extent is safe.
Sep 13 07:08:17 SC-1 osafimmnd[423]: NO implementer for class 'SaClmCluster' is 
safClmService = class extent is safe.
Sep 13 07:08:17 SC-1 osafclmna[469]: Started
Sep 13 07:08:17 SC-1 osafclmna[469]: NO safNode=SC-1,safCluster=myClmCluster 
Joined cluster, nodeid=2010f
Sep 13 07:08:17 SC-1 osafamfd[477]: Started
Sep 13 07:08:17 SC-1 osafamfd[477]: NO Invalid configuration, 
saAmfCtDefRecoveryOnError=NO_RECOMMENDATION(1) for 
'safVersion=4.0.0,safCompType=OpenSafCompTypeAMFWDOG'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO COMPONENT_FAILOVER(3) used instead of 
NO_RECOMMENDATION(1) for 'safVersion=4.0.0,safCompType=OpenSafCompTypeAMFWDOG'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO Invalid configuration, 
saAmfCtDefRecoveryOnError=NO_RECOMMENDATION(1) for 
'safVersion=4.0.0,safCompType=OpenSafCompTypeCPND'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO COMPONENT_FAILOVER(3) used instead of 
NO_RECOMMENDATION(1) for 'safVersion=4.0.0,safCompType=OpenSafCompTypeCPND'
Sep 13 07:08:17 SC-1 osafamfd[477]: NO Invalid configuration, 
saAmfCtDefRecoveryOnError=NO_RECOMMENDATION(1) for 
'safVersion=4.0.0,safCompType=OpenSafCompTypeSMFND'
Sep 

[tickets] [opensaf:tickets] #1081 Node went for reboot in the middle of switchover

2014-09-15 Thread Sirisha Alla



---

** [tickets:#1081] Node went for reboot in the middle of switchover**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 11:45 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 11:45 AM UTC
**Owner:** nobody

The issue is seen on SLES X86 VMs running with opensaf changeset 5697 +#946 
patch. IMM Db is loaded with 50k objects.

IMM Application along with switchovers is in progress.

Syslog on SC-2:

Sep 15 15:16:03 SLES-64BIT-SLOT2 osafamfnd[2452]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer disconnected 2 
0, 2010f (@OpenSafImmReplicatorA)
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer (applier) 
connected: 32 (@OpenSafImmReplicatorA) 0, 2010f
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer (applier) 
connected: 33 (@OpenSafImmReplicatorB) 348, 2020f
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafntfimcnd[4058]: NO Started
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmnd[2366]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer locally 
disconnected. Marking it as doomed 28 5, 2020f (safClmService)
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer disconnected 
28 5, 2020f (safClmService)
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer connected: 34 
(safClmService) 353, 2020f
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10037/4294967351 numOps:1
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10037/4294967351 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafclmd[2417]: ER saImmOiImplementerSet 
failed rc:14, exiting
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafclmd[2417]: ER saImmOiImplementerSet 
failed rc:14, exiting
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafclmd[2417]: ER saImmOiImplementerSet 
failed rc:14, exiting
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafamfnd[2452]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafamfnd[2452]: ER 
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafamfnd[2452]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131599, SupervisionTime = 60
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer locally 
disconnected. Marking it as doomed 34 353, 2020f (safClmService)
Sep 15 15:16:06 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer disconnected 
34 353, 2020f (safClmService)
Sep 15 

[tickets] [opensaf:tickets] #1061 imm: memory leak in dumping resources in PBE

2014-09-15 Thread Neelakanta Reddy
- **status**: accepted -- review



---

** [tickets:#1061] imm: memory leak in dumping resources in PBE**

**Status:** review
**Milestone:** 4.5.0
**Created:** Wed Sep 10, 2014 12:35 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 10:18 AM UTC
**Owner:** Neelakanta Reddy

In dumping resources (in PBE), data are collected, sent as a result, but 
allocated memory (rparams) is not freed.

p
SaImmAdminOperationParamsT_2 ** rparams;br/
rparams = (SaImmAdminOperationParamsT_2 **) realloc(NULL, 
sizeof(SaImmAdminOperationParamsT_2 *));
/p


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1082 imm: check the adminowner of same name exists before creating

2014-09-15 Thread Neelakanta Reddy



---

** [tickets:#1082] imm: check the adminowner of same name exists before 
creating**

**Status:** unassigned
**Milestone:** 4.5.0
**Created:** Mon Sep 15, 2014 12:20 PM UTC by Neelakanta Reddy
**Last Updated:** Mon Sep 15, 2014 12:20 PM UTC
**Owner:** nobody

1. without enabling PBE the number of adminowners are:

immadm -O displayverbose -p resource:SA_STRING_T:adminowners 
opensafImm=opensafImm,safApp=safImmService
[using admin-owner: 'safImmService']
Object: opensafImm=opensafImm,safApp=safImmService
Name   Type Value(s)

safImmService  SA_INT64_T   131343 (0x2010f)

2. with enabling of PBE the no of adminowners are :
immcfg -m -a saImmRepositoryInit=1 safRdn=immManagement,safApp=safImmService

 immadm -O displayverbose -p resource:SA_STRING_T:adminowners 
opensafImm=opensafImm,safApp=safImmService
[using admin-owner: 'safImmService']
Object: opensafImm=opensafImm,safApp=safImmService
Name   Type Value(s)

safImmService  SA_INT64_T   131343 (0x2010f)
safImmService  SA_INT64_T   131343 (0x2010f)

while creating the new adminowner, check for existing adminowner if already 
present and assign it, instead of creating a new adminowner.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1081 Node went for reboot in the middle of switchover

2014-09-15 Thread Neelakanta Reddy
- **status**: unassigned -- duplicate
- **Comment**:

Duplicate of #1067. Please retry with #1067.

Evidences below (same as 1067):

Sep 15 15:16:06.199563 osafclmd [2417:clms_imm.c:0488] IN saImmOiRtObjectUpdate 
for safNode=PL-3,safCluster=myClmCluster failed with rc = 20. Reinit with IMM

Sep 15 15:16:06.203193 osafclmd [2417:clms_imm.c:2319]  clm_imm_reinit_bg
Sep 15 15:16:06.203224 osafclmd [2417:clms_imm.c:2326]  clm_imm_reinit_bg


Sep 15 15:16:06.203336 osafclmd [2417:clms_imm.c:0488] IN saImmOiRtObjectUpdate 
for safNode=PL-4,safCluster=myClmCluster failed with rc = 9. Reinit with IMM
Sep 15 15:16:06.203343 osafclmd [2417:imma_oi_api.c:0622]  saImmOiFinalize
Sep 15 15:16:06.203348 osafclmd [2417:imma_oi_api.c:0626] T2 ERR_BAD_HANDLE: No 
initialized handle exists!
Sep 15 15:16:06.203353 osafclmd [2417:clms_imm.c:2319]  clm_imm_reinit_bg
Sep 15 15:16:06.203380 osafclmd [2417:clms_imm.c:2326]  clm_imm_reinit_bg


Sep 15 15:16:06.203468 osafclmd [2417:clms_imm.c:0488] IN saImmOiRtObjectUpdate 
for safNode=SC-1,safCluster=myClmCluster failed with rc = 9. Reinit with IMM
Sep 15 15:16:06.203473 osafclmd [2417:imma_oi_api.c:0622]  saImmOiFinalize
Sep 15 15:16:06.203477 osafclmd [2417:imma_oi_api.c:0626] T2 ERR_BAD_HANDLE: No 
initialized handle exists!
Sep 15 15:16:06.203482 osafclmd [2417:clms_imm.c:2319]  clm_imm_reinit_bg
Sep 15 15:16:06.203501 osafclmd [2417:clms_imm.c:2326]  clm_imm_reinit_bg


Sep 15 15:16:06.203590 osafclmd [2417:imma_oi_api.c:2275] T2 ERR_BAD_HANDLE: No 
initialized handle exists!
Sep 15 15:16:06.203596 osafclmd [2417:clms_imm.c:0488] IN saImmOiRtObjectUpdate 
for safNode=SC-2,safCluster=myClmCluster failed with rc = 9. Reinit with IMM
Sep 15 15:16:06.203601 osafclmd [2417:imma_oi_api.c:0622]  saImmOiFinalize
Sep 15 15:16:06.203606 osafclmd [2417:imma_oi_api.c:0626] T2 ERR_BAD_HANDLE: No 
initialized handle exists!
Sep 15 15:16:06.203611 osafclmd [2417:clms_imm.c:2319]  clm_imm_reinit_bg
Sep 15 15:16:06.203632 osafclmd [2417:clms_imm.c:2326]  clm_imm_reinit_bg




Sep 15 15:16:06.617115 osafclmd [2417:clms_imm.c:2286] ER saImmOiImplementerSet 
failed rc:14, exiting
Sep 15 15:16:06.617600 osafclmd [2417:clms_imm.c:2286] ER saImmOiImplementerSet 
failed rc:14, exiting
Sep 15 15:16:06.617872 osafclmd [2417:clms_imm.c:2286] ER saImmOiImplementerSet 
failed rc:14, exiting




---

** [tickets:#1081] Node went for reboot in the middle of switchover**

**Status:** duplicate
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 11:45 AM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 11:45 AM UTC
**Owner:** nobody

The issue is seen on SLES X86 VMs running with opensaf changeset 5697 +#946 
patch. IMM Db is loaded with 50k objects.

IMM Application along with switchovers is in progress.

Syslog on SC-2:

Sep 15 15:16:03 SLES-64BIT-SLOT2 osafamfnd[2452]: NO Assigned 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer disconnected 2 
0, 2010f (@OpenSafImmReplicatorA)
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer (applier) 
connected: 32 (@OpenSafImmReplicatorA) 0, 2010f
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafimmnd[2366]: NO Implementer (applier) 
connected: 33 (@OpenSafImmReplicatorB) 348, 2020f
Sep 15 15:16:03 SLES-64BIT-SLOT2 osafntfimcnd[4058]: NO Started
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:04 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: IN ccb-prepare received at PBE 
slave ccbId:10036/4294967350 numOps:1
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: NO Prepare 
ccb:10036/4294967350 received at Pbe slave when Prior Ccb 22 still 
processing
Sep 15 15:16:05 SLES-64BIT-SLOT2 osafimmpbed: IN PBE slave waiting for prepare 
from primary on PRTA update ccb:10036

[tickets] [opensaf:tickets] #966 osaf: Global constant name 'kMaxDnLength' lacks prefix

2014-09-15 Thread Anders Widell
- **status**: assigned -- accepted



---

** [tickets:#966] osaf: Global constant name 'kMaxDnLength' lacks prefix**

**Status:** accepted
**Milestone:** 4.5.0
**Created:** Thu Jul 24, 2014 11:12 AM UTC by Anders Bjornerstedt
**Last Updated:** Fri Aug 15, 2014 03:12 PM UTC
**Owner:** Anders Widell

Ticket #191 added support for extended names (long DNs) to opensaf.
A new sanity limit is till defined by the global constant kMaxDnLength.
But that constant lacks any prefix and so does not make it clear where
it belongs (what component owns it) and could potentially cause naming 
conflicts in the future. 




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1051 ais_name_borrow/lend ruins trace of services

2014-09-15 Thread Anders Widell
- **status**: assigned -- accepted



---

** [tickets:#1051] ais_name_borrow/lend ruins trace of services**

**Status:** accepted
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:46 AM UTC by Hans Feldt
**Last Updated:** Tue Sep 09, 2014 07:46 AM UTC
**Owner:** Anders Widell

Utility functions like this that are called a lot cannot have any TRACE macros 
in them. Needs to be removed.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1083 imm: runtime attributes are trimmed away for PBE

2014-09-15 Thread Zoran Milinkovic



---

** [tickets:#1083] imm: runtime attributes are trimmed away for PBE**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 01:29 PM UTC
**Owner:** nobody

Runtime attributes are trimmed away for PBE.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME)  
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {

Sign ~ should be replaced with !.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #229 Toggle TIPC_USE_SUBSLOT_ID automatically during installation by providing a configure option --enable-subslot

2014-09-15 Thread Mathi Naickan
- **Milestone**: 4.5.0 -- 4.6.FC



---

** [tickets:#229] Toggle TIPC_USE_SUBSLOT_ID automatically during installation 
by providing a configure option --enable-subslot**

**Status:** assigned
**Milestone:** 4.6.FC
**Created:** Wed May 15, 2013 10:31 AM UTC by Mathi Naickan
**Last Updated:** Fri Aug 15, 2014 03:17 PM UTC
**Owner:** Mathi Naickan

After insallation, we are suggesting user to toggle the variable 
TIPC_USE_SUBSLOT_ID in nid.conf. 
However, generic upgrade scripts need not have the intelligence to know what 
options opensaf was built with and accordingly take backup of nid.conf contents.
By providing a configure option —enable-subslot and automatically toggling the 
TIPC_USE_SUBSLOT_ID during installation, the upgrade scripts need not be 
modified for toggling this variable while upgrading from 4.2 to 4.3.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #738 Handling of negative timeout values in API

2014-09-15 Thread Mathi Naickan
- **Milestone**: 4.5.0 -- 4.6.FC
- **Comment**:

In preparation for 4.5.RC1.



---

** [tickets:#738] Handling of negative timeout values in API**

**Status:** assigned
**Milestone:** 4.6.FC
**Created:** Wed Jan 22, 2014 08:35 AM UTC by Sirisha Alla
**Last Updated:** Fri Aug 15, 2014 03:16 PM UTC
**Owner:** Mathi Naickan

The issue is seen on changeset 4733 | patches for #220.

saClmClusterNodeGet_4 API is invoked with -3 as input for the timeout 
parameter. ERR_TIMEOUT is returned where in the closest return value to be 
returned would be INVALID_PARAM according to the specification.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1084 imm: wrong trace log in setImplementer

2014-09-15 Thread Zoran Milinkovic



---

** [tickets:#1084] imm: wrong trace log in setImplementer**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:41 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 01:41 PM UTC
**Owner:** nobody

In ImmModel::setImplementer for regular OI, a wrong trace is logged if class 
implementer has already been set to the object.

TRACE_7(ERR_EXIST: Object '%s' already has class 
implementer %s 
conflicts with setting object implementer,
objectName.c_str(),
obj-mImplementer-mImplementerName.c_str());

This code may crash immnd if obj-mImplementer is NULL.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1084 imm: wrong trace log in setImplementer

2014-09-15 Thread Zoran Milinkovic
- **Component**: unknown -- imm
- **Part**: - -- nd



---

** [tickets:#1084] imm: wrong trace log in setImplementer**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:41 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 01:41 PM UTC
**Owner:** nobody

In ImmModel::setImplementer for regular OI, a wrong trace is logged if class 
implementer has already been set to the object.

TRACE_7(ERR_EXIST: Object '%s' already has class 
implementer %s 
conflicts with setting object implementer,
objectName.c_str(),
obj-mImplementer-mImplementerName.c_str());

This code may crash immnd if obj-mImplementer is NULL.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1050 amfnd sometimes fails to start due to ERR_LIBRARY from saImmOmInitialize

2014-09-15 Thread Hans Feldt
Please check/test these patches


Attachment: osaf-1050.tgz (5.5 kB; application/x-compressed-tar) 


---

** [tickets:#1050] amfnd sometimes fails to start due to ERR_LIBRARY from 
saImmOmInitialize**

**Status:** review
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:08 AM UTC by Hans Feldt
**Last Updated:** Fri Sep 12, 2014 09:26 AM UTC
**Owner:** Hans Feldt

With MDS/TIPC amfnd randomly fails to start causing failed opensaf start.

osafimmnd logs the infamous immnd_evt_proc_imm_init: ... MDS problem?

Reason is a random timing variation of the TIPC topology DOWN event. This 
sometimes causes the DOWN event to wrongly delete a newly added process_info 
entry.

The trigger for this problem is that some IMM clients in opensaf like amfnd 
does not reuse IMM handles but initialize/finalize in a far from optimal way. 
This should also be fixed.

The solution under test consists of two parts:
1) The MDS down event just starts a timer in MDS, when the timeout event 
happens the process_info entry is deleted.

2) A new explicit disconnect() is added to the MDS API which is used by IMMA 
library when it is about to close down the whole core library.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #984 imm: IMM PR doc update for 4.5

2014-09-15 Thread Neelakanta Reddy
- **status**: assigned -- accepted



---

** [tickets:#984] imm: IMM PR doc update for 4.5 **

**Status:** accepted
**Milestone:** 4.5.0
**Created:** Thu Aug 14, 2014 10:15 AM UTC by Neelakanta Reddy
**Last Updated:** Fri Aug 15, 2014 09:50 PM UTC
**Owner:** Neelakanta Reddy

update the IMMsv PR document with 4.5  enhancement 

1. Support for saImmOmCcbAbort() and saImmOmCcbValidate() -- #798
2. Allow admin-operations directly targeting an implementer or applier -- #799
3. Attribute 'longDnsAllowed' added to class 'OpensafImm'  -- #897
4. Allow schema change to add attribute default  -- #895
5. Support for configurable OI callback timeout -- #16
6. Notes on upgrading from OpenSAF 4.[1,2,3,4] to OpenSAF 4.5



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1085 imm: memory leak in finalizeSync

2014-09-15 Thread Zoran Milinkovic



---

** [tickets:#1085] imm: memory leak in finalizeSync**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 02:03 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 02:03 PM UTC
**Owner:** nobody

In ImmModel::finalize Sync, memory is not deleted for CcbInfo and 
AdminOwnerInfo structs, if errors happen in the same block.

For AdminOwnerInfo:

AdminOwnerInfo* info = new AdminOwnerInfo;

...

if((info-mDying)  (!(info-mReleaseOnFinalize))) {
LOG_ER(finalizeSync client: Admo is dying yet 
releaseOnFinalize is false);
err = SA_AIS_ERR_FAILED_OPERATION;
goto done; 
}

...

if(oi == sObjectMap.end()) {
LOG_ER(Sync client failed to locate object: 
%s, will restart., objectName.c_str());
err = SA_AIS_ERR_FAILED_OPERATION;
goto done; 
} 

For CcbInfo:

CcbInfo* newCcb = new CcbInfo;

...

if(newCcb-mWaitStartTime = ((time_t) 0)) {
LOG_ER(newCcb-mWaitStartTime = 0);
err = SA_AIS_ERR_FAILED_OPERATION;
goto done;
}

...

if((newCcb-isActive())) {
LOG_ER(Can not sync Ccb that is active);
err = SA_AIS_ERR_FAILED_OPERATION;
goto done;
}



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1086 imm: wrong trace log in cleanTheBasement

2014-09-15 Thread Zoran Milinkovic



---

** [tickets:#1086] imm: wrong trace log in cleanTheBasement**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 02:20 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 02:20 PM UTC
**Owner:** nobody

Wrong trace log in ImmModel::cleanTheBasement.

TRACE_5(Timeout on sImplDetachTime implid:%u, ci2-second.mConn);

The same trace may crash IMM, because ci2 is sPbeRtReqContinuationMap.end() 
from the previous step.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1087 imm: assign instead of compare in immnd_evt_proc_rt_object_delete

2014-09-15 Thread Zoran Milinkovic



---

** [tickets:#1087] imm: assign instead of compare in 
immnd_evt_proc_rt_object_delete**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 02:25 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 02:25 PM UTC
**Owner:** nobody

In IF statement in immnd_evt_proc_rt_object_delete, there is assigning of value 
instead of comparing. 
There should be (err == SA_AIS_OK)

if(spApplConn  (err = SA_AIS_OK)  !delayedReply) {




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1079 longDn : searchInitialize is not default to 100 handles

2014-09-15 Thread Anders Bjornerstedt
This is a strange ticket.

The default is 100.
The client can OVERRIDE the default by defaining the environment variable.

Here in this test the value assigned is (-1) which makes no sense.
The value is naturally expected to be an unsigned int (positive integer).

Since the value is an unsigned int you will instead get the maxint
positive value in this test.




---

** [tickets:#1079] longDn : searchInitialize is not default to 100 handles**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 11:00 AM UTC by surender khetavath
**Last Updated:** Mon Sep 15, 2014 11:00 AM UTC
**Owner:** nobody

changeset : 5697

Test:
export IMMA_MAX_OPEN_SEARCHES_PER_HANDLE=-1
OmInit()
searchInit() in loop till 100. Expectation is that 101th search request must 
result is ERR_NO_RESOURCES. But the return value is SA_AIS_OK.

statement from READEME
A default maximum of 100 concurrently open search handles per om-handle
are allowed.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1051 ais_name_borrow/lend ruins trace of services

2014-09-15 Thread Anders Widell
- **status**: accepted -- review



---

** [tickets:#1051] ais_name_borrow/lend ruins trace of services**

**Status:** review
**Milestone:** 4.5.0
**Created:** Tue Sep 09, 2014 07:46 AM UTC by Hans Feldt
**Last Updated:** Mon Sep 15, 2014 01:16 PM UTC
**Owner:** Anders Widell

Utility functions like this that are called a lot cannot have any TRACE macros 
in them. Needs to be removed.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


Re: [tickets] [opensaf:tickets] #1083 imm: runtime attributes are trimmed away for PBE

2014-09-15 Thread Anders Björnerstedt
I dont think they are trimmed from the PBE.
Thewy are trimmed from the special-applier.

/AnersBj


From: Zoran Milinkovic [mailto:zmilinko...@users.sf.net]
Sent: den 15 september 2014 15:30
To: opensaf-tickets@lists.sourceforge.net
Subject: [tickets] [opensaf:tickets] #1083 imm: runtime attributes are trimmed 
away for PBE



[tickets:#1083]http://sourceforge.net/p/opensaf/tickets/1083 imm: runtime 
attributes are trimmed away for PBE

Status: unassigned
Milestone: 4.3.3
Created: Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
Last Updated: Mon Sep 15, 2014 01:29 PM UTC
Owner: nobody

Runtime attributes are trimmed away for PBE.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME) 
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {


Sign ~ should be replaced with !.



Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to 
https://sourceforge.net/p/opensaf/tickets/https://sourceforge.net/p/opensaf/tickets

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #1083 imm: runtime attributes are trimmed away for PBE

2014-09-15 Thread Anders Bjornerstedt
I dont think they are trimmed from the PBE.
Thewy are trimmed from the special-applier.

/AnersBj


From: Zoran Milinkovic [mailto:zmilinko...@users.sf.net]
Sent: den 15 september 2014 15:30
To: opensaf-tickets@lists.sourceforge.net
Subject: [tickets] [opensaf:tickets] #1083 imm: runtime attributes are trimmed 
away for PBE



[tickets:#1083]http://sourceforge.net/p/opensaf/tickets/1083 imm: runtime 
attributes are trimmed away for PBE

Status: unassigned
Milestone: 4.3.3
Created: Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
Last Updated: Mon Sep 15, 2014 01:29 PM UTC
Owner: nobody

Runtime attributes are trimmed away for PBE.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME) 
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {


Sign ~ should be replaced with !.



Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to 
https://sourceforge.net/p/opensaf/tickets/https://sourceforge.net/p/opensaf/tickets

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a 
mailing list, you can unsubscribe from the mailing list.



---

** [tickets:#1083] imm: runtime attributes are trimmed away for PBE**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 01:29 PM UTC
**Owner:** nobody

Runtime attributes are trimmed away for PBE.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME)  
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {

Sign ~ should be replaced with !.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to http://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
http://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1083 imm: runtime attributes are trimmed away for Special-applier

2014-09-15 Thread Anders Bjornerstedt
- **summary**: imm: runtime attributes are trimmed away for PBE -- imm: 
runtime attributes are trimmed away for Special-applier



---

** [tickets:#1083] imm: runtime attributes are trimmed away for 
Special-applier**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 01:29 PM UTC
**Owner:** nobody

Runtime attributes are trimmed away for PBE.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME)  
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {

Sign ~ should be replaced with !.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1083 imm: persistent runtime attributes are trimmed away for Special-applier

2014-09-15 Thread Anders Bjornerstedt
- **summary**: imm: runtime attributes are trimmed away for Special-applier -- 
imm: persistent runtime attributes are trimmed away for Special-applier



---

** [tickets:#1083] imm: persistent runtime attributes are trimmed away for 
Special-applier**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 03:01 PM UTC
**Owner:** nobody

Runtime attributes are trimmed away for PBE.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME)  
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {

Sign ~ should be replaced with !.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1083 imm: ccbCreate - default values for PRTA's do not reach PBE or special applier

2014-09-15 Thread Anders Bjornerstedt
- **summary**: imm: persistent runtime attributes are trimmed away for 
Special-applier -- imm: ccbCreate - default values for PRTA's do not reach PBE 
or special applier
- Description has changed:

Diff:



--- old
+++ new
@@ -1,4 +1,11 @@
-Runtime attributes are trimmed away for PBE.
+Runtime attributes (persistent or not) that are
+
+  - part of a config object.
+  - have a default value.
+  - have not been explicitly assigned a value
+
+Will not get the default value written to PBE or included in special applier
+callback.
 
 The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.
 






---

** [tickets:#1083] imm: ccbCreate - default values for PRTA's do not reach PBE 
or special applier**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 03:10 PM UTC
**Owner:** nobody

Runtime attributes (persistent or not) that are

  - part of a config object.
  - have a default value.
  - have not been explicitly assigned a value

Will not get the default value written to PBE or included in special applier
callback.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME)  
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {

Sign ~ should be replaced with !.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1083 imm: ccbCreate - default values for PRTA's do not reach PBE or special applier

2014-09-15 Thread Anders Bjornerstedt
- **status**: unassigned -- assigned
- **assigned_to**: Zoran Milinkovic



---

** [tickets:#1083] imm: ccbCreate - default values for PRTA's do not reach PBE 
or special applier**

**Status:** assigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 01:29 PM UTC by Zoran Milinkovic
**Last Updated:** Mon Sep 15, 2014 03:28 PM UTC
**Owner:** Zoran Milinkovic

Runtime attributes (persistent or not) that are

  - part of a config object.
  - have a default value.
  - have not been explicitly assigned a value

Will not get the default value written to PBE or included in special applier
callback.

The problem is in the calculation in IF statement, which is always true if 
attr-flags contains SA_IMM_ATTR_RUNTIME flag.

if((attr-mFlags  SA_IMM_ATTR_RUNTIME)  
  ~(attr-mFlags  SA_IMM_ATTR_PERSISTENT) 
  ~(attr-mFlags  SA_IMM_ATTR_NOTIFY)) {

Sign ~ should be replaced with !.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1079 longDn : searchInitialize is not default to 100 handles

2014-09-15 Thread Anders Bjornerstedt
- **status**: unassigned -- invalid
- **Comment**:

Setting this ticket to invalid.
If I have misunderstood the problem then you may re-open and provide
a clearer problem description.



---

** [tickets:#1079] longDn : searchInitialize is not default to 100 handles**

**Status:** invalid
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 11:00 AM UTC by surender khetavath
**Last Updated:** Mon Sep 15, 2014 02:38 PM UTC
**Owner:** nobody

changeset : 5697

Test:
export IMMA_MAX_OPEN_SEARCHES_PER_HANDLE=-1
OmInit()
searchInit() in loop till 100. Expectation is that 101th search request must 
result is ERR_NO_RESOURCES. But the return value is SA_AIS_OK.

statement from READEME
A default maximum of 100 concurrently open search handles per om-handle
are allowed.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1091 2PBE: class create takes longer time

2014-09-15 Thread Sirisha Alla



---

** [tickets:#1091] 2PBE: class create takes longer time**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 03:45 PM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 03:45 PM UTC
**Owner:** nobody

The issue is seen on SLES X86 VMs running with opensaf changeset 5697+#946 
patch. The IMM DB is loaded with 50k objects.

Class Create is returning TIMEOUT and on further call returns ERR_BUSY very 
frequently. 

Agent traces:

Sep 15 18:16:23.961232 imma [8200:imma_om_api.c:4303]  saImmOmClassCreate_2
Sep 15 18:16:23.961245 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:29.440968 imma [8200:imma_om_api.c:4594] TR Return code:5
Sep 15 18:16:29.441017 imma [8200:imma_om_api.c:4640]  saImmOmClassCreate_2

Total time taken is more than IMMA_SYNCR_TIMEOUT which is defaulted to 10 
seconds. Application does not have any SYNCR_TIMEOUT customized.

Further call to Class Create returns ERR_BUSY.

Sep 15 18:16:29.568306 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:29.568394 imma [8200:ntfa_mds.c:0369] TR NTFS down
Sep 15 18:16:29.942518 imma [8200:imma_om_api.c:4303]  saImmOmClassCreate_2
Sep 15 18:16:29.942608 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:30.495659 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:30.495742 imma [8200:ntfa_mds.c:0382] T2 MSG from NTFS 
NCSMDS_NEW_ACTIVE/UP
Sep 15 18:16:30.524268 imma [8200:imma_om_api.c:4594] TR Return code:10
Sep 15 18:16:30.524315 imma [8200:imma_om_api.c:4640]  saImmOmClassCreate_2

Syslog on SC-2:

Sep 15 18:16:33 SLES-64BIT-SLOT2 osafamfnd[2467]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmd[2356]: WA IMMD not re-electing coord 
for switch-over (si-swap) coord at (2010f)
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafntfimcnd[2427]: NO exiting on signal 15
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO ERR_BUSY: Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 received while class with same 
name is already being mutated
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 124 
(safMsgGrpService) 312, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 125 
(safLogService) 3, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 126 
(safCheckPointService) 308, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafsmfd[2597]: NO Backup create cmd = 
/usr/lib64/opensaf/smf-backup-create

Syslog on SC-1:

Sep 15 18:16:46 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjApplRejModifyCallback_101 committing with ccbId:1003a
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjApplRejModifyCallback_101 is PERSISTENT.
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: WA Primary PBE failed to create 
class towards slave PBE. Library or immsv replied Rc:6 - ignoring
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 committing with ccbId:1003b
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:1003c/4294967356
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 is PERSISTENT.
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA Start prepare for ccb: 
1003c/4294967356 towards slave PBE returned: '6' from Immsv
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA 
update Ccb:1003c/4294967356 towards PBE-B
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO 2PBE Error (20) in PRTA update 
(ccbId:1003c)
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmnd[7804]: WA update of PERSISTENT 
runtime attributes in object 'safNode=PL-3,safCluster=myClmCluster' REVERTED. 
PBE rc:20
Sep 15 18:16:50 SLES-64BIT-SLOT1 

[tickets] [opensaf:tickets] #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-15 Thread Adrian Szwej
I had 1 controller and 4 payloads up and running.
Normally the Messages pending is kept to 2 and sometimes go up to 3,4.
I was bringing up the 5th payload up and down for around 10-15 times.
*while ( true ); do /etc/init.d/opensafd stop  /etc/init.d/opensafd start; 
done*

*tail -f /var/log/opensaf/osafimmnd | grep Messages pending:*
Sep 15 21:12:50.691919 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:2
Sep 15 21:12:50.724038 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:2
Sep 15 21:12:50.957123 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:2
Sep 15 21:12:50.961528 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:3
Sep 15 21:12:51.215563 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:2
Sep 15 21:12:52.785945 osafimmnd [368:immnd_evt.c:2674] TR Messages 
pending:2
Sep 15 21:12:52.799428 osafimmnd [368:immnd_evt.c:2674] TR Messages 
pending:2
Sep 15 21:12:57.923195 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:2
Sep 15 21:12:58.355613 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:3
Sep 15 21:12:58.369637 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:5
Sep 15 21:12:58.372522 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:6
Sep 15 21:12:58.394801 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:8
Sep 15 21:12:58.458708 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:10
Sep 15 21:12:58.470905 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:12
Sep 15 21:12:58.480655 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:14
Sep 15 21:12:58.484411 osafimmnd [368:immnd_evt.c:0960] TR Messages 
pending:16

Once this happen; it does not help to terminate the 5th payload.
Some minute later cluster reset is triggered.
osafimmnd [738:immnd_mds.c:0573] TR Resetting fevs replies pending to zero.



---

** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)**

**Status:** invalid
**Milestone:** 4.3.3
**Created:** Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
**Last Updated:** Mon Sep 15, 2014 07:45 AM UTC
**Owner:** Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is 
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep  6  6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep  6  6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep  6  6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look 
for id:1664
Sep  6  6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Sep  6  6:58:02.096725 osafimmnd [502:immnd_evt.c:1676]  
immnd_evt_proc_search_next
Sep  6  6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: 
step:540

I have managed to overcome this bug temporary by making following patch:

+++ b/osaf/libs/common/immsv/include/immsv_api.hSat Sep 06 08:38:16 
2014 +
@@ -70,7 +70,7 @@

 /*Max # of outstanding fevs messages towards director.*/
 /*Note max-max is 255. cb-fevs_replies_pending is an uint8_t*/
-#define IMMSV_DEFAULT_FEVS_MAX_PENDING 16
+#define IMMSV_DEFAULT_FEVS_MAX_PENDING 255

 #define IMMSV_MAX_OBJECTS 1
 #define IMMSV_MAX_ATTRIBUTES 128



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-15 Thread Adrian Szwej
I have also tried following flavours:
**Larger MDS buffers**

export MDS_SOCK_SND_RCV_BUF_SIZE=126976 
DTM_SOCK_SND_RCV_BUF_SIZE=126976

**Longer keep alive settings**

**OpenSAF build 4.5**

**MTU 9000**
veth4e51  Link encap:Ethernet  HWaddr aa:a6:f0:5f:0f:82  
  UP BROADCAST RUNNING  MTU:9000  Metric:1
--
veth76a4  Link encap:Ethernet  HWaddr 9a:ea:07:f4:be:55  
  UP BROADCAST RUNNING  MTU:9000  Metric:1
--
vethb5f5  Link encap:Ethernet  HWaddr 22:98:e3:39:32:34  
  UP BROADCAST RUNNING  MTU:9000  Metric:1
--
vethb9e3  Link encap:Ethernet  HWaddr d2:ec:18:c4:f9:2d  
  UP BROADCAST RUNNING  MTU:9000  Metric:1
--
vethd703  Link encap:Ethernet  HWaddr 3e:a0:49:c0:f0:73  
  UP BROADCAST RUNNING  MTU:9000  Metric:1
--
vethf736  Link encap:Ethernet  HWaddr 4e:c4:6e:74:fc:03  
  UP BROADCAST RUNNING  MTU:9000  Metric:1

Ping during sync between containers show latency of 0.250-0.500 ms.

The result is the same.
I can provoke the problem by cycling start/stop of 6th opensaf instance in 
linux container.

while ( true ); do /etc/init.d/opensafd stop  /etc/init.d/opensafd start; 
done


---

** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)**

**Status:** invalid
**Milestone:** 4.3.3
**Created:** Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
**Last Updated:** Mon Sep 15, 2014 09:48 PM UTC
**Owner:** Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is 
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep  6  6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep  6  6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep  6  6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look 
for id:1664
Sep  6  6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Sep  6  6:58:02.096725 osafimmnd [502:immnd_evt.c:1676]  
immnd_evt_proc_search_next
Sep  6  6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: 
step:540

I have managed to overcome this bug temporary by making following patch:

+++ b/osaf/libs/common/immsv/include/immsv_api.hSat Sep 06 08:38:16 
2014 +
@@ -70,7 +70,7 @@

 /*Max # of outstanding fevs messages towards director.*/
 /*Note max-max is 255. cb-fevs_replies_pending is an uint8_t*/
-#define IMMSV_DEFAULT_FEVS_MAX_PENDING 16
+#define IMMSV_DEFAULT_FEVS_MAX_PENDING 255

 #define IMMSV_MAX_OBJECTS 1
 #define IMMSV_MAX_ATTRIBUTES 128



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #1072 Sync stop after few payload nodes joining the cluster (TCP)

2014-09-15 Thread A V Mahesh (AVM)
Some time back ,I bought-up 30 Nodes with TCP transport with out any issue, at 
that time  In addition to increasing Larger MDS 
buffers(MDS_SOCK_SND_RCV_BUF_SIZE  DTM_SOCK_SND_RCV_BUF_SIZE), I also 
increased wmem_max  rmem_max, you also give a try.

sysctl -w net.core.wmem_max=33554432
sysctl -w net.core.rmem_max=33554432
sysctl -w net.ipv4.tcp_rmem=4096 87380 33554432
sysctl -w net.ipv4.tcp_wmem=4096 87380 33554432


---

** [tickets:#1072] Sync stop after few payload nodes joining the cluster (TCP)**

**Status:** invalid
**Milestone:** 4.3.3
**Created:** Fri Sep 12, 2014 09:20 PM UTC by Adrian Szwej
**Last Updated:** Mon Sep 15, 2014 10:46 PM UTC
**Owner:** Anders Bjornerstedt

Communication is MDS over TCP. Cluster 2+3; where scenario is 
Start SCs; start 1 payload; wait for sync; start second payload; wait for sync; 
start 3rd payload. Third one fails; or sometimes it might be forth.

There is problem of getting more than 2/3 payloads synchronized due to a 
consistent way of triggering a bug.

The following is triggered in the loading immnd causing the joined node to 
timeout/fail to start up.

Sep  6  6:58:02.096550 osafimmnd [502:immsv_evt.c:5382] T8 Received: 
IMMND_EVT_A2ND_SEARCHNEXT (17) from 2020f
Sep  6  6:58:02.096575 osafimmnd [502:immnd_evt.c:1443]  
immnd_evt_proc_search_next
Sep  6  6:58:02.096613 osafimmnd [502:immnd_evt.c:1454] T2 SEARCH NEXT, Look 
for id:1664
Sep  6  6:58:02.096641 osafimmnd [502:ImmModel.cc:1366] T2 ERR_TRY_AGAIN: Too 
many pending incoming fevs messages ( 16) rejecting sync iteration next request
Sep  6  6:58:02.096725 osafimmnd [502:immnd_evt.c:1676]  
immnd_evt_proc_search_next
Sep  6  6:58:03.133230 osafimmnd [502:immnd_proc.c:1980] IN Sync Phase-3: 
step:540

I have managed to overcome this bug temporary by making following patch:

+++ b/osaf/libs/common/immsv/include/immsv_api.hSat Sep 06 08:38:16 
2014 +
@@ -70,7 +70,7 @@

 /*Max # of outstanding fevs messages towards director.*/
 /*Note max-max is 255. cb-fevs_replies_pending is an uint8_t*/
-#define IMMSV_DEFAULT_FEVS_MAX_PENDING 16
+#define IMMSV_DEFAULT_FEVS_MAX_PENDING 255

 #define IMMSV_MAX_OBJECTS 1
 #define IMMSV_MAX_ATTRIBUTES 128



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce.
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1091 2PBE: class create timesout before default SYNCR_TIMEOUT

2014-09-15 Thread Sirisha Alla
- **summary**: 2PBE: class create takes longer time -- 2PBE: class create 
timesout before default SYNCR_TIMEOUT
- Description has changed:

Diff:



--- old
+++ new
@@ -9,7 +9,7 @@
 Sep 15 18:16:29.440968 imma [8200:imma_om_api.c:4594] TR Return code:5
 Sep 15 18:16:29.441017 imma [8200:imma_om_api.c:4640]  saImmOmClassCreate_2
 
-Total time taken is more than IMMA_SYNCR_TIMEOUT which is defaulted to 10 
seconds. Application does not have any SYNCR_TIMEOUT customized.
+Total time taken is not even IMMA_SYNCR_TIMEOUT which is defaulted to 10 
seconds. Application does not have any SYNCR_TIMEOUT customized.
 
 Further call to Class Create returns ERR_BUSY.
 






---

** [tickets:#1091] 2PBE: class create timesout before default SYNCR_TIMEOUT**

**Status:** unassigned
**Milestone:** 4.3.3
**Created:** Mon Sep 15, 2014 03:45 PM UTC by Sirisha Alla
**Last Updated:** Mon Sep 15, 2014 03:45 PM UTC
**Owner:** nobody

The issue is seen on SLES X86 VMs running with opensaf changeset 5697+#946 
patch. The IMM DB is loaded with 50k objects.

Class Create is returning TIMEOUT and on further call returns ERR_BUSY very 
frequently. 

Agent traces:

Sep 15 18:16:23.961232 imma [8200:imma_om_api.c:4303]  saImmOmClassCreate_2
Sep 15 18:16:23.961245 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:29.440968 imma [8200:imma_om_api.c:4594] TR Return code:5
Sep 15 18:16:29.441017 imma [8200:imma_om_api.c:4640]  saImmOmClassCreate_2

Total time taken is not even IMMA_SYNCR_TIMEOUT which is defaulted to 10 
seconds. Application does not have any SYNCR_TIMEOUT customized.

Further call to Class Create returns ERR_BUSY.

Sep 15 18:16:29.568306 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:29.568394 imma [8200:ntfa_mds.c:0369] TR NTFS down
Sep 15 18:16:29.942518 imma [8200:imma_om_api.c:4303]  saImmOmClassCreate_2
Sep 15 18:16:29.942608 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:30.495659 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:30.495742 imma [8200:ntfa_mds.c:0382] T2 MSG from NTFS 
NCSMDS_NEW_ACTIVE/UP
Sep 15 18:16:30.524268 imma [8200:imma_om_api.c:4594] TR Return code:10
Sep 15 18:16:30.524315 imma [8200:imma_om_api.c:4640]  saImmOmClassCreate_2

Syslog on SC-2:

Sep 15 18:16:33 SLES-64BIT-SLOT2 osafamfnd[2467]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmd[2356]: WA IMMD not re-electing coord 
for switch-over (si-swap) coord at (2010f)
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafntfimcnd[2427]: NO exiting on signal 15
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO ERR_BUSY: Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 received while class with same 
name is already being mutated
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 124 
(safMsgGrpService) 312, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 125 
(safLogService) 3, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 126 
(safCheckPointService) 308, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafsmfd[2597]: NO Backup create cmd = 
/usr/lib64/opensaf/smf-backup-create

Syslog on SC-1:

Sep 15 18:16:46 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjApplRejModifyCallback_101 committing with ccbId:1003a
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjApplRejModifyCallback_101 is PERSISTENT.
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: WA Primary PBE failed to create 
class towards slave PBE. Library or immsv replied Rc:6 - ignoring
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 committing with ccbId:1003b
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:1003c/4294967356
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 is PERSISTENT.
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 

[tickets] [opensaf:tickets] #1091 2PBE: class create timesout before default SYNCR_TIMEOUT

2014-09-15 Thread Anders Bjornerstedt
- **Version**: 4.5 FC -- 4.4
- **Priority**: major -- minor
- **Milestone**: 4.3.3 -- 4.4.1
- **Comment**:

Reducing severity to minor.

This is a pure performance issue related to code that was added in 4.4
and has since then not changed in any significant way.




---

** [tickets:#1091] 2PBE: class create timesout before default SYNCR_TIMEOUT**

**Status:** unassigned
**Milestone:** 4.4.1
**Created:** Mon Sep 15, 2014 03:45 PM UTC by Sirisha Alla
**Last Updated:** Tue Sep 16, 2014 04:44 AM UTC
**Owner:** nobody

The issue is seen on SLES X86 VMs running with opensaf changeset 5697+#946 
patch. The IMM DB is loaded with 50k objects.

Class Create is returning TIMEOUT and on further call returns ERR_BUSY very 
frequently. 

Agent traces:

Sep 15 18:16:23.961232 imma [8200:imma_om_api.c:4303]  saImmOmClassCreate_2
Sep 15 18:16:23.961245 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:29.440968 imma [8200:imma_om_api.c:4594] TR Return code:5
Sep 15 18:16:29.441017 imma [8200:imma_om_api.c:4640]  saImmOmClassCreate_2

Total time taken is not even IMMA_SYNCR_TIMEOUT which is defaulted to 10 
seconds. Application does not have any SYNCR_TIMEOUT customized.

Further call to Class Create returns ERR_BUSY.

Sep 15 18:16:29.568306 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:29.568394 imma [8200:ntfa_mds.c:0369] TR NTFS down
Sep 15 18:16:29.942518 imma [8200:imma_om_api.c:4303]  saImmOmClassCreate_2
Sep 15 18:16:29.942608 imma [8200:imma_om_api.c:4434] TR name: 
testMA_verifyObjPrimNoResponseModCallback_101 category:1
Sep 15 18:16:30.495659 imma [8200:ntfa_mds.c:0359] T2 NTFA Rcvd MDS subscribe 
evt from svc 28
Sep 15 18:16:30.495742 imma [8200:ntfa_mds.c:0382] T2 MSG from NTFS 
NCSMDS_NEW_ACTIVE/UP
Sep 15 18:16:30.524268 imma [8200:imma_om_api.c:4594] TR Return code:10
Sep 15 18:16:30.524315 imma [8200:imma_om_api.c:4640]  saImmOmClassCreate_2

Syslog on SC-2:

Sep 15 18:16:33 SLES-64BIT-SLOT2 osafamfnd[2467]: NO Assigning 
'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=SC-2,safSg=2N,safApp=OpenSAF'
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmd[2356]: WA IMMD not re-electing coord 
for switch-over (si-swap) coord at (2010f)
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafntfimcnd[2427]: NO exiting on signal 15
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO ERR_BUSY: Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 received while class with same 
name is already being mutated
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 124 
(safMsgGrpService) 312, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 125 
(safLogService) 3, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafimmnd[2393]: NO Implementer connected: 126 
(safCheckPointService) 308, 2020f
Sep 15 18:16:33 SLES-64BIT-SLOT2 osafsmfd[2597]: NO Backup create cmd = 
/usr/lib64/opensaf/smf-backup-create

Syslog on SC-1:

Sep 15 18:16:46 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjApplRejModifyCallback_101 committing with ccbId:1003a
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjApplRejModifyCallback_101 is PERSISTENT.
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: WA Primary PBE failed to create 
class towards slave PBE. Library or immsv replied Rc:6 - ignoring
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 committing with ccbId:1003b
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmpbed: IN Starting distributed PBE 
commit for PRTA update Ccb:1003c/4294967356
Sep 15 18:16:47 SLES-64BIT-SLOT1 osafimmnd[7804]: NO Create of class 
testMA_verifyObjPrimNoResponseModCallback_101 is PERSISTENT.
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:48 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:49 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: NO Slave PBE 6 or Immsv 
(4294901760) replied with transient error on prepare for 
ccb:1003c/4294967356
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA Start prepare for ccb: 
1003c/4294967356 towards slave PBE returned: '6' from Immsv
Sep 15 18:16:50 SLES-64BIT-SLOT1 osafimmpbed: WA PBE-A failed to prepare PRTA 
update Ccb:1003c/4294967356 towards PBE-B
Sep 15 18:16:50