[tickets] [opensaf:tickets] #2205 imm: IMMND crashes when receiving D2ND_ABORT_CCB

2016-11-23 Thread Hung Nguyen
backtrace


~~~
GNU gdb (GDB; SUSE Linux Enterprise 12) 7.9.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib64/opensaf/osafimmnd...done.
[New LWP 5608]
[New LWP 5612]
[New LWP 5610]
[New LWP 5611]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafimmnd --tracemask=0x'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7efe3c1690c7 in raise () from /lib64/libc.so.6
### BT ###
#0  0x7efe3c1690c7 in raise () from /lib64/libc.so.6
#1  0x7efe3c16a478 in abort () from /lib64/libc.so.6
#2  0x7efe3d5e129e in __osafassert_fail (__file=, 
__line=, __func=, __assertion=) at 
../../../../../../opensaf/osaf/libs/core/leap/sysf_def.c:281
#3  0x00430a14 in ImmModel::ccbAbort (this=0x119c840, 
ccbId=ccbId@entry=79, connVector=..., clVector=..., 
nodeId=nodeId@entry=0x7ffcaaaf5314, 
pbeNodeIdPtr=pbeNodeIdPtr@entry=0x7ffcaaaf5318) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/ImmModel.cc:6169
#4  0x00430dd2 in immModel_ccbAbort (cb=cb@entry=0x69e5a0 <_immnd_cb>, 
ccbId=ccbId@entry=79, arrSize=arrSize@entry=0x7ffcaaaf530c, 
implConnArr=implConnArr@entry=0x7ffcaaaf5320, 
clientArr=clientArr@entry=0x7ffcaaaf5328, 
clArrSize=clArrSize@entry=0x7ffcaaaf5310, nodeId=nodeId@entry=0x7ffcaaaf5314, 
pbeNodeId=0x7ffcaaaf5318) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/ImmModel.cc:1353
#5  0x00404276 in immnd_evt_ccb_abort (cb=cb@entry=0x69e5a0 
<_immnd_cb>, ccbId=79, clientArr=clientArr@entry=0x7ffcaaaf5548, 
clArrsize=clArrsize@entry=0x7ffcaaaf5538, nodeId=nodeId@entry=0x7ffcaaaf553c) 
at ../../../../../../../opensaf/osaf/services/saf/immsv/immnd/immnd_evt.c:6931
#6  0x00407ff8 in immnd_evt_proc_ccb_finalize (cb=cb@entry=0x69e5a0 
<_immnd_cb>, evt=evt@entry=0x7ffcaaaf57e0, 
originatedAtThisNd=originatedAtThisNd@entry=SA_FALSE, 
clnt_hdl=clnt_hdl@entry=0, reply_dest=0) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/immnd_evt.c:7687
#7  0x00416afd in immnd_evt_proc_fevs_dispatch (cb=cb@entry=0x69e5a0 
<_immnd_cb>, msg=msg@entry=0x7efe34003568, 
originatedAtThisNd=originatedAtThisNd@entry=SA_FALSE, 
clnt_hdl=clnt_hdl@entry=0, reply_dest=reply_dest@entry=0, 
msgNo=msgNo@entry=5671) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/immnd_evt.c:8318
#8  0x00419b86 in immnd_evt_proc_fevs_rcv (sinfo=0x7efe34003680, 
evt=0x7efe34003540, cb=0x69e5a0 <_immnd_cb>) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/immnd_evt.c:9145
#9  immnd_process_evt () at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/immnd_evt.c:684
#10 0x0040bc36 in main (argc=, argv=) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/immnd_main.c:370
### BT FULL ###
#0  0x7efe3c1690c7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7efe3c16a478 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7efe3d5e129e in __osafassert_fail (__file=, 
__line=, __func=, __assertion=) at 
../../../../../../opensaf/osaf/libs/core/leap/sysf_def.c:281
No locals.
#3  0x00430a14 in ImmModel::ccbAbort (this=0x119c840, 
ccbId=ccbId@entry=79, connVector=..., clVector=..., 
nodeId=nodeId@entry=0x7ffcaaaf5314, 
pbeNodeIdPtr=pbeNodeIdPtr@entry=0x7ffcaaaf5318) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/ImmModel.cc:6169
pbeConn = 0
__FUNCTION__ = "ccbAbort"
i = 
isi = 
ccb = 0x1405350
#4  0x00430dd2 in immModel_ccbAbort (cb=cb@entry=0x69e5a0 <_immnd_cb>, 
ccbId=ccbId@entry=79, arrSize=arrSize@entry=0x7ffcaaaf530c, 
implConnArr=implConnArr@entry=0x7ffcaaaf5320, 
clientArr=clientArr@entry=0x7ffcaaaf5328, 
clArrSize=clArrSize@entry=0x7ffcaaaf5310, nodeId=nodeId@entry=0x7ffcaaaf5314, 
pbeNodeId=0x7ffcaaaf5318) at 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/ImmModel.cc:1353
aborted = 
__FUNCTION__ = "immModel_ccbAbort"
cv = {
  > = {
_M_impl = {
   = {
<__gnu_cxx::new_allocator> = {}, 
}, 
  members of std::_Vector_base >::_Vector_impl: 
  _M_start = 0x0, 
  _M_finish = 0x0, 

[tickets] [opensaf:tickets] #2205 imm: IMMND crashes when receiving D2ND_ABORT_CCB

2016-11-23 Thread Hung Nguyen



---

** [tickets:#2205] imm: IMMND crashes when receiving D2ND_ABORT_CCB**

**Status:** accepted
**Milestone:** 5.0.2
**Created:** Thu Nov 24, 2016 07:23 AM UTC by Hung Nguyen
**Last Updated:** Thu Nov 24, 2016 07:23 AM UTC
**Owner:** Hung Nguyen
**Attachments:**

- 
[osafNode.immnd.bz2](https://sourceforge.net/p/opensaf/tickets/2205/attachment/osafNode.immnd.bz2)
 (18.9 MB; application/octet-stream)


~~~
Nov 16 10:06:17 SC-2-1 osafimmnd[5608]: 
../../../../../../../opensaf/osaf/services/saf/immsv/immnd/ImmModel.cc:6169: 
ccbAbort: Assertion '*nodeId == ccb->mAugCcbParent->mOriginatingNode' failed.
~~~

~~~
Nov 16 10:06:17.260296 osafimmnd [5608:immsv_evt.c:5473] T8 Received: 
IMMND_EVT_A2ND_OI_CCB_AUG_INIT (91) from 0
Nov 16 10:06:17.260303 osafimmnd [5608:immnd_evt.c:10304] >> 
immnd_evt_ccb_augment_init
Nov 16 10:06:17.260310 osafimmnd [5608:ImmModel.cc:6502] >> ccbAugmentInit
Nov 16 10:06:17.260323 osafimmnd [5608:ImmModel.cc:6555] TR Augment CCB in 
state MODIFY_OP
Nov 16 10:06:17.260329 osafimmnd [5608:ImmModel.cc:6592] TR 
omuti->second:0x14051f0
Nov 16 10:06:17.260359 osafimmnd [5608:ImmModel.cc:6593] TR 
omuti->second->mContinuationId:24 == rsp->inv:24
Nov 16 10:06:17.260366 osafimmnd [5608:ImmModel.cc:6600] TR obj:0x1405460
Nov 16 10:06:17.260371 osafimmnd [5608:ImmModel.cc:6658] << ccbAugmentInit

Nov 16 10:06:17.261479 osafimmnd [5608:immsv_evt.c:5473] T8 Received: 
IMMND_EVT_D2ND_ABORT_CCB (62) from 0
Nov 16 10:06:17.261486 osafimmnd [5608:immnd_evt.c:7684] >> 
immnd_evt_proc_ccb_finalize
Nov 16 10:06:17.261490 osafimmnd [5608:immnd_evt.c:6921] >> immnd_evt_ccb_abort
Nov 16 10:06:17.261495 osafimmnd [5608:immnd_evt.c:6925] TR We expect there to 
be a PBE
Nov 16 10:06:17.261501 osafimmnd [5608:ImmModel.cc:6079] >> ccbAbort
Nov 16 10:06:17.261506 osafimmnd [5608:ImmModel.cc:6088] T5 ABORT CCB 79
Nov 16 10:06:17.261539 osafimmnd [5608:ImmModel.cc:6151] NO Ccb 79 ABORTED 
(immcfg_SC-2-1_9735)
~~~


When IMMND received A2ND_OI_CCB_AUG_INIT the ccbstate was changed to CCB_READY.
Then when D2ND_ABORT_CCB message came, in ImmModel::ccbAbort()
\*nodeId is not updated and later it failed to assert

~~~
osafassert(*nodeId == ccb->mAugCcbParent->mOriginatingNode);
~~~

Attached is IMMND traces.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2116 EDS faulted on new Active controller after being promoted from QUIESCED to ACTIVE

2016-11-23 Thread Praveen
- **Milestone**: 5.2.FC --> 5.0.2



---

** [tickets:#2116] EDS faulted on new Active controller after being promoted 
from QUIESCED to ACTIVE**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Thu Oct 13, 2016 09:49 AM UTC by Ritu Raj
**Last Updated:** Thu Oct 13, 2016 09:50 AM UTC
**Owner:** nobody
**Attachments:**

- 
[messages](https://sourceforge.net/p/opensaf/tickets/2116/attachment/messages) 
(2.9 MB; application/octet-stream)
- 
[osafevtd](https://sourceforge.net/p/opensaf/tickets/2116/attachment/osafevtd) 
(102.4 kB; application/octet-stream)


# Environment details
OS : Suse 64bit
Changeset : 8190 ( 5.1.GA)
Setup : 3 nodes ( 3 controllers with headless feature enabled & PBE disabled)

# Summary
EDS faulted on new Active controller after being promoted from QUIESCED to 
ACTIVE

# Steps followed & Observed behaviour
1. Initially started OpenSAF on 3 controller with HEADLESS feature enabled 
(SC-1 ACTIVE, SC-2 Standby, SC-3 QUIESCED)
2. Stop OpenSAF on both the controller(Active/Standby) simultaneously
3. QUIESCED controller become Active as clmna Starting to promote this node to 
a system controller

Oct 13 14:29:05 SCALE_SLOT-73 osafclmna[3434]: NO Starting to promote this node 
to a system controller
Oct 13 14:29:05 SCALE_SLOT-73 osafrded[3443]: NO Requesting ACTIVE role


Oct 13 14:29:10 SCALE_SLOT-73 osafimmd[3462]: IN AMF HA ACTIVE request
Oct 13 14:29:10 SCALE_SLOT-73 osaffmd[3452]: NO Stopped activation supervision 
due to new AMF state 1
Oct 13 14:29:10 SCALE_SLOT-73 osafamfd[3513]: NO Received node_up from 2030f: 
msg_id 1
Oct 13 14:29:10 SCALE_SLOT-73 osafamfd[3513]: NO Node 'SC-3' joined the cluster

3. After few second EDS faulted and node went for reboot

Oct 13 14:30:11 SCALE_SLOT-73 osafamfnd[3523]: NO 
'safComp=EDS,safSu=SC-3,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Oct 13 14:30:11 SCALE_SLOT-73 osafamfnd[3523]: ER 
safComp=EDS,safSu=SC-3,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Oct 13 14:30:11 SCALE_SLOT-73 osafamfnd[3523]: Rebooting OpenSAF NodeId = 
131855 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131855, SupervisionTime = 60


** Notes
1. Syslog attached
2. osafevtd trace attached


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2134 AMF: Update RTA saAmfSISUHAState to IMM

2016-11-23 Thread Praveen
- **Milestone**: 5.2.FC --> never



---

** [tickets:#2134] AMF: Update RTA saAmfSISUHAState to IMM**

**Status:** wontfix
**Milestone:** never
**Created:** Thu Oct 20, 2016 07:58 PM UTC by Minh Hon Chau
**Last Updated:** Thu Nov 10, 2016 06:48 AM UTC
**Owner:** nobody


In scenario of 2N Si-swap, when AMFD sends QUIESCED su_si assignment msg (for 
example) to AMFND that changes the HA State of SUSI assignment, AMFD updates 
its local state AVD_SU_SI_REL::state, checkpoint this change to standby AMFD. 
However, AMFD does not updates saAmfSISUHAState untill receiving su_si 
assignment response. Question:
(1). Whether AMFD should update the runtime attribute saAmfSISUHAState to IMM 
as long as local @state gets updated in implementer; to make IMM, active AMFD, 
standby AMFD all are synced
(2). Or AMFD updates saAmfSISUHAState to IMM only if AMFD receives su_si 
assignment from AMFND, as it has been implemented currently for some reason 
(not expose the change of saAmfSISUHAState to user too early?)

grep "avd_susi_update" which updates saAmfSISUHAState to IMM, there is also an 
inconsistency in usage. For avd_susi_mod_send() sends su_si msg and also 
updates saAmfSISUHAState immediately, while avd_sg_su_si_mod_snd does 
otherwise. 

Since the headless recovery relies on IMM to restore the state. If 
saAmfSISUHAState is not updated punctually and the node is reboot during 
headless stage, so after headless saAmfSISUHAState read from IMM does not fit 
with many other states (SG fsm, SUSI fsm, saAmfSISUHAState of the other SUSIs).

My question is if doing (1) will cause any problem for normal cluster? Pending 
patches #1725 part 2 currently implement (1).



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2191 ntf: not handle TRY_AGAIN of saClmInitialize_4()

2016-11-23 Thread Vu Minh Nguyen
- **status**: review --> fixed
- **assigned_to**: Vu Minh Nguyen -->  nobody 
- **Comment**:

changeset:   8357:2f1901bd5e3f
tag: tip
parent:  8354:288991466a47
user:Vu Minh Nguyen 
date:Thu Nov 17 13:10:24 2016 +0700
summary: ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]

changeset:   8356:898ed6baeb2f
branch:  opensaf-5.1.x
parent:  8352:15416dce3e2d
user:Vu Minh Nguyen 
date:Thu Nov 17 13:10:24 2016 +0700
summary: ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]

changeset:   8355:6833e9b1d421
branch:  opensaf-5.0.x
parent:  8351:4eb1ebe62d35
user:Vu Minh Nguyen 
date:Thu Nov 17 13:10:24 2016 +0700
summary: ntf: handle error code TRY_AGAIN of saClmInitialize() [#2191]



---

** [tickets:#2191] ntf: not handle TRY_AGAIN of saClmInitialize_4()**

**Status:** fixed
**Milestone:** 5.0.2
**Created:** Thu Nov 17, 2016 03:12 AM UTC by Vu Minh Nguyen
**Last Updated:** Thu Nov 17, 2016 06:31 AM UTC
**Owner:** nobody


In current implementation, ntfsv does not deal with `SA_AIS_ERR_TRY_AGAIN` when 
calling  `saClmInitialize`. If that is the case, ntfsv will be exited, and in 
consequence, the node will be reboot.

Below is the trace log, happened when testing roaming SC:

> 2016-11-15 18:13:25 SC-3 osafntfd[458]: ER saClmInitialize failed with error: 
> 6
> 2016-11-15 18:13:25 SC-3 osafrded[412]: NO Peer down on node 0x2020f


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2158 OSAF: IMMND dies before IMMND becomes AMF component

2016-11-23 Thread Hans Nordebäck
Ticket [#2204] created 


---

** [tickets:#2158] OSAF: IMMND dies before IMMND becomes AMF component**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Wed Nov 02, 2016 05:20 AM UTC by Minh Hon Chau
**Last Updated:** Wed Nov 23, 2016 12:30 PM UTC
**Owner:** nobody
**Attachments:**

- 
[osafamfnd_sc2](https://sourceforge.net/p/opensaf/tickets/2158/attachment/osafamfnd_sc2)
 (264.2 kB; application/octet-stream)


If IMMND dies at Opensaf startup phase, IMMND is not restarted by AMF. The 
issue has been observed in following situation
- Restart cluster
- During active controller starts up, a critical component is death which cause 
a node failfast
Oct 25 12:51:21 SC-1 osafamfnd[7642]: ER 
safComp=ABC,safSu=1,safSg=2N,safApp=ABC Faulted due to:csiSetcallbackTimeout 
Recovery is:nodeFailfast
Oct 25 12:51:21 SC-1 osafamfnd[7642]: Rebooting OpenSAF NodeId = 131343 EE Name 
= , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131343, 
SupervisionTime = 60
- In the meantime, standby controller is requested to become active
Oct 25 12:51:27 SC-2 tipclog[16221]: Lost link <1.1.2:eth0-1.1.1:eth0> on 
network plane A
Oct 25 12:51:27 SC-2 osafclmna[4336]: NO Starting to promote this node to a 
system controller
Oct 25 12:51:27 SC-2 osafrded[4387]: NO Requesting ACTIVE role
- IMMND is also death a bit later
Oct 25 12:51:29 SC-2 osafimmnd[4536]: ER MESSAGE:44816 OUT OF ORDER my highest 
processed:44814 - exiting
Oct 25 12:51:29 SC-2 osafamfnd[7414]: NO saClmDispatch BAD_HANDLE
- Other services could not initialize other services since IMMND is death
Oct 25 12:51:39 SC-2 osafamfd[7400]: WA saClmInitialize_4 returned 5
Oct 25 12:51:39 SC-2 osafamfd[7400]: WA saNtfInitialize returned 5
Oct 25 12:51:39 SC-2 osafntfimcnd[7501]: WA ntfimcn_ntf_init saNtfInitialize( 
returned SA_AIS_ERR_TIMEOUT (5)
Oct 25 12:51:39 SC-2 osafclmd[7386]: WA saImmOiImplementerSet returned 9
Oct 25 12:51:39 SC-2 osafntfd[7372]: WA saLogInitialize returns try again, 
retries...
Oct 25 12:51:39 SC-2 osaflogd[7358]: WA saImmOiImplementerSet returned 
SA_AIS_ERR_BAD_HANDLE (9)
Oct 25 12:51:39 SC-2 osafamfnd[7414]: WA saClmInitialize_4 returned 5

Oct 25 12:51:49 SC-2 osafamfd[7400]: WA saClmInitialize_4 returned 5
Oct 25 12:51:50 SC-2 osafamfd[7400]: WA saNtfInitialize returned 5
Oct 25 12:51:50 SC-2 osafamfnd[7414]: WA saClmInitialize_4 returned 5

Oct 25 12:52:00 SC-2 osafamfd[7400]: WA saClmInitialize_4 returned 5
Oct 25 12:52:00 SC-2 osafamfd[7400]: WA saNtfInitialize returned 5
Oct 25 12:52:00 SC-2 osafamfnd[7414]: WA saClmInitialize_4 returned 5

Oct 25 12:52:20 SC-2 osafamfnd[7414]: WA saClmInitialize_4 returned 5
Oct 25 12:52:20 SC-2 osafamfd[7400]: WA saNtfInitialize returned 5
Oct 25 12:52:20 SC-2 osafimmd[4489]: NO Extended intro from node 2210f

- At the end, AMFD heart beat timeout 
Oct 25 12:53:57 SC-2 osafntfimcnd[7501]: WA ntfimcn_ntf_init saNtfInitialize( 
returned SA_AIS_ERR_TIMEOUT (5)
Oct 25 12:54:01 SC-2 osafamfnd[7414]: WA saClmInitialize_4 returned 5
Oct 25 12:54:01 SC-2 osafamfd[7400]: WA saNtfInitialize returned 5
Oct 25 12:54:01 SC-2 osafamfd[7400]: WA saClmInitialize_4 returned 5
Oct 25 12:54:07 SC-2 osafntfimcnd[7501]: WA ntfimcn_ntf_init saNtfInitialize( 
returned SA_AIS_ERR_TIMEOUT (5)
Oct 25 12:54:11 SC-2 osafamfnd[7414]: WA saClmInitialize_4 returned 5
Oct 25 12:54:11 SC-2 osafamfd[7400]: WA saClmInitialize_4 returned 5
Oct 25 12:54:11 SC-2 osafamfd[7400]: WA saNtfInitialize returned 5
Oct 25 12:54:15 SC-2 osafamfnd[7414]: ER AMF director heart beat timeout, 
generating core for amfd

In AMFND trace in SC2, AMFND did not receive su_pres from AMFD, therefore AMFND 
could not initiate middleware components (including IMMND), so AMFND was not 
aware of IMMND's death so that AMFND can restart IMMND. The problem here is 
slightly different from #1828, which happened in newly promoted SC (with 
roamingSC feature) where AMFND had IMMND registered.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2184 AMF: Return TRY_AGAIN if SG Fsm State is not STABLE

2016-11-23 Thread Minh Hon Chau
- **labels**:  --> SA_AIS_ERR_BAD_OPERATION for ccb_complete if unstable sg
- **status**: assigned --> review



---

** [tickets:#2184] AMF: Return TRY_AGAIN if SG Fsm State is not STABLE**

**Status:** review
**Milestone:** 5.0.2
**Labels:** SA_AIS_ERR_BAD_OPERATION for ccb_complete if unstable sg 
**Created:** Sat Nov 12, 2016 11:59 AM UTC by Minh Hon Chau
**Last Updated:** Wed Nov 23, 2016 08:47 AM UTC
**Owner:** Minh Hon Chau


We have AMFD code returning SA_AIS_ERR_TRY_AGAIN to IMM if SG fsm state is not 
STABLE, while other places are returning SA_AIS_ERR_BAD_OPERATION.
According to AMF spec, item 9.4:

"SA_AIS_ERR_TRY_AGAIN - The service cannot be provided at this time. The client
may retry later. This error generally should be returned when the requested 
action is
valid but not currently possible, probably because another operation is acting 
upon
the logical entity on which the administrative operation is invoked. Such an 
operation
can be another administrative operation or an error recovery initiated by the 
Availabil-
ity Management Framework."

9.4.2 SA_AMF_ADMIN_UNLOCK
SA_AIS_ERR_BAD_OPERATION - The operation was not successful because the tar-
get entity is in locked-instantiation administrative state.
9.4.3 SA_AMF_ADMIN_LOCK
SA_AIS_ERR_BAD_OPERATION - The operation was not successful because the tar-
get entity is in locked-instantiation administrative state.
... and so on

SA_AIS_ERR_BAD_OPERATION should be returned in case that operations are 
impossibly executable under a specific circumtance. 

One application has seen "immcfg -a saAmfSIPrefActiveAssignments" sometimes 
returning SA_AIS_ERR_FAILED_OPERATION while the SG is performing SUSI 
assignment, sometimes immcfg succeeds. 

Nov 10 03:20:03 SC-2-1 ABC: NOTICE: immcfg -a saAmfSIPrefActiveAssignments=2 
safSi=All-NWayActive,safApp=ABC returned error - saImmOmCcbApply FAILED: 
SA_AIS_ERR_FAILED_OPERATION (21)#012OI reports: IMM: Validation abort: 
Completed validation fails (ERR_BAD_OPERATION)#012OI reports: 
SG'safSg=NWayActive,safApp=ABC' is not stable (1)
…
Nov 10 03:20:03 SC-2-1 osafamfnd[12061]: NO 
'safSu=SC-1,safSg=NWayActive,safApp=ABC' Presence State INSTANTIATING => 
INSTANTIATED
Nov 10 03:20:03 SC-2-1 osafamfnd[12061]: NO Assigned 
'safSi=All-NWayActive,safApp=ABC' ACTIVE to 
'safSu=SC-1,safSg=NWayActive,safApp=ABC'

This ticket will change SA_AIS_ERR_BAD_OPERATION to SA_AIS_ERR_TRY_AGAIN in all 
other places where SG is not STABLE, so that application can try again until SG 
becomes STABLE (as it should be).


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2202 cpnd: osafckptnd core dump in high memory load

2016-11-23 Thread Vo Minh Hoang
- **status**: accepted --> review



---

** [tickets:#2202] cpnd: osafckptnd core dump in high memory load**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Wed Nov 23, 2016 09:18 AM UTC by Vo Minh Hoang
**Last Updated:** Wed Nov 23, 2016 09:18 AM UTC
**Owner:** Vo Minh Hoang


Coredump occur while creating checkpoint section in high memory load, sharemem 
guarantee is not enable.

~~~
Core was generated by `/usr/lib64/opensaf/osafckptnd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f38f8513109 in __strtok_r_1c () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install 
opensaf-ckpt-nodedirector-debuginfo-5.1.0-690.0.d0f65c1.sle12.x86_64
(gdb) where
#0  0x7f38f8513109 in __strtok_r_1c () from /lib64/libc.so.6
#1  0x7f38f9fc074a in memcpy (__len=, __src=, 
__dest=) at /usr/include/bits/string3.h:51
#2  ncs_os_posix_shm (req=req@entry=0x7fff5de1f6b0)
at ../../../../../../opensaf/osaf/libs/core/leap/os_defs.c:858
#3  0x00415f6f in cpnd_sec_hdr_update 
(sec_info=sec_info@entry=0x19dc880, 
cp_node=cp_node@entry=0x19dc3e0)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_proc.c:1875
#4  0x0040673a in cpnd_ckpt_sec_add (cp_node=0x19dc3e0, 
id=0x7f38f0008a00, 
exp_time=1478796221720867000, gen_flag=gen_flag@entry=0)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_db.c:456
#5  0x0040d718 in cpnd_evt_proc_ckpt_sect_create 
(cb=cb@entry=0x18337f0, 
evt=evt@entry=0x7f38f000ad80, sinfo=sinfo@entry=0x7f38f000b3d8)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_evt.c:2244
#6  0x0040eff4 in cpnd_process_evt (evt=0x7f38f000ad70)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_evt.c:227
#7  0x00410bcd in cpnd_main_process (cb=cb@entry=0x18337f0)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_init.c:579
#8  0x00405a83 in main (argc=, argv=)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_main.c:79
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2202 cpnd: osafckptnd core dump in high memory load

2016-11-23 Thread Vo Minh Hoang



---

** [tickets:#2202] cpnd: osafckptnd core dump in high memory load**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Wed Nov 23, 2016 09:18 AM UTC by Vo Minh Hoang
**Last Updated:** Wed Nov 23, 2016 09:18 AM UTC
**Owner:** Vo Minh Hoang


Coredump occur while creating checkpoint section in high memory load, sharemem 
guarantee is not enable.

~~~
Core was generated by `/usr/lib64/opensaf/osafckptnd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f38f8513109 in __strtok_r_1c () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install 
opensaf-ckpt-nodedirector-debuginfo-5.1.0-690.0.d0f65c1.sle12.x86_64
(gdb) where
#0  0x7f38f8513109 in __strtok_r_1c () from /lib64/libc.so.6
#1  0x7f38f9fc074a in memcpy (__len=, __src=, 
__dest=) at /usr/include/bits/string3.h:51
#2  ncs_os_posix_shm (req=req@entry=0x7fff5de1f6b0)
at ../../../../../../opensaf/osaf/libs/core/leap/os_defs.c:858
#3  0x00415f6f in cpnd_sec_hdr_update 
(sec_info=sec_info@entry=0x19dc880, 
cp_node=cp_node@entry=0x19dc3e0)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_proc.c:1875
#4  0x0040673a in cpnd_ckpt_sec_add (cp_node=0x19dc3e0, 
id=0x7f38f0008a00, 
exp_time=1478796221720867000, gen_flag=gen_flag@entry=0)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_db.c:456
#5  0x0040d718 in cpnd_evt_proc_ckpt_sect_create 
(cb=cb@entry=0x18337f0, 
evt=evt@entry=0x7f38f000ad80, sinfo=sinfo@entry=0x7f38f000b3d8)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_evt.c:2244
#6  0x0040eff4 in cpnd_process_evt (evt=0x7f38f000ad70)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_evt.c:227
#7  0x00410bcd in cpnd_main_process (cb=cb@entry=0x18337f0)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_init.c:579
#8  0x00405a83 in main (argc=, argv=)
at ../../../../../../../opensaf/osaf/services/saf/cpsv/cpnd/cpnd_main.c:79
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets