[tickets] [opensaf:tickets] #2321 Incorrect error messages "mkfifo already exists" observed in syslog

2017-02-22 Thread Ritu Raj



---

** [tickets:#2321] Incorrect error messages "mkfifo already exists" observed in 
syslog**

**Status:** unassigned
**Milestone:** 5.2.FC
**Created:** Thu Feb 23, 2017 05:46 AM UTC by Ritu Raj
**Last Updated:** Thu Feb 23, 2017 05:46 AM UTC
**Owner:** nobody


# Environment details
OS : Suse 64bit
Changeset :  8603( 5.2.MO-1)

# Summary
Incorrect error messages "mkfifo already exists" observed in syslog after 
perfoming opensaf stop and start operation.

#Steps
1. Started the OpenSAF on single controller
2. Stop the OpenSAF and start agian, while starting OpnSAF again on same node 
following error message observed in syslog for component osafamfnd and 
osafamfwd:

Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: mkfifo already exists: 
/var/lib/opensaf/osafamfnd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: Started

Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: mkfifo already exists: 
/var/lib/opensaf/osafamfwd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: Started





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2313 osaf: saflog function does not handle long record

2017-02-22 Thread Nguyen TK Luu
- **status**: assigned --> review



---

** [tickets:#2313] osaf: saflog function does not handle long record**

**Status:** review
**Milestone:** 5.0.2
**Created:** Fri Feb 17, 2017 06:45 AM UTC by Vu Minh Nguyen
**Last Updated:** Mon Feb 20, 2017 12:45 PM UTC
**Owner:** Nguyen TK Luu


`saflog` is a utility function provided for other OpenSAF services to write an 
log record to OpenSAF LOG.
The `logBuf` in that function is only limited to 255 bytes.

So, if writing an log record more than 255 bytes, the log record will be 
truncated and `logBufSize` will hold the string length of the log record it 
would have been written to. As the result, it leads to incosistence in 
`logBufSize` and `logBuf`, eventually getting `SA_AIS_INVALID_PARAM`.

So, to solve the problem, `logBuf` capacity should be extended to maximum log 
record size (65Kb) so that it can avoid log record truncation.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2146 log: implement SaLogFilterSetCallbackT

2017-02-22 Thread Vu Minh Nguyen
- **status**: review --> fixed
- **assigned_to**: Canh Truong -->  nobody 
- **Comment**:

changeset:   8610:224db03b3ec0
tag: tip
user:Canh Van Truong 
date:Wed Feb 22 15:05:51 2017 +0700
summary: log: implement SaLogFilterSetCallbackT and version handling [#2146]




---

** [tickets:#2146] log: implement SaLogFilterSetCallbackT**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Fri Oct 28, 2016 09:01 AM UTC by Vu Minh Nguyen
**Last Updated:** Thu Jan 19, 2017 11:35 AM UTC
**Owner:** nobody


This ticket is to implement SaLogFilterSetCallbackT which is mentioned at 
section `3.6.5 SaLogFilterSetCallbackT` @ AIS LOG document.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2241 dtm: Use number suffix for mds log backup files

2017-02-22 Thread Anders Widell
- **status**: review --> fixed
- **Comment**:

changeset:   8609:22b5d41ac612
user:Anders Widell 
date:Wed Feb 22 15:47:41 2017 +0100
summary: dtm: Use .1 file name extension for MDS log backup file [#2241]

[staging:22b5d4]



---

** [tickets:#2241] dtm: Use number suffix for mds log backup files**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Thu Dec 22, 2016 10:12 AM UTC by Anders Widell
**Last Updated:** Tue Feb 21, 2017 05:19 PM UTC
**Owner:** Anders Widell


Use the suffix .0 instead of .bak for the backup file of the mds log, to align 
with log rotation normally used in /var/log, and to enable the possibility to 
implement support for multiple backup files.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2320 clm: standby clmd crashes due to missing node information

2017-02-22 Thread Zoran Milinkovic
- Description has changed:

Diff:



--- old
+++ new
@@ -1,4 +1,25 @@
 The standby CLMD service crashed due to missing PL-3 information.
+
+syslog from SC-2:
+~~~
+Feb 13 00:43:31 SC-2-2 osafamfd[5082]: NO Cold sync complete!
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: Ruling epoch noted as:5
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: SaImmRepositoryInitModeT 
changed and noted as 'SA_IMM_KEEP_REPOSITORY'
+Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
+Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 19082
+Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO Epoch set to 5 in ImmModel
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2020f old epoch: 4  new epoch:5
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2010f old epoch: 4  new epoch:5
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
+Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2030f old epoch: 0  new epoch:5
+Feb 13 00:43:31 SC-2-2 osafclmd[5066]: ER Node is NULL,problem with the 
database.
+Feb 13 00:43:31 SC-2-2 osafclmd[5066]: 
../../opensaf/src/clm/clmd/clms_mbcsv.c:468: ckpt_proc_node_rec: Assertion '0' 
failed.
+Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
+Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: ER 
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
+Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
+Feb 13 00:43:32 SC-2-2 opensaf_reboot: Rebooting local node; timeout=60
+~~~
 
 Coredump:
 ~~~






---

** [tickets:#2320] clm: standby clmd crashes due to missing node information**

**Status:** unassigned
**Milestone:** 5.2.FC
**Created:** Wed Feb 22, 2017 12:55 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Feb 22, 2017 12:55 PM UTC
**Owner:** nobody


The standby CLMD service crashed due to missing PL-3 information.

syslog from SC-2:
~~~
Feb 13 00:43:31 SC-2-2 osafamfd[5082]: NO Cold sync complete!
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: Ruling epoch noted as:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: SaImmRepositoryInitModeT changed 
and noted as 'SA_IMM_KEEP_REPOSITORY'
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 19082
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO Epoch set to 5 in ImmModel
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2020f old epoch: 4  new epoch:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2010f old epoch: 4  new epoch:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at 
node 2030f old epoch: 0  new epoch:5
Feb 13 00:43:31 SC-2-2 osafclmd[5066]: ER Node is NULL,problem with the 
database.
Feb 13 00:43:31 SC-2-2 osafclmd[5066]: 
../../opensaf/src/clm/clmd/clms_mbcsv.c:468: ckpt_proc_node_rec: Assertion '0' 
failed.
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: NO 
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: ER 
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
Feb 13 00:43:32 SC-2-2 opensaf_reboot: Rebooting local node; timeout=60
~~~

Coredump:
~~~
[New LWP 5066]
[New LWP 5069]
[New LWP 5068]
[New LWP 5070]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafclmd'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
### BT ###
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
#1  0x7fbc7be89478 in abort () from /lib64/libc.so.6
#2  0x7fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50 
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468, 
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec", 
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at 
../../opensaf/src/base/sysf_def.c:281
#3  0x7fbc7e1aa016 in ckpt_proc_node_rec (cb=, 
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
#4  0x7fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=, 
cb=0x7fbc7e3be100 

[tickets] [opensaf:tickets] #2320 clm: standby clmd crashes due to missing node information

2017-02-22 Thread Zoran Milinkovic



---

** [tickets:#2320] clm: standby clmd crashes due to missing node information**

**Status:** unassigned
**Milestone:** 5.2.FC
**Created:** Wed Feb 22, 2017 12:55 PM UTC by Zoran Milinkovic
**Last Updated:** Wed Feb 22, 2017 12:55 PM UTC
**Owner:** nobody


The standby CLMD service crashed due to missing PL-3 information.

Coredump:
~~~
[New LWP 5066]
[New LWP 5069]
[New LWP 5068]
[New LWP 5070]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafclmd'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
### BT ###
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
#1  0x7fbc7be89478 in abort () from /lib64/libc.so.6
#2  0x7fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50 
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468, 
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec", 
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at 
../../opensaf/src/base/sysf_def.c:281
#3  0x7fbc7e1aa016 in ckpt_proc_node_rec (cb=, 
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
#4  0x7fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=, 
cb=0x7fbc7e3be100 <_clms_cb>) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:2310
#5  ckpt_decode_cbk_handler (cbk_arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:1997
#6  mbcsv_callback (arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:719
#7  0x7fbc7c856f76 in ncs_mbscv_rcv_decode (peer=peer@entry=0x7fbc7f217a60, 
evt=evt@entry=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:393
#8  0x7fbc7c857146 in ncs_mbcsv_rcv_async_update (peer=0x7fbc7f217a60, 
evt=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:440
#9  0x7fbc7c85dd30 in mbcsv_process_events (rcvd_evt=0x7fbc740036e0, 
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at 
../../opensaf/src/mbc/mbcsv_pr_evts.c:168
#10 0x7fbc7c85de9b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753, 
mbx=mbx@entry=4283432961) at ../../opensaf/src/mbc/mbcsv_pr_evts.c:272
#11 0x7fbc7c8586c2 in mbcsv_process_dispatch_request (arg=0x7fff27b6b310) 
at ../../opensaf/src/mbc/mbcsv_api.c:423
#12 0x7fbc7e1aa7be in clms_mbcsv_dispatch (mbcsv_hdl=) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:687
#13 0x7fbc7e19e4e4 in main (argc=, argv=) at 
../../opensaf/src/clm/clmd/clms_main.c:535
### BT FULL ###
#0  0x7fbc7be880c7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7fbc7be89478 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50 
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468, 
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec", 
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at 
../../opensaf/src/base/sysf_def.c:281
No locals.
#3  0x7fbc7e1aa016 in ckpt_proc_node_rec (cb=, 
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
param = 0x7fbc7f218a60
node = 0x0
ip = 0x0
__FUNCTION__ = "ckpt_proc_node_rec"
#4  0x7fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=, 
cb=0x7fbc7e3be100 <_clms_cb>) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:2310
ckpt_cluster_rec = 
rc = 1
num_bytes = 
hdr = 0x7fbc7f218a50
ckpt_finalize_rec = 
ckpt_node_rec = 
ckpt_node_config_rec = 
ckpt_node_del_rec = 
ckpt_node_down_rec = 
ckpt_msg = 0x7fbc7f218a50
ckpt_client_rec = 
ckpt_csync_node_rec = 
ckpt_agent_down = 
#5  ckpt_decode_cbk_handler (cbk_arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:1997
rc = 1
msg_fmt_version = 1
#6  mbcsv_callback (arg=0x7fff27b6b1a0) at 
../../opensaf/src/clm/clmd/clms_mbcsv.c:719
rc = 1
__FUNCTION__ = "mbcsv_callback"
#7  0x7fbc7c856f76 in ncs_mbscv_rcv_decode (peer=peer@entry=0x7fbc7f217a60, 
evt=evt@entry=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:393
parg = {
  i_op = NCS_MBCSV_CBOP_DEC,
  i_client_hdl = 0,
  i_ckpt_hdl = 4292870177,
  info = {
encode = {
  io_msg_type = NCS_MBCSV_MSG_ASYNC_UPDATE,
  io_action = NCS_MBCSV_ACT_ADD,
  io_reo_type = 6,
  io_reo_hdl = 0,
  io_uba = {
start = 0x0,
ub = 0x0,
bufp = 0x70 ,
res = 112,
ttl = 0,
max = 2132894108
  },
  io_req_context = 9209973925752930305,
  i_peer_version = 21264
},
decode = {
  i_msg_type = NCS_MBCSV_MSG_ASYNC_UPDATE,
  i_action = NCS_MBCSV_ACT_ADD,
  i_reo_type = 6,
  i_uba = {
 

[tickets] [opensaf:tickets] #2319 base: Add a base64 encoding function

2017-02-22 Thread Anders Widell



---

** [tickets:#2319] base: Add a base64 encoding function**

**Status:** assigned
**Milestone:** next
**Created:** Wed Feb 22, 2017 12:27 PM UTC by Anders Widell
**Last Updated:** Wed Feb 22, 2017 12:27 PM UTC
**Owner:** Anders Widell


The hash function implementation contains a private implementation of base64. 
Consider making this function generic and exposing a public API for it.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2318 base: Optimize the performance of the hash function

2017-02-22 Thread Anders Widell



---

** [tickets:#2318] base: Optimize the performance of the hash function**

**Status:** assigned
**Milestone:** next
**Created:** Wed Feb 22, 2017 12:24 PM UTC by Anders Widell
**Last Updated:** Wed Feb 22, 2017 12:24 PM UTC
**Owner:** Anders Widell


The hash function in base is currently unoptimized. Perform some basic 
optimizations so that the performance is comparable to other implementations.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2317 base: Add a raw interface to the hash function

2017-02-22 Thread Anders Widell



---

** [tickets:#2317] base: Add a raw interface to the hash function**

**Status:** assigned
**Milestone:** next
**Created:** Wed Feb 22, 2017 12:20 PM UTC by Anders Widell
**Last Updated:** Wed Feb 22, 2017 12:20 PM UTC
**Owner:** Anders Widell


Add a raw interface as an alternative to the C++ string interface for the hash 
function. This raw interface would take naked pointers as parameters and 
produce a raw sequence of bytes as ouput, i.e. without any encoding.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2304 imm: osafimmpbed creates coredump due to double free memory

2017-02-22 Thread Zoran Milinkovic
- **status**: review --> fixed
- **Comment**:

default(5.2):

changeset:   8608:b50e7fd1fa07
tag: tip
parent:  8605:0c6da910d0d4
user:Zoran Milinkovic 
date:Mon Feb 13 13:36:49 2017 +0100
summary: imm: fix PBE coredump for double freeing memory [#2304]



---

** [tickets:#2304] imm: osafimmpbed creates coredump due to double free memory**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Mon Feb 13, 2017 11:57 AM UTC by Zoran Milinkovic
**Last Updated:** Mon Feb 13, 2017 12:48 PM UTC
**Owner:** Zoran Milinkovic


When IMM is running with code coverage, there is often coredump for osafimmpbed.
The problem comes from double exit call from two threads, the main and MDS 
thread. Both threads try to call destructor for static variable in IMM PBE 
library.

I think this is a timing issue and we haven't seen this error earlier. With 
code coverage flag, the problem occurs aprox. once a day.

GDB coredump backtrace:
~~~
[New LWP 1888]
[New LWP 1884]
[New LWP 1887]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/lib/opensaf/osafimmpbed --pbe 
/srv/shared/imm//imm.db'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56

Thread 3 (Thread 0x7fc9258e8b00 (LWP 1887)):
#0  0x7fc924072fdd in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x7fc924ad909b in osaf_poll_no_timeout (io_fds=0x7fc9258e8290, 
i_nfds=1) at src/base/osaf_poll.c:32
result = 32713
#2  0x7fc924ad9248 in osaf_ppoll (io_fds=0x7fc9258e8290, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at src/base/osaf_poll.c:79
millisecond_round_up = {tv_sec = 0, tv_nsec = 99}
max_possible_timeout = {tv_sec = 2147483, tv_nsec = 64700}
start_time = {tv_sec = 17179869186, tv_nsec = 140501895252736}
time_left_ts = {tv_sec = 1, tv_nsec = 1}
result = 615339859
#3  0x7fc924ae95cf in ncs_tmr_wait () at src/base/sysf_tmr.c:409
rc = 1
inds_rmvd = 1
next_delay = 0
tv = {tv_sec = 16777215, tv_usec = 0}
ts_current = {tv_sec = 216961, tv_nsec = 620030550}
ts = {tv_sec = 16777215, tv_nsec = 0}
set = {fd = 8, events = 1, revents = 0}
#4  0x7fc924353184 in start_thread (arg=0x7fc9258e8b00) at 
pthread_create.c:312
__res = 
pd = 0x7fc9258e8b00
now = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140501895252736, 
-8535808571625374835, 1, 1, 140501895253440, 140501895252736, 
8509828887122344845, 8509832138929142669}, mask_was_saved = 0}}, priv = {pad = 
{0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 
pagesize_m1 = 
sp = 
freesize = 
__PRETTY_FUNCTION__ = "start_thread"
#5  0x7fc92408037d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 2 (Thread 0x7fc9258eb780 (LWP 1884)):
#0  0x7fc92435a64a in do_fcntl (arg=0x7ffd194a7070, cmd=7, fd=22) at 
../sysdeps/unix/sysv/linux/fcntl.c:39
resultvar = 18446744073709551104
#1  __libc_fcntl (fd=22, cmd=) at 
../sysdeps/unix/sysv/linux/fcntl.c:92
ap = {{gp_offset = 16, fp_offset = 32713, overflow_arg_area = 
0x7ffd194a7070, reg_save_area = 0x7ffd194a7030}}
arg = 0x7ffd194a7070
oldtype = 0
#2  0x7fc925270985 in __gcov_open () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#3  0x7fc9252714ee in gcov_exit () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#4  0x7fc923fc21a9 in __run_exit_handlers (status=1, listp=0x7fc9243446c8 
<__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
atfct = 
onfct = 
cxafct = 
f = 
#5  0x7fc923fc21f5 in __GI_exit (status=) at exit.c:104
No locals.
#6  0x55828aa7c60c in pbeDaemon (immHandle=4230542917903, 
dbHandle=0x55828bb010e8, ownerHandle=1483565869334821379, 
classIdMap=0x7ffd194abc10, objCount=335, pbe2=false, pbe2B=false) at 
src/imm/immpbed/immpbe_daemon.cc:2343
error = SA_AIS_OK
ci = {first = , second = }
__FUNCTION__ = "pbeDaemon"
#7  0x55828aa6b408 in main (argc=3, argv=0x7ffd194abdd8) at 
src/imm/immpbed/immpbe.cc:354
localTmpFilename = ""
pbeRecoverFile = true
dbHandle = 0x55828bb010e8
classIdMap = std::map with 62 elements = {["OpenSafLogConfig"] = 
0x55828bb83290, ["OpenSafLogCurrentConfig"] = 0x55828bb792a0, 
["OpenSafSmfCampRestartIndicator"] = 0x55828bb7cdb0, 
["OpenSafSmfCampRestartInfo"] = 0x55828bb83b80, ["OpenSafSmfConfig"] = 
0x55828bb79090, ["OpenSafSmfExecControl"] = 0x55828bb7d0d0, ["OpenSafSmfMisc"] 
= 0x55828bb7bbc0, ["OpenSafSmfPbeIndicator"] = 0x55828bb79420, 
["OpenSafSmfRollbackData"] = 

[tickets] [opensaf:tickets] #2261 imm: PBE is crashing when OpenSAF is shutting down

2017-02-22 Thread Zoran Milinkovic
- **status**: assigned --> duplicate
- **Comment**:

Duplicate of #2304



---

** [tickets:#2261] imm: PBE is crashing when OpenSAF is shutting down**

**Status:** duplicate
**Milestone:** 5.0.2
**Created:** Fri Jan 13, 2017 09:26 AM UTC by Zoran Milinkovic
**Last Updated:** Fri Jan 13, 2017 09:26 AM UTC
**Owner:** Zoran Milinkovic


When OpenSAF is shutting down, PBE is crashing sometimes.

~~~
[New LWP 1888]
[New LWP 1884]
[New LWP 1887]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/local/lib/opensaf/osafimmpbed --pbe 
/srv/shared/imm//imm.db'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56

Thread 3 (Thread 0x7fc9258e8b00 (LWP 1887)):
#0  0x7fc924072fdd in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0x7fc924ad909b in osaf_poll_no_timeout (io_fds=0x7fc9258e8290, 
i_nfds=1) at src/base/osaf_poll.c:32
#2  0x7fc924ad9248 in osaf_ppoll (io_fds=0x7fc9258e8290, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at src/base/osaf_poll.c:79
#3  0x7fc924ae95cf in ncs_tmr_wait () at src/base/sysf_tmr.c:409
#4  0x7fc924353184 in start_thread (arg=0x7fc9258e8b00) at 
pthread_create.c:312
#5  0x7fc92408037d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 2 (Thread 0x7fc9258eb780 (LWP 1884)):
#0  0x7fc92435a64a in do_fcntl (arg=0x7ffd194a7070, cmd=7, fd=22) at 
../sysdeps/unix/sysv/linux/fcntl.c:39
#1  __libc_fcntl (fd=22, cmd=) at 
../sysdeps/unix/sysv/linux/fcntl.c:92
#2  0x7fc925270985 in __gcov_open () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#3  0x7fc9252714ee in gcov_exit () from 
/usr/local/lib/opensaf/libosaf_common.so.0
No symbol table info available.
#4  0x7fc923fc21a9 in __run_exit_handlers (status=1, listp=0x7fc9243446c8 
<__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#5  0x7fc923fc21f5 in __GI_exit (status=) at exit.c:104
No locals.
#6  0x55828aa7c60c in pbeDaemon (immHandle=4230542917903, 
dbHandle=0x55828bb010e8, ownerHandle=1483565869334821379, 
classIdMap=0x7ffd194abc10, objCount=335, pbe2=false, pbe2B=false) at 
src/imm/immpbed/immpbe_daemon.cc:2343
#7  0x55828aa6b408 in main (argc=3, argv=0x7ffd194abdd8) at 
src/imm/immpbed/immpbe.cc:354

Thread 1 (Thread 0x7fc9258c8b00 (LWP 1888)):
#0  0x7fc923fbcc37 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7fc923fc0028 in __GI_abort () at abort.c:89
#2  0x7fc923ff92a4 in __libc_message (do_abort=do_abort@entry=1, 
fmt=fmt@entry=0x7fc9241076b0 "*** Error in `%s': %s: 0x%s ***\n") at 
../sysdeps/posix/libc_fatal.c:175
#3  0x7fc92400555e in malloc_printerr (ptr=, 
str=0x7fc924107878 "double free or corruption (fasttop)", action=1) at 
malloc.c:4996
#4  _int_free (av=, p=, have_lock=0) at 
malloc.c:3840
#5  0x7fc92483936f in std::basic_string::~basic_string() () from 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#6  0x7fc923fc253a in __cxa_finalize (d=0x7fc9256c6c80) at cxa_finalize.c:56
#7  0x7fc925493833 in __do_global_dtors_aux () from 
/usr/local/lib/opensaf/libimmpbe_dump.so.0
No symbol table info available.
#8  0x7fc9258c7dc0 in ?? ()
No symbol table info available.
#9  0x7fc9256e870a in _dl_fini () at dl-fini.c:252
Backtrace stopped: frame did not save the PC
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2309 imm: IMMNDs on PLs fail to discard local OI when headless

2017-02-22 Thread Hung Nguyen
- **status**: review --> fixed
- **Comment**:

default(5.2) [staging:0c6da9]
changeset:   8605:0c6da910d0d4
user:Hung Nguyen 
date:Wed Feb 22 16:50:02 2017 +0700
summary: imm: Cleanup orphaned implementers and admowners when headless 
[#2309]

opensaf-5.1.x [staging:f667c9]
changeset:   8606:f667c97dab51
user:Hung Nguyen 
date:Wed Feb 22 16:51:55 2017 +0700
summary: imm: Cleanup orphaned implementers and admowners when headless 
[#2309]

opensaf-5.0.x [staging:adc96b]
changeset:   8607:adc96bde4277
user:Hung Nguyen 
date:Wed Feb 22 16:52:52 2017 +0700
summary: imm: Cleanup orphaned implementers and admowners when headless 
[#2309]




---

** [tickets:#2309] imm: IMMNDs on PLs fail to discard local OI when headless**

**Status:** fixed
**Milestone:** 5.0.2
**Created:** Wed Feb 15, 2017 04:22 AM UTC by Hung Nguyen
**Last Updated:** Fri Feb 17, 2017 08:09 AM UTC
**Owner:** Hung Nguyen
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2309/attachment/log.tgz) 
(251.2 kB; application/x-compressed)


When killing a PL-based OI right before cluster goes headless, IMMND fails to 
discard the implementer.
The implementer is only discarded locally, not really discarded.

That results in the implementer is stuck in "dying" state, and any attempt to 
set the implementer will get ERR_TRY_AGAIN.

~~~
:::sql
Feb 15 10:56:58 PL-3 osafimmnd[1127]: NO Implementer connected: 6 (xhunngu) 
<29, 2030f>
Feb 15 10:56:58 PL-3 osafimmnd[1127]: NO implementer for class 'Test' is 
xhunngu => class extent is safe.
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO Implementer locally disconnected. 
Marking it as doomed 6 <29, 2030f> (xhunngu)
Feb 15 10:57:20 PL-3 osafimmnd[1127]: WA SC Absence IS allowed:1800 IMMD 
service is DOWN
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO IMMD SERVICE IS DOWN, HYDRA IS 
CONFIGURED => UNREGISTERING IMMND form MDS
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO Implementer disconnected 1 <0, 
2010f(down)> (safLogService)
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO Implementer disconnected 2 <0, 
2010f(down)> (@safLogService_appl)
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO Implementer disconnected 3 <0, 
2010f(down)> (safClmService)
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO Implementer disconnected 4 <0, 
2010f(down)> (safAmfService)
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO Impl Discarded node 2010f
Feb 15 10:57:20 PL-3 osafimmnd[1127]: NO MDS unregisterede. sleeping ...
Feb 15 10:57:21 PL-3 osafimmnd[1127]: NO Sleep done registering IMMND with MDS
Feb 15 10:57:21 PL-3 osafimmnd[1127]: NO SUCCESS IN REGISTERING IMMND WITH MDS
Feb 15 10:57:21 PL-3 osafimmnd[1127]: NO Re-introduce-me highestProcessed:653 
highestReceived:653
Feb 15 10:57:22 PL-3 osafclmna[1136]: NO Starting to promote this node to a 
system controller
Feb 15 10:57:24 PL-3 osafamfnd[1144]: WA AMF director unexpectedly crashed
Feb 15 10:57:24 PL-3 osafamfnd[1144]: NO Checking 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' for pending messages
Feb 15 10:57:26 PL-3 osafimmnd[1127]: WA MDS Send Failed to service:IMMD rc:2
Feb 15 10:57:27 PL-3 osafimmnd[1127]: NO Re-introduce-me highestProcessed:653 
highestReceived:653
Feb 15 10:57:27 PL-3 osafimmnd[1127]: WA MDS Send Failed to service:IMMD rc:2
Feb 15 10:57:28 PL-3 osafimmnd[1127]: NO Re-introduce-me highestProcessed:653 
highestReceived:653
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets