[tickets] [opensaf:tickets] #2550 smf: Smfd fails to create SaAmfNodeSwBundle due to IMM sync

2017-08-28 Thread Vijay Roy via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

Acknowledged by Mahesh AV, Lennart.



---

** [tickets:#2550] smf: Smfd fails to create SaAmfNodeSwBundle due to IMM sync**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Mon Aug 14, 2017 04:35 AM UTC by A V Mahesh (AVM)
**Last Updated:** Thu Aug 17, 2017 12:08 PM UTC
**Owner:** Vijay Roy


As the fix of #2389, SmfImmCreateOperation returns TRY_AGAIN when 
OmCcbObjectCreate is aborted due to IMM sync (resource abort).
However, on caller classes which areSmfImmUtils::doImmOperation and 
SmfUpgradeStep::createOneSaAmfNodeSwBundle do not handle for TRY_AGAIN returned 
code.

This causes the upgrade fails.

Jun 27 23:22:55 SC_2_2 osafsmfd[28490]: NO STEP: Create new SaAmfNodeSwBundle 
objects
Jun 27 23:22:53 SC_2_2 osafimmd[28350]: NO Node 20c0f request sync 
sync-pid:13429 epoch:0 
Jun 27 23:22:56 SC_2_2 osafimmnd[28364]: NO Announce sync, epoch:376
Jun 27 23:22:56 SC_2_2 osafimmd[28350]: NO Successfully announced sync. New 
ruling epoch:376
Jun 27 23:23:01 SC_2_2 osafimmnd[28364]: WA Aborting ccbId 2959 to start sync
Jun 27 23:23:01 SC_2_2 osafimmloadd: NO Sync starting
Jun 27 23:23:02 SC_2_2 osafimmloadd: IN Synced 7868 objects in total
Jun 27 23:23:02 SC_2_2 osafimmnd[28364]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE 16888
Jun 27 23:23:02 SC_2_2 osafimmloadd: NO Sync ending normally
Jun 27 23:23:03 SC_2_2 osafimmnd[28364]: NO ERR_FAILED_OPERATION: ccb 2959 is 
in an error state rejecting ccbObjectCreate operation 
Jun 27 23:23:03 SC_2_2 osafsmfd[28490]: NO Failed to create object of 
class=[SaAmfNodeSwBundle] to 
parent=[safAmfNode=PL-18,safAmfCluster=myAmfCluster]. 
rc=SA_AIS_ERR_FAILED_OPERATION (21),
Jun 27 23:23:03 SC_2_2 osafsmfd[28490]: NO Creation of object failed, 
rc=SA_AIS_ERR_FAILED_OPERATION (21), class=[SaAmfNodeSwBundle], 
parent=[safAmfNode=PL-18,safAmfCluster=myAmfCluster]
Jun 27 23:23:03 SC_2_2 osafsmfd[28490]: ER Failed to create new 
SaAmfNodeSwBundle objects in step=safSmfStep=0001

Measurement: SmfUpgradeStep::createOneSaAmfNodeSwBundle should handle for 
TRY_AGAIN returned code


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2547 amfd: payload cannot join cluster

2017-08-28 Thread Gary Lee via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

develop:

commit f921600fa2affd69e898a8beb0848c75924cfae1
Author: Gary Lee 
Date:   Tue Aug 29 13:42:50 2017 +1000

amfd: postpone deletion of node from node_id_db [#2547]

CLM and MDS callbacks are delivered to the main thread via different paths.
If a node is restarted quickly, sometimes CLM JOIN is processed before the
prior MDS down. This means the node will not be able to join the cluster
as it is not in node_id_db (deleted in MDS down processing).

This patch ensures addition to, and removal from node_id_db is only done
from CLM callbacks to avoid race conditions such as above.



---

** [tickets:#2547] amfd: payload cannot join cluster**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Wed Aug 09, 2017 12:41 AM UTC by Gary Lee
**Last Updated:** Mon Aug 14, 2017 03:49 AM UTC
**Owner:** Gary Lee


If a payload is stopped and restarted quickly, sometimes it will not be able to 
re-join the cluster.

CLM and MDS events are sent to the main thread in separate pathways. Here we 
can see a MDS DOWN event arriving out of order, after CLM JOIN.

~~~
Jul 27 11:45:15.259963 osafamfd [264:264:src/clm/agent/clma_api.c:0829] >> 
saClmDispatch
Jul 27 11:45:15.260082 osafamfd [264:264:src/amf/amfd/clm.cc:0222] >> 
clm_track_cb: '0' '4' '1'
Jul 27 11:45:15.260103 osafamfd [264:264:src/amf/amfd/clm.cc:0238] TR 
numberOfMembers:'4', numberOfItems:'1'
Jul 27 11:45:15.260121 osafamfd [264:264:src/amf/amfd/clm.cc:0244] TR i = 0, 
node:'safNode=PL-4,safCluster=myClmCluster', clusterChange:3
Jul 27 11:45:15.260133 osafamfd [264:264:src/amf/amfd/clm.cc:0299] TR  Node 
Left: rootCauseEntity safNode=PL-4,safCluster=myClmCluster for node 132111

Jul 27 11:45:15.279492 osafamfd [264:264:src/clm/agent/clma_api.c:0829] >> 
saClmDispatch
Jul 27 11:45:15.279574 osafamfd [264:264:src/amf/amfd/clm.cc:0222] >> 
clm_track_cb: '0' '4' '1'
Jul 27 11:45:15.279581 osafamfd [264:264:src/amf/amfd/clm.cc:0238] TR 
numberOfMembers:'5', numberOfItems:'1'
Jul 27 11:45:15.279589 osafamfd [264:264:src/amf/amfd/clm.cc:0244] TR i = 0, 
node:'safNode=PL-4,safCluster=myClmCluster', clusterChange:2
Jul 27 11:45:15.279609 osafamfd [264:264:src/amf/amfd/node.cc:0052] TR added 
node 132111
Jul 27 11:45:15.279620 osafamfd [264:264:src/amf/amfd/clm.cc:0380] TR Node 
Joined 'safNode=PL-4,safCluster=myClmCluster' '36'

Jul 27 11:45:15.287973 osafamfd [264:264:src/amf/amfd/main.cc:0770] >> 
process_event: evt->rcv_evt 21
Jul 27 11:45:15.287979 osafamfd [264:264:src/amf/amfd/ndfsm.cc:0771] >> 
avd_mds_avnd_down_evh: 2040f, 0x55c93b1dfda0
Jul 27 11:45:15.287986 osafamfd [264:264:src/amf/amfd/ndproc.cc:1219] >> 
avd_node_failover: 'safAmfNode=PL-4,safAmfCluster=myAmfCluster'
Jul 27 11:45:15.287991 osafamfd [264:264:src/amf/amfd/ndfsm.cc:1110] >> 
avd_node_mark_absent

Jul 27 11:45:15.785245 osafamfd [264:264:src/amf/amfd/ndfsm.cc:0296] >> 
avd_node_up_evh: from 2040f, safAmfNode=PL-4,safAmfCluster=myAmfCluster
Jul 27 11:45:15.785261 osafamfd [264:264:src/amf/amfd/ndfsm.cc:0363] TR invalid 
node ID (2040f)
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2532 mds: TCP SVC_UP event is not received after subscribing

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
>>Just to be clear, are you requesting those traces for TIPC?

I try to reproduce as  well , as for your reproducible steps, but not able to 
reproduce on TCP or TIPC.
I did check logs and didn't get much clue , any how I will revisit the logs 
again and see if I get any clue.

If required we have to reproduce the issue with some additional IMMA/MDS 
debugging patch on same setup  with TIPC or TCP where you observed the issue .

-AVM


---

** [tickets:#2532] mds: TCP SVC_UP event is not received after subscribing**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Fri Jul 21, 2017 05:59 AM UTC by Hung Nguyen
**Last Updated:** Mon Aug 28, 2017 03:27 AM UTC
**Owner:** nobody
**Attachments:**

- 
[logs_n_traces.tgz](https://sourceforge.net/p/opensaf/tickets/2532/attachment/logs_n_traces.tgz)
 (1.5 MB; application/x-compressed)


MDS is successfully installed on IMMA and IMMA subscribed to IMMD successfully.
IMMND also received IMMA SVC_UP event but IMMA didn't receive SVC_UP event for 
IMMND.

~~~
<142>1 2017-07-20T13:00:36.072773+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14043"] MCM:API: svc_id = IMMA_OM(26) on VDEST id = 65535, 
SVC_PVT_VER = 0 Install Successfull
> ...
<142>1 2017-07-20T13:00:36.073091+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14074"] MCM:API: svc_subscribe :svc_id = IMMA_OM(26) on VDEST id = 
65535 Subscription to svc_id = IMMND(25) Successful
> ...
<142>1 2017-07-20T13:00:36.073904+02:00 PL-4 osafimmnd 177 mds.log [meta 
sequenceId="96185"] MCM:API: svc_up : svc_id = IMMND(25) on DEST id = 65535 got 
UP for svc_id = IMMA_OM(26) on Adest = , 
rem_svc_pvt_ver=0, rem_svc_archword=10
~~~


IMMA waited for the SVC_UP event for 30 sec but didn't receive anything.
~~~
Jul 20 13:00:36.071465 imma [278:278:src/imm/agent/imma_init.cc:0263] >> 
imma_startup 
Jul 20 13:00:36.071474 imma [278:278:src/imm/agent/imma_init.cc:0273] TR use 
count 0
Jul 20 13:00:36.071484 imma [278:278:src/base/ncs_main_pub.c:0220] TR 
NCS:PROCESS_ID=278
Jul 20 13:00:36.071494 imma [278:278:src/base/sysf_def.c:0089] TR INITIALIZING 
LEAP ENVIRONMENT
Jul 20 13:00:36.071584 imma [278:278:src/base/sysf_def.c:0124] TR DONE 
INITIALIZING LEAP ENVIRONMENT
Jul 20 13:00:36.071832 imma [278:278:src/base/ncs_main_pub.c:0757] TR 
NCS:NODE_ID=0x0002040F
Jul 20 13:00:36.072329 imma [278:278:src/mbc/mbcsv_dl_api.c:0059] >> 
mbcsv_lib_req 
Jul 20 13:00:36.072350 imma [278:278:src/mbc/mbcsv_dl_api.c:0096] >> 
mbcsv_lib_init 
Jul 20 13:00:36.072378 imma [278:278:src/mbc/mbcsv_mbx.c:0174] >> 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072389 imma [278:278:src/mbc/mbcsv_mbx.c:0189] << 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072399 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0158] >> 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072409 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0173] << 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072419 imma [278:278:src/mbc/mbcsv_dl_api.c:0075] << 
mbcsv_lib_req 
Jul 20 13:00:36.072440 imma [278:278:src/base/ncs_main_pub.c:0389] TR 
MBCSV:MBCA:ON
Jul 20 13:00:36.073104 imma [278:278:src/imm/agent/imma_init.cc:0063] >> 
imma_sync_with_immnd 
Jul 20 13:00:36.073114 imma [278:278:src/imm/agent/imma_init.cc:0071] TR 
Blocking first client
Jul 20 13:01:06.102156 imma [278:278:src/imm/agent/imma_init.cc:0081] TR 
Blocking wait released
Jul 20 13:01:06.102375 imma [278:278:src/imm/agent/imma_init.cc:0091] << 
imma_sync_with_immnd 
Jul 20 13:01:06.102413 imma [278:278:src/imm/agent/imma_init.cc:0179] TR Client 
agent successfully initialized
Jul 20 13:01:06.102427 imma [278:278:src/imm/agent/imma_init.cc:0296] << 
imma_startup: use count 1
~~~


Attached is traces and logs.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2546 log: debug assertion failed in log agent during scale-in

2017-08-28 Thread Vu Minh Nguyen via Opensaf-tickets
- **status**: fixed --> accepted
- **assigned_to**: Vu Minh Nguyen
- **Comment**:

Still encounter the coredump after including this fix. So, re-open the ticket.



---

** [tickets:#2546] log: debug assertion failed in log agent during scale-in**

**Status:** accepted
**Milestone:** 5.17.10
**Created:** Tue Aug 08, 2017 07:56 AM UTC by Vu Minh Nguyen
**Last Updated:** Wed Aug 09, 2017 03:20 AM UTC
**Owner:** Vu Minh Nguyen


During scaling in, LOG application got coredump due to assertion failed in log 
agent:

> 320e :
> _ZN9LogClient17RestoreRefCounterE16RefCounterDegreeb():
> /src/log/agent/lga_client.cc:139 (discriminator 1)
> 320e: 48 8d 0d 8b a7 00 00lea0xa78b(%rip),%rcx# d9a0 
> 

Above disassembly points to this code line `assert(ref_counter_ >= -1);`

:::C++
void LogClient::RestoreRefCounter(RefCounterDegree value, bool updated) {
  TRACE_ENTER();
  if (updated == false) return;
  ScopeLock scopeLock(ref_counter_mutex_);
  ref_counter_ -= value;
  TRACE("%s: value = %d", __func__, ref_counter_);
  // Don't expect the @ref_counter_ is less than (-1)
  assert(ref_counter_ >= -1);
}




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2532 mds: TCP SVC_UP event is not received after subscribing

2017-08-28 Thread Hung Nguyen via Opensaf-tickets
Hi,

In logs_n_traces.tgz file, I already included:

* IMMND trace (osafimmnd)
* IMMA trace (imma.trace)
* MDS log with MDS_LOG_LEVEL=5 for osafimmnd and immomtest (mds.log)
* MDS log with MDS_LOG_LEVEL=5 for immomtest (mds.log)


Just to be clear, are you requesting those traces for TIPC?

Thanks,


---

** [tickets:#2532] mds: TCP SVC_UP event is not received after subscribing**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Fri Jul 21, 2017 05:59 AM UTC by Hung Nguyen
**Last Updated:** Mon Aug 28, 2017 03:27 AM UTC
**Owner:** nobody
**Attachments:**

- 
[logs_n_traces.tgz](https://sourceforge.net/p/opensaf/tickets/2532/attachment/logs_n_traces.tgz)
 (1.5 MB; application/x-compressed)


MDS is successfully installed on IMMA and IMMA subscribed to IMMD successfully.
IMMND also received IMMA SVC_UP event but IMMA didn't receive SVC_UP event for 
IMMND.

~~~
<142>1 2017-07-20T13:00:36.072773+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14043"] MCM:API: svc_id = IMMA_OM(26) on VDEST id = 65535, 
SVC_PVT_VER = 0 Install Successfull
> ...
<142>1 2017-07-20T13:00:36.073091+02:00 PL-4 immomtest 278 mds.log [meta 
sequenceId="14074"] MCM:API: svc_subscribe :svc_id = IMMA_OM(26) on VDEST id = 
65535 Subscription to svc_id = IMMND(25) Successful
> ...
<142>1 2017-07-20T13:00:36.073904+02:00 PL-4 osafimmnd 177 mds.log [meta 
sequenceId="96185"] MCM:API: svc_up : svc_id = IMMND(25) on DEST id = 65535 got 
UP for svc_id = IMMA_OM(26) on Adest = , 
rem_svc_pvt_ver=0, rem_svc_archword=10
~~~


IMMA waited for the SVC_UP event for 30 sec but didn't receive anything.
~~~
Jul 20 13:00:36.071465 imma [278:278:src/imm/agent/imma_init.cc:0263] >> 
imma_startup 
Jul 20 13:00:36.071474 imma [278:278:src/imm/agent/imma_init.cc:0273] TR use 
count 0
Jul 20 13:00:36.071484 imma [278:278:src/base/ncs_main_pub.c:0220] TR 
NCS:PROCESS_ID=278
Jul 20 13:00:36.071494 imma [278:278:src/base/sysf_def.c:0089] TR INITIALIZING 
LEAP ENVIRONMENT
Jul 20 13:00:36.071584 imma [278:278:src/base/sysf_def.c:0124] TR DONE 
INITIALIZING LEAP ENVIRONMENT
Jul 20 13:00:36.071832 imma [278:278:src/base/ncs_main_pub.c:0757] TR 
NCS:NODE_ID=0x0002040F
Jul 20 13:00:36.072329 imma [278:278:src/mbc/mbcsv_dl_api.c:0059] >> 
mbcsv_lib_req 
Jul 20 13:00:36.072350 imma [278:278:src/mbc/mbcsv_dl_api.c:0096] >> 
mbcsv_lib_init 
Jul 20 13:00:36.072378 imma [278:278:src/mbc/mbcsv_mbx.c:0174] >> 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072389 imma [278:278:src/mbc/mbcsv_mbx.c:0189] << 
mbcsv_initialize_mbx_list 
Jul 20 13:00:36.072399 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0158] >> 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072409 imma [278:278:src/mbc/mbcsv_pwe_anc.c:0173] << 
mbcsv_initialize_peer_list 
Jul 20 13:00:36.072419 imma [278:278:src/mbc/mbcsv_dl_api.c:0075] << 
mbcsv_lib_req 
Jul 20 13:00:36.072440 imma [278:278:src/base/ncs_main_pub.c:0389] TR 
MBCSV:MBCA:ON
Jul 20 13:00:36.073104 imma [278:278:src/imm/agent/imma_init.cc:0063] >> 
imma_sync_with_immnd 
Jul 20 13:00:36.073114 imma [278:278:src/imm/agent/imma_init.cc:0071] TR 
Blocking first client
Jul 20 13:01:06.102156 imma [278:278:src/imm/agent/imma_init.cc:0081] TR 
Blocking wait released
Jul 20 13:01:06.102375 imma [278:278:src/imm/agent/imma_init.cc:0091] << 
imma_sync_with_immnd 
Jul 20 13:01:06.102413 imma [278:278:src/imm/agent/imma_init.cc:0179] TR Client 
agent successfully initialized
Jul 20 13:01:06.102427 imma [278:278:src/imm/agent/imma_init.cc:0296] << 
imma_startup: use count 1
~~~


Attached is traces and logs.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2475 amf: support for SC status change Callback, non SAF.

2017-08-28 Thread Praveen via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 00c185144de728f7938f775fd3ce65ee95b01032
Author: Praveen 
Date:   Mon Aug 28 14:32:32 2017 +0530

amf: update readme for SC status change callback [#2475]

commit b93cf244b3fb64bc213d82125e1665b50b80f2c6
Author: Praveen 
Date:   Mon Aug 28 14:32:33 2017 +0530

amf: support SC status change callback, non SAF [#2475]

commit 81e2878c1fa3287e37238a38a1bb054951489e86
Author: Praveen 
Date:   Mon Aug 28 14:32:33 2017 +0530

amf: add sample apps for SC status change callback [#2475]

commit a79bb4c527ec3c59a61ce6552184c18213fe4acd
Author: Praveen 
Date:   Mon Aug 28 14:32:33 2017 +0530

amf: add api test cases for sc status change callback [#2475]




---

** [tickets:#2475] amf: support for SC status change Callback, non SAF.**

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Thu Jun 01, 2017 10:19 AM UTC by Praveen
**Last Updated:** Mon Aug 14, 2017 08:27 AM UTC
**Owner:** Praveen


This enhancement is for supporting two resources in AMFA which will enable 
application to know about
SCs Absence and Presence state when they go down and comes up.

Information about the resources:
* A callback that will be invoked by AMFA whenever a SC joins cluster and
  both SCs leaves cluster if SC Absence feature is enabled.

  -Callback and its argument:

  void (*OsafAmfSCStatusChangeCallbackT)(OsafAmfSCStatusT state)
  where OsafAmfSCStatusT is defined as:
typedef enum {
  OSAF_AMF_SC_PRESENT = 1,
  OSAF_AMF_SC_ABSENT = 2,
} OsafAmfSCStatusT;

  This callback can be integrated
  with standard AMF component(even with legacy one also).

  -Return codes:
   SA_AIS_OK - The function returned successfully.
   SA_AIS_ERR_LIBRARY - An unexpected problem occurred in the library (such as
corruption). The library cannot be used anymore.
   SA_AIS_ERR_BAD_HANDLE - The handle amfHandle is invalid, since it is 
corrupted,
   uninitialized, or has already been finalized.
   SA_AIS_ERR_INVALID_PARAM - A parameter is not set correctly (callback).

* An API to register/install above callback function:
   void osafAmfInstallSCStatusChangeCallback(SaAmfHandleT amfHandle,
 void (*OsafAmfSCStatusChangeCallbackT)(OsafAmfSCStatusT status));
   If 0 is passed as amfHandle, then callback will be invoked in the
   context of MDS thread. If a valid amfHandle is passed then callback
   will be invoked in the context of thread which is calling saAmfDispatch()
   with this handle.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #466 Length of the objectnames is more by one for configuration object notifications

2017-08-28 Thread Srinivas Siva Mangipudy via Opensaf-tickets
- **status**: unassigned --> assigned
- **assigned_to**: Srinivas Siva Mangipudy



---

** [tickets:#466] Length of the objectnames is more by one for configuration 
object notifications**

**Status:** assigned
**Milestone:** future
**Created:** Thu Jun 20, 2013 09:08 AM UTC by Sirisha Alla
**Last Updated:** Thu Aug 24, 2017 05:48 AM UTC
**Owner:** Srinivas Siva Mangipudy


When ntfimcnd sends notifications for configuration object 
creation/modification/deletion, the length of the notifying object and the 
notification object is been shown wrongly. IMM callback gives the length of the 
notification object correctly.

Notification object length in the imm callback:
objectName->length: 37
objectName->value: 'attrName_testSA_registerSA_Node_37_69'

Object create/modify/delete notifications indicate the length of notification 
object is 38 and the length of notifying object is 15 for "safApp=OpenSaf".

This issue is reproducible.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #872 osafdtmd asserts after connect with non member node

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#872] osafdtmd asserts after connect with non member node**

**Status:** assigned
**Milestone:** future
**Created:** Wed Apr 23, 2014 06:45 AM UTC by Hans Feldt
**Last Updated:** Tue Nov 15, 2016 06:36 AM UTC
**Owner:** nobody


100% reproducible.

By mistake I had opensaf started on my native system (named xubuntu-13 below). 
Then I launched a virtual cluster which then keeps crashing. SC-1 in the 
virtual cluster stays up but all other nodes keeps crashing with the following 
assert:

Apr 23 08:35:27 SC-2 osafdtmd[352]: NO Established contact with 'xubuntu-13'
Apr 23 08:35:27 SC-2 osafdtmd[352]: dtm_node.c:108: dtm_process_node_info: 
Assertion '0' failed.

Apr 23 08:35:38 PL-3 osafdtmd[350]: NO Established contact with 'xubuntu-13'
Apr 23 08:35:38 PL-3 osafdtmd[350]: NO Established contact with 'SC-2'
Apr 23 08:35:38 PL-3 osafdtmd[350]: dtm_node.c:108: dtm_process_node_info: 
Assertion '0' failed.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2301 cpsv: replace patricia trees with cpp Map/trees

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#2301] cpsv: replace patricia trees with cpp Map/trees**

**Status:** assigned
**Milestone:** future
**Created:** Mon Feb 13, 2017 06:29 AM UTC by A V Mahesh (AVM)
**Last Updated:** Mon Feb 13, 2017 06:29 AM UTC
**Owner:** nobody


Replace DB NCS PATRICIA TREE  with  C++ Map
for improve efficiency.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #246 cpsv: Section create fails with random return values when mulitple processes try to create sections in the same checkpoint 70 node setup.

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#246] cpsv: Section create fails with random return values when 
mulitple processes try to create sections in the same checkpoint  70 node 
setup. **

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:37 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody


 from http://devel.opensaf.org/ticket/2386

 Changeset: 3065
Setup: 70 node SLES11 VM setup


2 applications per node are running on a 70 node setup. 


Collocated checkpoint is created. After active replica is set from one process, 
section create with section id as GENERATED_SECTION_ID is invoked from rest of 
the processes. But the section create fails with ERR_EXIST, ERR_TIMEOUT, 
ERR_TRY_AGAIN.


/var/log/messages for the two controllers will be shared.





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #444 osafdtmd needs to exit with failure rather than rebooting the system in case of missing configurations

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False
- **Milestone**: 5.17.08 --> future



---

** [tickets:#444] osafdtmd needs to exit with failure rather than rebooting the 
system in case of missing configurations**

**Status:** accepted
**Milestone:** future
**Created:** Thu Jun 06, 2013 09:44 AM UTC by Sirisha Alla
**Last Updated:** Mon Apr 10, 2017 01:40 PM UTC
**Owner:** nobody


In case of missing configuration, it would be better to log the error in the 
syslog and exit rather than reboot the system. The system goes for continuous 
reboots in such cases.

Starting OpenSAF Services: Jun  6 14:46:45 OEL-64BIT-SLOT4 osafdtmd[2199]: 
Started
Jun  6 14:46:45 OEL-64BIT-SLOT4 osafdtmd[2199]: ER DTM: Could not open file  
node_name 
Jun  6 14:46:45 OEL-64BIT-SLOT4 osafdtmd[2199]: ER DTM:Error reading 
/etc/opensaf/dtmd.conf.  errno : 2
Jun  6 14:46:45 OEL-64BIT-SLOT4 opensafd[2190]: ER Failed #012 DESC:TRANSPORT
Jun  6 14:46:45 OEL-64BIT-SLOT4 opensafd[2190]: ER Going for recovery
Jun  6 14:46:45 OEL-64BIT-SLOT4 opensafd[2190]: ER Trying To RESPAWN 
/usr/lib64/opensaf/clc-cli/osaf-transport attempt #1
Jun  6 14:46:45 OEL-64BIT-SLOT4 opensafd[2190]: ER Sending SIGKILL to 
TRANSPORT, pid=2191
Jun  6 14:46:45 OEL-64BIT-SLOT4 opensafd[2203]: ER Failed to exec while forking 
script, err=No such file or directory
Jun  6 14:47:00 OEL-64BIT-SLOT4 osafdtmd[2214]: Started
Jun  6 14:47:00 OEL-64BIT-SLOT4 osafdtmd[2214]: ER DTM: Could not open file  
node_name 
Jun  6 14:47:00 OEL-64BIT-SLOT4 osafdtmd[2214]: ER DTM:Error reading 
/etc/opensaf/dtmd.conf.  errno : 2
Jun  6 14:47:00 OEL-64BIT-SLOT4 opensafd[2190]: ER Could Not RESPAWN TRANSPORT
Jun  6 14:47:00 OEL-64BIT-SLOT4 opensafd[2190]: ER Failed #012 DESC:TRANSPORT
Jun  6 14:47:00 OEL-64BIT-SLOT4 opensafd[2190]: ER Trying To RESPAWN 
/usr/lib64/opensaf/clc-cli/osaf-transport attempt #2
Jun  6 14:47:00 OEL-64BIT-SLOT4 opensafd[2190]: ER Sending SIGKILL to 
TRANSPORT, pid=2206
Jun  6 14:47:00 OEL-64BIT-SLOT4 opensafd[2217]: ER Failed to exec while forking 
script, err=No such file or directory
Jun  6 14:47:15 OEL-64BIT-SLOT4 osafdtmd[2229]: Started
Jun  6 14:47:15 OEL-64BIT-SLOT4 osafdtmd[2229]: ER DTM: Could not open file  
node_name 
Jun  6 14:47:15 OEL-64BIT-SLOT4 osafdtmd[2229]: ER DTM:Error reading 
/etc/opensaf/dtmd.conf.  errno : 2
Jun  6 14:47:15 OEL-64BIT-SLOT4 opensafd[2190]: ER Could Not RESPAWN TRANSPORT
Jun  6 14:47:15 OEL-64BIT-SLOT4 opensafd[2190]: ER Failed #012 DESC:TRANSPORT
Jun  6 14:47:15 OEL-64BIT-SLOT4 opensafd[2190]: ER FAILED TO RESPAWN
Jun  6 14:47:15 OEL-64BIT-SLOT4 osafdtmd: osafdtmd Process down, Rebooting the 
node
Jun  6 14:47:15 OEL-64BIT-SLOT4 opensaf_reboot: Rebooting local node

here node_name file is missing



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #457 Dtm: standby joins as active after restart in a 70 node setup

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#457] Dtm: standby joins as active after restart in a 70 node 
setup**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Jun 14, 2013 06:48 AM UTC by Neelakanta Reddy
**Last Updated:** Wed Jul 15, 2015 02:21 PM UTC
**Owner:** nobody
**Attachments:**

- 
[messages_SC1](https://sourceforge.net/p/opensaf/tickets/457/attachment/messages_SC1)
 (65.5 kB; application/octet-stream)
- 
[messages_SC2](https://sourceforge.net/p/opensaf/tickets/457/attachment/messages_SC2)
 (208.0 kB; application/octet-stream)


After analyzing the logs following is the observation:

Slot1 is active and slot2 is standby

1. IMMND killed in slot-2

Jun 11 21:29:46 SLES-64BIT-SLOT2 osafamfnd[3750]: NO 
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery is 'componentRestart'


2. Active IMMD detected the slot-2 IMMND is discarded

Jun 11 15:54:02 SLES-64BIT-SLOT1 osafimmnd[3746]: NO Global discard node 
received for nodeId:2020f pid:3668


3. New immnd at slot2 requests for sync

Jun 11 21:29:46 SLES-64BIT-SLOT2 osafimmnd[7315]: Started

Jun 11 15:54:03 SLES-64BIT-SLOT1 osafimmd[3736]: NO Node 2020f request sync 
sync-pid:7315 epoch:0

4. slot2 went for reboot, IMMD is killed

Jun 11 21:29:49 SLES-64BIT-SLOT2 osafamfnd[3750]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Jun 11 21:29:49 SLES-64BIT-SLOT2 osafamfnd[3750]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast
Jun 11 21:29:49 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node

5. After coming up the slot2 got active role (slot1 is still in active)

Jun 11 21:30:22 SLES-64BIT-SLOT2 osafrded[2095]: NO Peer not available => 
Active role
Jun 11 21:30:23 SLES-64BIT-SLOT2 osaffmd[2108]: Started
Jun 11 21:30:23 SLES-64BIT-SLOT2 osafimmd[2117]: Started
Jun 11 21:30:23 SLES-64BIT-SLOT2 osafimmnd[2127]: Started


6. After getting active role the node went for loading

Jun 11 21:30:23 SLES-64BIT-SLOT2 osafimmnd[2127]: NO This IMMND is now the NEW 
Coord

7. After some time, there is a connection established to the active node

Jun 11 21:30:23 SLES-64BIT-SLOT2 osafdtmd[2077]: NO Established contact with 
'SC-1
Jun 11 15:54:39 SLES-64BIT-SLOT1 osafdtmd[3696]: NO Established contact with 
'SC-2'


8. after connecting the loading event reaches to active IMMD at Slot1, the 
immnd up event is not received because by the time immnd is up the connection 
is not established between the two nodes.

Jun 11 15:54:42 SLES-64BIT-SLOT1 osafimmd[3736]: WA Wrong PID 0 != 2127

9. AMFD, tries to re-connect to IMM because, IMMND return bad_handle when the 
previous synchronous call from the amfd is not yet complete and AMFD requested 
for one more request on same handle.

Jun 11 15:54:49 SLES-64BIT-SLOT1 osafamfd[3815]: NO Re-initializing with IMM
Jun 11 15:54:49 SLES-64BIT-SLOT1 osafimmnd[3746]: WA IMMND - Client Node Get 
Failed for cli_hdl 85899477263
Jun 11 15:54:49 SLES-64BIT-SLOT1 osafamfd[3815]: ER saImmOiImplementerSet 
failed 14
Jun 11 15:54:49 SLES-64BIT-SLOT1 osafamfd[3815]: ER exiting since 
avd_imm_impl_set failed


conclusion:

The mds in the slot2 connected with slot1, after initiating loading in IMMND, 
because of this slot2 got active role. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #272 checkpoint overwrite returns timeout when controllers are running with different compatible versions

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#272] checkpoint overwrite returns timeout when controllers are 
running with different compatible versions**

**Status:** assigned
**Milestone:** future
**Created:** Fri May 17, 2013 11:40 AM UTC by Sirisha Alla
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody
**Attachments:**

- 
[logs.tar.gz](https://sourceforge.net/p/opensaf/tickets/272/attachment/logs.tar.gz)
 (175.5 kB; application/x-gzip)


The issue is seen on OEL6.4 TCP setup. Changeset being used is 4241 with 
patches 2794 and 3117.

Active controller(SC-1) is running with 4.3 version while standby controller 
(SC-2) is running with cs3533(4.2.x)

A non collocated checkpoint replica is created on Active controller.
A section is created in the checkpoint.
Write and Read APIs are successfull but overwrite API is returning timeout for 
5 seconds after which application timesout and exits.

No ckptnd and agent crashes observed. When the same application is run on SC-2, 
it runs without any error.

Attaching the journal and the traces of ckptnd and ckptd on both the 
controllers.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #238 cpsv : Write for asynchronous non collocated checkpoint returns SA_AIS_ERR_NOT_EXIST in some processes

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#238] cpsv : Write for asynchronous non collocated checkpoint 
returns SA_AIS_ERR_NOT_EXIST in some processes**

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:17 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody


>From  http://devel.opensaf.org/ticket/2384

Changeset : 3065
Setup: 70 node SLES11 VM setup.


Problem Description:



70 processes are running the below test scenario with each node hosting a 
single process.


1) The application that is running on SC-1 opens a non-collocated checkpoint, 
creates a section in the checkpoint.
2) The rest of the applications creates the checkpoint and once the section 
create is successful on SC-1, writes into the same section.


Some of the applications return SA_AIS_ERR_NOT_EXIST for write operation.


Traces are not enabled on the setup, and /var/log/messages for both the 
controllers can be provided







---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #241 cpsv : saCkptCheckpointOpen writes to const SaNameT

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#241] cpsv : saCkptCheckpointOpen writes to const SaNameT**

**Status:** unassigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:28 AM UTC by A V Mahesh (AVM)
**Last Updated:** Mon Apr 03, 2017 06:47 PM UTC
**Owner:** nobody


from http://devel.opensaf.org/ticket/1731

Problem:
osaf/libs/agents/saf/cpa/cpa_api.c line 648 : 
m_CPSV_SET_SANAMET(checkpointName);
However, checkpointName is: const SaNameT *checkpointName
and m_CPSV_SET_SANAMET does memset( (uns8 *)>value[name->length], 0, 
(SA_MAX_NAME_LENGTH - name->length) )


This causes a segfault if the value passed in is in read-only memory.


bug is present in opensaf-staging/1057c1e6ebba I'm not sure what version that 
is.


Example:
#define CKPT_NAME "safCkpt=My_Ckpt,safApp=safCkptService"
const SaNameT ckpt_name = { sizeof(CKPT_NAME) - 1, CKPT_NAME };


Then call saCkptCheckpointOpen on it





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #239 cpsv : section create returns ERR_EXIST after few try agains on 70 node cluster

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#239] cpsv : section create returns ERR_EXIST after few try agains 
on 70 node cluster**

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:19 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody


>From http://devel.opensaf.org/ticket/3042

This is seen on 70 SLES VM setup. One checkpoint application runs on each node.


1) Checkpoint Application on active controller creates an asynchronous 
collocated checkpoint. The applications on other nodes open the same checkpoint
2) Replica is set active on active controller and section is created
3) Section create API returns TRY_AGAIN few times and returns ERR_EXIST.


When application gets try again, the section should not be created in the 
checkpoint. This is always not reproducible. 


snippet from test journal:


520|0 15 00130961 1 21| FAILED : Section 11 created in active colloc ckpt
520|0 15 00130961 1 22| Return Value : SA_AIS_ERR_TRY_AGAIN
520|0 15 00130961 1 23|
520|0 15 00130961 1 24| Try again count : 8 
520|0 15 00130961 1 25|
520|0 15 00130961 1 26| FAILED : Section 11 created in active colloc ckpt 
520|0 15 00130961 1 27| Return Value : SA_AIS_ERR_EXIST


Attaching CPD and CPND traces of both the controllers





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #467 checkpoint with COLLOCATED flag forcing to register for arrival callback

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#467] checkpoint with COLLOCATED flag forcing to register for 
arrival callback**

**Status:** assigned
**Milestone:** future
**Created:** Mon Jun 24, 2013 06:36 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody


 am using opensaf 4.0.0
http://devel.opensaf.org/ticket/1866


I am running a simple Amf demo for counting which uses checkpoint.


my checkpoint creation flags are : SA_CKPT_CHECKPOINT_COLLOCATED| 
SA_CKPT_WR_ALL_REPLICAS


i tested it on a 2 node cluster(both target hardware and UML nodes).


problem is that unless i register for arrivalcallback, my standby component is 
faulting. amf is reporting healthcheck timeout.


i tested for SA_CKPT_CHECKPOINT_COLLOCATED| SA_CKPT_WR_ACTIVE_REPLICA also . I 
am facing facing same issue.


If I remove the collocated flag, it works fine. 





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #673 dtm: leak mentioned by valgrind at mdtm_process_poll_recv_data_tcp()

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#673] dtm: leak mentioned  by valgrind at 
mdtm_process_poll_recv_data_tcp()**

**Status:** assigned
**Milestone:** future
**Created:** Wed Dec 18, 2013 09:22 AM UTC by A V Mahesh (AVM)
**Last Updated:** Sun Nov 01, 2015 09:36 PM UTC
**Owner:** nobody


 leak mentioned by valgrind at mdtm_process_poll_recv_data_tcp()
==
dtm_process_poll_recv_data_tcp:
   :
 recd_bytes = recv(tcp_cb->DBSRsock, tcp_cb->len_buff, 2, 0);
 if (0 == recd_bytes) {
 LOG_ER("MDTM:socket_recv() = %d, conn lost with dh 
server, exiting library err :%s", recd_bytes, strerror(errno));
 close(tcp_cb->DBSRsock);
 exit(0);
==


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #638 node cannot join AMF cluster after restart

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False
- **Milestone**: 5.17.08 --> future



---

** [tickets:#638] node cannot join AMF cluster after restart**

**Status:** accepted
**Milestone:** future
**Created:** Fri Nov 22, 2013 02:54 PM UTC by Hans Feldt
**Last Updated:** Mon Apr 10, 2017 01:40 PM UTC
**Owner:** nobody


OpenSAF 4.2.2 changeset 3796, 79 extra patches
System: RHEL based, 2 node cluster, MDS/TIPC

After node reboot of the standby controller it cannot join the cluster again. 
This can be seen in the syslog on the active controller:


Nov 17 17:15:20 notice atrcxb3166 osafamfd[6038]: Cold sync complete!
Nov 19 17:40:07 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' joined the cluster
Nov 19 17:42:08 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 19 17:42:28 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f

Nov 21 16:24:21 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' left the cluster
Nov 21 16:29:04 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' joined the cluster
Nov 21 16:29:24 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:29:44 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:30:04 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:30:24 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:30:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:31:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:31:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:31:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:32:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:32:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:32:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:33:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:33:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:33:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:34:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:34:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:34:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:35:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:35:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:35:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:36:14 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:36:34 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 16:36:54 warning atrcxb3166 osafamfd[6712]: invalid node state 1 for 
node 2020f
Nov 21 17:41:58 err atrcxb3166 osafamfd[6712]: avd_d2n_msg_dequeue: ncsmds_api 
failed 2
Nov 21 17:42:08 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' left the cluster
Nov 21 17:42:18 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:42:38 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:42:58 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:43:18 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:43:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:43:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:44:19 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:44:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:44:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:45:19 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:45:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:45:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:46:19 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
node ID (2020f)
Nov 21 17:46:39 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
msg id 210, from 2020f should be 1
Nov 21 17:46:59 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
msg id 211, from 2020f should be 1

Nov 21 18:00:40 warning atrcxb3166 osafamfd[6712]: avd_msg_sanity_chk: invalid 
msg id 252, from 2020f should be 1
Nov 21 18:01:00 notice atrcxb3166 osafamfd[6712]: Node 'SC-2' left the cluster
Nov 22 11:44:37 notice atrcxb3166 osafamfd[6712]: 

[tickets] [opensaf:tickets] #1285 MDS TCP: zero bytes recvd results in application exit

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1285] MDS TCP: zero bytes recvd results in application exit**

**Status:** assigned
**Milestone:** future
**Created:** Thu Mar 26, 2015 09:49 AM UTC by Girish
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody


sometimes application using opensaf exits with below message:

 Feb 20 15:24:59 fedvm1 RIB[28549]: MDTM:socket_recv() = 0, conn lost with dh 
server, exiting library err :Success
Feb 20 15:24:59 fedvm1 osafamfnd[28263]: NO 
'safSu=SU1,safSg=app-simplex,safApp=appos' component restart probation timer 
started (timeout: 40 ns)
Feb 20 15:24:59 fedvm1 osafamfnd[28263]: NO Restarting a component of 
'safSu=SU1,safSg=app-simplex,safApp=appos' (comp restart count: 1)
Feb 20 15:24:59 fedvm1 osafamfnd[28263]: NO 
'safComp=App,safSu=SU1,safSg=app-simplex,safApp=appos' faulted due to 'avaDown' 
: Recovery is 'componentRestart'

Exits at location 
osaf/libs/core/mds/mds_dt_trans.c::mdtm_process_poll_recv_data_tcp

recd_bytes = recv(tcp_cb->DBSRsock, tcp_cb->buffer, local_len_buf, 0);
if (recd_bytes < 0) {
return;
} else if (0 == recd_bytes) {
syslog(LOG_ERR, "MDTM:socket_recv() = 
%d, conn lost with dh server, exiting library err :%d len:%d", recd_bytes, 
errno,
  local_len_buf);
close(tcp_cb->DBSRsock);
exit(0);
} else if (local_len_buf > recd_bytes) {


 local_len_buf turns out be 0


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1598 leap: PAYLOAD_BUF_SIZE value is suppose to be equal to MDS_DIRECT_BUF_MAXSIZE

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1598] leap: PAYLOAD_BUF_SIZE value is suppose to be equal to 
MDS_DIRECT_BUF_MAXSIZE**

**Status:** assigned
**Milestone:** future
**Created:** Tue Nov 17, 2015 07:00 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Sep 20, 2016 05:48 PM UTC
**Owner:** nobody


The  PAYLOAD_BUF_SIZE value is suppose to be equal to
MDS_DIRECT_BUF_MAXSIZE (65535 maximum packet size)-(56 MDS header) ,
but this was NOT changed as part of the patch `MDS: Performance improvement 
[#654]`  in release 4.5.FC,
because of the previous releases of Opensaf  below 4.5.FC  the value of MDS
MDTM_RECV_BUFFER_SIZE (mds_dt_tipc.c) was limited to (8000+MDS header )
, so to support  in-service Upgrade to  below 4.5.FC  , this was NOT changed  
in 4.5.FC.

Now from 4.7  to  4.6/4.5  releases , we can send message  size of
MDS_DIRECT_BUF_MAXSIZE ((65535 maximum packet size)-(56 MDS header)) value
so for the current release it is limited  PAYLOAD_BUF_SIZE 8000 can be  
possibly adjusted
to  MDS_DIRECT_BUF_MAXSIZE (65535 maximum packet size)-(56 MDS header).


For example :  ( of course not as Static array , we may need to do malloc() )

-#define PAYLOAD_BUF_SIZE 8000 /* default size of packet_data bufrs */
+#define PAYLOAD_BUF_SIZE  ((65535 / 100) * 91)  /* default size of packet_data 
bufrs */



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1700 cpsv: re-create the checkpoint without any sections in case the all replicas is lost

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1700] cpsv: re-create the checkpoint without any sections in case 
the all replicas is lost**

**Status:** assigned
**Milestone:** future
**Created:** Fri Mar 11, 2016 04:15 AM UTC by A V Mahesh (AVM)
**Last Updated:** Mon Apr 04, 2016 04:18 AM UTC
**Owner:** nobody


This is an extinction ticket ` checkpoint replicas during headless state V3 
[#1621]`.

While enhancing the ticket #1621 the Suggestion was to re-create the checkpoint 
without
any sections in case the all replicas is lost. If the sections were
re-created, the application wouldn't know that data has been lost. I
think the BAD_HANDLE approach is okay since we have used it in other
services, but I see it as kind of a hack solution that is not really
in line with the specs.

The specs never intended BAD_HANDLE to be something that can happen
spontaneously on a previously valid handle, lest you are suffering
from memory corruption. In the future we could consider the
feasibility of avoiding spontaneous BAD_HANDLE where possible, and
in CKPT I think it may be possible by re-creating the checkpoints.


This change is quite much and requires a detailed design in
different scenarios. I would suggest to create an enhancement ticket for 
this.

More detailed information of limitations is in cpsv service README.HEADLESS 
file.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1317 ckpt : stale replicas observed in a 70 node cluster

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1317] ckpt : stale replicas observed in a 70 node cluster**

**Status:** assigned
**Milestone:** future
**Created:** Wed Apr 15, 2015 10:16 AM UTC by Sirisha Alla
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody
**Attachments:**

- 
[logs.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1317/attachment/logs.tar.bz2)
 (6.5 MB; application/x-bzip)


This issue is observed on cs6377 (46FC Tag). The cluster is 0f 70 nodes and 2 
checkpoint applications run on each node. The application running on the active 
controller creates the checkpoint, while the applications running on other 
nodes open the same checkpoint and use them. After sections are created, 
written and read from all the applications finalizes the handles used. The 
retention duration of the checkpoint is specified to a minimal value of 1000 
nanoseconds.

/dev/shm on the active controller after the applications exited.

SLES-64BIT-SLOT1:~ # date;ls -lrt /dev/shm/
Wed Apr 15 14:25:09 IST 2015
total 1772
-rw-r--r-- 1 opensaf opensaf 1076040 Apr 15 13:38 
opensaf_NCS_MQND_QUEUE_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf  328000 Apr 15 13:38 opensaf_NCS_GLND_RES_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf  16 Apr 15 13:38 opensaf_NCS_GLND_LCK_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf   88000 Apr 15 13:38 opensaf_NCS_GLND_EVT_CKPT_INFO
-rw-r--r-- 1 opensaf opensaf  704008 Apr 15 13:38 
opensaf_CPND_CHECKPOINT_INFO_131343
-rw-r--r-- 1 opensaf opensaf   79848 Apr 15 13:55 
opensaf_safCkpt=active_replica_ckpt_name_1_sysgrou_131343_4
-rw-r--r-- 1 opensaf opensaf   79848 Apr 15 13:56 
opensaf_safCkpt=active_replica_ckpt_name_1_sysgrou_131343_9
-rw-r--r-- 1 opensaf opensaf   79848 Apr 15 13:57 
opensaf_safCkpt=active_replica_ckpt_name_1_sysgrou_131343_16
SLES-64BIT-SLOT1:~ # date;immfind|grep -i ckpt
Wed Apr 15 14:25:11 IST 2015
safApp=safCkptService
SLES-64BIT-SLOT1:~ # 

When the same checkpoint name is being tried created, checkpoint service is not 
creating a new replica in the shared memory.

cpd,cpnd traces are attached.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1679 osaf: enhance TRACE/LOGS of all Opensaf services by adding sender Node Name

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1679] osaf: enhance TRACE/LOGS of all Opensaf services by adding 
sender Node Name**

**Status:** assigned
**Milestone:** future
**Created:** Fri Feb 05, 2016 07:17 AM UTC by A V Mahesh (AVM)
**Last Updated:** Fri Aug 05, 2016 04:13 AM UTC
**Owner:** nobody


The ticket #1522 solution enhances the Opensaf both TCP & TIPC Transport. 
Transport to provides  Node Name of the sender as part of ncsmds_callback_info 
(NCSMDS_CALLBACK_INFO).

So now we can debug efficiently by knowing remote node name  part of LOG 
message.

See more detailes in https://sourceforge.net/p/opensaf/tickets/1522/


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2217 mds: optimize use of gl_mds_library_mutex

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False
- **Milestone**: 5.17.08 --> future



---

** [tickets:#2217] mds: optimize use of gl_mds_library_mutex**

**Status:** accepted
**Milestone:** future
**Created:** Tue Dec 06, 2016 09:55 AM UTC by Mathi Naickan
**Last Updated:** Mon Apr 10, 2017 01:40 PM UTC
**Owner:** nobody


A prototyping exercise was done long back to remove this lock but had resulted 
in problems such as out of order. MDS has evolved since then. 
We could revisit the way mds uses gl_mds_library_mutex.
The ticket aims to identify optimization of the way mds gl_mds_library_mutex is 
used.

Details TBD


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1733 Payload got rebooted when cpnd is killed on payload

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1733] Payload got rebooted when cpnd is killed on payload**

**Status:** assigned
**Milestone:** future
**Created:** Wed Apr 06, 2016 11:05 AM UTC by Madhurika Koppula
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** nobody
**Attachments:**

- 
[cpsv.tgz](https://sourceforge.net/p/opensaf/tickets/1733/attachment/cpsv.tgz) 
(15.0 MB; application/octet-stream)


Setup:
Changeset- 7436
Version - opensaf 5.0
4 nodes configured with single PBE

Issue Observed: It is random.

1) When CPND is killed on payload, component restart of CPND failed because of 
expiration of component registration timer.
2) Node went for reboot. Test application is being ran.

Below is the timestamp of PL-4:

Apr  6 10:52:00 OEL_M-SLOT-4 osafamfnd[3015]: NO 
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' component restart probation timer 
started (timeout: 600 ns)
Apr  6 10:52:00 OEL_M-SLOT-4 osafamfnd[3015]: NO Restarting a component of 
'safSu=PL-4,safSg=NoRed,safApp=OpenSAF' (comp restart count: 1)

Apr  6 10:52:00 OEL_M-SLOT-4 osafamfnd[3015]: NO 
'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'componentRestart'

Apr  6 10:52:00 OEL_M-SLOT-4 osafckptnd[6263]: Started
Apr  6 10:52:10 OEL_M-SLOT-4 osafamfnd[3015]: NO Instantiation of 
'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' failed
Apr  6 10:52:10 OEL_M-SLOT-4 osafamfnd[3015]: NO Reason: component registration 
timer expired
Apr  6 10:52:10 OEL_M-SLOT-4 osafckptnd[6294]: Started

Apr  6 10:52:20 OEL_M-SLOT-4 osafamfnd[3015]: NO Instantiation of 
'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' failed

Apr  6 10:52:20 OEL_M-SLOT-4 osafamfnd[3015]: NO Reason: component registration 
timer expired
Apr  6 10:52:20 OEL_M-SLOT-4 osafamfnd[3015]: WA 
'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF' Presence State RESTARTING 
=> INSTANTIATION_FAILED
Apr  6 10:52:20 OEL_M-SLOT-4 osafamfnd[3015]: NO Component Failover trigerred 
for 'safSu=PL-4,safSg=NoRed,safApp=OpenSAF': Failed component: 
'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF'
Apr  6 10:52:20 OEL_M-SLOT-4 osafamfnd[3015]: ER 
'safComp=CPND,safSu=PL-4,safSg=NoRed,safApp=OpenSAF'got Inst failed
Apr  6 10:52:20 OEL_M-SLOT-4 osafamfnd[3015]: Rebooting OpenSAF NodeId = 132111 
EE Name = , Reason: NCS component Instantiation failed, OwnNodeId = 132111, 
SupervisionTime = 60
Apr  6 10:52:20 OEL_M-SLOT-4 opensaf_reboot: Rebooting local node; timeout=60
Apr  6 10:52:46 OEL_M-SLOT-4 kernel: imklog 5.8.10, log source = /proc/kmsg 
started.

3) Below is the timestamp of ACTIVE controller:

Apr  6 10:51:59 OEL_M-SLOT-1 osafimmd[6916]: WA No coordinator IMMND known 
(case B) - ignoring sync request
Apr  6 10:51:59 OEL_M-SLOT-1 osafimmd[6916]: NO Node 2040f request sync 
sync-pid:2980 epoch:0
Apr  6 10:52:24 OEL_M-SLOT-1 kernel: TIPC: Resetting link 
<1.1.1:eth3-1.1.4:eth3>, peer not responding
Apr  6 10:52:24 OEL_M-SLOT-1 kernel: TIPC: Lost link <1.1.1:eth3-1.1.4:eth3> on 
network plane A
Apr  6 10:52:24 OEL_M-SLOT-1 kernel: TIPC: Lost contact with <1.1.4>
Apr  6 10:52:24 OEL_M-SLOT-1 osafamfd[7003]: NO Node 'PL-4' left the cluster
Apr  6 10:52:24 OEL_M-SLOT-1 osafclmd[6988]: NO Node 132111 went down. Not 
sending track callback for agents on that node
Apr  6 10:52:24 OEL_M-SLOT-1 osafclmd[6988]: NO Node 132111 went down. Not 
sending track callback for agents on that node
Apr  6 10:52:24 OEL_M-SLOT-1 osafimmnd[3728]: NO Global discard node received 
for nodeId:2040f pid:2980
Apr  6 10:52:24 OEL_M-SLOT-1 osafimmnd[3728]: NO Implementer connected: 1539 
(MsgQueueService132111) <12283, 2010f>



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1815 mds: suspected message loss in large cluster deployments

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1815] mds: suspected message loss in large cluster deployments**

**Status:** assigned
**Milestone:** future
**Created:** Mon May 09, 2016 06:45 AM UTC by Gary Lee
**Last Updated:** Tue Nov 15, 2016 06:39 AM UTC
**Owner:** nobody


It has been observed that CLM callbacks to amfd can become 'lost'
in a large cluster. It seems to be occurring in MDS, when the callbacks are
sent around the same time as amfd is calling avd_imm_config_get().

It seems avd_imm_config_get() generates a large
amount of traffic through MDS.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1837 TIPC: loading model gives: "osafimmpbed: ER Failed in saImmOmSearchNext_2:5 - exiting" and "osafimmpbed: ER immpbe.cc dumpObjectsToPbe failed - exiting (line:265)

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1837] TIPC: loading model gives: "osafimmpbed: ER Failed in 
saImmOmSearchNext_2:5 - exiting" and "osafimmpbed: ER immpbe.cc 
dumpObjectsToPbe failed - exiting (line:265)**

**Status:** assigned
**Milestone:** future
**Created:** Wed May 18, 2016 05:41 AM UTC by beatriz brandao
**Last Updated:** Mon Apr 03, 2017 04:59 AM UTC
**Owner:** nobody
**Attachments:**

- 
[C:\Docs\lixo\osaftestLog-2016-04-19_04-04-26.gz](https://sourceforge.net/p/opensaf/tickets/1837/attachment/C%3A%5CDocs%5Clixo%5CosaftestLog-2016-04-19_04-04-26.gz)
 (1.4 MB; application/x-gzip-compressed)


Testcase:
osaftest.tests.amf.functest.config_changes.test_comptype_attr_chg.Test.test_chg_ct_def_disable_restart
Note: this testcase are run with TIPC enabled.

Testcase starts @:
2016-04-19 03:44:28 INFO - TestCase:setUp Start | 
test_chg_ct_def_disable_restart (osaftest.tests.amf.functest.
config_changes.test_comptype_attr_chg.Test)

Testcase ends @:
2016-04-19 03:45:16 DEBUG: Powered off cluster

First analysis done by Zoran:
>From syslogs, I cannot see what was the problem for causing ERR_TIMEOUT in 
>searchNext().

According to MDS logs, it seems that this might be an MDS problem.

>From MDS logs:
Apr 19  3:44:36.237379 osaflogd[446] NOTIFY  |MDTM: svc up event for svc_id = 
LGA(21), subscri. by svc_id = LGS(20) 
pwe_id=1 Adest = 
Apr 19  3:44:36.238518 osafntfd[461] NOTIFY  |MDTM: svc up event for svc_id = 
NTFA(29), subscri. by svc_id = NTFS(28) 
pwe_id=1 Adest = 
Apr 19  3:44:36.239261 osafclmd[477] NOTIFY  |MDTM: svc up event for svc_id = 
CLMA(35), subscri. by svc_id = CLMS(34) 
pwe_id=1 Adest = 
Apr 19  3:44:38.788267 osaflogd[446] NOTIFY  |MDTM: svc up event for svc_id = 
LGA(21), subscri. by svc_id = LGS(20) 
pwe_id=1 Adest = 
Apr 19  3:44:44.911298 osafimmpbed[453] ERR  |MDS_SND_RCV: Timeout or Error 
occured
Apr 19  3:44:44.912049 osafimmpbed[453] ERR  |MDS_SND_RCV: Timeout occured on 
sndrsp message
Apr 19  3:44:44.912128 osafimmpbed[453] ERR  |MDS_SND_RCV: 
Adest=<0x0002010f,1637493776>
Apr 19  3:44:44.919827 osafimmnd[432] NOTIFY  |MDTM: svc down event for svc_id 
= IMMA_OM(26), subscri. by svc_id = 
IMMND(25) pwe_id=1 Adest = 
Apr 19  3:44:45.413550 osafimmpbed[679] NOTIFY  |BEGIN MDS LOGGING| 
PID= | ARCHW=a|64bit=1

the was no any MDS message between 3:44:38.788267 and 3:44:44.911298.

At 3:44:44.911298, MDS send/receive PBE request was timed out.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1929 osaf: Build fails with GCC 6.1.0

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#1929] osaf: Build fails with GCC 6.1.0**

**Status:** assigned
**Milestone:** future
**Created:** Tue Aug 02, 2016 09:21 AM UTC by A V Mahesh (AVM)
**Last Updated:** Tue Sep 20, 2016 05:36 PM UTC
**Owner:** nobody


OpenSAF fails to build with GCC 6.1.0, due to new compiler warnings:
# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-6.1.0/configure --prefix=/usr --enable-shared 
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu 
--enable-languages=c,c++ --disable-multilib --disable-bootstrap 
--with-system-zlib --with-gmp=/usr/local/gmp-6.1.1 
--with-mpfr=/usr/local/mpfr-3.1.4 --with-mpc=/usr/local/mpc-1.0.3
Thread model: posix
gcc version 6.1.0 (GCC)


make[5]: Entering directory `/avm/opensaf/osaf/tools/safimm/immdump'
g++ -DHAVE_CONFIG_H -I. -I../../../..  -DSA_EXTENDED_NAME_SOURCE 
-I../../../../osaf/libs/saf/include -I../../../../osaf/libs/core/include 
-I../../../../osaf/libs/core/leap/include 
-I../../../../osaf/libs/core/mds/include 
-I../../../../osaf/libs/core/common/include  
-I../../../../osaf/libs/common/immsv/include  -Wall -fno-strict-aliasing 
-Werror -fPIC -D__STDC_FORMAT_MACROS -D_FORTIFY_SOURCE=2 -fstack-protector 
-DINTERNAL_VERSION_ID='""'  -I/usr/include/libxml2 -g -O2 -MT 
immdump-imm_dumper.o -MD -MP -MF .deps/immdump-imm_dumper.Tpo -c -o 
immdump-imm_dumper.o `test -f 'imm_dumper.cc' || echo './'`imm_dumper.cc
imm_dumper.cc: In function ‘int main(int, char**)’:
imm_dumper.cc:144:5: error: this ‘if’ clause does not guard... 
[-Werror=misleading-indentation]
 if ((c = getopt_long(argc, argv, "hp:x:c:", long_options, NULL)) == -1)
 ^~
imm_dumper.cc:147:13: note: ...this statement, but the latter is misleadingly 
indented as if it is guarded by the ‘if’
 switch (c) {
 ^~
cc1plus: all warnings being treated as errors
make[5]: *** [immdump-imm_dumper.o] Error 1
make[5]: Leaving directory `/avm/opensaf/osaf/tools/safimm/immdump'
make[4]: *** [all-recursive] Error 1
make[4]: Leaving directory `/avm/opensaf/osaf/tools/safimm'


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2082 CKPT : Track cbk not invoked for section creation after cpnd restart

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#2082] CKPT : Track cbk not invoked for section creation after cpnd 
restart**

**Status:** assigned
**Milestone:** future
**Created:** Thu Sep 29, 2016 11:06 AM UTC by Srikanth R
**Last Updated:** Tue Nov 15, 2016 06:37 AM UTC
**Owner:** nobody


Changeset: 7997 5.1.FC

Track Callback is not invoked after cpnd restart. Below are the apis called 
from the applications , spawned on two nodes .i.e payloads.


On first node :

-> Initialize with cpsv 
-> Create a ckpt with ACTIVE REPLICA flag.
 
 On second node.
 -> Initialize with cpsv.

 On First node,
 -> Open the checkpoint in writing mode
-> Open the checkpoint in reading mode.
 -> Kill cpnd process
 -> Register for Track callback.

On Second node, 
 -> Open the ckpt in read mode.
 -> Kill cpnd proces
 -> Register for Track callback.
 
 
After ensuring that both agents registered for track callback, create a section 
from the application on first node. For section creation, callback should be 
invoked for applications on two nodes.

Currently callback is not invoked for the application on second node. With out 
cpnd restart, callback is invoked for the two applications


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2007 EVT: Service got hanged for 2 hours after saEvtEventPublish

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#2007] EVT: Service got hanged for 2 hours after saEvtEventPublish**

**Status:** assigned
**Milestone:** future
**Created:** Wed Sep 07, 2016 09:39 AM UTC by Chani Srivastava
**Last Updated:** Fri Sep 09, 2016 05:20 AM UTC
**Owner:** nobody


OS : Suse PPC 64bit 
Changeset : 7997  ( 5.1.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
no PBE )


Below APIs are called in sequence
saEvtInitialize()
saEvtChannelOpen()
saEvtEventAllocate()
saEvtEventAttributesSet()
saEvtEventPublish()

Application got hanged

(gdb) bt
0x0fff93e819f4 in .__poll () from /lib64/libc.so.6
1  0x0fff93be45c4 in osaf_poll_no_timeout (io_fds=0xfff9412e728, i_nfds=1) 
at osaf_poll.c:32
2  0x0fff93be4738 in osaf_ppoll (io_fds=0xfff9412e728, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at osaf_poll.c:79
3  0x0fff93bf2f88 in ncs_tmr_wait () at sysf_tmr.c:409
4  0x0fff9402c818 in .start_thread () from /lib64/libpthread.so.0
5  0x0fff93e8db2c in .__clone () from /lib64/libc.so.6
(gdb) thread apply all bt

Thread 4 (LWP 4698):
0  0x0fff94034f30 in .sem_wait () from /lib64/libpthread.so.0
1  0x0fff93c03630 in hm_block_me (cell=0x100b76e0, pool_id=1 '\001') at 
hj_hdl.c:697
2  0x0fff93c025a0 in ncshm_destroy_hdl (id=NCS_SERVICE_ID_EDA, 
uhdl=4028633082) at hj_hdl.c:366
3  0x0fff940dd3bc in eda_channel_hdl_rec_del (list_head=0x100c00e0, 
rm_node=0x100c1c90) at eda_hdl.c:317
4  0x0fff940d7368 in saEvtChannelClose (channelHandle=4028633082) at 
eda_saf_api.c:895
5  0x1002ffcc in tet_saEvtChannelClose (ptrChannelHandle=0x1007a8e8 
)
at src/tet_edsv_wrappers.c:198
6  0x1000ed78 in tet_RetentionTimeClear_Thread () at src/tet_eda.c:4790
7  0x10011804 in tet_invoketp (icnum=300, tpnum=1) at src/tet_eda.c:6279
8  0x10032cbc in call_1tp (icnum=0, tpnum=0, testnum=0) at 
tcm_main.c:581
9  0x100333b0 in call_tps (tpcount=, icnum=) at tcm_main.c:477
10 tet_tcm_main (argc=1, argv=0xfffc40a76c8) at tcm_main.c:432
11 0x10035fa4 in main (argc=,
argv=) at main.c:83

Thread 3 (LWP 4727):
0  0x0fff93e53d68 in .__GI___libc_nanosleep () from /lib64/libc.so.6
1  0x0fff93e53ae0 in .__sleep () from /lib64/libc.so.6
2  0x10031b30 in eda_selection_thread () at src/tet_edsv_wrappers.c:643
3  0x0fff9402c818 in .start_thread () from /lib64/libpthread.so.0
4  0x0fff93e8db2c in .__clone () from /lib64/libc.so.6

Thread 2 (LWP 4701):
0  0x0fff93e819f4 in .__poll () from /lib64/libc.so.6
1  0x0fff93c4b5a8 in mdtm_process_recv_events () at mds_dt_tipc.c:665
2  0x0fff9402c818 in .start_thread () from /lib64/libpthread.so.0
3  0x0fff93e8db2c in .__clone () from /lib64/libc.so.6

Thread 1 (LWP 4700):
0  0x0fff93e819f4 in .__poll () from /lib64/libc.so.6
1  0x0fff93be45c4 in osaf_poll_no_timeout (io_fds=0xfff9412e728, i_nfds=1) 
at osaf_poll.c:32
2  0x0fff93be4738 in osaf_ppoll (io_fds=0xfff9412e728, i_nfds=1, 
i_timeout_ts=0x0, i_sigmask=0x0) at osaf_poll.c:79
3  0x0fff93bf2f88 in ncs_tmr_wait () at sysf_tmr.c:409
4  0x0fff9402c818 in .start_thread () from /lib64/libpthread.so.0
5  0x0fff93e8db2c in .__clone () from /lib64/libc.so.6
(gdb) q



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2011 ckptd seg faulted on active controller when trying to create checkpoint

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False
- **Milestone**: 5.17.08 --> future



---

** [tickets:#2011] ckptd seg faulted on active controller when trying to create 
checkpoint**

**Status:** accepted
**Milestone:** future
**Created:** Thu Sep 08, 2016 07:28 AM UTC by Ritu Raj
**Last Updated:** Mon Apr 10, 2017 01:40 PM UTC
**Owner:** nobody
**Attachments:**

- 
[ckptd_bt](https://sourceforge.net/p/opensaf/tickets/2011/attachment/ckptd_bt) 
(2.6 kB; application/octet-stream)
- 
[messages-20160907.bz2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/messages-20160907.bz2)
 (380.1 kB; application/x-bzip)
- [syslog2](https://sourceforge.net/p/opensaf/tickets/2011/attachment/syslog2) 
(1.4 MB; application/octet-stream)


Environment details

OS : Suse 64bit
Changeset : 7997 ( 5.1.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
1PBE enabled with 30K objects )

Summary :

ckptd crashed on active controller when trying to create checkpoint during 
failover

Steps followed & Observed behaviour

1. Initially ran some CKPT test scenarios, along with failovers. After the end 
of the test scenarios, The following IMM objects &  replicas are not deleted 
sofo-s3:/dev/shm # immfind | grep 101
safCkpt=all_replicas_ckpt_name_101
safCkpt=collocated_ckpt_name_101
safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101
safReplica=safNode=PL-3\,safCluster=myClmCluster,safCkpt=collocated_ckpt_name_101
safReplica=safNode=SC-1\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101
safReplica=safNode=SC-2\,safCluster=myClmCluster,safCkpt=all_replicas_ckpt_name_101

2.  When ckpt is created with the earlier name (all_replicas_ckpt_name_101)  
observed the following error in syslog. Also CkptOpen failed with ERR_LIBRARY.

>>   saImmOiRtObjectCreate_2 failed with error = 14
>>
Sep  7 17:21:11 sofo-s2 osafimmnd[2137]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Sep  7 17:21:12 sofo-s2 osafckptd[2284]: ER create_runtime_ckpt_object - 
saImmOiRtObjectCreate_2 failed with error = 14
Sep  7 17:21:12 sofo-s2 osafckptd[2284]: ER create runtime ckpt object failed 
with error: 14
Sep  7 17:21:12 sofo-s2 osafckptd[2284]: ER cpd db add ckpt_node failed for 
ckpt_id:2


4. After some time cpktd seg faulted on active controller
>>
Sep  7 17:21:43 sofo-s2 osafamfnd[2187]: NO 
'safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'
Sep  7 17:21:43 sofo-s2 osafamfnd[2187]: ER 
safComp=CPD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
Sep  7 17:21:43 sofo-s2 osafamfnd[2187]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131599, SupervisionTime = 60
Sep  7 17:21:43 sofo-s2 opensaf_reboot: Rebooting local node; timeout=60

5. Below is the bt

0-  0x7fbbd5ffcb20 in memcmp () from /lib64/libc.so.6
1-  0x7fbbd7a10929 in ncs_patricia_tree_get (pTree=0x67b4c8, 
pKey=0x7d22531c "\017\001\002") at patricia.c:435

2-  0x0040800d in cpd_cpnd_info_node_get (cpnd_tree=0x67b4c8, 
dest=0x67ec60, cpnd_info_node=0x7d225350) at cpd_db.c:706

3-  0x0040cd56 in cpd_evt_proc_mds_evt (cb=0x67b340, evt=0x67ec50) at 
cpd_evt.c:1378

4-  0x004091cb in cpd_process_evt (evt=0x67ec40) at cpd_evt.c:107
5-  0x0041185f in cpd_main_process (cb=0x67b340) at cpd_init.c:661
6 - 0x00411b89 in main (argc=1, argv=0x7d225578) at cpd_main.c:74


Notes:
1. Syslog attached
2. bt attached 
3. ckptd traces not enabled


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2085 CKPT : IMM attributes for ckpt table are increased by 1, when ckpt open returns TIME_OUT

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#2085] CKPT : IMM attributes for ckpt table are increased by 1, 
when ckpt open returns TIME_OUT**

**Status:** assigned
**Milestone:** future
**Created:** Fri Sep 30, 2016 05:13 AM UTC by Srikanth R
**Last Updated:** Tue Nov 15, 2016 06:37 AM UTC
**Owner:** nobody


Changeset : 7997 5.1.FC

IMM attributes for ckpt table are increased by 1, when ckpt open returns 
TIME_OUT. Below is the flow of steps in which how application uses CKPT.

-> Initialize with ckpt with callbacks. API returned SA_AIS_OK
-> Invoke selection object. API returned SA_AIS_OK
-> Create a checkpoint using async option. API returned SA_AIS_OK
-> Kill ckpnd process.
-> Check for the callbacks and check the IMM attribute of CKPT object.
Callback is invoked, in which return value is ERR_TIMEOUT. Spec mandates 
that, api should be called again to check whether checkpoint creation is 
successful or not. If the further call returns ERR_EXIST, the previous call is 
successful or the further call returns SA_AIS_OK, the previous call is 
unsuccessful.

 -> As the callback returned SA_AIS_ERR_TIMEOUT, invoked the checkpoint 
creation api async again. This time, api and both callback returned SA_AIS_OK.
 
  Now if you check the attributes for CKPT table object, the attributes 
saCkptCheckpointNumOpeners, saCkptCheckpointNumReaders and 
saCkptCheckpointNumWriters are having a value of 2, instead of expected value 
1. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2302 mds: replace patricia trees with cpp Map/trees

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#2302] mds: replace patricia trees with cpp Map/trees**

**Status:** assigned
**Milestone:** future
**Created:** Mon Feb 13, 2017 06:30 AM UTC by A V Mahesh (AVM)
**Last Updated:** Mon Feb 13, 2017 06:32 AM UTC
**Owner:** nobody


DB NCS PATRICIA TREE  with  C++ Map
for improve efficiency.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2303 dtm: replace patricia trees with cpp Map/trees

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#2303] dtm: replace patricia trees with cpp Map/trees**

**Status:** assigned
**Milestone:** future
**Created:** Mon Feb 13, 2017 06:32 AM UTC by A V Mahesh (AVM)
**Last Updated:** Mon Feb 13, 2017 06:33 AM UTC
**Owner:** nobody


DB NCS PATRICIA TREE  with  C++ Map
for improve efficiency.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2384 tools: IMM/tools/apitest fix all Cppcheck 1.77 issues

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 
- **Blocker**:  --> False



---

** [tickets:#2384] tools: IMM/tools/apitest fix all Cppcheck 1.77 issues**

**Status:** assigned
**Milestone:** future
**Created:** Fri Mar 17, 2017 04:57 AM UTC by A V Mahesh (AVM)
**Last Updated:** Fri Mar 17, 2017 04:57 AM UTC
**Owner:** nobody


[src/imm/agent/imma_db.cc:264]: (style) C-style pointer casting

[src/imm/apitest/immtest.c:151] -> [src/imm/apitest/immtest.c:184]: (style) 
Variable 'err' is reassigned a value before the old one has been used.
[src/imm/apitest/immtest.c:236] -> [src/imm/apitest/immtest.c:252]: (style) 
Variable 'err' is reassigned a value before the old one has been used.
[src/imm/apitest/implementer/applier.c:340]: (style) Consecutive return, break, 
continue, goto or throw statements are unnecessary.
[src/imm/apitest/implementer/applier.c:202]: (style) The scope of the variable 
'c' can be reduced.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:307]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:415]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:474]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:523]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:618]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:696]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:763]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:848]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:932]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiAdminOperation.c:1017]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_SaImmOiCcb.c:175]: (style) The scope of the 
variable 'ret' can be reduced.
[src/imm/apitest/implementer/test_SaImmOiCcb.c:228]: (style) The scope of the 
variable 'ret' can be reduced.
[src/imm/apitest/implementer/test_SaImmOiRtAttrUpdateCallbackT.c:55] -> 
[src/imm/apitest/implementer/test_SaImmOiRtAttrUpdateCallbackT.c:82]: (style) 
Variable 'err' is reassigned a value before the old one has been used.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:368]: (style) 
The scope of the variable 'ret' can be reduced.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:424]: (style) 
The scope of the variable 'ret' can be reduced.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:491]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:512]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:574]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:593]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:660]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:665]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:683]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:750]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.
[src/imm/apitest/implementer/test_saImmOiAugmentCcbInitialize.c:755]: (style) 
Obsolescent function 'usleep' called. It is recommended to use 'nanosleep' or 
'setitimer' instead.

[tickets] [opensaf:tickets] #2444 mds : improve m_NCS_TMR_START() error handling in mds code

2017-08-28 Thread A V Mahesh (AVM) via Opensaf-tickets
- **assigned_to**: A V Mahesh (AVM) -->  nobody 



---

** [tickets:#2444] mds : improve m_NCS_TMR_START()  error handling in mds code**

**Status:** assigned
**Milestone:** future
**Created:** Fri Apr 28, 2017 03:45 AM UTC by A V Mahesh (AVM)
**Last Updated:** Fri Apr 28, 2017 03:45 AM UTC
**Owner:** nobody
**Attachments:**

- 
[mds_tmr_err.patch](https://sourceforge.net/p/opensaf/tickets/2444/attachment/mds_tmr_err.patch)
 (8.4 kB; application/octet-stream)


The attached patch will improve m_NCS_TMR_START()  error handling in mds code.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #162 amfnd used 90 + % CPU on CLM unconfigured node because of saclmDispatch

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 
- **Blocker**: True --> False



---

** [tickets:#162] amfnd used 90 + % CPU on CLM unconfigured node because of 
saclmDispatch**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 04:33 AM UTC by Nagendra Kumar
**Last Updated:** Mon Aug 28, 2017 07:00 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/1339

Steps to reproduce:


1) Bring up 2 controller and 2 payloads
2) Lock one payload(CLM Node)and delete that node from database using immcfg 
utility
3) Lock another payload(CLM Node)


Payload one amfnd shows 100% CPU utilization


Reason:
CLM agent doesnot destroy the message delivered for unconfigured node.


Changed 3 years ago by rameshb ¶
  ■owner changed from sangeeta_meena to Nagendra 
■status changed from new to assigned 
■component changed from CLM to AvSv 
There should not be any dispatch of pending cbks in case of unconfigured node, 
I see amfnd should handle this in case of "SA_AIS_ERR_UNAVAILABLE" return code 
(say through CLM-finalize or through reboot the node).


Changed 2 years ago by jfournier ¶
  ■owner changed from Nagendra to nagendra 
Changed 2 years ago by jfournier ¶
  ■milestone changed from 4.0.RC1 to 4.0.1 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #333 amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when SA_AMF_COMPONENT_NAME is not exported.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#333] amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when 
SA_AMF_COMPONENT_NAME is not exported.**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Mon May 27, 2013 04:48 AM UTC by Praveen
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2113.





Unless SA_AMF_COMPONENT_NAME is not exported, health check is not started for 
unregistered process.
 

As APPENDIX B on page 442 specifies that saAmfHealthcheckStart can be called in 
the context of any process , whether it is not registered or not



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #399 amf: SU admin state not updated after doing controller switchover and admin lock of SU.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#399] amf: SU admin state not updated after doing controller 
switchover and admin lock of SU.**

**Status:** unassigned
**Milestone:** future
**Created:** Fri May 31, 2013 05:21 AM UTC by Praveen
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2879.

changeset : 3796, 4.2.2
 model : NpluM
 

Initial Configuration:-
 =
 SI equal distribution
 saAmfSGNumPrefInserviceSUs=5 -a saAmfSGMaxActiveSIsperSU=2 -a 
saAmfSGMaxStandbySIsperSU=3 -a saAmfSGNumPrefActiveSUs=3 -a 
saAmfSGNumPrefStandbySUs=2
 saAmfSGAutoAdjust=1
 

6 SIs in locked state.
 saAmfSIPrefActiveAssignments=1 -a saAmfSIPrefStandbyAssignments=1
 

5SUs with same SURank set to 5.Each SUs admin state was locked-instantiation 
state.
 SU1, SU4, SU5 spawned on SC-1
 SU2 on SC-2
 SU3 on PL-4
 

Steps:-
 1. Brought up the NplusM model with above configuration.
 2. Performed unlock-instantiation operation on each SUs (SU1 to SU5)
 3. Performed unlock operation on each SUs (SU1 to SU5).
 4. Performed unlock of each SIs (SI1 to SI6)
 

Here observed that SUSI assignments were equally distributed.
 

5. Now on SC-1, command line trigger controller switchover
 and immediately on SC-2, trigger the admin lock on SU1.
 

Here observed that controller switchover successfully completed
 but the admin lock on SU1 failed with SA_AIS_ERR_TIMEOUT.
 

Again tried to lock the SU1, but this time it got failed with SA_AIS_ERR_NO_OP. 
It was failing with the same error SA_AIS_ERR_NO_OP after reties. amf-state su 
states was showing the admin state of SU1 as UNLOCKED. Hence admin state of SU1 
was not getting changed. 
Observed that all the SUSI assignments from SU1 got removed but the 


/var/log/messages was printing the below messages:-
 

Oct 23 13:01:53 SLOT2 osafimmnd[7176]: Timeout on syncronous admin operation 1
 Oct 23 13:03:47 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 Oct 23 13:06:15 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 

safSu=d_NplusM_1Norm_1,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_2,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_3,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_4,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_5,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSISU=safSu=SC-1\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?2,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=SC-2\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?1,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-3\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?4,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-4\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?3,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_6,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_4,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_5,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)


Changed 7 months ago by shareef 




Same issue also 

[tickets] [opensaf:tickets] #178 escalation policy is not happening till the restart count exceeds, instead of reaching saAmfSGCompRestartMax for NPI components

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 
- **Blocker**: False --> True



---

** [tickets:#178] escalation policy is not happening till the restart count 
exceeds, instead of reaching saAmfSGCompRestartMax for NPI components**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 06:24 AM UTC by Nagendra Kumar
**Last Updated:** Mon Aug 28, 2017 07:00 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2144

error escalation is not happening till the restart count exceeds 
saAmfSGCompRestartMax for the components brought up in NPI.


But according to spec, first level escalation should happen when the restart 
count reaches the saAmfSGCompRestartMax


Mentioned in the spec, 3.11.2.2 page NO: 203,


If this count reaches the saAmfSGCompRestartMax value before the end of the
"component restart" probation period, the Availability Management Framework per-
forms the first level of recovery escalation for that service unit: the 
Availability Man-
agement Framework restarts the entire service unit





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1795 AMF : haState should be marked QUIESCING in PG callback for shutdown op

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#1795] AMF : haState should be marked QUIESCING in PG callback for 
shutdown op**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Apr 29, 2016 07:24 AM UTC by Srikanth R
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Changeset : 7434

For the shutdown operation on the SI, the haState is filled up with the value 
SA_AMF_HA_QUIESCED (3), instead of SA_AMF_HA_QUIESCING (4)  in the protection 
group callback.


PROTECTION GROUP CALLBACK IS INVOKED
error :  1
numberOfMembers :  2
csiName :  safCsi=CSI1,safSi=TestApp_SI1,safApp=TestApp_TwoN
number of items in notification buffer is  2
{0: {'member': {'haState': 2, 'compName': 
safComp=COMP1,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
1, 'haReadinessState': 1}, 'change': 1}, 1: {'member': {'haState': **3**, 
'compName': 
safComp=COMP1,safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
2, 'haReadinessState': 1}, 'change': 4}}



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1799 AMF : csiName and csiFlags are not properly populated, during assignment removal ( proxy)

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#1799] AMF : csiName and csiFlags are not properly populated, 
during assignment removal ( proxy)**

**Status:** unassigned
**Milestone:** future
**Created:** Sat Apr 30, 2016 06:17 AM UTC by Srikanth R
**Last Updated:** Mon Aug 28, 2017 06:58 AM UTC
**Owner:** nobody


Changeset : 7436
Setup :2N redmodel with both proxy and proxied hosted on the same node.


* Initially the proxy and proxied are in  fully assigned  state.
* Now perform lock operation on proxy SU, for which quiesced callback and csi 
removal callback is populating the csiFlags as SA_AMF_CSI_TARGET_ALL  and 
csiName is populated as NULL. But the proxy component is having active 
assignments , which is to be removed according to the callback .  Similar is 
for lock operation is on proxied SU.

 So expectation is that for lock operation on either proxy / proxied SU 
,csiFlags should be populated as SA_AMF_CSI_TARGET_ONE  with the corresponding 
csi.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1532 AMF : SI should be reverted to unlocked state, after shutdown operation of SI is rejected

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#1532] AMF : SI should be reverted to unlocked state, after 
shutdown operation of SI is rejected**

**Status:** unassigned
**Milestone:** future
**Created:** Thu Oct 08, 2015 11:20 AM UTC by Srikanth R
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Changeset : 6901
Application  : 2n ( two SUs and 4 SIs with SI1 as sponsor for the remaining SIs)

Steps :

 * Initially all the SIs are in assigned state.
 * Invoked shutdown operation on one of the dependent SI .i.e SI2.
 *  For the quiescing callback, component responded with FAILED_OP

Oct  8 16:27:20 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' QUIESCING to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Performing failover of 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' (SU failover count: 2)
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted 
due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover'

 * After recovery of SU1, SI2 assignments are also done, which is not expected.

Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
TERMINATING => INSTANTIATED
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI1,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI3,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'

 * Below is the SI state after the shutdown operation
 safSi=TestApp_SI2,safApp=TestApp_TwoN
saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)

* Further unlock operation of SI resulted in TIMEOUT return op.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #162 amfnd used 90 + % CPU on CLM unconfigured node because of saclmDispatch

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> True



---

** [tickets:#162] amfnd used 90 + % CPU on CLM unconfigured node because of 
saclmDispatch**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 04:33 AM UTC by Nagendra Kumar
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/1339

Steps to reproduce:


1) Bring up 2 controller and 2 payloads
2) Lock one payload(CLM Node)and delete that node from database using immcfg 
utility
3) Lock another payload(CLM Node)


Payload one amfnd shows 100% CPU utilization


Reason:
CLM agent doesnot destroy the message delivered for unconfigured node.


Changed 3 years ago by rameshb ¶
  ■owner changed from sangeeta_meena to Nagendra 
■status changed from new to assigned 
■component changed from CLM to AvSv 
There should not be any dispatch of pending cbks in case of unconfigured node, 
I see amfnd should handle this in case of "SA_AIS_ERR_UNAVAILABLE" return code 
(say through CLM-finalize or through reboot the node).


Changed 2 years ago by jfournier ¶
  ■owner changed from Nagendra to nagendra 
Changed 2 years ago by jfournier ¶
  ■milestone changed from 4.0.RC1 to 4.0.1 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #178 escalation policy is not happening till the restart count exceeds, instead of reaching saAmfSGCompRestartMax for NPI components

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#178] escalation policy is not happening till the restart count 
exceeds, instead of reaching saAmfSGCompRestartMax for NPI components**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 06:24 AM UTC by Nagendra Kumar
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2144

error escalation is not happening till the restart count exceeds 
saAmfSGCompRestartMax for the components brought up in NPI.


But according to spec, first level escalation should happen when the restart 
count reaches the saAmfSGCompRestartMax


Mentioned in the spec, 3.11.2.2 page NO: 203,


If this count reaches the saAmfSGCompRestartMax value before the end of the
"component restart" probation period, the Availability Management Framework per-
forms the first level of recovery escalation for that service unit: the 
Availability Man-
agement Framework restarts the entire service unit





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #333 amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when SA_AMF_COMPONENT_NAME is not exported.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned



---

** [tickets:#333] amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when 
SA_AMF_COMPONENT_NAME is not exported.**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Mon May 27, 2013 04:48 AM UTC by Praveen
**Last Updated:** Thu Jul 20, 2017 07:59 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2113.





Unless SA_AMF_COMPONENT_NAME is not exported, health check is not started for 
unregistered process.
 

As APPENDIX B on page 442 specifies that saAmfHealthcheckStart can be called in 
the context of any process , whether it is not registered or not



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1795 AMF : haState should be marked QUIESCING in PG callback for shutdown op

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#1795] AMF : haState should be marked QUIESCING in PG callback for 
shutdown op**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Apr 29, 2016 07:24 AM UTC by Srikanth R
**Last Updated:** Tue Sep 20, 2016 05:46 PM UTC
**Owner:** Nagendra Kumar


Changeset : 7434

For the shutdown operation on the SI, the haState is filled up with the value 
SA_AMF_HA_QUIESCED (3), instead of SA_AMF_HA_QUIESCING (4)  in the protection 
group callback.


PROTECTION GROUP CALLBACK IS INVOKED
error :  1
numberOfMembers :  2
csiName :  safCsi=CSI1,safSi=TestApp_SI1,safApp=TestApp_TwoN
number of items in notification buffer is  2
{0: {'member': {'haState': 2, 'compName': 
safComp=COMP1,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
1, 'haReadinessState': 1}, 'change': 1}, 1: {'member': {'haState': **3**, 
'compName': 
safComp=COMP1,safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
2, 'haReadinessState': 1}, 'change': 4}}



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1799 AMF : csiName and csiFlags are not properly populated, during assignment removal ( proxy)

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#1799] AMF : csiName and csiFlags are not properly populated, 
during assignment removal ( proxy)**

**Status:** unassigned
**Milestone:** future
**Created:** Sat Apr 30, 2016 06:17 AM UTC by Srikanth R
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Changeset : 7436
Setup :2N redmodel with both proxy and proxied hosted on the same node.


* Initially the proxy and proxied are in  fully assigned  state.
* Now perform lock operation on proxy SU, for which quiesced callback and csi 
removal callback is populating the csiFlags as SA_AMF_CSI_TARGET_ALL  and 
csiName is populated as NULL. But the proxy component is having active 
assignments , which is to be removed according to the callback .  Similar is 
for lock operation is on proxied SU.

 So expectation is that for lock operation on either proxy / proxied SU 
,csiFlags should be populated as SA_AMF_CSI_TARGET_ONE  with the corresponding 
csi.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #399 amf: SU admin state not updated after doing controller switchover and admin lock of SU.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#399] amf: SU admin state not updated after doing controller 
switchover and admin lock of SU.**

**Status:** unassigned
**Milestone:** future
**Created:** Fri May 31, 2013 05:21 AM UTC by Praveen
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2879.

changeset : 3796, 4.2.2
 model : NpluM
 

Initial Configuration:-
 =
 SI equal distribution
 saAmfSGNumPrefInserviceSUs=5 -a saAmfSGMaxActiveSIsperSU=2 -a 
saAmfSGMaxStandbySIsperSU=3 -a saAmfSGNumPrefActiveSUs=3 -a 
saAmfSGNumPrefStandbySUs=2
 saAmfSGAutoAdjust=1
 

6 SIs in locked state.
 saAmfSIPrefActiveAssignments=1 -a saAmfSIPrefStandbyAssignments=1
 

5SUs with same SURank set to 5.Each SUs admin state was locked-instantiation 
state.
 SU1, SU4, SU5 spawned on SC-1
 SU2 on SC-2
 SU3 on PL-4
 

Steps:-
 1. Brought up the NplusM model with above configuration.
 2. Performed unlock-instantiation operation on each SUs (SU1 to SU5)
 3. Performed unlock operation on each SUs (SU1 to SU5).
 4. Performed unlock of each SIs (SI1 to SI6)
 

Here observed that SUSI assignments were equally distributed.
 

5. Now on SC-1, command line trigger controller switchover
 and immediately on SC-2, trigger the admin lock on SU1.
 

Here observed that controller switchover successfully completed
 but the admin lock on SU1 failed with SA_AIS_ERR_TIMEOUT.
 

Again tried to lock the SU1, but this time it got failed with SA_AIS_ERR_NO_OP. 
It was failing with the same error SA_AIS_ERR_NO_OP after reties. amf-state su 
states was showing the admin state of SU1 as UNLOCKED. Hence admin state of SU1 
was not getting changed. 
Observed that all the SUSI assignments from SU1 got removed but the 


/var/log/messages was printing the below messages:-
 

Oct 23 13:01:53 SLOT2 osafimmnd[7176]: Timeout on syncronous admin operation 1
 Oct 23 13:03:47 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 Oct 23 13:06:15 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 

safSu=d_NplusM_1Norm_1,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_2,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_3,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_4,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_5,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSISU=safSu=SC-1\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?2,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=SC-2\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?1,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-3\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?4,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-4\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?3,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_6,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_4,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_5,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)


Changed 7 months ago by shareef 




[tickets] [opensaf:tickets] #1532 AMF : SI should be reverted to unlocked state, after shutdown operation of SI is rejected

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **Blocker**:  --> False



---

** [tickets:#1532] AMF : SI should be reverted to unlocked state, after 
shutdown operation of SI is rejected**

**Status:** unassigned
**Milestone:** future
**Created:** Thu Oct 08, 2015 11:20 AM UTC by Srikanth R
**Last Updated:** Tue Jun 07, 2016 11:22 AM UTC
**Owner:** Nagendra Kumar


Changeset : 6901
Application  : 2n ( two SUs and 4 SIs with SI1 as sponsor for the remaining SIs)

Steps :

 * Initially all the SIs are in assigned state.
 * Invoked shutdown operation on one of the dependent SI .i.e SI2.
 *  For the quiescing callback, component responded with FAILED_OP

Oct  8 16:27:20 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' QUIESCING to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Performing failover of 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' (SU failover count: 2)
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted 
due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover'

 * After recovery of SU1, SI2 assignments are also done, which is not expected.

Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
TERMINATING => INSTANTIATED
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI1,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI3,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'

 * Below is the SI state after the shutdown operation
 safSi=TestApp_SI2,safApp=TestApp_TwoN
saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)

* Further unlock operation of SI resulted in TIMEOUT return op.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #248 "amf: Incorrect return code from saAmfComponentErrorReport_4 () and saAmfComponentErrorClear_4()".

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 4a65aa761b8eb399f96325793f0b1b87edc7e44e
Author: Nagendra Kumar 
Date:   Mon Aug 28 12:15:24 2017 +0530

amfa: return BAD HANDLE in error report or error clear [#248]




---

** [tickets:#248] "amf: Incorrect return code from saAmfComponentErrorReport_4 
() and saAmfComponentErrorClear_4()". **

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Thu May 16, 2013 06:41 AM UTC by Praveen
**Last Updated:** Thu Jul 20, 2017 08:01 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2817.

Changeset:3728
 When saAmfComponentErrorReport_4() and saAmfComponentErrorClear_4() are called 
after finalizing the amfHandle(calling saAmfFinalize()), both of them returns 
SA_AIS_ERR_VERSION instead of SA_AIS_ERR_BAD_HANDLE.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets