[tickets] [opensaf:tickets] #3348 mds: fix typo error

2024-02-28 Thread Nagendra Kumar via Opensaf-tickets



---

**[tickets:#3348] mds: fix typo error**

**Status:** unassigned
**Milestone:** 5.24.09
**Created:** Wed Feb 28, 2024 06:13 PM UTC by Nagendra Kumar
**Last Updated:** Wed Feb 28, 2024 06:13 PM UTC
**Owner:** nobody


The following fixes the typo:
diff -rupN ori/src/mds/mds_svc_op.c mod/src/mds/mds_svc_op.c
--- ori/src/mds/mds_svc_op.c2023-03-28 01:00:36.0 +0100
+++ mod/src/mds/mds_svc_op.c2024-01-16 16:43:47.506848477 +
@@ -642,8 +642,8 @@ uint32_t mds_svc_op_unsubscribe(const NC
mds_subtn_tbl_del(svc_hdl, info->info.svc_cancel.i_svc_ids[i]);
MDS_SVC_LOG_INFO(UNSUBSCRIBE_TAG, info,
"Unsubscribe to svc_id = %s(%d) successful",
-   get_svc_names(info->info.svc_subscribe.i_svc_ids[i]),
-   info->info.svc_subscribe.i_svc_ids[i]);
+   get_svc_names(info->info.svc_cancel.i_svc_ids[i]),
+   info->info.svc_cancel.i_svc_ids[i]);
}
m_MDS_LEAVE();
return NCSCC_RC_SUCCESS;



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2526 amfd: Command unlock nodegroup timeout if su failover is escalated (>= 2SIs)

2022-10-10 Thread Nagendra Kumar via Opensaf-tickets
Trying to reproduce it on latest OpenSAF.
Thanks
-Nagendra
High Availability Solutions(www.GetHighAvailability.com)


---

** [tickets:#2526] amfd: Command unlock nodegroup timeout if su failover is 
escalated (>= 2SIs)**

**Status:** accepted
**Milestone:** future
**Labels:** nodegroup timeout 
**Created:** Wed Jul 12, 2017 04:23 AM UTC by Minh Hon Chau
**Last Updated:** Mon Oct 10, 2022 12:20 PM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[app3_twon3su3si.xml](https://sourceforge.net/p/opensaf/tickets/2526/attachment/app3_twon3su3si.xml)
 (14.6 kB; text/xml)


- Configuration: 2N app, 3SI (model is attached), SU4/SU5 are hosted on PL4/PL5 
respectively
- Steps:
. Create nodegroup consists of PL4/PL5
. Unlock ng
. SU4 is assigned ACTIVE
. While component of SU5 is being assigned STANDBY, kill a component of SU4 to 
escalate to a SuFailover
. SU4 is now getting STANDBY assignment, SU5 is getting ACTIVE assignment
. But the command unlock ng is being hold until TIMEOUT

Note: Repeat the same test with only **1 SI**, the command unlock ng returns OK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2526 amfd: Command unlock nodegroup timeout if su failover is escalated (>= 2SIs)

2022-10-10 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> accepted



---

** [tickets:#2526] amfd: Command unlock nodegroup timeout if su failover is 
escalated (>= 2SIs)**

**Status:** accepted
**Milestone:** future
**Labels:** nodegroup timeout 
**Created:** Wed Jul 12, 2017 04:23 AM UTC by Minh Hon Chau
**Last Updated:** Mon Oct 10, 2022 12:19 PM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[app3_twon3su3si.xml](https://sourceforge.net/p/opensaf/tickets/2526/attachment/app3_twon3su3si.xml)
 (14.6 kB; text/xml)


- Configuration: 2N app, 3SI (model is attached), SU4/SU5 are hosted on PL4/PL5 
respectively
- Steps:
. Create nodegroup consists of PL4/PL5
. Unlock ng
. SU4 is assigned ACTIVE
. While component of SU5 is being assigned STANDBY, kill a component of SU4 to 
escalate to a SuFailover
. SU4 is now getting STANDBY assignment, SU5 is getting ACTIVE assignment
. But the command unlock ng is being hold until TIMEOUT

Note: Repeat the same test with only **1 SI**, the command unlock ng returns OK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2526 amfd: Command unlock nodegroup timeout if su failover is escalated (>= 2SIs)

2022-10-10 Thread Nagendra Kumar via Opensaf-tickets
- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar



---

** [tickets:#2526] amfd: Command unlock nodegroup timeout if su failover is 
escalated (>= 2SIs)**

**Status:** assigned
**Milestone:** future
**Labels:** nodegroup timeout 
**Created:** Wed Jul 12, 2017 04:23 AM UTC by Minh Hon Chau
**Last Updated:** Wed Jan 09, 2019 09:43 PM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[app3_twon3su3si.xml](https://sourceforge.net/p/opensaf/tickets/2526/attachment/app3_twon3su3si.xml)
 (14.6 kB; text/xml)


- Configuration: 2N app, 3SI (model is attached), SU4/SU5 are hosted on PL4/PL5 
respectively
- Steps:
. Create nodegroup consists of PL4/PL5
. Unlock ng
. SU4 is assigned ACTIVE
. While component of SU5 is being assigned STANDBY, kill a component of SU4 to 
escalate to a SuFailover
. SU4 is now getting STANDBY assignment, SU5 is getting ACTIVE assignment
. But the command unlock ng is being hold until TIMEOUT

Note: Repeat the same test with only **1 SI**, the command unlock ng returns OK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #3204 amf: support of node repair feature

2020-07-27 Thread Nagendra Kumar via Opensaf-tickets
Hi Thang,
I just now tested this scenario. It returns TRY_AGAIN and a log in the syslog 
saying SC-2 is not part of CLM cluster. The log in the syslog, will be because 
to repair a disabled node, the differentiation has to be with CLM whether the 
node is down or faulty.
When you test using amf-adm or immadm then when amf/someother-service returns 
TRY_AGAIN, then it will try again and will not return try_again to you until it 
times out.
Thanks
-Nagu


---

** [tickets:#3204] amf: support of node repair feature**

**Status:** review
**Milestone:** 5.20.08
**Created:** Mon Jul 20, 2020 11:57 PM UTC by Anand Sundararaj
**Last Updated:** Mon Jul 27, 2020 07:43 AM UTC
**Owner:** Anand Sundararaj


Support of Administrative operation "SA_AMF_ADMIN_REPAIRED" on Amf Node
Amf Specs B.4.1 section: 9.4.10 SA_AMF_ADMIN_REPAIRED


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #3205 amf: remove hard-coding in amfnd

2020-07-27 Thread Nagendra Kumar via Opensaf-tickets
That's right, Thang.
Thanks
-Nagu


---

** [tickets:#3205] amf: remove hard-coding in amfnd**

**Status:** review
**Milestone:** 5.20.08
**Created:** Tue Jul 21, 2020 12:35 AM UTC by Anand Sundararaj
**Last Updated:** Mon Jul 27, 2020 06:30 AM UTC
**Owner:** Anand Sundararaj
**Attachments:**

- 
[amfnd_non_root_default.patch](https://sourceforge.net/p/opensaf/tickets/3205/attachment/amfnd_non_root_default.patch)
 (1.2 kB; application/octet-stream)


Amfnd is hard-coded to run as root:
"src/amf/amfnd/main.cc":
  daemonize_as_user("root", argc, argv);
This needs to be removed.

This is with reference to User Query and the patch(attached) was provided by 
Praveen:

On 13-Apr-17 7:27 PM, Carroll, James R wrote:
> Hi,
> 
> I am using openSAF 5.0, and it appears that some of the openSAF (amfnd) 
> daemons are hard-coded to run as root.
> Is there any way to disable this feature, so that I do not have to run the 
> daemon as root?
> 
> I see the following note in the README documentation:
> Only two processes are running as root, amfnd and smfnd. Reason is 
> that amfnd need todo that for backwards compatible reasons and the programs 
> it starts might be designed to require root access.
> 
> We are trying to run all of our programs as non-root.  Regarding the 
> documentation noted above, if we can start all our programs as non-root, then 
> we would not need to run the opensaf as root.

As of now, it is hard-coded in amfnd to run as root.
Attached are patches on default and 5.0 branch to enable amfnd to start as 
non-root.
After installation of OpenSAF, uncomment "#AMFND_NON_ROOT=1" line in amfnd.conf 
to enable amfnd to run as a user  as mentioned in amfnd.conf. 
By default it will run as root.

Thanks
Praveen


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #3205 amf: remove hard-coding in amfnd

2020-07-26 Thread Nagendra Kumar via Opensaf-tickets
Hi Thang,
1. This is the option/flexibility being provided, not breaking anything. It is 
by default off.
2. As you can see in the tickt description, the requirement was reported in the 
user's list, so it looks real use case. This helps in increasing OpenSAF' 
adaptation across the globe.
3. Many users doesn't use Smf and they start from 5.20.05 onwards, needn't 
worry about a release 4.2, which was released 6 or 7 years back. So, no 
backward compatibility issue, agree ??
Please suggest.
Thanks
-Nagendra


---

** [tickets:#3205] amf: remove hard-coding in amfnd**

**Status:** review
**Milestone:** 5.20.08
**Created:** Tue Jul 21, 2020 12:35 AM UTC by Anand Sundararaj
**Last Updated:** Mon Jul 27, 2020 03:33 AM UTC
**Owner:** Anand Sundararaj
**Attachments:**

- 
[amfnd_non_root_default.patch](https://sourceforge.net/p/opensaf/tickets/3205/attachment/amfnd_non_root_default.patch)
 (1.2 kB; application/octet-stream)


Amfnd is hard-coded to run as root:
"src/amf/amfnd/main.cc":
  daemonize_as_user("root", argc, argv);
This needs to be removed.

This is with reference to User Query and the patch(attached) was provided by 
Praveen:

On 13-Apr-17 7:27 PM, Carroll, James R wrote:
> Hi,
> 
> I am using openSAF 5.0, and it appears that some of the openSAF (amfnd) 
> daemons are hard-coded to run as root.
> Is there any way to disable this feature, so that I do not have to run the 
> daemon as root?
> 
> I see the following note in the README documentation:
> Only two processes are running as root, amfnd and smfnd. Reason is 
> that amfnd need todo that for backwards compatible reasons and the programs 
> it starts might be designed to require root access.
> 
> We are trying to run all of our programs as non-root.  Regarding the 
> documentation noted above, if we can start all our programs as non-root, then 
> we would not need to run the opensaf as root.

As of now, it is hard-coded in amfnd to run as root.
Attached are patches on default and 5.0 branch to enable amfnd to start as 
non-root.
After installation of OpenSAF, uncomment "#AMFND_NON_ROOT=1" line in amfnd.conf 
to enable amfnd to run as a user  as mentioned in amfnd.conf. 
By default it will run as root.

Thanks
Praveen


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2898 imm: syslog number of recent fevs messages when immnd down

2018-09-29 Thread Nagendra Kumar via Opensaf-tickets
- **Component**: plm --> imm



---

** [tickets:#2898] imm: syslog number of recent fevs messages when immnd down**

**Status:** fixed
**Milestone:** 5.18.09
**Created:** Tue Jul 17, 2018 05:22 AM UTC by Vu Minh Nguyen
**Last Updated:** Sat Sep 29, 2018 01:17 PM UTC
**Owner:** Danh Vo


We has  encountered "OUT OF ORDER" error sometimes, and it is not easy to find 
out which message has been lost.

It would help if imm is able to syslog a number of recent fevs (e.g. 05) when 
detecting immnd down for any reason, showing recent sequence numbers and 
message ids. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2898 imm: syslog number of recent fevs messages when immnd down

2018-09-29 Thread Nagendra Kumar via Opensaf-tickets
- **Component**: imm --> plm



---

** [tickets:#2898] imm: syslog number of recent fevs messages when immnd down**

**Status:** fixed
**Milestone:** 5.18.09
**Created:** Tue Jul 17, 2018 05:22 AM UTC by Vu Minh Nguyen
**Last Updated:** Thu Aug 30, 2018 12:45 PM UTC
**Owner:** Danh Vo


We has  encountered "OUT OF ORDER" error sometimes, and it is not easy to find 
out which message has been lost.

It would help if imm is able to syslog a number of recent fevs (e.g. 05) when 
detecting immnd down for any reason, showing recent sequence numbers and 
message ids. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2928 osaf: Programmer's Reference and README need update

2018-09-17 Thread Nagendra Kumar via Opensaf-tickets
- **Component**: unknown --> ais
- **Part**: - --> doc



---

** [tickets:#2928] osaf: Programmer's Reference and README need update**

**Status:** unassigned
**Milestone:** 5.18.09
**Created:** Mon Sep 17, 2018 01:24 PM UTC by Nagendra Kumar
**Last Updated:** Mon Sep 17, 2018 01:24 PM UTC
**Owner:** nobody


Programmer's Reference and README need to be updated for better usability, 
readability and its coherence with OpenSAF functionalities.

Few of the things to be corrected are:
- References of devel lists tickets, which no longer exists.
- Formatting of different sections are different.
- Reference to old code structure: opensaf/osaf/, osaf/tools
- References to devel lists tickets, which are inaccessible.
- Sample output doesn't match with the current demo output.
- Spelling mistakes.

Some specific issues are:
Ntf:
The following command mentioned in Ntf PR doesn't work:
ntfsend -T 0x5000 -s 4 --probableCause 74 -a “additional information”

Smf;
Mentioned in PR: smf-bundle-import, smf-bundle-remove, smf-backup-restore 
doesn't exists.

Clm:
Mentioned in PR:  (see 1818). There is no links to it.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2928 osaf: Programmer's Reference and README need update

2018-09-17 Thread Nagendra Kumar via Opensaf-tickets



---

** [tickets:#2928] osaf: Programmer's Reference and README need update**

**Status:** unassigned
**Milestone:** 5.18.09
**Created:** Mon Sep 17, 2018 01:24 PM UTC by Nagendra Kumar
**Last Updated:** Mon Sep 17, 2018 01:24 PM UTC
**Owner:** nobody


Programmer's Reference and README need to be updated for better usability, 
readability and its coherence with OpenSAF functionalities.

Few of the things to be corrected are:
- References of devel lists tickets, which no longer exists.
- Formatting of different sections are different.
- Reference to old code structure: opensaf/osaf/, osaf/tools
- References to devel lists tickets, which are inaccessible.
- Sample output doesn't match with the current demo output.
- Spelling mistakes.

Some specific issues are:
Ntf:
The following command mentioned in Ntf PR doesn't work:
ntfsend -T 0x5000 -s 4 --probableCause 74 -a “additional information”

Smf;
Mentioned in PR: smf-bundle-import, smf-bundle-remove, smf-backup-restore 
doesn't exists.

Clm:
Mentioned in PR:  (see 1818). There is no links to it.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #209 plmd crashed while deleting plm entities at various points.

2018-09-15 Thread Nagendra Kumar via Opensaf-tickets
This looks so easy to reproduce,  it should be critical.


---

** [tickets:#209] plmd crashed while deleting plm entities at various points.**

**Status:** assigned
**Milestone:** 5.18.09
**Created:** Wed May 15, 2013 07:02 AM UTC by Mathi Naickan
**Last Updated:** Sat Sep 15, 2018 11:26 AM UTC
**Owner:** MeenakshiTK


When the command , "immcfg -d safHE=7220_slot_1,safDomain=domain_1" is ran plm 
crashed with segmentation fault.
the above object has three childs dpb_1,dpb_2 and PL-13
plmd crashed with the following backtrace :
Program terminated with signal 11, Segmentation fault.
#0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at 
plms_utils.c:1360
1360 if (0 == strcmp(tail->plm_entity->dn_name_str,
(gdb) bt fH[[K
#0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at 
plms_utils.c:1360
#1 0x08088c8d in plms_chld_get (ent=0x80fe450, chld_list=0xbbc5d9f8) at 
plms_utils.c:842
#2 0x0805ca90 in plms_delete_objects (obj_type=6, obj_name=0x810a2a8) at 
plms_imm.c:697
#3 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=68719608079, ccb_id=2) at 
plms_imm.c:1425
#4 0x032fc46f in imma_process_callback_info (cb=) at imma_proc.c:2005
#5 0x032fb393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592
#6 0x032ebcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548
#7 0x08051bff in main (argc=2, argv=0xbbc5e414) at plms_main.c:484
#8 0x033aee0c in libc_start_main () from /lib/libc.so.6
#9 0x0804c401 in _start ()

While deleting the entity, which doesn't have any child, it crashed with the 
following backtrace
#0 0x0805cb14 in plms_delete_objects (obj_type=7, obj_name=0x8109588) at 
plms_imm.c:707
#1 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=824633852431, ccb_id=6) at 
plms_imm.c:1425
#2 0x0189c46f in imma_process_callback_info (cb=) at imma_proc.c:2005
#3 0x0189b393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592
#4 0x0188bcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548
#5 0x08051bff in main (argc=2, argv=0xbe1d6db4) at plms_main.c:484
#6 0x0194ee0c in libc_start_main () from /lib/libc.so.6
#7 0x0804c401 in _start ()

Also, check the following issue : 
Crash in plmc_err callback when ee_id is passed as empty string.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2916 amfd: SU remains OUT_OF_SERVICE when unlock SU during cluster start up

2018-08-27 Thread Nagendra Kumar via Opensaf-tickets
Can you please update the ticket with the change sets?


---

** [tickets:#2916] amfd: SU remains OUT_OF_SERVICE when unlock SU during 
cluster start up**

**Status:** fixed
**Milestone:** 5.18.08
**Created:** Tue Aug 21, 2018 11:09 PM UTC by Minh Hon Chau
**Last Updated:** Thu Aug 23, 2018 10:16 PM UTC
**Owner:** Minh Hon Chau


The problem happens in NWay Active SUs, all SUs have the assignment, except SU 
in the SC-1.
Scenario: It is single step upgrade, after cluter reboot, SMF will issue unlock 
command during cluster startup.
During cluster starting up, SC-2 has already joined cluster and before the SC-1 
(standby) joins the cluster, issue unlock the SUs in SC-1 and SC-2. Only SU in 
SC-2 becomes IN_SERVICE.

  3001 2018-08-16T02:52:00.986+0200  SC-2 safApp=safAmfService NO: "Admin 
op "UNLOCK" initiated for 'safSu=SC-1,safSg=NWA,safApp=QWE-ABC-Amfproxy', 
invocation: 425201762315"
  3002 2018-08-16T02:52:00.987+0200  SC-2 safApp=safAmfService NO: 
"safSu=SC-1,safSg=NWA,safApp=QWE-ABC-Amfproxy AdmState LOCKED => UNLOCKED"
  3003 2018-08-16T02:52:00.988+0200  SC-2 safApp=safAmfService NO: "Admin 
op done for invocation: 425201762315, result 1"
  3004 2018-08-16T02:52:00.990+0200  SC-2 safApp=safAmfService NO: "Admin 
op "UNLOCK" initiated for 'safSu=SC-2,safSg=NWA,safApp=QWE-ABC-Amfproxy', 
invocation: 429496729612"
  3005 2018-08-16T02:52:00.990+0200  SC-2 safApp=safAmfService NO: 
"safSu=SC-2,safSg=NWA,safApp=QWE-ABC-Amfproxy AdmState LOCKED => UNLOCKED"
  3006 2018-08-16T02:52:00.991+0200  SC-2 safApp=safAmfService NO: 
"safSu=SC-2,safSg=NWA,safApp=QWE-ABC-Amfproxy ReadinessState OUT_OF_SERVICE => 
IN_SERVICE"

When "cluster init timeout" amfd will start assignment of all SUs

  3069 2018-08-16T02:52:10.058+0200  SC-2 safApp=safAmfService NO: "Cluster 
startup timeout, assigning SIs to SUs"

so we can see the other SUs were being assigned

  3080 2018-08-16T02:52:10.129+0200  SC-2 safApp=safAmfService NO: 
"safSi=ABC.main-NWA-1,safApp=QWE-ABC-Amfproxy assigned to 
safSu=SC-2,safSg=NWA,safApp=QWE-ABC-Amfproxy HA State 'ACTIVE'"

  3311 2018-08-16T02:53:55.745+0200  SC-2 safApp=safAmfService NO: 
"safSu=PL-4,safSg=NWA,safApp=QWE-ABC-Amfproxy PresenceState INSTANTIATING => 
INSTANTIATED"
  3312 2018-08-16T02:53:55.745+0200  SC-2 safApp=safAmfService NO: 
"safSu=PL-4,safSg=NWA,safApp=QWE-ABC-Amfproxy ReadinessState OUT_OF_SERVICE => 
IN_SERVICE"

  3313 2018-08-16T02:53:55.750+0200  SC-2 safApp=safAmfService NO: 
"safSi=ABC.main-NWA-1,safApp=QWE-ABC-Amfproxy assigned to 
safSu=PL-4,safSg=NWA,safApp=QWE-ABC-Amfproxy HA State 'ACTIVE'"

  3360 2018-08-16T02:54:06.555+0200  SC-2 safApp=safAmfService NO: 
"safSu=PL-3,safSg=NWA,safApp=QWE-ABC-Amfproxy PresenceState INSTANTIATING => 
INSTANTIATED"
  3361 2018-08-16T02:54:06.555+0200  SC-2 safApp=safAmfService NO: 
"safSu=PL-3,safSg=NWA,safApp=QWE-ABC-Amfproxy ReadinessState OUT_OF_SERVICE => 
IN_SERVICE"

  3362 2018-08-16T02:54:06.560+0200  SC-2 safApp=safAmfService NO: 
"safSi=ABC.main-NWA-1,safApp=QWE-ABC-Amfproxy assigned to 
safSu=PL-3,safSg=NWA,safApp=QWE-ABC-Amfproxy HA State 'ACTIVE'"

The only SU in SC-1 is not assigned, because it is still OUT_OF_SERVICE
The unlock command does not get the SU IN_SERVICE, because node SC-1 is still 
DISABLED.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #162 amfnd used 90 + % CPU on CLM unconfigured node because of saclmDispatch

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 
- **Blocker**: True --> False



---

** [tickets:#162] amfnd used 90 + % CPU on CLM unconfigured node because of 
saclmDispatch**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 04:33 AM UTC by Nagendra Kumar
**Last Updated:** Mon Aug 28, 2017 07:00 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/1339

Steps to reproduce:


1) Bring up 2 controller and 2 payloads
2) Lock one payload(CLM Node)and delete that node from database using immcfg 
utility
3) Lock another payload(CLM Node)


Payload one amfnd shows 100% CPU utilization


Reason:
CLM agent doesnot destroy the message delivered for unconfigured node.


Changed 3 years ago by rameshb ¶
  ■owner changed from sangeeta_meena to Nagendra 
■status changed from new to assigned 
■component changed from CLM to AvSv 
There should not be any dispatch of pending cbks in case of unconfigured node, 
I see amfnd should handle this in case of "SA_AIS_ERR_UNAVAILABLE" return code 
(say through CLM-finalize or through reboot the node).


Changed 2 years ago by jfournier ¶
  ■owner changed from Nagendra to nagendra 
Changed 2 years ago by jfournier ¶
  ■milestone changed from 4.0.RC1 to 4.0.1 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #333 amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when SA_AMF_COMPONENT_NAME is not exported.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#333] amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when 
SA_AMF_COMPONENT_NAME is not exported.**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Mon May 27, 2013 04:48 AM UTC by Praveen
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2113.





Unless SA_AMF_COMPONENT_NAME is not exported, health check is not started for 
unregistered process.
 

As APPENDIX B on page 442 specifies that saAmfHealthcheckStart can be called in 
the context of any process , whether it is not registered or not



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #399 amf: SU admin state not updated after doing controller switchover and admin lock of SU.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#399] amf: SU admin state not updated after doing controller 
switchover and admin lock of SU.**

**Status:** unassigned
**Milestone:** future
**Created:** Fri May 31, 2013 05:21 AM UTC by Praveen
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2879.

changeset : 3796, 4.2.2
 model : NpluM
 

Initial Configuration:-
 =
 SI equal distribution
 saAmfSGNumPrefInserviceSUs=5 -a saAmfSGMaxActiveSIsperSU=2 -a 
saAmfSGMaxStandbySIsperSU=3 -a saAmfSGNumPrefActiveSUs=3 -a 
saAmfSGNumPrefStandbySUs=2
 saAmfSGAutoAdjust=1
 

6 SIs in locked state.
 saAmfSIPrefActiveAssignments=1 -a saAmfSIPrefStandbyAssignments=1
 

5SUs with same SURank set to 5.Each SUs admin state was locked-instantiation 
state.
 SU1, SU4, SU5 spawned on SC-1
 SU2 on SC-2
 SU3 on PL-4
 

Steps:-
 1. Brought up the NplusM model with above configuration.
 2. Performed unlock-instantiation operation on each SUs (SU1 to SU5)
 3. Performed unlock operation on each SUs (SU1 to SU5).
 4. Performed unlock of each SIs (SI1 to SI6)
 

Here observed that SUSI assignments were equally distributed.
 

5. Now on SC-1, command line trigger controller switchover
 and immediately on SC-2, trigger the admin lock on SU1.
 

Here observed that controller switchover successfully completed
 but the admin lock on SU1 failed with SA_AIS_ERR_TIMEOUT.
 

Again tried to lock the SU1, but this time it got failed with SA_AIS_ERR_NO_OP. 
It was failing with the same error SA_AIS_ERR_NO_OP after reties. amf-state su 
states was showing the admin state of SU1 as UNLOCKED. Hence admin state of SU1 
was not getting changed. 
Observed that all the SUSI assignments from SU1 got removed but the 


/var/log/messages was printing the below messages:-
 

Oct 23 13:01:53 SLOT2 osafimmnd[7176]: Timeout on syncronous admin operation 1
 Oct 23 13:03:47 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 Oct 23 13:06:15 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 

safSu=d_NplusM_1Norm_1,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_2,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_3,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_4,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_5,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSISU=safSu=SC-1\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?2,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=SC-2\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?1,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-3\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?4,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-4\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?3,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_6,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_4,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_5,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)


Changed 7 months ago by shareef 




Same issue a

[tickets] [opensaf:tickets] #178 escalation policy is not happening till the restart count exceeds, instead of reaching saAmfSGCompRestartMax for NPI components

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 
- **Blocker**: False --> True



---

** [tickets:#178] escalation policy is not happening till the restart count 
exceeds, instead of reaching saAmfSGCompRestartMax for NPI components**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 06:24 AM UTC by Nagendra Kumar
**Last Updated:** Mon Aug 28, 2017 07:00 AM UTC
**Owner:** nobody


Migrated from http://devel.opensaf.org/ticket/2144

error escalation is not happening till the restart count exceeds 
saAmfSGCompRestartMax for the components brought up in NPI.


But according to spec, first level escalation should happen when the restart 
count reaches the saAmfSGCompRestartMax


Mentioned in the spec, 3.11.2.2 page NO: 203,


If this count reaches the saAmfSGCompRestartMax value before the end of the
"component restart" probation period, the Availability Management Framework per-
forms the first level of recovery escalation for that service unit: the 
Availability Man-
agement Framework restarts the entire service unit





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1795 AMF : haState should be marked QUIESCING in PG callback for shutdown op

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#1795] AMF : haState should be marked QUIESCING in PG callback for 
shutdown op**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Apr 29, 2016 07:24 AM UTC by Srikanth R
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Changeset : 7434

For the shutdown operation on the SI, the haState is filled up with the value 
SA_AMF_HA_QUIESCED (3), instead of SA_AMF_HA_QUIESCING (4)  in the protection 
group callback.


PROTECTION GROUP CALLBACK IS INVOKED
error :  1
numberOfMembers :  2
csiName :  safCsi=CSI1,safSi=TestApp_SI1,safApp=TestApp_TwoN
number of items in notification buffer is  2
{0: {'member': {'haState': 2, 'compName': 
safComp=COMP1,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
1, 'haReadinessState': 1}, 'change': 1}, 1: {'member': {'haState': **3**, 
'compName': 
safComp=COMP1,safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
2, 'haReadinessState': 1}, 'change': 4}}



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1799 AMF : csiName and csiFlags are not properly populated, during assignment removal ( proxy)

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#1799] AMF : csiName and csiFlags are not properly populated, 
during assignment removal ( proxy)**

**Status:** unassigned
**Milestone:** future
**Created:** Sat Apr 30, 2016 06:17 AM UTC by Srikanth R
**Last Updated:** Mon Aug 28, 2017 06:58 AM UTC
**Owner:** nobody


Changeset : 7436
Setup :2N redmodel with both proxy and proxied hosted on the same node.


* Initially the proxy and proxied are in  fully assigned  state.
* Now perform lock operation on proxy SU, for which quiesced callback and csi 
removal callback is populating the csiFlags as SA_AMF_CSI_TARGET_ALL  and 
csiName is populated as NULL. But the proxy component is having active 
assignments , which is to be removed according to the callback .  Similar is 
for lock operation is on proxied SU.

 So expectation is that for lock operation on either proxy / proxied SU 
,csiFlags should be populated as SA_AMF_CSI_TARGET_ONE  with the corresponding 
csi.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1532 AMF : SI should be reverted to unlocked state, after shutdown operation of SI is rejected

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **assigned_to**: Nagendra Kumar -->  nobody 



---

** [tickets:#1532] AMF : SI should be reverted to unlocked state, after 
shutdown operation of SI is rejected**

**Status:** unassigned
**Milestone:** future
**Created:** Thu Oct 08, 2015 11:20 AM UTC by Srikanth R
**Last Updated:** Mon Aug 28, 2017 06:59 AM UTC
**Owner:** nobody


Changeset : 6901
Application  : 2n ( two SUs and 4 SIs with SI1 as sponsor for the remaining SIs)

Steps :

 * Initially all the SIs are in assigned state.
 * Invoked shutdown operation on one of the dependent SI .i.e SI2.
 *  For the quiescing callback, component responded with FAILED_OP

Oct  8 16:27:20 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' QUIESCING to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Performing failover of 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' (SU failover count: 2)
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted 
due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover'

 * After recovery of SU1, SI2 assignments are also done, which is not expected.

Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
TERMINATING => INSTANTIATED
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI1,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI3,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'

 * Below is the SI state after the shutdown operation
 safSi=TestApp_SI2,safApp=TestApp_TwoN
saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)

* Further unlock operation of SI resulted in TIMEOUT return op.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #162 amfnd used 90 + % CPU on CLM unconfigured node because of saclmDispatch

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> True



---

** [tickets:#162] amfnd used 90 + % CPU on CLM unconfigured node because of 
saclmDispatch**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 04:33 AM UTC by Nagendra Kumar
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/1339

Steps to reproduce:


1) Bring up 2 controller and 2 payloads
2) Lock one payload(CLM Node)and delete that node from database using immcfg 
utility
3) Lock another payload(CLM Node)


Payload one amfnd shows 100% CPU utilization


Reason:
CLM agent doesnot destroy the message delivered for unconfigured node.


Changed 3 years ago by rameshb ¶
  ■owner changed from sangeeta_meena to Nagendra 
■status changed from new to assigned 
■component changed from CLM to AvSv 
There should not be any dispatch of pending cbks in case of unconfigured node, 
I see amfnd should handle this in case of "SA_AIS_ERR_UNAVAILABLE" return code 
(say through CLM-finalize or through reboot the node).


Changed 2 years ago by jfournier ¶
  ■owner changed from Nagendra to nagendra 
Changed 2 years ago by jfournier ¶
  ■milestone changed from 4.0.RC1 to 4.0.1 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #178 escalation policy is not happening till the restart count exceeds, instead of reaching saAmfSGCompRestartMax for NPI components

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#178] escalation policy is not happening till the restart count 
exceeds, instead of reaching saAmfSGCompRestartMax for NPI components**

**Status:** unassigned
**Milestone:** future
**Created:** Tue May 14, 2013 06:24 AM UTC by Nagendra Kumar
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2144

error escalation is not happening till the restart count exceeds 
saAmfSGCompRestartMax for the components brought up in NPI.


But according to spec, first level escalation should happen when the restart 
count reaches the saAmfSGCompRestartMax


Mentioned in the spec, 3.11.2.2 page NO: 203,


If this count reaches the saAmfSGCompRestartMax value before the end of the
"component restart" probation period, the Availability Management Framework per-
forms the first level of recovery escalation for that service unit: the 
Availability Man-
agement Framework restarts the entire service unit





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #333 amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when SA_AMF_COMPONENT_NAME is not exported.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned



---

** [tickets:#333] amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when 
SA_AMF_COMPONENT_NAME is not exported.**

**Status:** unassigned
**Milestone:** 5.17.10
**Created:** Mon May 27, 2013 04:48 AM UTC by Praveen
**Last Updated:** Thu Jul 20, 2017 07:59 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2113.





Unless SA_AMF_COMPONENT_NAME is not exported, health check is not started for 
unregistered process.
 

As APPENDIX B on page 442 specifies that saAmfHealthcheckStart can be called in 
the context of any process , whether it is not registered or not



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1795 AMF : haState should be marked QUIESCING in PG callback for shutdown op

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#1795] AMF : haState should be marked QUIESCING in PG callback for 
shutdown op**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Apr 29, 2016 07:24 AM UTC by Srikanth R
**Last Updated:** Tue Sep 20, 2016 05:46 PM UTC
**Owner:** Nagendra Kumar


Changeset : 7434

For the shutdown operation on the SI, the haState is filled up with the value 
SA_AMF_HA_QUIESCED (3), instead of SA_AMF_HA_QUIESCING (4)  in the protection 
group callback.


PROTECTION GROUP CALLBACK IS INVOKED
error :  1
numberOfMembers :  2
csiName :  safCsi=CSI1,safSi=TestApp_SI1,safApp=TestApp_TwoN
number of items in notification buffer is  2
{0: {'member': {'haState': 2, 'compName': 
safComp=COMP1,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
1, 'haReadinessState': 1}, 'change': 1}, 1: {'member': {'haState': **3**, 
'compName': 
safComp=COMP1,safSu=TestApp_SU2,safSg=TestApp_SG1,safApp=TestApp_TwoN, 'rank': 
2, 'haReadinessState': 1}, 'change': 4}}



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1799 AMF : csiName and csiFlags are not properly populated, during assignment removal ( proxy)

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#1799] AMF : csiName and csiFlags are not properly populated, 
during assignment removal ( proxy)**

**Status:** unassigned
**Milestone:** future
**Created:** Sat Apr 30, 2016 06:17 AM UTC by Srikanth R
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Changeset : 7436
Setup :2N redmodel with both proxy and proxied hosted on the same node.


* Initially the proxy and proxied are in  fully assigned  state.
* Now perform lock operation on proxy SU, for which quiesced callback and csi 
removal callback is populating the csiFlags as SA_AMF_CSI_TARGET_ALL  and 
csiName is populated as NULL. But the proxy component is having active 
assignments , which is to be removed according to the callback .  Similar is 
for lock operation is on proxied SU.

 So expectation is that for lock operation on either proxy / proxied SU 
,csiFlags should be populated as SA_AMF_CSI_TARGET_ONE  with the corresponding 
csi.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #399 amf: SU admin state not updated after doing controller switchover and admin lock of SU.

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> unassigned
- **Blocker**:  --> False



---

** [tickets:#399] amf: SU admin state not updated after doing controller 
switchover and admin lock of SU.**

**Status:** unassigned
**Milestone:** future
**Created:** Fri May 31, 2013 05:21 AM UTC by Praveen
**Last Updated:** Tue Sep 20, 2016 06:04 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2879.

changeset : 3796, 4.2.2
 model : NpluM
 

Initial Configuration:-
 =
 SI equal distribution
 saAmfSGNumPrefInserviceSUs=5 -a saAmfSGMaxActiveSIsperSU=2 -a 
saAmfSGMaxStandbySIsperSU=3 -a saAmfSGNumPrefActiveSUs=3 -a 
saAmfSGNumPrefStandbySUs=2
 saAmfSGAutoAdjust=1
 

6 SIs in locked state.
 saAmfSIPrefActiveAssignments=1 -a saAmfSIPrefStandbyAssignments=1
 

5SUs with same SURank set to 5.Each SUs admin state was locked-instantiation 
state.
 SU1, SU4, SU5 spawned on SC-1
 SU2 on SC-2
 SU3 on PL-4
 

Steps:-
 1. Brought up the NplusM model with above configuration.
 2. Performed unlock-instantiation operation on each SUs (SU1 to SU5)
 3. Performed unlock operation on each SUs (SU1 to SU5).
 4. Performed unlock of each SIs (SI1 to SI6)
 

Here observed that SUSI assignments were equally distributed.
 

5. Now on SC-1, command line trigger controller switchover
 and immediately on SC-2, trigger the admin lock on SU1.
 

Here observed that controller switchover successfully completed
 but the admin lock on SU1 failed with SA_AIS_ERR_TIMEOUT.
 

Again tried to lock the SU1, but this time it got failed with SA_AIS_ERR_NO_OP. 
It was failing with the same error SA_AIS_ERR_NO_OP after reties. amf-state su 
states was showing the admin state of SU1 as UNLOCKED. Hence admin state of SU1 
was not getting changed. 
Observed that all the SUSI assignments from SU1 got removed but the 


/var/log/messages was printing the below messages:-
 

Oct 23 13:01:53 SLOT2 osafimmnd[7176]: Timeout on syncronous admin operation 1
 Oct 23 13:03:47 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 Oct 23 13:06:15 SLOT2 osafamfd[7225]: Admin operation (2) has no effect on 
current state (2)
 

safSu=d_NplusM_1Norm_1,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_2,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_3,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_4,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSu=d_NplusM_1Norm_5,safSg=SG_d_npm,safApp=NpMApp
 


saAmfSUAdminState=UNLOCKED(1)
 saAmfSUOperState=ENABLED(1)
 saAmfSUPresenceState=INSTANTIATED(3)
 saAmfSUReadinessState=IN-SERVICE(2)
 

safSISU=safSu=SC-1\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?2,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=SC-2\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?1,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=SC-2\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-3\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?4,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=PL-4\,safSg=NoRed?\,safApp=OpenSAF,safSi=NoRed?3,safApp=OpenSAF
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_6,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_2\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=STANDBY(2)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_1,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_2,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_5\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_4,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_4\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_3,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)
 

safSISU=safSu=d_NplusM_1Norm_3\,safSg=SG_d_npm\,safApp=NpMApp,safSi=d_NplusM_1Norm_5,safApp=NpMApp
 


saAmfSISUHAState=ACTIVE(1)


Changed 7 months ago by s

[tickets] [opensaf:tickets] #1532 AMF : SI should be reverted to unlocked state, after shutdown operation of SI is rejected

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **Blocker**:  --> False



---

** [tickets:#1532] AMF : SI should be reverted to unlocked state, after 
shutdown operation of SI is rejected**

**Status:** unassigned
**Milestone:** future
**Created:** Thu Oct 08, 2015 11:20 AM UTC by Srikanth R
**Last Updated:** Tue Jun 07, 2016 11:22 AM UTC
**Owner:** Nagendra Kumar


Changeset : 6901
Application  : 2n ( two SUs and 4 SIs with SI1 as sponsor for the remaining SIs)

Steps :

 * Initially all the SIs are in assigned state.
 * Invoked shutdown operation on one of the dependent SI .i.e SI2.
 *  For the quiescing callback, component responded with FAILED_OP

Oct  8 16:27:20 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' QUIESCING to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Performing failover of 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' (SU failover count: 2)
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safComp=COMP2,safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' faulted 
due to 'csiSetcallbackTimeout' : Recovery is 'componentFailover'

 * After recovery of SU1, SI2 assignments are also done, which is not expected.

Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN' Presence State 
TERMINATING => INSTANTIATED
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI1,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI2,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'
Oct  8 16:27:30 SYSTEST-PLD-1 osafamfnd[4535]: NO Assigning 
'safSi=TestApp_SI3,safApp=TestApp_TwoN' STANDBY to 
'safSu=TestApp_SU1,safSg=TestApp_SG1,safApp=TestApp_TwoN'

 * Below is the SI state after the shutdown operation
 safSi=TestApp_SI2,safApp=TestApp_TwoN
saAmfSIAdminState=LOCKED(2)
saAmfSIAssignmentState=FULLY_ASSIGNED(2)

* Further unlock operation of SI resulted in TIMEOUT return op.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #248 "amf: Incorrect return code from saAmfComponentErrorReport_4 () and saAmfComponentErrorClear_4()".

2017-08-28 Thread Nagendra Kumar via Opensaf-tickets
- **status**: review --> fixed
- **Comment**:

commit 4a65aa761b8eb399f96325793f0b1b87edc7e44e
Author: Nagendra Kumar <nagendr...@oracle.com>
Date:   Mon Aug 28 12:15:24 2017 +0530

amfa: return BAD HANDLE in error report or error clear [#248]




---

** [tickets:#248] "amf: Incorrect return code from saAmfComponentErrorReport_4 
() and saAmfComponentErrorClear_4()". **

**Status:** fixed
**Milestone:** 5.17.10
**Created:** Thu May 16, 2013 06:41 AM UTC by Praveen
**Last Updated:** Thu Jul 20, 2017 08:01 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2817.

Changeset:3728
 When saAmfComponentErrorReport_4() and saAmfComponentErrorClear_4() are called 
after finalizing the amfHandle(calling saAmfFinalize()), both of them returns 
SA_AIS_ERR_VERSION instead of SA_AIS_ERR_BAD_HANDLE.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #248 "amf: Incorrect return code from saAmfComponentErrorReport_4 () and saAmfComponentErrorClear_4()".

2017-07-20 Thread Nagendra Kumar via Opensaf-tickets
The basic reason for returning SA_AIS_ERR_VERSION(3) from non-registered after 
finalise is that in Finalize(), pend_dis is zero  and ncs_ava_shutdown() is 
called, which in turn call ncs_ava_shutdown()->ava_destroy(). ava_destroy() 
deletes cb and makes gl_ava_hdl zero.
So, when ComponentErrorReport_4() is called ava_B4_ver_used(0) returns zero 
because neither cb nor gl_ava_hdl exists and then the following code returns 
SA_AIS_ERR_VERSION:
  /* Version is previously set in in initialize function */
  if (!ava_B4_ver_used(0)) {
TRACE_2(
"Invalid AMF version, set correct AMF version using saAmfInitialize_4. "
"Required version is: ReleaseCode = 'B', majorVersion = 0x04");
rc = SA_AIS_ERR_VERSION;
goto done;
  }

WHen called from registered process in Dispatch context, pend_dis is not zero 
and ncs_ava_shutdown() is not called and hence ava_B4_ver_used(0) returns true 
and code proceeds and returns BAD_HANDLE from later point of call.


---

** [tickets:#248] "amf: Incorrect return code from saAmfComponentErrorReport_4 
() and saAmfComponentErrorClear_4()". **

**Status:** assigned
**Milestone:** 5.2.FC
**Created:** Thu May 16, 2013 06:41 AM UTC by Praveen
**Last Updated:** Wed Jul 19, 2017 06:41 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2817.

Changeset:3728
 When saAmfComponentErrorReport_4() and saAmfComponentErrorClear_4() are called 
after finalizing the amfHandle(calling saAmfFinalize()), both of them returns 
SA_AIS_ERR_VERSION instead of SA_AIS_ERR_BAD_HANDLE.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #333 amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when SA_AMF_COMPONENT_NAME is not exported.

2017-07-20 Thread Nagendra Kumar via Opensaf-tickets
- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar
- **Part**: - --> nd
- **Blocker**:  --> False
- **Milestone**: future --> 5.17.10



---

** [tickets:#333] amf: saAmfHealthcheckStart returns SA_AIS_ERR_NOT_EXIST when 
SA_AMF_COMPONENT_NAME is not exported.**

**Status:** assigned
**Milestone:** 5.17.10
**Created:** Mon May 27, 2013 04:48 AM UTC by Praveen
**Last Updated:** Mon Apr 03, 2017 06:47 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2113.





Unless SA_AMF_COMPONENT_NAME is not exported, health check is not started for 
unregistered process.
 

As APPENDIX B on page 442 specifies that saAmfHealthcheckStart can be called in 
the context of any process , whether it is not registered or not



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #248 "amf: Incorrect return code from saAmfComponentErrorReport_4 () and saAmfComponentErrorClear_4()".

2017-07-20 Thread Nagendra Kumar via Opensaf-tickets
The patch is tested on CS #8791


---

** [tickets:#248] "amf: Incorrect return code from saAmfComponentErrorReport_4 
() and saAmfComponentErrorClear_4()". **

**Status:** review
**Milestone:** 5.17.10
**Created:** Thu May 16, 2013 06:41 AM UTC by Praveen
**Last Updated:** Thu Jul 20, 2017 07:56 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2817.

Changeset:3728
 When saAmfComponentErrorReport_4() and saAmfComponentErrorClear_4() are called 
after finalizing the amfHandle(calling saAmfFinalize()), both of them returns 
SA_AIS_ERR_VERSION instead of SA_AIS_ERR_BAD_HANDLE.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #248 "amf: Incorrect return code from saAmfComponentErrorReport_4 () and saAmfComponentErrorClear_4()".

2017-07-20 Thread Nagendra Kumar via Opensaf-tickets
Attched the patch


Attachments:

- 
[248_1.patch](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/e78dae72/2a55/attachment/248_1.patch)
 (1.1 kB; application/octet-stream)


---

** [tickets:#248] "amf: Incorrect return code from saAmfComponentErrorReport_4 
() and saAmfComponentErrorClear_4()". **

**Status:** assigned
**Milestone:** 5.2.FC
**Created:** Thu May 16, 2013 06:41 AM UTC by Praveen
**Last Updated:** Thu Jul 20, 2017 07:55 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2817.

Changeset:3728
 When saAmfComponentErrorReport_4() and saAmfComponentErrorClear_4() are called 
after finalizing the amfHandle(calling saAmfFinalize()), both of them returns 
SA_AIS_ERR_VERSION instead of SA_AIS_ERR_BAD_HANDLE.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #248 "amf: Incorrect return code from saAmfComponentErrorReport_4 () and saAmfComponentErrorClear_4()".

2017-07-20 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> review
- **Milestone**: 5.2.FC --> 5.17.10



---

** [tickets:#248] "amf: Incorrect return code from saAmfComponentErrorReport_4 
() and saAmfComponentErrorClear_4()". **

**Status:** review
**Milestone:** 5.17.10
**Created:** Thu May 16, 2013 06:41 AM UTC by Praveen
**Last Updated:** Thu Jul 20, 2017 07:55 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2817.

Changeset:3728
 When saAmfComponentErrorReport_4() and saAmfComponentErrorClear_4() are called 
after finalizing the amfHandle(calling saAmfFinalize()), both of them returns 
SA_AIS_ERR_VERSION instead of SA_AIS_ERR_BAD_HANDLE.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #248 "amf: Incorrect return code from saAmfComponentErrorReport_4 () and saAmfComponentErrorClear_4()".

2017-07-18 Thread Nagendra Kumar via Opensaf-tickets
- **status**: unassigned --> assigned
- **Blocker**:  --> False



---

** [tickets:#248] "amf: Incorrect return code from saAmfComponentErrorReport_4 
() and saAmfComponentErrorClear_4()". **

**Status:** assigned
**Milestone:** 5.2.FC
**Created:** Thu May 16, 2013 06:41 AM UTC by Praveen
**Last Updated:** Wed Apr 12, 2017 06:50 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2817.

Changeset:3728
 When saAmfComponentErrorReport_4() and saAmfComponentErrorReport_4() are 
called after finalizing the amfHandle(calling saAmfFinalize()), both of them 
returns SA_AIS_ERR_VERSION instead of SA_AIS_ERR_BAD_HANDLE.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #243 amf:Response_4 fails with ERR_VERSION even when invoked with correct versioned handle

2017-07-18 Thread Nagendra Kumar via Opensaf-tickets
- **status**: assigned --> invalid
- **Milestone**: future --> never
- **Comment**:

So, marking as invalid



---

** [tickets:#243] amf:Response_4 fails with ERR_VERSION even when invoked with 
correct versioned handle**

**Status:** invalid
**Milestone:** never
**Created:** Thu May 16, 2013 06:32 AM UTC by Praveen
**Last Updated:** Wed Jul 19, 2017 05:47 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[amf_demo_243.c](https://sourceforge.net/p/opensaf/tickets/243/attachment/amf_demo_243.c)
 (15.4 kB; application/octet-stream)


Migrated from http://devel.opensaf.org/ticket/2877.

The issue is seen on SLES 64bit VMs.
 

Migrated from http://devel.opensaf.org/ticket/2877.

The component initially initializes with B.4.1. Another initialize is invoked 
with B.1.1 version. When callbacks arrived at the component, Response_4 is 
invoked with the handle obtained from B.4.1 initialization. Response_4 returned 
SA_AIS_ERR_VERSION.

Output from the component log:
Invoking the function  with the arguments 
(4289724417, 4271898630L, None, 1)
 ('Return Value of the function : <—', 1) ==> Response_4 invoked before doing 
Initialize with B.1.1 version
Invoking the function  
('Return Value of the function : <—', [1, 4290772994])
Invoking the function  with the arguments 
(4289724417, 4287627277L, None, 1) 
('Return Value of the function : <—', 3) ===> same handle as used before but 
got ERR_VERSION now.
 

Traces can be shared if required.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #243 amf:Response_4 fails with ERR_VERSION even when invoked with correct versioned handle

2017-07-18 Thread Nagendra Kumar via Opensaf-tickets
Jul 19 11:11:24 PM_SC-1 osafamfnd[30940]: NO 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1' Presence State INSTANTIATING => 
INSTANTIATED
Jul 19 11:11:24 PM_SC-1 amf_demo[31113]: HC started with AMF 4
Jul 19 11:11:24 PM_SC-1 amf_demo[31113]: 1. saAmfComponentRegister FAILED, But 
Continueing 14
Jul 19 11:11:24 PM_SC-1 amf_demo[31113]: HC started with AMF for 1
Jul 19 11:11:24 PM_SC-1 amf_demo[31113]: Registered with AMF
Jul 19 11:11:24 PM_SC-1 amf_demo[31113]: Health check 1
Jul 19 11:11:34 PM_SC-1 amf_demo[31113]: Health check 2
Jul 19 11:11:44 PM_SC-1 amf_demo[31113]: Health check 3
Jul 19 11:11:54 PM_SC-1 amf_demo[31113]: Health check 4

The response was given to Amf 4.1 callback.


---

** [tickets:#243] amf:Response_4 fails with ERR_VERSION even when invoked with 
correct versioned handle**

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:32 AM UTC by Praveen
**Last Updated:** Wed Jul 19, 2017 05:45 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[amf_demo_243.c](https://sourceforge.net/p/opensaf/tickets/243/attachment/amf_demo_243.c)
 (15.4 kB; application/octet-stream)


Migrated from http://devel.opensaf.org/ticket/2877.

The issue is seen on SLES 64bit VMs.
 

Migrated from http://devel.opensaf.org/ticket/2877.

The component initially initializes with B.4.1. Another initialize is invoked 
with B.1.1 version. When callbacks arrived at the component, Response_4 is 
invoked with the handle obtained from B.4.1 initialization. Response_4 returned 
SA_AIS_ERR_VERSION.

Output from the component log:
Invoking the function  with the arguments 
(4289724417, 4271898630L, None, 1)
 ('Return Value of the function : <—', 1) ==> Response_4 invoked before doing 
Initialize with B.1.1 version
Invoking the function  
('Return Value of the function : <—', [1, 4290772994])
Invoking the function  with the arguments 
(4289724417, 4287627277L, None, 1) 
('Return Value of the function : <—', 3) ===> same handle as used before but 
got ERR_VERSION now.
 

Traces can be shared if required.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #243 amf:Response_4 fails with ERR_VERSION even when invoked with correct versioned handle

2017-07-18 Thread Nagendra Kumar via Opensaf-tickets
Please find attached sample file, used in testing.
I tested on CS #8791. I didn't get any issue. Health check continued.


---

** [tickets:#243] amf:Response_4 fails with ERR_VERSION even when invoked with 
correct versioned handle**

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:32 AM UTC by Praveen
**Last Updated:** Wed Jul 19, 2017 05:43 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[amf_demo_243.c](https://sourceforge.net/p/opensaf/tickets/243/attachment/amf_demo_243.c)
 (15.4 kB; application/octet-stream)


Migrated from http://devel.opensaf.org/ticket/2877.

The issue is seen on SLES 64bit VMs.
 

Migrated from http://devel.opensaf.org/ticket/2877.

The component initially initializes with B.4.1. Another initialize is invoked 
with B.1.1 version. When callbacks arrived at the component, Response_4 is 
invoked with the handle obtained from B.4.1 initialization. Response_4 returned 
SA_AIS_ERR_VERSION.

Output from the component log:
Invoking the function  with the arguments 
(4289724417, 4271898630L, None, 1)
 ('Return Value of the function : <—', 1) ==> Response_4 invoked before doing 
Initialize with B.1.1 version
Invoking the function  
('Return Value of the function : <—', [1, 4290772994])
Invoking the function  with the arguments 
(4289724417, 4287627277L, None, 1) 
('Return Value of the function : <—', 3) ===> same handle as used before but 
got ERR_VERSION now.
 

Traces can be shared if required.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #243 amf:Response_4 fails with ERR_VERSION even when invoked with correct versioned handle

2017-07-18 Thread Nagendra Kumar via Opensaf-tickets
- Attachments has changed:

Diff:



--- old
+++ new
@@ -0,0 +1 @@
+amf_demo_243.c (15.4 kB; application/octet-stream)






---

** [tickets:#243] amf:Response_4 fails with ERR_VERSION even when invoked with 
correct versioned handle**

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:32 AM UTC by Praveen
**Last Updated:** Tue Jul 18, 2017 10:34 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[amf_demo_243.c](https://sourceforge.net/p/opensaf/tickets/243/attachment/amf_demo_243.c)
 (15.4 kB; application/octet-stream)


Migrated from http://devel.opensaf.org/ticket/2877.

The issue is seen on SLES 64bit VMs.
 

Migrated from http://devel.opensaf.org/ticket/2877.

The component initially initializes with B.4.1. Another initialize is invoked 
with B.1.1 version. When callbacks arrived at the component, Response_4 is 
invoked with the handle obtained from B.4.1 initialization. Response_4 returned 
SA_AIS_ERR_VERSION.

Output from the component log:
Invoking the function  with the arguments 
(4289724417, 4271898630L, None, 1)
 ('Return Value of the function : <—', 1) ==> Response_4 invoked before doing 
Initialize with B.1.1 version
Invoking the function  
('Return Value of the function : <—', [1, 4290772994])
Invoking the function  with the arguments 
(4289724417, 4287627277L, None, 1) 
('Return Value of the function : <—', 3) ===> same handle as used before but 
got ERR_VERSION now.
 

Traces can be shared if required.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #243 amf:Response_4 fails with ERR_VERSION even when invoked with correct versioned handle

2017-07-18 Thread Nagendra Kumar via Opensaf-tickets
- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar
- **Part**: - --> lib
- **Blocker**:  --> False



---

** [tickets:#243] amf:Response_4 fails with ERR_VERSION even when invoked with 
correct versioned handle**

**Status:** assigned
**Milestone:** future
**Created:** Thu May 16, 2013 06:32 AM UTC by Praveen
**Last Updated:** Mon Apr 03, 2017 06:47 PM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2877.

The issue is seen on SLES 64bit VMs.
 

Migrated from http://devel.opensaf.org/ticket/2877.

The component initially initializes with B.4.1. Another initialize is invoked 
with B.1.1 version. When callbacks arrived at the component, Response_4 is 
invoked with the handle obtained from B.4.1 initialization. Response_4 returned 
SA_AIS_ERR_VERSION.

Output from the component log:
Invoking the function  with the arguments 
(4289724417, 4271898630L, None, 1)
 ('Return Value of the function : <—', 1) ==> Response_4 invoked before doing 
Initialize with B.1.1 version
Invoking the function  
('Return Value of the function : <—', [1, 4290772994])
Invoking the function  with the arguments 
(4289724417, 4287627277L, None, 1) 
('Return Value of the function : <—', 3) ===> same handle as used before but 
got ERR_VERSION now.
 

Traces can be shared if required.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2477 amfd: Cyclic reboot after SC absence period (in large cluster)

2017-06-07 Thread Nagendra Kumar via Opensaf-tickets
Hi Minh,
But I agree that we need to avoid rebooting the controllers, 
but by avoiding assert, I am not sure, let me check.

Thanks
-Nagu


---

** [tickets:#2477] amfd: Cyclic reboot after SC absence period (in large 
cluster)**

**Status:** review
**Milestone:** 5.17.06
**Labels:** assignment failover during stop of both SC 2416 
**Created:** Fri Jun 02, 2017 06:17 AM UTC by Minh Hon Chau
**Last Updated:** Wed Jun 07, 2017 09:00 AM UTC
**Owner:** Minh Hon Chau


The scenario of the problem in this ticket happens in the same scenario 
reported in #2416

After SC absence period, amfd gets into osafassert(), causes coredump, and the 
problem repeatedly happens 

One of patches of #2416 had tried to call IMM sync as soon as possible, and it 
works fine with a small cluster (5 nodes). But a large cluster consists of 
about 75 nodes, the change of IMM sync calls takes mostly no effect. 

In #2416, a problem had been seen with an assumption of unreliable IMM sync 
calls in which after SC absence period, amfd had 3 assignments for a 2N SG, 2 
STANDBY SUSIs , and 1 ACTIVE SUSI. It was fixed by commit :"amfd: Add iteration 
to failover all absent assignments [#2416]" (refer to: 
https://sourceforge.net/p/opensaf/tickets/2416/#f83b)

Another variant problem of unreliable IMM calls before both SC go down, is that 
amfd can have both SUs with ACTIVE assignments, that leads to assert. This 
problem can only be seen in large cluster so far


Details of coredump:
 
~~~
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafamfd'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f784279b0c7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install 
opensaf-amf-director-debuginfo-5.2.0-469.0.6128a2d.sle12.x86_64
(gdb) bt full
#0  0x7f784279b0c7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7f784279c478 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7f78435fdf4e in __osafassert_fail (__file=, 
__line=, __func=, 
__assertion=) at ../../opensaf/src/base/sysf_def.c:286
No locals.
#3  0x7f78445671e8 in avd_sg_2n_act_susi (sg=, 
stby_susi=stby_susi@entry=0x7ffeef034998, cb=0x7f78447f2e80 <_control_block>)
at ../../opensaf/src/amf/amfd/sg_2n_fsm.cc:596
susi = 
a_susi_2 = 0x7f7845e0d0c0
s_susi_1 = 0x7f7845e0d0c0
su_2 = 
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
s_susi_2 = 0x7f7845e2a030
a_susi = 0x0
a_susi_1 = 0x7f7845e2a030
s_susi = 0x0
su_1 = 0x7f7845d69e60
#4  0x7f784456d5d6 in SG_2N::node_fail (this=0x7f7845d5f4f0, 
cb=0x7f78447f2e80 <_control_block>, su=0x7f7845d69e60)
at ../../opensaf/src/amf/amfd/sg_2n_fsm.cc:3402
a_susi = 
s_susi = 0x7f7845d69a68
o_su = 
flag = 
__FUNCTION__ = "node_fail"
su_ha_state = 
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
#5  0x7f784455de1a in AVD_SG::failover_absent_assignment 
(this=0x7f7845d5f4f0) at ../../opensaf/src/amf/amfd/sg.cc:2307
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
__FUNCTION__ = "failover_absent_assignment"
failed_su = 0x7f7845d69e60
#6  0x7f7844514125 in avd_cluster_tmr_init_evh (cb=0x7f78447f2e80 
<_control_block>, evt=)
at ../../opensaf/src/amf/amfd/cluster.cc:103
i_sg = 0x7f7845d5f4f0
__for_range = @0x7f7845ca2a90: {db = {_M_t = {
  _M_impl = 
{ const, AVD_SG*> > >> = 
{<__gnu_cxx::new_allocator const, AVD_SG*> > >> = {}, }, 
_M_key_compare = {, std::basic_string, bool>> = {}, 
}, _M_header = {_M_color = std::_S_red, 
  _M_parent = 0x7f7845d515e0, _M_left = 0x7f7845d03ed0, 
_M_right = 0x7f7845d81580}, _M_node_count = 28
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
__FUNCTION__ = "avd_cluster_tmr_init_evh"
su = 0x0
node = 
#7  0x7f784453ca2c in process_event (cb_now=0x7f78447f2e80 
<_control_block>, evt=0x7f78340013d0) at ../../opensaf/src/amf/amfd/main.cc:775
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
__FUNCTION__ = "process_event"
#8  0x7f78444f6abe in main_loop () at ../../opensaf/src/amf/amfd/main.cc:691
pollretval = 
evt = 0x7f78340013d0
polltmo = 0
term_fd = 24
cb = 0x7f78447f2e80 <_control_block>
error = 
old_sync_state = AVD_STBY_OUT_OF_SYNC
#9  main (argc=, argv=) at 
../../opensaf/src/amf/amfd/main.cc:848
No locals.
~~~



---


[tickets] [opensaf:tickets] #2477 amfd: Cyclic reboot after SC absence period (in large cluster)

2017-06-07 Thread Nagendra Kumar via Opensaf-tickets
Also, to note, it is documented as limitations in Amf PR Doc as below, so this 
ticket qualifies as Enhancement (could have been #2416 as well):
2.2.11.3Limitations
•   Possible loss of RTA updates and SI assignment messages
If both SCs go down abruptly (SCs are immediately powered-off for instance), 
AMFD could fail to update RTA to IMM, the SI assignment messages sent from 
AMFND could not reach to AMFD, or vice versa. In such cases,  recovery could be 
impossible, applications may have inappropriate assignment states.



---

** [tickets:#2477] amfd: Cyclic reboot after SC absence period (in large 
cluster)**

**Status:** review
**Milestone:** 5.17.06
**Labels:** assignment failover during stop of both SC 2416 
**Created:** Fri Jun 02, 2017 06:17 AM UTC by Minh Hon Chau
**Last Updated:** Mon Jun 05, 2017 10:18 AM UTC
**Owner:** Minh Hon Chau


The scenario of the problem in this ticket happens in the same scenario 
reported in #2416

After SC absence period, amfd gets into osafassert(), causes coredump, and the 
problem repeatedly happens 

One of patches of #2416 had tried to call IMM sync as soon as possible, and it 
works fine with a small cluster (5 nodes). But a large cluster consists of 
about 75 nodes, the change of IMM sync calls takes mostly no effect. 

In #2416, a problem had been seen with an assumption of unreliable IMM sync 
calls in which after SC absence period, amfd had 3 assignments for a 2N SG, 2 
STANDBY SUSIs , and 1 ACTIVE SUSI. It was fixed by commit :"amfd: Add iteration 
to failover all absent assignments [#2416]" (refer to: 
https://sourceforge.net/p/opensaf/tickets/2416/#f83b)

Another variant problem of unreliable IMM calls before both SC go down, is that 
amfd can have both SUs with ACTIVE assignments, that leads to assert. This 
problem can only be seen in large cluster so far


Details of coredump:
 
~~~
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafamfd'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f784279b0c7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install 
opensaf-amf-director-debuginfo-5.2.0-469.0.6128a2d.sle12.x86_64
(gdb) bt full
#0  0x7f784279b0c7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x7f784279c478 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x7f78435fdf4e in __osafassert_fail (__file=, 
__line=, __func=, 
__assertion=) at ../../opensaf/src/base/sysf_def.c:286
No locals.
#3  0x7f78445671e8 in avd_sg_2n_act_susi (sg=, 
stby_susi=stby_susi@entry=0x7ffeef034998, cb=0x7f78447f2e80 <_control_block>)
at ../../opensaf/src/amf/amfd/sg_2n_fsm.cc:596
susi = 
a_susi_2 = 0x7f7845e0d0c0
s_susi_1 = 0x7f7845e0d0c0
su_2 = 
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
s_susi_2 = 0x7f7845e2a030
a_susi = 0x0
a_susi_1 = 0x7f7845e2a030
s_susi = 0x0
su_1 = 0x7f7845d69e60
#4  0x7f784456d5d6 in SG_2N::node_fail (this=0x7f7845d5f4f0, 
cb=0x7f78447f2e80 <_control_block>, su=0x7f7845d69e60)
at ../../opensaf/src/amf/amfd/sg_2n_fsm.cc:3402
a_susi = 
s_susi = 0x7f7845d69a68
o_su = 
flag = 
__FUNCTION__ = "node_fail"
su_ha_state = 
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
#5  0x7f784455de1a in AVD_SG::failover_absent_assignment 
(this=0x7f7845d5f4f0) at ../../opensaf/src/amf/amfd/sg.cc:2307
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
__FUNCTION__ = "failover_absent_assignment"
failed_su = 0x7f7845d69e60
#6  0x7f7844514125 in avd_cluster_tmr_init_evh (cb=0x7f78447f2e80 
<_control_block>, evt=)
at ../../opensaf/src/amf/amfd/cluster.cc:103
i_sg = 0x7f7845d5f4f0
__for_range = @0x7f7845ca2a90: {db = {_M_t = {
  _M_impl = 
{ const, AVD_SG*> > >> = 
{<__gnu_cxx::new_allocator const, AVD_SG*> > >> = {}, }, 
_M_key_compare = {, std::basic_string, bool>> = {}, 
}, _M_header = {_M_color = std::_S_red, 
  _M_parent = 0x7f7845d515e0, _M_left = 0x7f7845d03ed0, 
_M_right = 0x7f7845d81580}, _M_node_count = 28
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
__FUNCTION__ = "avd_cluster_tmr_init_evh"
su = 0x0
node = 
#7  0x7f784453ca2c in process_event (cb_now=0x7f78447f2e80 
<_control_block>, evt=0x7f78340013d0) at ../../opensaf/src/amf/amfd/main.cc:775
t_ = {trace_leave_called = false, file_ = 0x0, function_ = 0x0}
__FUNCTION__ = 

[tickets] [opensaf:tickets] #2416 amf: Problem of assignment failover during stop of both SCs (SC Absence)

2017-05-11 Thread Nagendra Kumar
>>Initially, SU1 (in SC1) and SU2 (in SC2) have active and standby assignment. 
>>Abruptly stop SC1 and SC2, SU3 (PL-3) appears to have standby assignment.
Does this happen because SC-1 Amfd sees that SC-2 is going down so it sends 
standby assignment to SU3(PL-3) ? But this can happen only when PL-3 down has 
been recieved by SC-1 Amfd and has processed and then sent Standby to SU3, but 
all the RTA updates would have missed because SC-1 also went down. This results 
in SU1 is Act and two Standby SU2 and SU3. Am I right?
Do you have traces and the time it has happened, I just want to analyse it.



---

** [tickets:#2416] amf: Problem of assignment failover during stop of both SCs 
(SC Absence)**

**Status:** review
**Milestone:** 5.17.06
**Created:** Mon Apr 10, 2017 04:39 AM UTC by Minh Hon Chau
**Last Updated:** Tue May 09, 2017 06:42 AM UTC
**Owner:** Minh Hon Chau


In configuration of 2N application which has active SU hosted in controller and 
the other standby SU is hosted in payload, the event of stopping both SCs could 
generate a su_si assignment message towards standby SU to change HA state to 
active. 

- In case this su_si assignment message is buffered and comes before 
MDSNCS_DOWN, node is rebooted
- In other cases where MDSNCS_DOWN comes before su_si assignment, currently 
amfnd does not ignore this su_si assignment. amfnd should ignore this su_si 
assignment message as similiar to other messages like su_pres, su_reg

Testing on similar application's configuration continues, problem found will be 
added in comments


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2428 Amf: Amfd crashes when su is unlocked

2017-04-14 Thread Nagendra Kumar



---

** [tickets:#2428] Amf: Amfd crashes when su is unlocked**

**Status:** assigned
**Milestone:** 5.17.08
**Created:** Fri Apr 14, 2017 01:23 PM UTC by Nagendra Kumar
**Last Updated:** Fri Apr 14, 2017 01:23 PM UTC
**Owner:** Nagendra Kumar


Steps to reproduce:

1. Start SC-1. Upload a demo app file.
immcfg -f /tmp/AppConfig-2N.xml_t1
2. Delete CtCs object.
immcfg -d 
"safSupportedCsType=safVersion=1\,safCSType=AmfDemo1,safVersion=1,safCompType=AmfDemo1"
3. Unlock-in the SU.
amf-adm unlock-in safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
4. Unlock the SU
amf-adm unlock safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1

We got Amfd crash at:
Core was generated by `/usr/local/lib/opensaf/osafamfd --tracemask=0x'.
Program terminated with signal 11, Segmentation fault.
#0  avd_snd_susi_msg(cl_cb_tag*, AVD_SU*, avd_su_si_rel_tag*, AVSV_SUSI_ACT, 
bool, avd_comp_csi_rel_tag*) () at src/amf/amfd/util.cc:701
701 return ctcs_type->saAmfCtCompCapability;

Bt is :
(gdb) bt
#0  avd_snd_susi_msg(cl_cb_tag*, AVD_SU*, avd_su_si_rel_tag*, AVSV_SUSI_ACT, 
bool, avd_comp_csi_rel_tag*) () at src/amf/amfd/util.cc:701
#1  0x7fb48fd91e02 in avd_new_assgn_susi(cl_cb_tag*, AVD_SU*, AVD_SI*, 
SaAmfHAStateT, bool, avd_su_si_rel_tag**) () at src/amf/amfd/sgproc.cc:242
#2  0x7fb48fd71340 in avd_sg_2n_su_chose_asgn(cl_cb_tag*, AVD_SG*) () at 
src/amf/amfd/sg_2n_fsm.cc:639
#3  0x7fb48fd718d1 in SG_2N::su_insvc(cl_cb_tag*, AVD_SU*) () at 
src/amf/amfd/sg_2n_fsm.cc:1373
#4  0x7fb48fdb2107 in AVD_SU::unlock(unsigned long long, unsigned long 
long) () at src/amf/amfd/su.cc:916
#5  0x7fb48fdb5a8a in su_admin_op_cb(unsigned long long, unsigned long 
long, SaNameT const*, unsigned long long, SaImmAdminOperationParamsT_2 const**) 
() at src/amf/amfd/su.cc:1369
#6  0x7fb48fd41c20 in admin_operation_cb(unsigned long long, unsigned long 
long, SaNameT const*, unsigned long long, SaImmAdminOperationParamsT_2 const**) 
() at src/amf/amfd/imm.cc:846
#7  0x7fb48f491937 in imma_process_callback_info(imma_cb*, 
imma_client_node*, imma_callback_info*, unsigned long long) () at 
src/imm/agent/imma_proc.cc:2119
#8  0x7fb48f492d89 in imma_hdl_callbk_dispatch_all(imma_cb*, unsigned long 
long) () at src/imm/agent/imma_proc.cc:1761
#9  0x7fb48f489a7f in saImmOiDispatch () at src/imm/agent/imma_oi_api.cc:638
#10 0x7fb48fcfdf48 in main () at src/amf/amfd/main.cc:729


The reason for the crash is in the function avd_snd_susi_msg(), 
get_comp_capability() is called with csi and comp as input parameter.
In the function, get_comp_capability(), there is no CtCs object available, so 
ctcstype_db->find  returns NULL to ctcs_type.
While accessing ctcs_type->saAmfCtCompCapability, AMfd crashes because 
ctcs_type is NULL.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2083 amf: error in syslog when initiating SI SWAP

2017-04-10 Thread Nagendra Kumar
- **status**: unassigned --> duplicate
- **Milestone**: future --> never
- **Comment**:

Fixed as part of #1897



---

** [tickets:#2083] amf: error in syslog when initiating SI SWAP**

**Status:** duplicate
**Milestone:** never
**Created:** Thu Sep 29, 2016 12:01 PM UTC by Rafael
**Last Updated:** Thu Sep 29, 2016 12:12 PM UTC
**Owner:** nobody


SMF initiates an SI SWAP which fails but then the retry is succesful. The error 
result should not be logged as an error in syslog.

osafamfd[473]: ER safSi=SC-2N,safApp=OpenSAF SWAP failed - only one assignment
osafrded[407]: NO Peer up on node 0x2020f
osafrded[407]: NO Got peer info request from node 0x2020f with role STANDBY
osafrded[407]: NO Got peer info response from node 0x2020f with role STANDBY
osafimmd[426]: NO MDS event from svc_id 24 (change:5, dest:13)
osafimmnd[436]: NO Implementer (applier) connected: 49 (@safAmfService2020f) 
<0, 2020f>
osafimmnd[436]: NO Implementer (applier) connected: 50 (@OpenSafImmReplicatorB) 
<0, 2020f>
osafamfd[473]: ER safSi=SC-2N,safApp=OpenSAF SWAP failed - Cold sync in progress
osafamfd[473]: NO safSi=SC-2N,safApp=OpenSAF Swap initiated



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2404 Amf : amfd crashed on active controller when executing the campaign for application upgrade.

2017-04-05 Thread Nagendra Kumar
- **status**: assigned --> review



---

** [tickets:#2404] Amf : amfd crashed on active controller when executing the 
campaign for application upgrade.**

**Status:** review
**Milestone:** 5.0.2
**Created:** Thu Mar 30, 2017 09:49 AM UTC by Madhurika Koppula
**Last Updated:** Tue Apr 04, 2017 08:38 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[amfd_crash.tgz](https://sourceforge.net/p/opensaf/tickets/2404/attachment/amfd_crash.tgz)
 (739.4 kB; application/octet-stream)


**Environment Details:**

OS : Suse 64bit
GCC Version: 6.1
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
PBE disabled ).
Changeset : 8701 ( 5.2.RC1) 

**Summary**:  amfd crashed on active controller when executing the campaign 
modeled for testing SG upgrade of no redundancy model. 

**Steps followed & Observed behaviour:**

1) Brought up the four nodes cluster successfully.
2) Brought up the No Redundancy model.
3) When executed campaign for testing SG upgrade, observed amfd crash on active 
controller at the moment when SU hosted on PL-3 went to instantaiation failed 
state after upgrade  (as script exits with non-zero status).
4) Amfd got aborted when invoking saImmOiDispatch.

**Below is the timestamp on active controller SC-1:**

Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: NO adminOperation: 
immUtil.callAdminOperation() Fail SA_AIS_ERR_REPAIR_PENDING (29), Failed unit 
is 'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp'
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: ER Failed to Restart activation 
units
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: ER Step execution failed, Try 
undoing the step
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: NO SmfStepStateUndoing::execute 
start undoing step.
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: NO STEP: Rolling back AU restart 
step 
safSmfStep=0001,safSmfProc=amfClusterProc-1,safSmfCampaign=Campaign_1,safApp=safSmfService
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: NO STEP: Online installation of 
old software
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: WA SU: 
safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp failed after upgrade in 
campaign
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: NO STEP: Create old 
SaAmfNodeSwBundle objects
Apr 27 14:36:08 SLES-M-SLOT-1 osafimmnd[4115]: NO Ccb 56 COMMITTED (SMFSERVICE)
Apr 27 14:36:08 SLES-M-SLOT-1 osafsmfd[4199]: NO STEP: Reverse information 
model and set maintenance status for deactivation units
Apr 27 14:36:08 SLES-M-SLOT-1 osafimmnd[4115]: NO Ccb 57 COMMITTED (SMFSERVICE)

**Apr 27 14:36:08 SLES-M-SLOT-1 osafamfnd[4181]: ER AMFD has unexpectedly 
crashed. Rebooting node

Apr 27 14:36:08 SLES-M-SLOT-1 osafamfnd[4181]: Rebooting OpenSAF NodeId = 
131343 EE **
Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131343, SupervisionTime = 60


**Timestamp on PL-3:**

Apr  4 18:32:55 SLES-M-SLOT-3 osafamfnd[26567]: NO saAmfCompType changed to 
'safVersion=5.0.0,safCompType=Comp_PxyApp_Proxy1_1_1' for 
'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp'
Apr  4 18:32:55 SLES-M-SLOT-3 osafimmnd[26556]: NO Ccb 54 COMMITTED (SMFSERVICE)
Apr  4 18:32:55 SLES-M-SLOT-3 osafamfnd[26567]: NO Admin restart requested for 
'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp'
Apr  4 18:32:56 SLES-M-SLOT-3 osafamfnd[26567]: NO 
saAmfCtDefQuiescingCompleteTimeout for 
'safVersion=5.0.0,safCompType=Comp_PxyApp_Proxy1_1_1' initialized with 
saAmfCtDefCallbackTimeout
Apr  4 18:32:56 SLES-M-SLOT-3 osafamfnd[26567]: NO Instantiation of 
'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp' failed
Apr  4 18:32:56 SLES-M-SLOT-3 osafamfnd[26567]: NO Reason:'Exec of script 
success, but script exits with non-zero status'
Apr  4 18:32:56 SLES-M-SLOT-3 osafamfnd[26567]: NO Exit code: 1
Apr  4 18:32:59 SLES-M-SLOT-3 osafamfnd[26567]: NO Instantiation of 
'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp' failed
Apr  4 18:32:59 SLES-M-SLOT-3 osafamfnd[26567]: NO Reason:'Exec of script 
success, but script exits with non-zero status'
Apr  4 18:32:59 SLES-M-SLOT-3 osafamfnd[26567]: NO Exit code: 1
**Apr  4 18:33:02 SLES-M-SLOT-3 osafamfnd[26567]: NO Instantiation of 
'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp' failed
Apr  4 18:33:02 SLES-M-SLOT-3 osafamfnd[26567]: NO Reason:'Exec of script 
success, but script exits with non-zero status'**
Apr  4 18:33:02 SLES-M-SLOT-3 osafamfnd[26567]: NO Exit code: 1
Apr  4 18:33:05 SLES-M-SLOT-3 osafamfnd[26567]: WA 
'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp' 
Presence State RESTARTING => INSTANTIATION_FAILED
Apr  4 18:33:05 SLES-M-SLOT-3 osafamfnd[26567]: NO Component Failover trigerred 
for 'safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp': Failed component: 
'safComp=Proxy1,safSu=dummy_Proxy_1,safSg=SG_dummy_Proxy,safApp=PxyApp'
Apr  4 18:33:05 SLES-M-SLOT-3 osafamfnd[2656

[tickets] [opensaf:tickets] #2361 AMFD: amfd crashed with healthCheckcallbackTimeout causing both controllers to reboot

2017-03-29 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8736:c3c90b5fb832
branch:  opensaf-5.0.x
parent:  8732:ea44141c05ee
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Thu Mar 30 10:17:41 2017 +0530
summary: amfd: handle BAD_HANDLE return during config read [#2361]

changeset:   8737:f9a5a957c16a
branch:  opensaf-5.1.x
parent:  8733:be2fd9824bc4
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Thu Mar 30 10:18:05 2017 +0530
summary: amfd: handle BAD_HANDLE return during config read [#2361]

changeset:   8738:a10d52313ef5
tag: tip
parent:  8735:68a5e668f807
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Thu Mar 30 10:18:25 2017 +0530
summary: amfd: handle BAD_HANDLE return during config read [#2361]

[staging:c3c90b]
[staging:f9a5a9]
[staging:a10d52]




---

** [tickets:#2361] AMFD: amfd crashed with healthCheckcallbackTimeout causing 
both controllers to reboot**

**Status:** fixed
**Milestone:** 5.0.2
**Created:** Fri Mar 10, 2017 09:08 AM UTC by Chani Srivastava
**Last Updated:** Tue Mar 14, 2017 10:42 AM UTC
**Owner:** Nagendra Kumar


**Environment details**

OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )

**Step**

1. Bringu opensaf on four nodes and create a load of 1 lakh objects
2. Imm test cases running on standby controller


SC-1 syslog

Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
**Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:healthCheckcallbackTimeout Recovery is:suFailover
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60**
Mar  7 19:45:58 OSAF-SC1 opensaf_reboot: Rebooting local node; timeout=60


SC-2 syslog

Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER Failed to read configuration, AMF 
will not start
Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER avd_imm_config_get FAILED
**Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: ER AMFD has unexpectedly crashed. 
Rebooting node**
Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131599, SupervisionTime = 60
Mar  7 19:41:00 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60


amfd, immnd and immd traces are shared seperately as those are huge in size



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2377 AMF: SG in unstable state after couple of admin operations during headless scenario

2017-03-15 Thread Nagendra Kumar
As per safLog, the issue occured at  Mar 14:
 11139 18:08:34 03/14/2017 NO safApp=safAmfService "Admin op invocation: 
5471788335253, err: 'SG not in STABLE state 
(safSg=TestApp_SG1,safApp=TestApp_TwoN)'"

Amfd trace is not available during this time. Amfd trace starts from Mar 15:
Mar 15  7:03:12.095487 osafamfd [3250:src/amf/amfd/main.cc:0502] >> initialize 

Please upload Amfd traces on/before Mar 14 18:08.


---

** [tickets:#2377] AMF: SG in unstable state after couple of admin operations 
during headless scenario**

**Status:** assigned
**Milestone:** 5.2.RC2
**Created:** Wed Mar 15, 2017 04:54 AM UTC by Srikanth R
**Last Updated:** Thu Mar 16, 2017 05:28 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[logs.tgz](https://sourceforge.net/p/opensaf/tickets/2377/attachment/logs.tgz) 
(7.6 MB; application/x-compressed)


Changeset : 8634 5.2.FC
Setup : 2 controllers with 3 payloads ( Headless feature enabled)
AMF application : 2n application 2 SUs 4SIs ( si-si deps disabled)

Steps performed :

-> Initially brought up 5 nodes.

-> Deployed the attached configuration.

-> Performed admin operations on SG couped with 2 headless operations.

-> Later performed shutdown operation of SG, which resulted in unstable state.

Attached logs :

-> syslog,amfd and amfnd traces of both controllers and PL-3.

-> AMF application


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2377 AMF: SG in unstable state after couple of admin operations during headless scenario

2017-03-15 Thread Nagendra Kumar
- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar



---

** [tickets:#2377] AMF: SG in unstable state after couple of admin operations 
during headless scenario**

**Status:** assigned
**Milestone:** 5.2.RC2
**Created:** Wed Mar 15, 2017 04:54 AM UTC by Srikanth R
**Last Updated:** Wed Mar 15, 2017 04:54 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[logs.tgz](https://sourceforge.net/p/opensaf/tickets/2377/attachment/logs.tgz) 
(7.6 MB; application/x-compressed)


Changeset : 8634 5.2.FC
Setup : 2 controllers with 3 payloads ( Headless feature enabled)
AMF application : 2n application 2 SUs 4SIs ( si-si deps disabled)

Steps performed :

-> Initially brought up 5 nodes.

-> Deployed the attached configuration.

-> Performed admin operations on SG couped with 2 headless operations.

-> Later performed shutdown operation of SG, which resulted in unstable state.

Attached logs :

-> syslog,amfd and amfnd traces of both controllers and PL-3.

-> AMF application


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2361 AMFD: amfd crashed with healthCheckcallbackTimeout causing both controllers to reboot

2017-03-14 Thread Nagendra Kumar
- **status**: unassigned --> accepted
- **assigned_to**: Nagendra Kumar
- **Version**:  --> 5.2 FC
- **Comment**:

This is reproducible on eb8089acf533+ (opensaf-5.1.x) 5.1.GA/5.1.0 release also.
Reproducible steps:
1. The following code changes were done for reproducing on standby controller:
diff --git a/osaf/services/saf/amf/amfd/svctype.cc 
b/osaf/services/saf/amf/amfd/svctype.cc
--- a/osaf/services/saf/amf/amfd/svctype.cc
+++ b/osaf/services/saf/amf/amfd/svctype.cc
@@ -230,6 +230,9 @@ SaAisErrorT avd_svctype_config_get(void)
searchParam.searchOneAttr.attrName = 
const_cast("SaImmAttrClassName");
searchParam.searchOneAttr.attrValueType = SA_IMM_ATTR_SASTRINGT;
searchParam.searchOneAttr.attrValue = 
+   LOG_ER("1. Sleeping .");
+   sleep(1);
+   LOG_ER("2. Sleeping .");

if (immutil_saImmOmSearchInitialize_2(avd_cb->immOmHandle, nullptr, 
SA_IMM_SUBTREE,
SA_IMM_SEARCH_ONE_ATTR | SA_IMM_SEARCH_GET_ALL_ATTR, 
,

2. Start Act(SC-1) and Standby(SC-2) controller.
3. Kill immnd on SC-2 and when when following errors comes again kill Immnd:
"1. Sleeping ."

4. Amfd exists:
Mar 14 13:02:58 PM_SC-2 osafimmd[1586]: NO SBY: New Epoch for IMMND process at 
node 2010f old epoch: 26  new epoch:27
Mar 14 13:02:58 PM_SC-2 osafimmd[1586]: NO IMMND coord at 2010f
Mar 14 13:02:58 PM_SC-2 osafamfd[1637]: ER No objects found (1)
Mar 14 13:02:58 PM_SC-2 osafamfd[1637]: ER Failed to read configuration, AMF 
will not start
Mar 14 13:02:58 PM_SC-2 osafamfd[1637]: ER avd_imm_config_get FAILED
Mar 14 13:02:58 PM_SC-2 osafamfnd[1647]: WA AMF director unexpectedly crashed
Mar 14 13:02:58 PM_SC-2 osafamfnd[1647]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: local AVD down(Adest) or both AVD down(Vdest) received, 
OwnNodeId = 131599, SupervisionTime = 60
Mar 14 13:02:58 PM_SC-2 opensaf_reboot: Rebooting local node; timeout=60




---

** [tickets:#2361] AMFD: amfd crashed with healthCheckcallbackTimeout causing 
both controllers to reboot**

**Status:** accepted
**Milestone:** 5.0.2
**Created:** Fri Mar 10, 2017 09:08 AM UTC by Chani Srivastava
**Last Updated:** Fri Mar 10, 2017 10:29 AM UTC
**Owner:** Nagendra Kumar


**Environment details**

OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )

**Step**

1. Bringu opensaf on four nodes and create a load of 1 lakh objects
2. Imm test cases running on standby controller


SC-1 syslog

Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
**Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:healthCheckcallbackTimeout Recovery is:suFailover
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60**
Mar  7 19:45:58 OSAF-SC1 opensaf_reboot: Rebooting local node; timeout=60


SC-2 syslog

Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER Failed to read configuration, AMF 
will not start
Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER avd_imm_config_get FAILED
**Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: ER AMFD has unexpectedly crashed. 
Rebooting node**
Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131599, SupervisionTime = 60
Mar  7 19:41:00 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60


amfd, immnd and immd traces are shared seperately as those are huge in size



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2361 AMFD: amfd crashed with healthCheckcallbackTimeout causing both controllers to reboot

2017-03-14 Thread Nagendra Kumar
- **status**: accepted --> review



---

** [tickets:#2361] AMFD: amfd crashed with healthCheckcallbackTimeout causing 
both controllers to reboot**

**Status:** review
**Milestone:** 5.0.2
**Created:** Fri Mar 10, 2017 09:08 AM UTC by Chani Srivastava
**Last Updated:** Tue Mar 14, 2017 08:46 AM UTC
**Owner:** Nagendra Kumar


**Environment details**

OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )

**Step**

1. Bringu opensaf on four nodes and create a load of 1 lakh objects
2. Imm test cases running on standby controller


SC-1 syslog

Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
**Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:healthCheckcallbackTimeout Recovery is:suFailover
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60**
Mar  7 19:45:58 OSAF-SC1 opensaf_reboot: Rebooting local node; timeout=60


SC-2 syslog

Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER Failed to read configuration, AMF 
will not start
Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER avd_imm_config_get FAILED
**Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: ER AMFD has unexpectedly crashed. 
Rebooting node**
Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131599, SupervisionTime = 60
Mar  7 19:41:00 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60


amfd, immnd and immd traces are shared seperately as those are huge in size



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active

2017-03-10 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8687:43deca051ae2
branch:  opensaf-5.0.x
parent:  8682:50a2033a8a8d
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Fri Mar 10 15:30:59 2017 +0530
summary: amfd: handle TIMEOUT for avd_imm_applier_set [#2338]

changeset:   8688:c4271e0114d8
branch:  opensaf-5.1.x
parent:  8683:59e265654232
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Fri Mar 10 15:31:10 2017 +0530
summary: amfd: handle TIMEOUT for avd_imm_applier_set [#2338]

changeset:   8689:4cefc956fdf0
tag: tip
parent:  8686:03647db14f06
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Fri Mar 10 15:31:28 2017 +0530
summary: amfd: handle TIMEOUT for avd_imm_applier_set [#2338]

[staging:43deca]
[staging:c4271e]
[staging:4cefc9]




---

** [tickets:#2338] amfd got crashed while changing role from queised to active**

**Status:** fixed
**Milestone:** 5.2.RC1
**Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj
**Last Updated:** Wed Mar 08, 2017 08:39 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz)
 (2.8 MB; application/octet-stream)
- 
[syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z)
 (649.4 kB; application/octet-stream)


#Environment details
OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )


#Summary
amfd got crashed while changing role from queised to active

#Steps followed & Observed behaviour
   1. Invoke switchovers
   2. After few successfull switchovers, SC-1 got Active role and SC-2 got 
standby role.
   3. Invoke one more switchover where SC-1 got queised role and 
SC-2 successfully become active after this cpd got crashed(SC-2) while 
SC-1 changing role from queised to active amfd got crashed on SC-1, resulted 
into cluster reset

>>For CPD crash refer ticket #2337

Syslog of SC-1:
Mar  2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for 
SaAmfNodeSwBundle, returned 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. 
Rebooting node
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 
EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131343, SupervisionTime = 60



BT:
(gdb) thread apply all bt

Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout=3) at src/base/osaf_poll.c:44
3  0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at 
src/base/osaf_poll.c:128
4  0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", 
size=64) at src/rde/agent/rda_papi.cc:673
5  0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at 
src/rde/agent/rda_papi.cc:150
6  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
7  0x7f2e034209cd in clone () from /lib64/libc.so.6
8  0x in ?? ()

Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e04188958 in mdtm_process_recv_events () at 
src/mds/mds_dt_tipc.c:669
2  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
3  0x7f2e034209cd in clone () from /lib64/libc.so.6
4  0x in ?? ()

Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, 
i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406
3  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
4  0x7f2e034209cd in clone () from /lib64/libc.so.6
5  0x in ?? ()

Thread 1 (Thread 0x7f2e05007720 (LWP 2178)):
0  0x7f2e0337bb55 in raise () from /lib64/libc.so.6
1  0x7f2e0337d131 in abort () from /lib64/libc.so.6
2  0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f 
"src/amf/amfd/role.cc", __line=807,
__func=0x7f2e05216c90 <avd_mds_qsd_role_evh(cl_cb_tag*, 
AVD_EVT*)::__FUNCTION__> "avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0")
at src/base/sysf_def.c:281
3  0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 
<_control_block>, evt

[tickets] [opensaf:tickets] #2361 AMFD: amfd crashed with healthCheckcallbackTimeout causing both controllers to reboot

2017-03-10 Thread Nagendra Kumar
- **Part**: - --> d
- **Comment**:

Logs analysis for Amf:
When Immnd is killed, Amfd gets BAD_HANDLE, so it went and reinitialized itself 
with Imm after that it went to read configuration. But when it was halfway to 
read, Immnd was again killed, so Amf was not able to read the complete 
configuration. So, the below error came:

Mar 7 19:41:00 OSAF-SC2 osafamfd[4339]: ER Failed to read configuration, AMF 
will not start

This bug also exists with older branches.



---

** [tickets:#2361] AMFD: amfd crashed with healthCheckcallbackTimeout causing 
both controllers to reboot**

**Status:** unassigned
**Milestone:** 5.2.RC1
**Created:** Fri Mar 10, 2017 09:08 AM UTC by Chani Srivastava
**Last Updated:** Fri Mar 10, 2017 09:08 AM UTC
**Owner:** nobody


**Environment details**

OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )

**Step**

1. Bringu opensaf on four nodes and create a load of 1 lakh objects
2. Imm test cases running on standby controller


SC-1 syslog

Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
**Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:healthCheckcallbackTimeout Recovery is:suFailover
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60**
Mar  7 19:45:58 OSAF-SC1 opensaf_reboot: Rebooting local node; timeout=60


SC-2 syslog

Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER Failed to read configuration, AMF 
will not start
Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER avd_imm_config_get FAILED
**Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: ER AMFD has unexpectedly crashed. 
Rebooting node**
Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131599, SupervisionTime = 60
Mar  7 19:41:00 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60


amfd, immnd and immd traces are shared seperately as those are huge in size



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2361 AMFD: amfd crashed with healthCheckcallbackTimeout causing both controllers to reboot

2017-03-10 Thread Nagendra Kumar
- **Milestone**: 5.2.RC1 --> 5.0.2



---

** [tickets:#2361] AMFD: amfd crashed with healthCheckcallbackTimeout causing 
both controllers to reboot**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Fri Mar 10, 2017 09:08 AM UTC by Chani Srivastava
**Last Updated:** Fri Mar 10, 2017 10:29 AM UTC
**Owner:** nobody


**Environment details**

OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )

**Step**

1. Bringu opensaf on four nodes and create a load of 1 lakh objects
2. Imm test cases running on standby controller


SC-1 syslog

Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
**Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:healthCheckcallbackTimeout Recovery is:suFailover
Mar  7 19:45:58 OSAF-SC1 osafamfnd[4720]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60**
Mar  7 19:45:58 OSAF-SC1 opensaf_reboot: Rebooting local node; timeout=60


SC-2 syslog

Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER Failed to read configuration, AMF 
will not start
Mar  7 19:41:00 OSAF-SC2 osafamfd[4339]: ER avd_imm_config_get FAILED
**Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: ER AMFD has unexpectedly crashed. 
Rebooting node**
Mar  7 19:41:00 OSAF-SC2 osafamfnd[4349]: Rebooting OpenSAF NodeId = 131599 EE 
Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131599, SupervisionTime = 60
Mar  7 19:41:00 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60


amfd, immnd and immd traces are shared seperately as those are huge in size



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2101 AMF: Heartbeat timeout observed after ImmNd restart

2017-03-10 Thread Nagendra Kumar
- **status**: accepted --> not-reproducible



---

** [tickets:#2101] AMF: Heartbeat timeout observed after ImmNd restart**

**Status:** not-reproducible
**Milestone:** 5.2.RC1
**Created:** Fri Oct 07, 2016 01:26 PM UTC by Chani Srivastava
**Last Updated:** Fri Mar 10, 2017 10:23 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[messages](https://sourceforge.net/p/opensaf/tickets/2101/attachment/messages) 
(777.0 kB; application/octet-stream)
- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/2101/attachment/osafamfd) 
(5.6 MB; application/octet-stream)


OS : Suse 64bit
Changeset : 8190
Setup : 4 physical nodee 1 PBE enabled with 1Lakh load

Step 
1. Bringu opensaf on four nodes
2. Imm test cases running with ndrestart scenario on standby controller
3. After 1 of the ndrestart amfd heart beat timeout happened. Below is the 
backtrace.

Coredump:
0 0x7fa7c680161c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
1  0x0044da7d in avd_imm_reinit_bg () at imm.cc:1949
2  0x00453e33 in main_loop () at main.cc:737
3  0x004541ff in main (argc=2, argv=0x7fff1cbfa268) at main.cc:848
(gdb) thread apply all bt

Thread 6 (Thread 0x7fa7c78fcb00 (LWP 3837)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout_ts=0x7fa7c78fc180, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout=3) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=13, i_timeout=3) at 
osaf_poll.c:128
4  0x7fa7c65f3a0c in rda_read_msg (sockfd=13, msg=0x7fa7c78fc260 "10 2", 
size=64) at rda_papi.cc:673
5  0x7fa7c65f31ec in rda_callback_task (rda_callback_cb=0x7963e0) at 
rda_papi.cc:150
6  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
7  0x7fa7c571b9cd in clone () from /lib64/libc.so.6
8  0x in ?? ()

Thread 5 (Thread 0x7fa7c4e0b700 (LWP 5557)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout_ts=0x7fa7c4e09d00, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout=1) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=29, i_timeout=1) at 
osaf_poll.c:128
4  0x7fa7c7535605 in mds_mcm_time_wait (sel_obj=0x80c3d8, time_val=1000) at 
mds_c_sndrcv.c:2570
5  0x7fa7c75351b1 in mcm_pvt_normal_svc_sndrsp (env_hdl=131071, 
fr_svc_id=26, msg=0x7fa7c4e0a0c0, to_dest=565214705385538, to_svc_id=25, 
req=0x7fa7c4e09e60, pri=MDS_SEND_PRIORITY_MEDIUM)
   at mds_c_sndrcv.c:2457
6  0x7fa7c7530d87 in mds_mcm_send (info=0x7fa7c4e09f90) at 
mds_c_sndrcv.c:690
7  0x7fa7c7530170 in mds_send (info=0x7fa7c4e09f90) at mds_c_sndrcv.c:390
8  0x7fa7c752fde1 in ncsmds_api (svc_to_mds_info=0x7fa7c4e09f90) at 
mds_papi.c:191
9  0x7fa7c6e527a7 in imma_mds_msg_sync_send (imma_mds_hdl=131071, 
destination=0x7fa7c707a850 <imma_cb+144>, i_evt=0x7fa7c4e0a0c0, 
o_evt=0x7fa7c4e0a058, timeout=1000) at imma_mds.c:604
10 0x7fa7c6e47bd6 in search_next_common (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a298, attributes=0x7fa7c4e0a318, bUseString=false) at 
imma_om_api.c:7584
11 0x7fa7c6e475f9 in saImmOmSearchNext_2 (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a370, attributes=0x7fa7c4e0a318) at imma_om_api.c:7444
12 0x004fb140 in immutil_saImmOmSearchNext_2 
(searchHandle=1475865130618188739, objectName=0x7fa7c4e0a370, 
attributes=0x7fa7c4e0a318) at immutil.c:1818
13 0x00431bc2 in avd_compcstype_config_get 
(name="safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF", comp=0x806e50) at 
compcstype.cc:306
14 0x00429c5c in avd_comp_config_get 
(su_name="safSu=SC-2,safSg=2N,safApp=OpenSAF", su=0x7b08c0) at comp.cc:756
15 0x004d9f63 in avd_su_config_get (sg_name="safSg=2N,safApp=OpenSAF", 
sg=0x7bb4c0) at su.cc:717
16 0x0048c7dc in avd_sg_config_get (app_dn="safApp=OpenSAF", 
app=0x7c08b0) at sg.cc:457
17 0x0040a88a in avd_app_config_get () at app.cc:460
18 0x0044c154 in avd_imm_config_get () at imm.cc:1574
19 0x0044d6e6 in avd_imm_reinit_bg_thread (_cb=0x75dba0 
<_control_block>) at imm.cc:1891
20 0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
21 0x7fa7c571b9cd in clone () from /lib64/libc.so.6
22 0x in ?? ()

Thread 4 (Thread 0x7fa7c791cb00 (LWP 3836)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c7549696 in mdtm_process_recv_events () at mds_dt_tipc.c:665
2  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
3  0x7fa7c571b9cd in clone () from /lib64/libc.so.6
4  0x in ?? ()

Thread 3 (Thread 0x7fa7b700 (LWP 5586)):
0  0x7fa7c6804294 in __lll_lock_wait () from /lib64/libpthread.so.0
1  

[tickets] [opensaf:tickets] #2101 AMF: Heartbeat timeout observed after ImmNd restart

2017-03-10 Thread Nagendra Kumar
Thanks for the information Chani.
Please repoen the ticket with Amf and Imm traces if it gets reproduced later on 
also.


---

** [tickets:#2101] AMF: Heartbeat timeout observed after ImmNd restart**

**Status:** accepted
**Milestone:** 5.2.RC1
**Created:** Fri Oct 07, 2016 01:26 PM UTC by Chani Srivastava
**Last Updated:** Fri Mar 10, 2017 09:51 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[messages](https://sourceforge.net/p/opensaf/tickets/2101/attachment/messages) 
(777.0 kB; application/octet-stream)
- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/2101/attachment/osafamfd) 
(5.6 MB; application/octet-stream)


OS : Suse 64bit
Changeset : 8190
Setup : 4 physical nodee 1 PBE enabled with 1Lakh load

Step 
1. Bringu opensaf on four nodes
2. Imm test cases running with ndrestart scenario on standby controller
3. After 1 of the ndrestart amfd heart beat timeout happened. Below is the 
backtrace.

Coredump:
0 0x7fa7c680161c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
1  0x0044da7d in avd_imm_reinit_bg () at imm.cc:1949
2  0x00453e33 in main_loop () at main.cc:737
3  0x004541ff in main (argc=2, argv=0x7fff1cbfa268) at main.cc:848
(gdb) thread apply all bt

Thread 6 (Thread 0x7fa7c78fcb00 (LWP 3837)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout_ts=0x7fa7c78fc180, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout=3) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=13, i_timeout=3) at 
osaf_poll.c:128
4  0x7fa7c65f3a0c in rda_read_msg (sockfd=13, msg=0x7fa7c78fc260 "10 2", 
size=64) at rda_papi.cc:673
5  0x7fa7c65f31ec in rda_callback_task (rda_callback_cb=0x7963e0) at 
rda_papi.cc:150
6  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
7  0x7fa7c571b9cd in clone () from /lib64/libc.so.6
8  0x in ?? ()

Thread 5 (Thread 0x7fa7c4e0b700 (LWP 5557)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout_ts=0x7fa7c4e09d00, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout=1) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=29, i_timeout=1) at 
osaf_poll.c:128
4  0x7fa7c7535605 in mds_mcm_time_wait (sel_obj=0x80c3d8, time_val=1000) at 
mds_c_sndrcv.c:2570
5  0x7fa7c75351b1 in mcm_pvt_normal_svc_sndrsp (env_hdl=131071, 
fr_svc_id=26, msg=0x7fa7c4e0a0c0, to_dest=565214705385538, to_svc_id=25, 
req=0x7fa7c4e09e60, pri=MDS_SEND_PRIORITY_MEDIUM)
   at mds_c_sndrcv.c:2457
6  0x7fa7c7530d87 in mds_mcm_send (info=0x7fa7c4e09f90) at 
mds_c_sndrcv.c:690
7  0x7fa7c7530170 in mds_send (info=0x7fa7c4e09f90) at mds_c_sndrcv.c:390
8  0x7fa7c752fde1 in ncsmds_api (svc_to_mds_info=0x7fa7c4e09f90) at 
mds_papi.c:191
9  0x7fa7c6e527a7 in imma_mds_msg_sync_send (imma_mds_hdl=131071, 
destination=0x7fa7c707a850 <imma_cb+144>, i_evt=0x7fa7c4e0a0c0, 
o_evt=0x7fa7c4e0a058, timeout=1000) at imma_mds.c:604
10 0x7fa7c6e47bd6 in search_next_common (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a298, attributes=0x7fa7c4e0a318, bUseString=false) at 
imma_om_api.c:7584
11 0x7fa7c6e475f9 in saImmOmSearchNext_2 (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a370, attributes=0x7fa7c4e0a318) at imma_om_api.c:7444
12 0x004fb140 in immutil_saImmOmSearchNext_2 
(searchHandle=1475865130618188739, objectName=0x7fa7c4e0a370, 
attributes=0x7fa7c4e0a318) at immutil.c:1818
13 0x00431bc2 in avd_compcstype_config_get 
(name="safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF", comp=0x806e50) at 
compcstype.cc:306
14 0x00429c5c in avd_comp_config_get 
(su_name="safSu=SC-2,safSg=2N,safApp=OpenSAF", su=0x7b08c0) at comp.cc:756
15 0x004d9f63 in avd_su_config_get (sg_name="safSg=2N,safApp=OpenSAF", 
sg=0x7bb4c0) at su.cc:717
16 0x0048c7dc in avd_sg_config_get (app_dn="safApp=OpenSAF", 
app=0x7c08b0) at sg.cc:457
17 0x0040a88a in avd_app_config_get () at app.cc:460
18 0x0044c154 in avd_imm_config_get () at imm.cc:1574
19 0x0044d6e6 in avd_imm_reinit_bg_thread (_cb=0x75dba0 
<_control_block>) at imm.cc:1891
20 0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
21 0x7fa7c571b9cd in clone () from /lib64/libc.so.6
22 0x in ?? ()

Thread 4 (Thread 0x7fa7c791cb00 (LWP 3836)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c7549696 in mdtm_process_recv_events () at mds_dt_tipc.c:665
2  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
3  0x7fa7c571b9cd in clone () from /lib64/libc.so.6
4  0x in ?? ()

Thread 3 (Thread 0x7fa7b700 (LWP 5586)):
0  0x

[tickets] [opensaf:tickets] #1457 AMF: Standby controller reboots if adding additional SI in N+M model

2017-03-09 Thread Nagendra Kumar
- **assigned_to**: Minh Hon Chau -->  nobody 



---

** [tickets:#1457] AMF: Standby controller reboots if adding additional SI in 
N+M model**

**Status:** unassigned
**Milestone:** future
**Labels:** N+M additional SI 
**Created:** Mon Aug 24, 2015 02:58 AM UTC by Minh Hon Chau
**Last Updated:** Thu Mar 09, 2017 09:56 AM UTC
**Owner:** nobody
**Attachments:**

- 
[add_SIc.xml](https://sourceforge.net/p/opensaf/tickets/1457/attachment/add_SIc.xml)
 (929 Bytes; text/xml)
- 
[app1_npm_2si_3su.xml](https://sourceforge.net/p/opensaf/tickets/1457/attachment/app1_npm_2si_3su.xml)
 (15.6 kB; text/xml)
- 
[stbctlr_reboot_as_adding_NpM_SI.tgz](https://sourceforge.net/p/opensaf/tickets/1457/attachment/stbctlr_reboot_as_adding_NpM_SI.tgz)
 (3.6 MB; application/x-compressed-tar)


Step to reproduce.
- Load model app1_npm_2si_3su.xml
- Unlock-in/Unlock SU1, SU2, SU3.
- Load model containing additional SI
-> Standby controller reboots, amfd coredump

bt:
(gdb) bt
#0  0x7f04f2eb8cc9 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f04f2ebc0d8 in __GI_abort () at abort.c:89
#2  0x7f04f463aa9e in __osafassert_fail (__file=,
__line=, __func=,
__assertion=) at sysf_def.c:281
#3  0x00410c0f in dec_si_su_curr_active (cb=0x6bdfa0 <_control_block>,
dec=) at ckpt_dec.cc:1738
#4  0x00409b97 in avsv_dequeue_async_update_msgs (
cb=cb@entry=0x6bdfa0 <_control_block>, pr_or_fr=pr_or_fr@entry=true)
at chkop.cc:1262
#5  0x0040a6a6 in avsv_mbcsv_process_dec_cb (arg=0x7fff015924c0,
cb=0x6bdfa0 <_control_block>) at chkop.cc:329
#6  avsv_mbcsv_cb (arg=0x7fff015924c0) at chkop.cc:171
#7  0x7f04f464ab26 in ncs_mbscv_rcv_decode (peer=peer@entry=0x2051bb0,
evt=evt@entry=0x7f04ec0064a0) at mbcsv_act.c:393
#8  0x7f04f464acf6 in ncs_mbcsv_rcv_async_update (peer=0x2051bb0,
evt=0x7f04ec0064a0) at mbcsv_act.c:440
#9  0x7f04f4651730 in mbcsv_process_events (rcvd_evt=0x7f04ec0064a0,
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at mbcsv_pr_evts.c:168
#10 0x7f04f465189b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753,
mbx=mbx@entry=4288675841) at mbcsv_pr_evts.c:272
#11 0x7f04f464c272 in mbcsv_process_dispatch_request (arg=0x7fff01592630)
at mbcsv_api.c:423
#12 0x004095cf in avsv_mbcsv_dispatch (
cb=cb@entry=0x6bdfa0 <_control_block>, flag=flag@entry=2) at chkop.cc:839
#13 0x0040568d in main_loop () at main.cc:716
#14 main (argc=, argv=) at main.cc:852





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1457 AMF: Standby controller reboots if adding additional SI in N+M model

2017-03-09 Thread Nagendra Kumar
Hi Minh, I am not sure if you are working, looks long time pending. Please 
check and assign it again if you can start.
Thanks
-Nagu


---

** [tickets:#1457] AMF: Standby controller reboots if adding additional SI in 
N+M model**

**Status:** unassigned
**Milestone:** future
**Labels:** N+M additional SI 
**Created:** Mon Aug 24, 2015 02:58 AM UTC by Minh Hon Chau
**Last Updated:** Thu Mar 09, 2017 09:54 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- 
[add_SIc.xml](https://sourceforge.net/p/opensaf/tickets/1457/attachment/add_SIc.xml)
 (929 Bytes; text/xml)
- 
[app1_npm_2si_3su.xml](https://sourceforge.net/p/opensaf/tickets/1457/attachment/app1_npm_2si_3su.xml)
 (15.6 kB; text/xml)
- 
[stbctlr_reboot_as_adding_NpM_SI.tgz](https://sourceforge.net/p/opensaf/tickets/1457/attachment/stbctlr_reboot_as_adding_NpM_SI.tgz)
 (3.6 MB; application/x-compressed-tar)


Step to reproduce.
- Load model app1_npm_2si_3su.xml
- Unlock-in/Unlock SU1, SU2, SU3.
- Load model containing additional SI
-> Standby controller reboots, amfd coredump

bt:
(gdb) bt
#0  0x7f04f2eb8cc9 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f04f2ebc0d8 in __GI_abort () at abort.c:89
#2  0x7f04f463aa9e in __osafassert_fail (__file=,
__line=, __func=,
__assertion=) at sysf_def.c:281
#3  0x00410c0f in dec_si_su_curr_active (cb=0x6bdfa0 <_control_block>,
dec=) at ckpt_dec.cc:1738
#4  0x00409b97 in avsv_dequeue_async_update_msgs (
cb=cb@entry=0x6bdfa0 <_control_block>, pr_or_fr=pr_or_fr@entry=true)
at chkop.cc:1262
#5  0x0040a6a6 in avsv_mbcsv_process_dec_cb (arg=0x7fff015924c0,
cb=0x6bdfa0 <_control_block>) at chkop.cc:329
#6  avsv_mbcsv_cb (arg=0x7fff015924c0) at chkop.cc:171
#7  0x7f04f464ab26 in ncs_mbscv_rcv_decode (peer=peer@entry=0x2051bb0,
evt=evt@entry=0x7f04ec0064a0) at mbcsv_act.c:393
#8  0x7f04f464acf6 in ncs_mbcsv_rcv_async_update (peer=0x2051bb0,
evt=0x7f04ec0064a0) at mbcsv_act.c:440
#9  0x7f04f4651730 in mbcsv_process_events (rcvd_evt=0x7f04ec0064a0,
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at mbcsv_pr_evts.c:168
#10 0x7f04f465189b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753,
mbx=mbx@entry=4288675841) at mbcsv_pr_evts.c:272
#11 0x7f04f464c272 in mbcsv_process_dispatch_request (arg=0x7fff01592630)
at mbcsv_api.c:423
#12 0x004095cf in avsv_mbcsv_dispatch (
cb=cb@entry=0x6bdfa0 <_control_block>, flag=flag@entry=2) at chkop.cc:839
#13 0x0040568d in main_loop () at main.cc:716
#14 main (argc=, argv=) at main.cc:852





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1457 AMF: Standby controller reboots if adding additional SI in N+M model

2017-03-09 Thread Nagendra Kumar
- **status**: review --> unassigned
- **Milestone**: 5.0.FC --> future



---

** [tickets:#1457] AMF: Standby controller reboots if adding additional SI in 
N+M model**

**Status:** unassigned
**Milestone:** future
**Labels:** N+M additional SI 
**Created:** Mon Aug 24, 2015 02:58 AM UTC by Minh Hon Chau
**Last Updated:** Sun Nov 01, 2015 09:36 PM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- 
[add_SIc.xml](https://sourceforge.net/p/opensaf/tickets/1457/attachment/add_SIc.xml)
 (929 Bytes; text/xml)
- 
[app1_npm_2si_3su.xml](https://sourceforge.net/p/opensaf/tickets/1457/attachment/app1_npm_2si_3su.xml)
 (15.6 kB; text/xml)
- 
[stbctlr_reboot_as_adding_NpM_SI.tgz](https://sourceforge.net/p/opensaf/tickets/1457/attachment/stbctlr_reboot_as_adding_NpM_SI.tgz)
 (3.6 MB; application/x-compressed-tar)


Step to reproduce.
- Load model app1_npm_2si_3su.xml
- Unlock-in/Unlock SU1, SU2, SU3.
- Load model containing additional SI
-> Standby controller reboots, amfd coredump

bt:
(gdb) bt
#0  0x7f04f2eb8cc9 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f04f2ebc0d8 in __GI_abort () at abort.c:89
#2  0x7f04f463aa9e in __osafassert_fail (__file=,
__line=, __func=,
__assertion=) at sysf_def.c:281
#3  0x00410c0f in dec_si_su_curr_active (cb=0x6bdfa0 <_control_block>,
dec=) at ckpt_dec.cc:1738
#4  0x00409b97 in avsv_dequeue_async_update_msgs (
cb=cb@entry=0x6bdfa0 <_control_block>, pr_or_fr=pr_or_fr@entry=true)
at chkop.cc:1262
#5  0x0040a6a6 in avsv_mbcsv_process_dec_cb (arg=0x7fff015924c0,
cb=0x6bdfa0 <_control_block>) at chkop.cc:329
#6  avsv_mbcsv_cb (arg=0x7fff015924c0) at chkop.cc:171
#7  0x7f04f464ab26 in ncs_mbscv_rcv_decode (peer=peer@entry=0x2051bb0,
evt=evt@entry=0x7f04ec0064a0) at mbcsv_act.c:393
#8  0x7f04f464acf6 in ncs_mbcsv_rcv_async_update (peer=0x2051bb0,
evt=0x7f04ec0064a0) at mbcsv_act.c:440
#9  0x7f04f4651730 in mbcsv_process_events (rcvd_evt=0x7f04ec0064a0,
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at mbcsv_pr_evts.c:168
#10 0x7f04f465189b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753,
mbx=mbx@entry=4288675841) at mbcsv_pr_evts.c:272
#11 0x7f04f464c272 in mbcsv_process_dispatch_request (arg=0x7fff01592630)
at mbcsv_api.c:423
#12 0x004095cf in avsv_mbcsv_dispatch (
cb=cb@entry=0x6bdfa0 <_control_block>, flag=flag@entry=2) at chkop.cc:839
#13 0x0040568d in main_loop () at main.cc:716
#14 main (argc=, argv=) at main.cc:852





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active

2017-03-07 Thread Nagendra Kumar
- **status**: assigned --> accepted



---

** [tickets:#2338] amfd got crashed while changing role from queised to active**

**Status:** accepted
**Milestone:** 5.2.RC1
**Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj
**Last Updated:** Tue Mar 07, 2017 07:27 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz)
 (2.8 MB; application/octet-stream)
- 
[syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z)
 (649.4 kB; application/octet-stream)


#Environment details
OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )


#Summary
amfd got crashed while changing role from queised to active

#Steps followed & Observed behaviour
   1. Invoke switchovers
   2. After few successfull switchovers, SC-1 got Active role and SC-2 got 
standby role.
   3. Invoke one more switchover where SC-1 got queised role and 
SC-2 successfully become active after this cpd got crashed(SC-2) while 
SC-1 changing role from queised to active amfd got crashed on SC-1, resulted 
into cluster reset

>>For CPD crash refer ticket #2337

Syslog of SC-1:
Mar  2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for 
SaAmfNodeSwBundle, returned 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. 
Rebooting node
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 
EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131343, SupervisionTime = 60



BT:
(gdb) thread apply all bt

Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout=3) at src/base/osaf_poll.c:44
3  0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at 
src/base/osaf_poll.c:128
4  0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", 
size=64) at src/rde/agent/rda_papi.cc:673
5  0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at 
src/rde/agent/rda_papi.cc:150
6  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
7  0x7f2e034209cd in clone () from /lib64/libc.so.6
8  0x in ?? ()

Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e04188958 in mdtm_process_recv_events () at 
src/mds/mds_dt_tipc.c:669
2  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
3  0x7f2e034209cd in clone () from /lib64/libc.so.6
4  0x in ?? ()

Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, 
i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406
3  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
4  0x7f2e034209cd in clone () from /lib64/libc.so.6
5  0x in ?? ()

Thread 1 (Thread 0x7f2e05007720 (LWP 2178)):
0  0x7f2e0337bb55 in raise () from /lib64/libc.so.6
1  0x7f2e0337d131 in abort () from /lib64/libc.so.6
2  0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f 
"src/amf/amfd/role.cc", __line=807,
__func=0x7f2e05216c90 <avd_mds_qsd_role_evh(cl_cb_tag*, 
AVD_EVT*)::__FUNCTION__> "avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0")
at src/base/sysf_def.c:281
3  0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 
<_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807
4  0x7f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, 
evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811
5  0x7f2e051560ee in main_loop () at src/amf/amfd/main.cc:702
6  0x7f2e051566fd in main (argc=2, argv=0x7fff5826f318) at 
src/amf/amfd/main.cc:861
(gdb)





Notes:
1. Syslog of both controller's attached
2. amfd bt attached
3. amfd trace attached

Both nodes are not in time sysnc, there is time gap between two nodes
Relative to SC-2, SC-1 is (+50 min ahead)
Time Diff
==
TestBed-R1:~  date
Thu Mar 2 16:34:45 IST 2017
TestBed-R2:~  date
Thu Mar 2 15:44:30 IST 2017
=


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To

[tickets] [opensaf:tickets] #2106 amf: Admin Operations on middleware SUs / SIs should not be supported

2017-03-07 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8660:57a2078b876b
branch:  opensaf-5.0.x
parent:  8656:a56101161326
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Mar 07 13:43:11 2017 +0530
summary: amfd: dont create susi when node is absent [#2106]

changeset:   8661:9cd5911abd45
branch:  opensaf-5.1.x
parent:  8657:a203318fb21e
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Mar 07 13:43:48 2017 +0530
summary: amfd: dont create susi when node is absent [#2106]

changeset:   8662:d08eb402be70
tag: tip
parent:  8659:8ef6c16048bc
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Mar 07 13:44:13 2017 +0530
summary: amfd: dont create susi when node is absent [#2106]

[staging:57a207]
[staging:9cd591]
[staging:d08eb4]




---

** [tickets:#2106] amf: Admin Operations on middleware SUs  / SIs should not be 
supported**

**Status:** fixed
**Milestone:** 5.2.RC1
**Created:** Sun Oct 09, 2016 11:18 AM UTC by Srikanth R
**Last Updated:** Wed Mar 01, 2017 10:39 AM UTC
**Owner:** Nagendra Kumar


Changeset : 8190 5.1.GA

-> Bring up a single controller SC-1
-> Now perform lock and unlock operation of middleware SU .i.e 
safSu=SC-2,safSg=NoRed,safApp=OpenSAF which is hosted on SC-2.
-> Admin lock operation succeeds, but admin unlock operation times out with the 
assignment to one of middleware SI.

 Following is the opensafd status after the unlock operation.
 
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)

  Admin operations on middleware objects should not be supported.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2321 Incorrect error messages "mkfifo already exists" observed in syslog

2017-03-07 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8659:8ef6c16048bc
tag: tip
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Mar 07 13:30:29 2017 +0530
summary: amf: replace exit with daemon_exit during shutdown [#2321]

[staging:8ef6c1]




---

** [tickets:#2321] Incorrect error messages "mkfifo already exists" observed in 
syslog**

**Status:** fixed
**Milestone:** 5.2.RC1
**Created:** Thu Feb 23, 2017 05:46 AM UTC by Ritu Raj
**Last Updated:** Thu Mar 02, 2017 07:11 AM UTC
**Owner:** Nagendra Kumar


# Environment details
OS : Suse 64bit
Changeset :  8603( 5.2.MO-1)

# Summary
Incorrect error messages "mkfifo already exists" observed in syslog after 
perfoming opensaf stop and start operation.

#Steps
1. Started the OpenSAF on single controller
2. Stop the OpenSAF and start agian, while starting OpnSAF again on same node 
following error message observed in syslog for component osafamfnd and 
osafamfwd:

Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: mkfifo already exists: 
/var/lib/opensaf/osafamfnd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: Started

Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: mkfifo already exists: 
/var/lib/opensaf/osafamfwd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: Started





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2213 AMFND: Coredump if suFailover while shutting down

2017-03-07 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8656:a56101161326
branch:  opensaf-5.0.x
parent:  8651:a90faf589254
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Mar 07 13:18:45 2017 +0530
summary: amfnd: avoid null pointer access [#2213]

changeset:   8657:a203318fb21e
branch:  opensaf-5.1.x
parent:  8652:a7c62f1de1a3
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Mar 07 13:19:02 2017 +0530
summary: amfnd: avoid null pointer access [#2213]

changeset:   8658:136a8f432da6
tag: tip
parent:  8655:45be1e612ab6
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Mar 07 13:19:16 2017 +0530
summary: amfnd: avoid null pointer access [#2213]

[staging:a56101]
[staging:a20331]
[staging:136a8f]




---

** [tickets:#2213] AMFND: Coredump if suFailover while shutting down**

**Status:** fixed
**Milestone:** 5.2.RC1
**Created:** Fri Dec 02, 2016 04:54 AM UTC by Minh Hon Chau
**Last Updated:** Tue Mar 07, 2017 07:21 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2213/attachment/log.tgz) 
(548.6 kB; application/x-compressed)


Seen amfnd coredump in PL5 with bt as below while cluster is shutting down
~~~
Thread 1 (Thread 0x7f92a8925780 (LWP 411)):
#0  __strcmp_sse2 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1358
No locals.
#1  0x00449cc9 in avsv_dblist_sastring_cmp (key1=, 
key2=) at util.c:361
i = 0
str1 = 
str2 = 
#2  0x7f92a84b1f95 in ncs_db_link_list_find (list_ptr=0x1ee89f0, 
key=0x656d6e6769737361 ) at ncsdlib.c:169
start_ptr = 0x1ee3168
#3  0x00416dc0 in avnd_comp_cmplete_all_csi_rec (cb=0x666940 
<_avnd_cb>, comp=0x1ee8200) at comp.cc:2652
curr = 0x1ee8060
prv = 0x1ee3150
__FUNCTION__ = "avnd_comp_cmplete_all_csi_rec"
#4  0x0040ca47 in avnd_instfail_su_failover (failed_comp=0x1ee8200, 
su=0x1ee74e0, cb=0x666940 <_avnd_cb>) at clc
.cc:3161
rc = 
#5  avnd_comp_clc_st_chng_prc (cb=cb@entry=0x666940 <_avnd_cb>, 
comp=comp@entry=0x1ee8200, prv_st=prv_st@entry=
SA_AMF_PRESENCE_RESTARTING, 
final_st=final_st@entry=SA_AMF_PRESENCE_TERMINATION_FAILED) at clc.cc:967
csi = 0x0
__FUNCTION__ = "avnd_comp_clc_st_chng_prc"
ev = AVND_SU_PRES_FSM_EV_MAX
is_en = 
rc = 1
#6  0x0040f530 in avnd_comp_clc_fsm_run (cb=cb@entry=0x666940 
<_avnd_cb>, comp=comp@entry=0x1ee8200, ev=
AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_FAIL) at clc.cc:906
prv_st = 
final_st = 
rc = 1
__FUNCTION__ = "avnd_comp_clc_fsm_run"
#7  0x0040fdea in avnd_evt_clc_resp_evh (cb=0x666940 <_avnd_cb>, 
evt=0x7f9298c0) at clc.cc:414
__FUNCTION__ = "avnd_evt_clc_resp_evh"
ev = 
clc_evt = 0x7f9298e0
comp = 0x1ee8200
rc = 1
#8  0x0042676f in avnd_evt_process (evt=0x7f9298c0) at main.cc:626
cb = 0x666940 <_avnd_cb>
rc = 1
#9  avnd_main_process () at main.cc:577
ret = 
fds = {{fd = 12, events = 1, revents = 1}, {fd = 16, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 
0}, {fd = 0, events = 0, revents = 0}}
evt = 0x7f9298c0
__FUNCTION__ = "avnd_main_process"
result = 
rc = 
#10 0x004058f3 in main (argc=1, argv=0x7ffe700c5c78) at main.cc:202
error = 0
1358../sysdeps/x86_64/multiarch/../strcmp.S: No such file or directory.
~~~
In syslog of PL5:

2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=npm_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=npm_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=nway_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=nway_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=4,safSg=1,safApp=np

[tickets] [opensaf:tickets] #2338 amfd got crashed while changing role from queised to active

2017-03-06 Thread Nagendra Kumar
- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar



---

** [tickets:#2338] amfd got crashed while changing role from queised to active**

**Status:** assigned
**Milestone:** 5.2.RC1
**Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj
**Last Updated:** Fri Mar 03, 2017 05:42 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz)
 (2.8 MB; application/octet-stream)
- 
[syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z)
 (649.4 kB; application/octet-stream)


#Environment details
OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )


#Summary
amfd got crashed while changing role from queised to active

#Steps followed & Observed behaviour
   1. Invoke switchovers
   2. After few successfull switchovers, SC-1 got Active role and SC-2 got 
standby role.
   3. Invoke one more switchover where SC-1 got queised role and 
SC-2 successfully become active after this cpd got crashed(SC-2) while 
SC-1 changing role from queised to active amfd got crashed on SC-1, resulted 
into cluster reset

>>For CPD crash refer ticket #2337

Syslog of SC-1:
Mar  2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for 
SaAmfNodeSwBundle, returned 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. 
Rebooting node
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 
EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131343, SupervisionTime = 60



BT:
(gdb) thread apply all bt

Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x7f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout=3) at src/base/osaf_poll.c:44
3  0x7f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=3) at 
src/base/osaf_poll.c:128
4  0x7f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", 
size=64) at src/rde/agent/rda_papi.cc:673
5  0x7f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at 
src/rde/agent/rda_papi.cc:150
6  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
7  0x7f2e034209cd in clone () from /lib64/libc.so.6
8  0x in ?? ()

Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e04188958 in mdtm_process_recv_events () at 
src/mds/mds_dt_tipc.c:669
2  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
3  0x7f2e034209cd in clone () from /lib64/libc.so.6
4  0x in ?? ()

Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)):
0  0x7f2e034174f6 in poll () from /lib64/libc.so.6
1  0x7f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, 
i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x7f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406
3  0x7f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
4  0x7f2e034209cd in clone () from /lib64/libc.so.6
5  0x in ?? ()

Thread 1 (Thread 0x7f2e05007720 (LWP 2178)):
0  0x7f2e0337bb55 in raise () from /lib64/libc.so.6
1  0x7f2e0337d131 in abort () from /lib64/libc.so.6
2  0x7f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f 
"src/amf/amfd/role.cc", __line=807,
__func=0x7f2e05216c90 <avd_mds_qsd_role_evh(cl_cb_tag*, 
AVD_EVT*)::__FUNCTION__> "avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0")
at src/base/sysf_def.c:281
3  0x7f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 
<_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807
4  0x7f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, 
evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811
5  0x7f2e051560ee in main_loop () at src/amf/amfd/main.cc:702
6  0x7f2e051566fd in main (argc=2, argv=0x7fff5826f318) at 
src/amf/amfd/main.cc:861
(gdb)





Notes:
1. Syslog of both controller's attached
2. amfd bt attached
3. amfd trace attached

Both nodes are not in time sysnc, there is time gap between two nodes
Relative to SC-2, SC-1 is (+50 min ahead)
Time Diff
==
TestBed-R1:~  date
Thu Mar 2 16:34:45 IST 2017
TestBed-R2:~  date
Thu Mar 2 15:44:30 IST 2017
=


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://s

[tickets] [opensaf:tickets] #2213 AMFND: Coredump if suFailover while shutting down

2017-03-06 Thread Nagendra Kumar
- **status**: assigned --> review
- **Version**:  --> 5.1 GA



---

** [tickets:#2213] AMFND: Coredump if suFailover while shutting down**

**Status:** review
**Milestone:** 5.2.RC1
**Created:** Fri Dec 02, 2016 04:54 AM UTC by Minh Hon Chau
**Last Updated:** Thu Mar 02, 2017 08:08 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2213/attachment/log.tgz) 
(548.6 kB; application/x-compressed)


Seen amfnd coredump in PL5 with bt as below while cluster is shutting down
~~~
Thread 1 (Thread 0x7f92a8925780 (LWP 411)):
#0  __strcmp_sse2 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1358
No locals.
#1  0x00449cc9 in avsv_dblist_sastring_cmp (key1=, 
key2=) at util.c:361
i = 0
str1 = 
str2 = 
#2  0x7f92a84b1f95 in ncs_db_link_list_find (list_ptr=0x1ee89f0, 
key=0x656d6e6769737361 ) at ncsdlib.c:169
start_ptr = 0x1ee3168
#3  0x00416dc0 in avnd_comp_cmplete_all_csi_rec (cb=0x666940 
<_avnd_cb>, comp=0x1ee8200) at comp.cc:2652
curr = 0x1ee8060
prv = 0x1ee3150
__FUNCTION__ = "avnd_comp_cmplete_all_csi_rec"
#4  0x0040ca47 in avnd_instfail_su_failover (failed_comp=0x1ee8200, 
su=0x1ee74e0, cb=0x666940 <_avnd_cb>) at clc
.cc:3161
rc = 
#5  avnd_comp_clc_st_chng_prc (cb=cb@entry=0x666940 <_avnd_cb>, 
comp=comp@entry=0x1ee8200, prv_st=prv_st@entry=
SA_AMF_PRESENCE_RESTARTING, 
final_st=final_st@entry=SA_AMF_PRESENCE_TERMINATION_FAILED) at clc.cc:967
csi = 0x0
__FUNCTION__ = "avnd_comp_clc_st_chng_prc"
ev = AVND_SU_PRES_FSM_EV_MAX
is_en = 
rc = 1
#6  0x0040f530 in avnd_comp_clc_fsm_run (cb=cb@entry=0x666940 
<_avnd_cb>, comp=comp@entry=0x1ee8200, ev=
AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_FAIL) at clc.cc:906
prv_st = 
final_st = 
rc = 1
__FUNCTION__ = "avnd_comp_clc_fsm_run"
#7  0x0040fdea in avnd_evt_clc_resp_evh (cb=0x666940 <_avnd_cb>, 
evt=0x7f9298c0) at clc.cc:414
__FUNCTION__ = "avnd_evt_clc_resp_evh"
ev = 
clc_evt = 0x7f9298e0
comp = 0x1ee8200
rc = 1
#8  0x0042676f in avnd_evt_process (evt=0x7f9298c0) at main.cc:626
cb = 0x666940 <_avnd_cb>
rc = 1
#9  avnd_main_process () at main.cc:577
ret = 
fds = {{fd = 12, events = 1, revents = 1}, {fd = 16, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 
0}, {fd = 0, events = 0, revents = 0}}
evt = 0x7f9298c0
__FUNCTION__ = "avnd_main_process"
result = 
rc = 
#10 0x004058f3 in main (argc=1, argv=0x7ffe700c5c78) at main.cc:202
error = 0
1358../sysdeps/x86_64/multiarch/../strcmp.S: No such file or directory.
~~~
In syslog of PL5:

2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=npm_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=npm_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=nway_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=nway_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=4,safSg=1,safApp=npm_2' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=4,safSg=1,safApp=npm_2' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 amfclccli[729]: CLEANUP request 
'safComp=A,safSu=4,safSg=1,safApp=npm_2'
2016-11-20 22:01:21 PL-5 amfclccli[728]: CLEANUP request 
'safComp=A,safSu=3,safSg=1,safApp=nway_1'
2016-11-20 22:01:21 PL-5 amfclccli[727]: CLEANUP request 
'safComp=A,safSu=3,safSg=1,safApp=npm_1'
2016-11-20 22:02:12 PL-5 osafamfnd[411]: NO Removed 'safSi=2,safApp=nway_1' 
from 'safSu=3,safSg=1,safApp=nway_1'
2016-11-20 22:02:12 PL-5 osafimmnd[

[tickets] [opensaf:tickets] #2345 amf: standby controller reboots after switchover

2017-03-06 Thread Nagendra Kumar
- **summary**: amf: standby controller reboots during switchover --> amf: 
standby controller reboots after switchover



---

** [tickets:#2345] amf: standby controller reboots after switchover**

**Status:** review
**Milestone:** 5.2.RC1
**Created:** Mon Mar 06, 2017 07:27 AM UTC by Nagendra Kumar
**Last Updated:** Mon Mar 06, 2017 08:37 AM UTC
**Owner:** Nagendra Kumar


Steps to reproduce
--
1. Make the following changes at Standby Amfd (SC-2) :
diff --git a/src/amf/amfnd/evt.cc b/src/amf/amfnd/evt.cc
--- a/src/amf/amfnd/evt.cc
+++ b/src/amf/amfnd/evt.cc
@@ -71,6 +71,11 @@ AVND_EVT *avnd_evt_create(AVND_CB *cb,
/* fill the event specific fields */
switch (type) {
/* AvD event types */
+   case AVND_EVT_AVD_ROLE_CHANGE_MSG:
+   case AVND_EVT_AVD_VERIFY_MSG:
+   evt->priority = NCS_IPC_PRIORITY_VERY_HIGH; /* bump up the 
priority */
+   evt->info.avd = (AVSV_DND_MSG *)info;
+   break;
case AVND_EVT_AVD_NODE_UP_MSG:
case AVND_EVT_AVD_REG_SU_MSG:
case AVND_EVT_AVD_REG_COMP_MSG:
@@ -79,12 +84,10 @@ AVND_EVT *avnd_evt_create(AVND_CB *cb,
case AVND_EVT_AVD_PG_UPD_MSG:
case AVND_EVT_AVD_OPERATION_REQUEST_MSG:
case AVND_EVT_AVD_SU_PRES_MSG:
-   case AVND_EVT_AVD_VERIFY_MSG:
case AVND_EVT_AVD_ACK_MSG:
case AVND_EVT_AVD_SHUTDOWN_APP_SU_MSG:
case AVND_EVT_AVD_SET_LEDS_MSG:
case AVND_EVT_AVD_COMP_VALIDATION_RESP_MSG:
-   case AVND_EVT_AVD_ROLE_CHANGE_MSG:
case AVND_EVT_AVD_ADMIN_OP_REQ_MSG:
case AVND_EVT_AVD_REBOOT_MSG:
case AVND_EVT_AVD_COMPCSI_ASSIGN_MSG:
diff --git a/src/amf/amfnd/mds.cc b/src/amf/amfnd/mds.cc
--- a/src/amf/amfnd/mds.cc
+++ b/src/amf/amfnd/mds.cc
@@ -543,6 +543,9 @@ uint32_t avnd_mds_svc_evt(AVND_CB *cb, M
// DOWN is received for the old director ..
if (m_AVND_CB_IS_AVD_UP(cb)) {
m_AVND_CB_AVD_UP_RESET(cb);
+   LOG_NO("Before sleep");
+   sleep(2);
+   LOG_NO("After sleep");
}

evt = avnd_evt_create(cb, AVND_EVT_MDS_AVD_UP, 0, 
_info->i_dest, 0, 0, 0);

2. Start both controllers. Upload app configuration, hosting SC-1 on SC-2.
3. Issue amf-adm si-swap safSi=SC-2N,safApp=OpenSAF
4. Then issue amf-adm unlock-in safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
Observed behaviour
--
Mar  6 12:53:58 PM_SC-2 osafamfnd[8501]: NO After sleep
Mar  6 12:54:01 PM_SC-2 osafamfnd[8501]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: Message ID mismatch, rec 2, expected 1, OwnNodeId = 
131599, SupervisionTime = 60
Mar  6 12:54:01 PM_SC-2 opensaf_reboot: Rebooting local node; timeout=60

This happens because role change(avnd_evt_mds_avd_up_evh) comes before 
avnd_evt_mds_avd_up_evh.

osafamfnd [9054:src/amf/amfnd/mds.cc:0540] NO AVD NEW_ACTIVE, adest:1
osafamfnd [9054:src/amf/amfnd/mds.cc:0546] NO Before sleep
osafamfnd [9054:src/amf/amfnd/mds.cc:0548] NO After sleep
osafamfnd [9054:src/amf/amfnd/mds.cc:0345] T1 Active AVD Adest = 565214626185244
osafamfnd [9054:src/amf/amfnd/main.cc:0647] >> avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0664] TR Evt type:9
osafamfnd [9054:src/amf/amfnd/verify.cc:0058] >> avnd_evt_avd_verify_evh: Data 
Verify message received from newly ACTIVE AVD
osafamfnd [9054:src/amf/amfnd/verify.cc:0071] T1 AVD send ID count: 7
osafamfnd [9054:src/amf/amfnd/verify.cc:0072] T1 AVND receive ID count: 7
osafamfnd [9054:src/amf/amfnd/di.cc:1087] >> avnd_di_ack_nack_msg_send: Receive 
id = 7
osafamfnd [9054:src/amf/amfnd/di.cc:1103] T1 MsgId=84,ACK=1
osafamfnd [9054:src/amf/amfnd/di.cc:1033] >> avnd_di_msg_send: Msg type '10'
osafamfnd [9054:src/amf/amfnd/di.cc:1043] T1 avnd_di_msg_send, Active AVD 
Adest: 565214626185244
osafamfnd [9054:src/amf/amfnd/mds.cc:1496] >> avnd_mds_red_send: Msg type '1'
osafamfnd [9054:src/amf/amfnd/mds.cc:1534] << avnd_mds_red_send: rc '1'
osafamfnd [9054:src/amf/amfnd/di.cc:1065] << avnd_di_msg_send: 1
osafamfnd [9054:src/amf/amfnd/di.cc:1112] << avnd_di_ack_nack_msg_send: retval=1
osafamfnd [9054:src/amf/amfnd/verify.cc:0095] T1 AVD receive ID count: 83
osafamfnd [9054:src/amf/amfnd/verify.cc:0096] T1 AVND send ID count: 83
osafamfnd [9054:src/amf/amfnd/di.cc:1571] >> avnd_di_resend_pg_start_track
osafamfnd [9054:src/amf/amfnd/di.cc:1581] << avnd_di_resend_pg_start_track
osafamfnd [9054:src/amf/amfnd/verify.cc:0143] << avnd_evt_avd_verify_evh
osafamfnd [9054:src/amf/amfnd/main.cc:0670] TR Evt Type:9 success
osafamfnd [9054:src/amf/amfnd/main.cc:0675] << avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0647] >> avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0664] TR Evt type:14
osafamfnd [9

[tickets] [opensaf:tickets] #2345 amf: standby controller reboots during switchover

2017-03-06 Thread Nagendra Kumar
- **status**: accepted --> review



---

** [tickets:#2345] amf: standby controller reboots during switchover**

**Status:** review
**Milestone:** 5.2.RC1
**Created:** Mon Mar 06, 2017 07:27 AM UTC by Nagendra Kumar
**Last Updated:** Mon Mar 06, 2017 07:27 AM UTC
**Owner:** Nagendra Kumar


Steps to reproduce
--
1. Make the following changes at Standby Amfd (SC-2) :
diff --git a/src/amf/amfnd/evt.cc b/src/amf/amfnd/evt.cc
--- a/src/amf/amfnd/evt.cc
+++ b/src/amf/amfnd/evt.cc
@@ -71,6 +71,11 @@ AVND_EVT *avnd_evt_create(AVND_CB *cb,
/* fill the event specific fields */
switch (type) {
/* AvD event types */
+   case AVND_EVT_AVD_ROLE_CHANGE_MSG:
+   case AVND_EVT_AVD_VERIFY_MSG:
+   evt->priority = NCS_IPC_PRIORITY_VERY_HIGH; /* bump up the 
priority */
+   evt->info.avd = (AVSV_DND_MSG *)info;
+   break;
case AVND_EVT_AVD_NODE_UP_MSG:
case AVND_EVT_AVD_REG_SU_MSG:
case AVND_EVT_AVD_REG_COMP_MSG:
@@ -79,12 +84,10 @@ AVND_EVT *avnd_evt_create(AVND_CB *cb,
case AVND_EVT_AVD_PG_UPD_MSG:
case AVND_EVT_AVD_OPERATION_REQUEST_MSG:
case AVND_EVT_AVD_SU_PRES_MSG:
-   case AVND_EVT_AVD_VERIFY_MSG:
case AVND_EVT_AVD_ACK_MSG:
case AVND_EVT_AVD_SHUTDOWN_APP_SU_MSG:
case AVND_EVT_AVD_SET_LEDS_MSG:
case AVND_EVT_AVD_COMP_VALIDATION_RESP_MSG:
-   case AVND_EVT_AVD_ROLE_CHANGE_MSG:
case AVND_EVT_AVD_ADMIN_OP_REQ_MSG:
case AVND_EVT_AVD_REBOOT_MSG:
case AVND_EVT_AVD_COMPCSI_ASSIGN_MSG:
diff --git a/src/amf/amfnd/mds.cc b/src/amf/amfnd/mds.cc
--- a/src/amf/amfnd/mds.cc
+++ b/src/amf/amfnd/mds.cc
@@ -543,6 +543,9 @@ uint32_t avnd_mds_svc_evt(AVND_CB *cb, M
// DOWN is received for the old director ..
if (m_AVND_CB_IS_AVD_UP(cb)) {
m_AVND_CB_AVD_UP_RESET(cb);
+   LOG_NO("Before sleep");
+   sleep(2);
+   LOG_NO("After sleep");
}

evt = avnd_evt_create(cb, AVND_EVT_MDS_AVD_UP, 0, 
_info->i_dest, 0, 0, 0);

2. Start both controllers. Upload app configuration, hosting SC-1 on SC-2.
3. Issue amf-adm si-swap safSi=SC-2N,safApp=OpenSAF
4. Then issue amf-adm unlock-in safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
Observed behaviour
--
Mar  6 12:53:58 PM_SC-2 osafamfnd[8501]: NO After sleep
Mar  6 12:54:01 PM_SC-2 osafamfnd[8501]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: Message ID mismatch, rec 2, expected 1, OwnNodeId = 
131599, SupervisionTime = 60
Mar  6 12:54:01 PM_SC-2 opensaf_reboot: Rebooting local node; timeout=60

This happens because role change(avnd_evt_mds_avd_up_evh) comes before 
avnd_evt_mds_avd_up_evh.

osafamfnd [9054:src/amf/amfnd/mds.cc:0540] NO AVD NEW_ACTIVE, adest:1
osafamfnd [9054:src/amf/amfnd/mds.cc:0546] NO Before sleep
osafamfnd [9054:src/amf/amfnd/mds.cc:0548] NO After sleep
osafamfnd [9054:src/amf/amfnd/mds.cc:0345] T1 Active AVD Adest = 565214626185244
osafamfnd [9054:src/amf/amfnd/main.cc:0647] >> avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0664] TR Evt type:9
osafamfnd [9054:src/amf/amfnd/verify.cc:0058] >> avnd_evt_avd_verify_evh: Data 
Verify message received from newly ACTIVE AVD
osafamfnd [9054:src/amf/amfnd/verify.cc:0071] T1 AVD send ID count: 7
osafamfnd [9054:src/amf/amfnd/verify.cc:0072] T1 AVND receive ID count: 7
osafamfnd [9054:src/amf/amfnd/di.cc:1087] >> avnd_di_ack_nack_msg_send: Receive 
id = 7
osafamfnd [9054:src/amf/amfnd/di.cc:1103] T1 MsgId=84,ACK=1
osafamfnd [9054:src/amf/amfnd/di.cc:1033] >> avnd_di_msg_send: Msg type '10'
osafamfnd [9054:src/amf/amfnd/di.cc:1043] T1 avnd_di_msg_send, Active AVD 
Adest: 565214626185244
osafamfnd [9054:src/amf/amfnd/mds.cc:1496] >> avnd_mds_red_send: Msg type '1'
osafamfnd [9054:src/amf/amfnd/mds.cc:1534] << avnd_mds_red_send: rc '1'
osafamfnd [9054:src/amf/amfnd/di.cc:1065] << avnd_di_msg_send: 1
osafamfnd [9054:src/amf/amfnd/di.cc:1112] << avnd_di_ack_nack_msg_send: retval=1
osafamfnd [9054:src/amf/amfnd/verify.cc:0095] T1 AVD receive ID count: 83
osafamfnd [9054:src/amf/amfnd/verify.cc:0096] T1 AVND send ID count: 83
osafamfnd [9054:src/amf/amfnd/di.cc:1571] >> avnd_di_resend_pg_start_track
osafamfnd [9054:src/amf/amfnd/di.cc:1581] << avnd_di_resend_pg_start_track
osafamfnd [9054:src/amf/amfnd/verify.cc:0143] << avnd_evt_avd_verify_evh
osafamfnd [9054:src/amf/amfnd/main.cc:0670] TR Evt Type:9 success
osafamfnd [9054:src/amf/amfnd/main.cc:0675] << avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0647] >> avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0664] TR Evt type:14
osafamfnd [9054:src/amf/amfnd/di.cc:1503] >> avnd_evt_avd_role_change_evh
osafamfnd [9054:src/

[tickets] [opensaf:tickets] #2345 amf: standby controller reboots during switchover

2017-03-05 Thread Nagendra Kumar



---

** [tickets:#2345] amf: standby controller reboots during switchover**

**Status:** accepted
**Milestone:** 5.2.RC1
**Created:** Mon Mar 06, 2017 07:27 AM UTC by Nagendra Kumar
**Last Updated:** Mon Mar 06, 2017 07:27 AM UTC
**Owner:** Nagendra Kumar


Steps to reproduce
--
1. Make the following changes at Standby Amfd (SC-2) :
diff --git a/src/amf/amfnd/evt.cc b/src/amf/amfnd/evt.cc
--- a/src/amf/amfnd/evt.cc
+++ b/src/amf/amfnd/evt.cc
@@ -71,6 +71,11 @@ AVND_EVT *avnd_evt_create(AVND_CB *cb,
/* fill the event specific fields */
switch (type) {
/* AvD event types */
+   case AVND_EVT_AVD_ROLE_CHANGE_MSG:
+   case AVND_EVT_AVD_VERIFY_MSG:
+   evt->priority = NCS_IPC_PRIORITY_VERY_HIGH; /* bump up the 
priority */
+   evt->info.avd = (AVSV_DND_MSG *)info;
+   break;
case AVND_EVT_AVD_NODE_UP_MSG:
case AVND_EVT_AVD_REG_SU_MSG:
case AVND_EVT_AVD_REG_COMP_MSG:
@@ -79,12 +84,10 @@ AVND_EVT *avnd_evt_create(AVND_CB *cb,
case AVND_EVT_AVD_PG_UPD_MSG:
case AVND_EVT_AVD_OPERATION_REQUEST_MSG:
case AVND_EVT_AVD_SU_PRES_MSG:
-   case AVND_EVT_AVD_VERIFY_MSG:
case AVND_EVT_AVD_ACK_MSG:
case AVND_EVT_AVD_SHUTDOWN_APP_SU_MSG:
case AVND_EVT_AVD_SET_LEDS_MSG:
case AVND_EVT_AVD_COMP_VALIDATION_RESP_MSG:
-   case AVND_EVT_AVD_ROLE_CHANGE_MSG:
case AVND_EVT_AVD_ADMIN_OP_REQ_MSG:
case AVND_EVT_AVD_REBOOT_MSG:
case AVND_EVT_AVD_COMPCSI_ASSIGN_MSG:
diff --git a/src/amf/amfnd/mds.cc b/src/amf/amfnd/mds.cc
--- a/src/amf/amfnd/mds.cc
+++ b/src/amf/amfnd/mds.cc
@@ -543,6 +543,9 @@ uint32_t avnd_mds_svc_evt(AVND_CB *cb, M
// DOWN is received for the old director ..
if (m_AVND_CB_IS_AVD_UP(cb)) {
m_AVND_CB_AVD_UP_RESET(cb);
+   LOG_NO("Before sleep");
+   sleep(2);
+   LOG_NO("After sleep");
}

evt = avnd_evt_create(cb, AVND_EVT_MDS_AVD_UP, 0, 
_info->i_dest, 0, 0, 0);

2. Start both controllers. Upload app configuration, hosting SC-1 on SC-2.
3. Issue amf-adm si-swap safSi=SC-2N,safApp=OpenSAF
4. Then issue amf-adm unlock-in safSu=SU1,safSg=AmfDemo_2N,safApp=AmfDemo1
Observed behaviour
--
Mar  6 12:53:58 PM_SC-2 osafamfnd[8501]: NO After sleep
Mar  6 12:54:01 PM_SC-2 osafamfnd[8501]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: Message ID mismatch, rec 2, expected 1, OwnNodeId = 
131599, SupervisionTime = 60
Mar  6 12:54:01 PM_SC-2 opensaf_reboot: Rebooting local node; timeout=60

This happens because role change(avnd_evt_mds_avd_up_evh) comes before 
avnd_evt_mds_avd_up_evh.

osafamfnd [9054:src/amf/amfnd/mds.cc:0540] NO AVD NEW_ACTIVE, adest:1
osafamfnd [9054:src/amf/amfnd/mds.cc:0546] NO Before sleep
osafamfnd [9054:src/amf/amfnd/mds.cc:0548] NO After sleep
osafamfnd [9054:src/amf/amfnd/mds.cc:0345] T1 Active AVD Adest = 565214626185244
osafamfnd [9054:src/amf/amfnd/main.cc:0647] >> avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0664] TR Evt type:9
osafamfnd [9054:src/amf/amfnd/verify.cc:0058] >> avnd_evt_avd_verify_evh: Data 
Verify message received from newly ACTIVE AVD
osafamfnd [9054:src/amf/amfnd/verify.cc:0071] T1 AVD send ID count: 7
osafamfnd [9054:src/amf/amfnd/verify.cc:0072] T1 AVND receive ID count: 7
osafamfnd [9054:src/amf/amfnd/di.cc:1087] >> avnd_di_ack_nack_msg_send: Receive 
id = 7
osafamfnd [9054:src/amf/amfnd/di.cc:1103] T1 MsgId=84,ACK=1
osafamfnd [9054:src/amf/amfnd/di.cc:1033] >> avnd_di_msg_send: Msg type '10'
osafamfnd [9054:src/amf/amfnd/di.cc:1043] T1 avnd_di_msg_send, Active AVD 
Adest: 565214626185244
osafamfnd [9054:src/amf/amfnd/mds.cc:1496] >> avnd_mds_red_send: Msg type '1'
osafamfnd [9054:src/amf/amfnd/mds.cc:1534] << avnd_mds_red_send: rc '1'
osafamfnd [9054:src/amf/amfnd/di.cc:1065] << avnd_di_msg_send: 1
osafamfnd [9054:src/amf/amfnd/di.cc:1112] << avnd_di_ack_nack_msg_send: retval=1
osafamfnd [9054:src/amf/amfnd/verify.cc:0095] T1 AVD receive ID count: 83
osafamfnd [9054:src/amf/amfnd/verify.cc:0096] T1 AVND send ID count: 83
osafamfnd [9054:src/amf/amfnd/di.cc:1571] >> avnd_di_resend_pg_start_track
osafamfnd [9054:src/amf/amfnd/di.cc:1581] << avnd_di_resend_pg_start_track
osafamfnd [9054:src/amf/amfnd/verify.cc:0143] << avnd_evt_avd_verify_evh
osafamfnd [9054:src/amf/amfnd/main.cc:0670] TR Evt Type:9 success
osafamfnd [9054:src/amf/amfnd/main.cc:0675] << avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0647] >> avnd_evt_process
osafamfnd [9054:src/amf/amfnd/main.cc:0664] TR Evt type:14
osafamfnd [9054:src/amf/amfnd/di.cc:1503] >> avnd_evt_avd_role_change_evh
osafamfnd [9054:src/amf/amfnd/di.cc:1507] IN AVD is n

[tickets] [opensaf:tickets] #2321 Incorrect error messages "mkfifo already exists" observed in syslog

2017-03-01 Thread Nagendra Kumar
Thanks. I have sent patch for review. Please review it.



---

** [tickets:#2321] Incorrect error messages "mkfifo already exists" observed in 
syslog**

**Status:** review
**Milestone:** 5.2.RC1
**Created:** Thu Feb 23, 2017 05:46 AM UTC by Ritu Raj
**Last Updated:** Thu Mar 02, 2017 07:00 AM UTC
**Owner:** Nagendra Kumar


# Environment details
OS : Suse 64bit
Changeset :  8603( 5.2.MO-1)

# Summary
Incorrect error messages "mkfifo already exists" observed in syslog after 
perfoming opensaf stop and start operation.

#Steps
1. Started the OpenSAF on single controller
2. Stop the OpenSAF and start agian, while starting OpnSAF again on same node 
following error message observed in syslog for component osafamfnd and 
osafamfwd:

Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: mkfifo already exists: 
/var/lib/opensaf/osafamfnd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: Started

Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: mkfifo already exists: 
/var/lib/opensaf/osafamfwd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: Started





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2213 AMFND: Coredump if suFailover while shutting down

2017-03-01 Thread Nagendra Kumar
Will this work? :
diff --git a/src/amf/amfnd/comp.cc b/src/amf/amfnd/comp.cc
--- a/src/amf/amfnd/comp.cc
+++ b/src/amf/amfnd/comp.cc
@@ -2650,6 +2650,9 @@ void avnd_comp_cmplete_all_csi_rec(AVND_
/* generate csi-remove-done event... csi may be 
deleted */
(void)avnd_comp_csi_remove_done(cb, comp, curr);

+   if (curr == nullptr)
+   break;
+
if (0 == m_AVND_COMPDB_REC_CSI_GET(*comp, 
curr->name.c_str())) {
curr =
(prv) ?



---

** [tickets:#2213] AMFND: Coredump if suFailover while shutting down**

**Status:** assigned
**Milestone:** 5.2.RC1
**Created:** Fri Dec 02, 2016 04:54 AM UTC by Minh Hon Chau
**Last Updated:** Thu Mar 02, 2017 06:09 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2213/attachment/log.tgz) 
(548.6 kB; application/x-compressed)


Seen amfnd coredump in PL5 with bt as below while cluster is shutting down
~~~
Thread 1 (Thread 0x7f92a8925780 (LWP 411)):
#0  __strcmp_sse2 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1358
No locals.
#1  0x00449cc9 in avsv_dblist_sastring_cmp (key1=, 
key2=) at util.c:361
i = 0
str1 = 
str2 = 
#2  0x7f92a84b1f95 in ncs_db_link_list_find (list_ptr=0x1ee89f0, 
key=0x656d6e6769737361 ) at ncsdlib.c:169
start_ptr = 0x1ee3168
#3  0x00416dc0 in avnd_comp_cmplete_all_csi_rec (cb=0x666940 
<_avnd_cb>, comp=0x1ee8200) at comp.cc:2652
curr = 0x1ee8060
prv = 0x1ee3150
__FUNCTION__ = "avnd_comp_cmplete_all_csi_rec"
#4  0x0040ca47 in avnd_instfail_su_failover (failed_comp=0x1ee8200, 
su=0x1ee74e0, cb=0x666940 <_avnd_cb>) at clc
.cc:3161
rc = 
#5  avnd_comp_clc_st_chng_prc (cb=cb@entry=0x666940 <_avnd_cb>, 
comp=comp@entry=0x1ee8200, prv_st=prv_st@entry=
SA_AMF_PRESENCE_RESTARTING, 
final_st=final_st@entry=SA_AMF_PRESENCE_TERMINATION_FAILED) at clc.cc:967
csi = 0x0
__FUNCTION__ = "avnd_comp_clc_st_chng_prc"
ev = AVND_SU_PRES_FSM_EV_MAX
is_en = 
rc = 1
#6  0x0040f530 in avnd_comp_clc_fsm_run (cb=cb@entry=0x666940 
<_avnd_cb>, comp=comp@entry=0x1ee8200, ev=
AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_FAIL) at clc.cc:906
prv_st = 
final_st = 
rc = 1
__FUNCTION__ = "avnd_comp_clc_fsm_run"
#7  0x0040fdea in avnd_evt_clc_resp_evh (cb=0x666940 <_avnd_cb>, 
evt=0x7f9298c0) at clc.cc:414
__FUNCTION__ = "avnd_evt_clc_resp_evh"
ev = 
clc_evt = 0x7f9298e0
comp = 0x1ee8200
rc = 1
#8  0x0042676f in avnd_evt_process (evt=0x7f9298c0) at main.cc:626
cb = 0x666940 <_avnd_cb>
rc = 1
#9  avnd_main_process () at main.cc:577
ret = 
fds = {{fd = 12, events = 1, revents = 1}, {fd = 16, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 
0}, {fd = 0, events = 0, revents = 0}}
evt = 0x7f9298c0
__FUNCTION__ = "avnd_main_process"
result = 
rc = 
#10 0x004058f3 in main (argc=1, argv=0x7ffe700c5c78) at main.cc:202
error = 0
1358../sysdeps/x86_64/multiarch/../strcmp.S: No such file or directory.
~~~
In syslog of PL5:

2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=npm_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=npm_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=nway_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=nway_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=4,safSg=1,safApp=npm_2' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=4,safSg=1,safApp=npm_2' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRest

[tickets] [opensaf:tickets] #2321 Incorrect error messages "mkfifo already exists" observed in syslog

2017-03-01 Thread Nagendra Kumar
- **status**: unassigned --> review
- **assigned_to**: Nagendra Kumar



---

** [tickets:#2321] Incorrect error messages "mkfifo already exists" observed in 
syslog**

**Status:** review
**Milestone:** 5.2.RC1
**Created:** Thu Feb 23, 2017 05:46 AM UTC by Ritu Raj
**Last Updated:** Wed Mar 01, 2017 02:25 PM UTC
**Owner:** Nagendra Kumar


# Environment details
OS : Suse 64bit
Changeset :  8603( 5.2.MO-1)

# Summary
Incorrect error messages "mkfifo already exists" observed in syslog after 
perfoming opensaf stop and start operation.

#Steps
1. Started the OpenSAF on single controller
2. Stop the OpenSAF and start agian, while starting OpnSAF again on same node 
following error message observed in syslog for component osafamfnd and 
osafamfwd:

Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: mkfifo already exists: 
/var/lib/opensaf/osafamfnd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: Started

Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: mkfifo already exists: 
/var/lib/opensaf/osafamfwd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: Started





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2213 AMFND: Coredump if suFailover while shutting down

2017-03-01 Thread Nagendra Kumar
Hi Minh, I am not getting any clue about the faults. Can you please provide 
traces if possible (via email).
Thanks
-Nagu


---

** [tickets:#2213] AMFND: Coredump if suFailover while shutting down**

**Status:** assigned
**Milestone:** 5.2.RC1
**Created:** Fri Dec 02, 2016 04:54 AM UTC by Minh Hon Chau
**Last Updated:** Wed Mar 01, 2017 10:57 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2213/attachment/log.tgz) 
(548.6 kB; application/x-compressed)


Seen amfnd coredump in PL5 with bt as below while cluster is shutting down
~~~
Thread 1 (Thread 0x7f92a8925780 (LWP 411)):
#0  __strcmp_sse2 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1358
No locals.
#1  0x00449cc9 in avsv_dblist_sastring_cmp (key1=, 
key2=) at util.c:361
i = 0
str1 = 
str2 = 
#2  0x7f92a84b1f95 in ncs_db_link_list_find (list_ptr=0x1ee89f0, 
key=0x656d6e6769737361 ) at ncsdlib.c:169
start_ptr = 0x1ee3168
#3  0x00416dc0 in avnd_comp_cmplete_all_csi_rec (cb=0x666940 
<_avnd_cb>, comp=0x1ee8200) at comp.cc:2652
curr = 0x1ee8060
prv = 0x1ee3150
__FUNCTION__ = "avnd_comp_cmplete_all_csi_rec"
#4  0x0040ca47 in avnd_instfail_su_failover (failed_comp=0x1ee8200, 
su=0x1ee74e0, cb=0x666940 <_avnd_cb>) at clc
.cc:3161
rc = 
#5  avnd_comp_clc_st_chng_prc (cb=cb@entry=0x666940 <_avnd_cb>, 
comp=comp@entry=0x1ee8200, prv_st=prv_st@entry=
SA_AMF_PRESENCE_RESTARTING, 
final_st=final_st@entry=SA_AMF_PRESENCE_TERMINATION_FAILED) at clc.cc:967
csi = 0x0
__FUNCTION__ = "avnd_comp_clc_st_chng_prc"
ev = AVND_SU_PRES_FSM_EV_MAX
is_en = 
rc = 1
#6  0x0040f530 in avnd_comp_clc_fsm_run (cb=cb@entry=0x666940 
<_avnd_cb>, comp=comp@entry=0x1ee8200, ev=
AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_FAIL) at clc.cc:906
prv_st = 
final_st = 
rc = 1
__FUNCTION__ = "avnd_comp_clc_fsm_run"
#7  0x0040fdea in avnd_evt_clc_resp_evh (cb=0x666940 <_avnd_cb>, 
evt=0x7f9298c0) at clc.cc:414
__FUNCTION__ = "avnd_evt_clc_resp_evh"
ev = 
clc_evt = 0x7f9298e0
comp = 0x1ee8200
rc = 1
#8  0x0042676f in avnd_evt_process (evt=0x7f9298c0) at main.cc:626
cb = 0x666940 <_avnd_cb>
rc = 1
#9  avnd_main_process () at main.cc:577
ret = 
fds = {{fd = 12, events = 1, revents = 1}, {fd = 16, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 
0}, {fd = 0, events = 0, revents = 0}}
evt = 0x7f9298c0
__FUNCTION__ = "avnd_main_process"
result = 
rc = 
#10 0x004058f3 in main (argc=1, argv=0x7ffe700c5c78) at main.cc:202
error = 0
1358../sysdeps/x86_64/multiarch/../strcmp.S: No such file or directory.
~~~
In syslog of PL5:

2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=npm_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=npm_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=nway_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=nway_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=4,safSg=1,safApp=npm_2' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=4,safSg=1,safApp=npm_2' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 amfclccli[729]: CLEANUP request 
'safComp=A,safSu=4,safSg=1,safApp=npm_2'
2016-11-20 22:01:21 PL-5 amfclccli[728]: CLEANUP request 
'safComp=A,safSu=3,safSg=1,safApp=nway_1'
2016-11-20 22:01:21 PL-5 amfclccli[727]: CLEANUP request 
'safComp=A,safSu=3,safSg=1,safApp=npm_1'
2016-11-20 22:02:12 PL-5 osafamfnd[411]: NO Removed 'safSi=2,safApp=nway_1' 
from 'safSu=3,

[tickets] [opensaf:tickets] #2321 Incorrect error messages "mkfifo already exists" observed in syslog

2017-03-01 Thread Nagendra Kumar
Hi Hans N, are these logs required in syslog?


---

** [tickets:#2321] Incorrect error messages "mkfifo already exists" observed in 
syslog**

**Status:** unassigned
**Milestone:** 5.2.RC1
**Created:** Thu Feb 23, 2017 05:46 AM UTC by Ritu Raj
**Last Updated:** Wed Mar 01, 2017 07:39 AM UTC
**Owner:** nobody


# Environment details
OS : Suse 64bit
Changeset :  8603( 5.2.MO-1)

# Summary
Incorrect error messages "mkfifo already exists" observed in syslog after 
perfoming opensaf stop and start operation.

#Steps
1. Started the OpenSAF on single controller
2. Stop the OpenSAF and start agian, while starting OpnSAF again on same node 
following error message observed in syslog for component osafamfnd and 
osafamfwd:

Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: mkfifo already exists: 
/var/lib/opensaf/osafamfnd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfnd[21955]: Started

Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: mkfifo already exists: 
/var/lib/opensaf/osafamfwd.fifo File exists
Feb 23 16:21:34 SO-SLOT-1 osafamfwd[22062]: Started





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2213 AMFND: Coredump if suFailover while shutting down

2017-03-01 Thread Nagendra Kumar
- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar



---

** [tickets:#2213] AMFND: Coredump if suFailover while shutting down**

**Status:** assigned
**Milestone:** 5.2.RC1
**Created:** Fri Dec 02, 2016 04:54 AM UTC by Minh Hon Chau
**Last Updated:** Fri Dec 02, 2016 04:54 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2213/attachment/log.tgz) 
(548.6 kB; application/x-compressed)


Seen amfnd coredump in PL5 with bt as below while cluster is shutting down
~~~
Thread 1 (Thread 0x7f92a8925780 (LWP 411)):
#0  __strcmp_sse2 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1358
No locals.
#1  0x00449cc9 in avsv_dblist_sastring_cmp (key1=, 
key2=) at util.c:361
i = 0
str1 = 
str2 = 
#2  0x7f92a84b1f95 in ncs_db_link_list_find (list_ptr=0x1ee89f0, 
key=0x656d6e6769737361 ) at ncsdlib.c:169
start_ptr = 0x1ee3168
#3  0x00416dc0 in avnd_comp_cmplete_all_csi_rec (cb=0x666940 
<_avnd_cb>, comp=0x1ee8200) at comp.cc:2652
curr = 0x1ee8060
prv = 0x1ee3150
__FUNCTION__ = "avnd_comp_cmplete_all_csi_rec"
#4  0x0040ca47 in avnd_instfail_su_failover (failed_comp=0x1ee8200, 
su=0x1ee74e0, cb=0x666940 <_avnd_cb>) at clc
.cc:3161
rc = 
#5  avnd_comp_clc_st_chng_prc (cb=cb@entry=0x666940 <_avnd_cb>, 
comp=comp@entry=0x1ee8200, prv_st=prv_st@entry=
SA_AMF_PRESENCE_RESTARTING, 
final_st=final_st@entry=SA_AMF_PRESENCE_TERMINATION_FAILED) at clc.cc:967
csi = 0x0
__FUNCTION__ = "avnd_comp_clc_st_chng_prc"
ev = AVND_SU_PRES_FSM_EV_MAX
is_en = 
rc = 1
#6  0x0040f530 in avnd_comp_clc_fsm_run (cb=cb@entry=0x666940 
<_avnd_cb>, comp=comp@entry=0x1ee8200, ev=
AVND_COMP_CLC_PRES_FSM_EV_CLEANUP_FAIL) at clc.cc:906
prv_st = 
final_st = 
rc = 1
__FUNCTION__ = "avnd_comp_clc_fsm_run"
#7  0x0040fdea in avnd_evt_clc_resp_evh (cb=0x666940 <_avnd_cb>, 
evt=0x7f9298c0) at clc.cc:414
__FUNCTION__ = "avnd_evt_clc_resp_evh"
ev = 
clc_evt = 0x7f9298e0
comp = 0x1ee8200
rc = 1
#8  0x0042676f in avnd_evt_process (evt=0x7f9298c0) at main.cc:626
cb = 0x666940 <_avnd_cb>
rc = 1
#9  avnd_main_process () at main.cc:577
ret = 
fds = {{fd = 12, events = 1, revents = 1}, {fd = 16, events = 1, 
revents = 0}, {fd = 14, events = 1, revents = 
0}, {fd = 0, events = 0, revents = 0}}
evt = 0x7f9298c0
__FUNCTION__ = "avnd_main_process"
result = 
rc = 
#10 0x004058f3 in main (argc=1, argv=0x7ffe700c5c78) at main.cc:202
error = 0
1358../sysdeps/x86_64/multiarch/../strcmp.S: No such file or directory.
~~~
In syslog of PL5:

2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=npm_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=npm_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=npm_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=3,safSg=1,safApp=nway_1' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=3,safSg=1,safApp=nway_1' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=3,safSg=1,safApp=nway_1' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
component restart probation timer started (timeout: 600 ns)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO Restarting a component of 
'safSu=4,safSg=1,safApp=npm_2' (comp restart count: 1)
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 
'safComp=A,safSu=4,safSg=1,safApp=npm_2' faulted due to 
'csiRemovecallbackTimeout' : Recovery is 'componentRestart'
2016-11-20 22:01:21 PL-5 osafamfnd[411]: NO 'safSu=4,safSg=1,safApp=npm_2' 
Presence State INSTANTIATED => RESTARTING
2016-11-20 22:01:21 PL-5 amfclccli[729]: CLEANUP request 
'safComp=A,safSu=4,safSg=1,safApp=npm_2'
2016-11-20 22:01:21 PL-5 amfclccli[728]: CLEANUP request 
'safComp=A,safSu=3,safSg=1,safApp=nway_1'
2016-11-20 22:01:21 PL-5 amfclccli[727]: CLEANUP request 
'safComp=A,safSu=3,safSg=1,safApp=npm_1'
2016-11-20 22:02:12 PL-5 osafamfnd[411]: NO Removed 'safSi=2,safApp=nway_1' 
from 'safSu=3,safSg=1,safApp=nway_1'
2016-11-20 22:02:12 PL-5 

[tickets] [opensaf:tickets] #2106 amf: Admin Operations on middleware SUs / SIs should not be supported

2017-03-01 Thread Nagendra Kumar
- **status**: accepted --> review



---

** [tickets:#2106] amf: Admin Operations on middleware SUs  / SIs should not be 
supported**

**Status:** review
**Milestone:** 5.2.RC1
**Created:** Sun Oct 09, 2016 11:18 AM UTC by Srikanth R
**Last Updated:** Wed Mar 01, 2017 10:02 AM UTC
**Owner:** Nagendra Kumar


Changeset : 8190 5.1.GA

-> Bring up a single controller SC-1
-> Now perform lock and unlock operation of middleware SU .i.e 
safSu=SC-2,safSg=NoRed,safApp=OpenSAF which is hosted on SC-2.
-> Admin lock operation succeeds, but admin unlock operation times out with the 
assignment to one of middleware SI.

 Following is the opensafd status after the unlock operation.
 
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)

  Admin operations on middleware objects should not be supported.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2106 amf: Admin Operations on middleware SUs / SIs should not be supported

2017-03-01 Thread Nagendra Kumar
- **status**: unassigned --> accepted
- **assigned_to**: Nagendra Kumar
- **Part**: - --> d
- **Version**:  --> 5.1 GA



---

** [tickets:#2106] amf: Admin Operations on middleware SUs  / SIs should not be 
supported**

**Status:** accepted
**Milestone:** 5.2.RC1
**Created:** Sun Oct 09, 2016 11:18 AM UTC by Srikanth R
**Last Updated:** Sun Oct 09, 2016 11:18 AM UTC
**Owner:** Nagendra Kumar


Changeset : 8190 5.1.GA

-> Bring up a single controller SC-1
-> Now perform lock and unlock operation of middleware SU .i.e 
safSu=SC-2,safSg=NoRed,safApp=OpenSAF which is hosted on SC-2.
-> Admin lock operation succeeds, but admin unlock operation times out with the 
assignment to one of middleware SI.

 Following is the opensafd status after the unlock operation.
 
safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC-2N,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)
safSISU=safSu=SC-2\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=OpenSAF
saAmfSISUHAState=ACTIVE(1)

  Admin operations on middleware objects should not be supported.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2101 AMF: Heartbeat timeout observed after ImmNd restart

2017-02-28 Thread Nagendra Kumar
- **status**: unassigned --> accepted
- **assigned_to**: Nagendra Kumar



---

** [tickets:#2101] AMF: Heartbeat timeout observed after ImmNd restart**

**Status:** accepted
**Milestone:** 5.2.RC1
**Created:** Fri Oct 07, 2016 01:26 PM UTC by Chani Srivastava
**Last Updated:** Wed Mar 01, 2017 06:17 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[messages](https://sourceforge.net/p/opensaf/tickets/2101/attachment/messages) 
(777.0 kB; application/octet-stream)
- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/2101/attachment/osafamfd) 
(5.6 MB; application/octet-stream)


OS : Suse 64bit
Changeset : 8190
Setup : 4 physical nodee 1 PBE enabled with 1Lakh load

Step 
1. Bringu opensaf on four nodes
2. Imm test cases running with ndrestart scenario on standby controller
3. After 1 of the ndrestart amfd heart beat timeout happened. Below is the 
backtrace.

Coredump:
0 0x7fa7c680161c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
1  0x0044da7d in avd_imm_reinit_bg () at imm.cc:1949
2  0x00453e33 in main_loop () at main.cc:737
3  0x004541ff in main (argc=2, argv=0x7fff1cbfa268) at main.cc:848
(gdb) thread apply all bt

Thread 6 (Thread 0x7fa7c78fcb00 (LWP 3837)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout_ts=0x7fa7c78fc180, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout=3) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=13, i_timeout=3) at 
osaf_poll.c:128
4  0x7fa7c65f3a0c in rda_read_msg (sockfd=13, msg=0x7fa7c78fc260 "10 2", 
size=64) at rda_papi.cc:673
5  0x7fa7c65f31ec in rda_callback_task (rda_callback_cb=0x7963e0) at 
rda_papi.cc:150
6  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
7  0x7fa7c571b9cd in clone () from /lib64/libc.so.6
8  0x in ?? ()

Thread 5 (Thread 0x7fa7c4e0b700 (LWP 5557)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout_ts=0x7fa7c4e09d00, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout=1) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=29, i_timeout=1) at 
osaf_poll.c:128
4  0x7fa7c7535605 in mds_mcm_time_wait (sel_obj=0x80c3d8, time_val=1000) at 
mds_c_sndrcv.c:2570
5  0x7fa7c75351b1 in mcm_pvt_normal_svc_sndrsp (env_hdl=131071, 
fr_svc_id=26, msg=0x7fa7c4e0a0c0, to_dest=565214705385538, to_svc_id=25, 
req=0x7fa7c4e09e60, pri=MDS_SEND_PRIORITY_MEDIUM)
   at mds_c_sndrcv.c:2457
6  0x7fa7c7530d87 in mds_mcm_send (info=0x7fa7c4e09f90) at 
mds_c_sndrcv.c:690
7  0x7fa7c7530170 in mds_send (info=0x7fa7c4e09f90) at mds_c_sndrcv.c:390
8  0x7fa7c752fde1 in ncsmds_api (svc_to_mds_info=0x7fa7c4e09f90) at 
mds_papi.c:191
9  0x7fa7c6e527a7 in imma_mds_msg_sync_send (imma_mds_hdl=131071, 
destination=0x7fa7c707a850 <imma_cb+144>, i_evt=0x7fa7c4e0a0c0, 
o_evt=0x7fa7c4e0a058, timeout=1000) at imma_mds.c:604
10 0x7fa7c6e47bd6 in search_next_common (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a298, attributes=0x7fa7c4e0a318, bUseString=false) at 
imma_om_api.c:7584
11 0x7fa7c6e475f9 in saImmOmSearchNext_2 (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a370, attributes=0x7fa7c4e0a318) at imma_om_api.c:7444
12 0x004fb140 in immutil_saImmOmSearchNext_2 
(searchHandle=1475865130618188739, objectName=0x7fa7c4e0a370, 
attributes=0x7fa7c4e0a318) at immutil.c:1818
13 0x00431bc2 in avd_compcstype_config_get 
(name="safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF", comp=0x806e50) at 
compcstype.cc:306
14 0x00429c5c in avd_comp_config_get 
(su_name="safSu=SC-2,safSg=2N,safApp=OpenSAF", su=0x7b08c0) at comp.cc:756
15 0x004d9f63 in avd_su_config_get (sg_name="safSg=2N,safApp=OpenSAF", 
sg=0x7bb4c0) at su.cc:717
16 0x0048c7dc in avd_sg_config_get (app_dn="safApp=OpenSAF", 
app=0x7c08b0) at sg.cc:457
17 0x0040a88a in avd_app_config_get () at app.cc:460
18 0x0044c154 in avd_imm_config_get () at imm.cc:1574
19 0x0044d6e6 in avd_imm_reinit_bg_thread (_cb=0x75dba0 
<_control_block>) at imm.cc:1891
20 0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
21 0x7fa7c571b9cd in clone () from /lib64/libc.so.6
22 0x in ?? ()

Thread 4 (Thread 0x7fa7c791cb00 (LWP 3836)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c7549696 in mdtm_process_recv_events () at mds_dt_tipc.c:665
2  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
3  0x7fa7c571b9cd in clone () from /lib64/libc.so.6
4  0x in ?? ()

Thread 3 (Thread 0x7fa7b700 (LWP 5586)):
0  0x7fa7c6804294 in __lll_lock_wait () from /li

[tickets] [opensaf:tickets] #2101 AMF: Heartbeat timeout observed after ImmNd restart

2017-02-28 Thread Nagendra Kumar
This might have been fixed by:
changeset:   8365:1108027c16f0
branch:  opensaf-5.1.x
parent:  8362:13742b479d92
user:Praveen Malviya 
date:Fri Nov 25 15:56:05 2016 +0530
summary: amfd: do not spawn multiple threads for imm init[#2188] V2

Please retest on 5.2 FC release.


---

** [tickets:#2101] AMF: Heartbeat timeout observed after ImmNd restart**

**Status:** unassigned
**Milestone:** 5.2.RC1
**Created:** Fri Oct 07, 2016 01:26 PM UTC by Chani Srivastava
**Last Updated:** Fri Oct 07, 2016 02:42 PM UTC
**Owner:** nobody
**Attachments:**

- 
[messages](https://sourceforge.net/p/opensaf/tickets/2101/attachment/messages) 
(777.0 kB; application/octet-stream)
- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/2101/attachment/osafamfd) 
(5.6 MB; application/octet-stream)


OS : Suse 64bit
Changeset : 8190
Setup : 4 physical nodee 1 PBE enabled with 1Lakh load

Step 
1. Bringu opensaf on four nodes
2. Imm test cases running with ndrestart scenario on standby controller
3. After 1 of the ndrestart amfd heart beat timeout happened. Below is the 
backtrace.

Coredump:
0 0x7fa7c680161c in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
1  0x0044da7d in avd_imm_reinit_bg () at imm.cc:1949
2  0x00453e33 in main_loop () at main.cc:737
3  0x004541ff in main (argc=2, argv=0x7fff1cbfa268) at main.cc:848
(gdb) thread apply all bt

Thread 6 (Thread 0x7fa7c78fcb00 (LWP 3837)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout_ts=0x7fa7c78fc180, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c78fc1c0, i_nfds=1, 
i_timeout=3) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=13, i_timeout=3) at 
osaf_poll.c:128
4  0x7fa7c65f3a0c in rda_read_msg (sockfd=13, msg=0x7fa7c78fc260 "10 2", 
size=64) at rda_papi.cc:673
5  0x7fa7c65f31ec in rda_callback_task (rda_callback_cb=0x7963e0) at 
rda_papi.cc:150
6  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
7  0x7fa7c571b9cd in clone () from /lib64/libc.so.6
8  0x in ?? ()

Thread 5 (Thread 0x7fa7c4e0b700 (LWP 5557)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c74f80d3 in osaf_ppoll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout_ts=0x7fa7c4e09d00, i_sigmask=0x0) at osaf_poll.c:105
2  0x7fa7c74f7ff9 in osaf_poll (io_fds=0x7fa7c4e09d40, i_nfds=1, 
i_timeout=1) at osaf_poll.c:44
3  0x7fa7c74f81c8 in osaf_poll_one_fd (i_fd=29, i_timeout=1) at 
osaf_poll.c:128
4  0x7fa7c7535605 in mds_mcm_time_wait (sel_obj=0x80c3d8, time_val=1000) at 
mds_c_sndrcv.c:2570
5  0x7fa7c75351b1 in mcm_pvt_normal_svc_sndrsp (env_hdl=131071, 
fr_svc_id=26, msg=0x7fa7c4e0a0c0, to_dest=565214705385538, to_svc_id=25, 
req=0x7fa7c4e09e60, pri=MDS_SEND_PRIORITY_MEDIUM)
   at mds_c_sndrcv.c:2457
6  0x7fa7c7530d87 in mds_mcm_send (info=0x7fa7c4e09f90) at 
mds_c_sndrcv.c:690
7  0x7fa7c7530170 in mds_send (info=0x7fa7c4e09f90) at mds_c_sndrcv.c:390
8  0x7fa7c752fde1 in ncsmds_api (svc_to_mds_info=0x7fa7c4e09f90) at 
mds_papi.c:191
9  0x7fa7c6e527a7 in imma_mds_msg_sync_send (imma_mds_hdl=131071, 
destination=0x7fa7c707a850 , i_evt=0x7fa7c4e0a0c0, 
o_evt=0x7fa7c4e0a058, timeout=1000) at imma_mds.c:604
10 0x7fa7c6e47bd6 in search_next_common (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a298, attributes=0x7fa7c4e0a318, bUseString=false) at 
imma_om_api.c:7584
11 0x7fa7c6e475f9 in saImmOmSearchNext_2 (searchHandle=1475865130618188739, 
objectName=0x7fa7c4e0a370, attributes=0x7fa7c4e0a318) at imma_om_api.c:7444
12 0x004fb140 in immutil_saImmOmSearchNext_2 
(searchHandle=1475865130618188739, objectName=0x7fa7c4e0a370, 
attributes=0x7fa7c4e0a318) at immutil.c:1818
13 0x00431bc2 in avd_compcstype_config_get 
(name="safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF", comp=0x806e50) at 
compcstype.cc:306
14 0x00429c5c in avd_comp_config_get 
(su_name="safSu=SC-2,safSg=2N,safApp=OpenSAF", su=0x7b08c0) at comp.cc:756
15 0x004d9f63 in avd_su_config_get (sg_name="safSg=2N,safApp=OpenSAF", 
sg=0x7bb4c0) at su.cc:717
16 0x0048c7dc in avd_sg_config_get (app_dn="safApp=OpenSAF", 
app=0x7c08b0) at sg.cc:457
17 0x0040a88a in avd_app_config_get () at app.cc:460
18 0x0044c154 in avd_imm_config_get () at imm.cc:1574
19 0x0044d6e6 in avd_imm_reinit_bg_thread (_cb=0x75dba0 
<_control_block>) at imm.cc:1891
20 0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
21 0x7fa7c571b9cd in clone () from /lib64/libc.so.6
22 0x in ?? ()

Thread 4 (Thread 0x7fa7c791cb00 (LWP 3836)):
0  0x7fa7c57124f6 in poll () from /lib64/libc.so.6
1  0x7fa7c7549696 in mdtm_process_recv_events () at mds_dt_tipc.c:665
2  0x7fa7c67fd7b6 in start_thread () from /lib64/libpthread.so.0
3  

[tickets] [opensaf:tickets] #2322 Amf: Amfd crashed during cold sync

2017-02-27 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8619:addb5d754d71
branch:  opensaf-5.0.x
parent:  8607:adc96bde4277
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Mon Feb 27 13:49:09 2017 +0530
files:   osaf/services/saf/amf/amfd/sutcomptype.cc
description:
amfd: fix null ptr accessing issue [#2322]


changeset:   8620:5be5a71b5069
branch:  opensaf-5.1.x
parent:  8617:37f663fdfaaa
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Mon Feb 27 13:49:44 2017 +0530
files:   osaf/services/saf/amf/amfd/sutcomptype.cc
description:
amfd: fix null ptr accessing issue [#2322]


changeset:   8621:7e68a07b0742
tag: tip
parent:  8618:2e299171c00e
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Mon Feb 27 13:50:04 2017 +0530
files:   src/amf/amfd/sutcomptype.cc
description:
amfd: fix null ptr accessing issue [#2322]


[staging:addb5d]
[staging:5be5a7]
[staging:7e68a0]



---

** [tickets:#2322] Amf: Amfd crashed during cold sync**

**Status:** fixed
**Milestone:** 5.0.2
**Created:** Thu Feb 23, 2017 11:33 AM UTC by Nagendra Kumar
**Last Updated:** Thu Feb 23, 2017 12:36 PM UTC
**Owner:** Nagendra Kumar


Steps to reproduce
1. Start SC-1, upload a demo app.
2. Keep 5 seconds delay at the following code:
diff --git a/src/amf/amfd/imm.cc b/src/amf/amfd/imm.cc
--- a/src/amf/amfd/imm.cc
+++ b/src/amf/amfd/imm.cc
@@ -1602,6 +1602,10 @@ unsigned int avd_imm_config_get(void)
if (avd_comptype_config_get() != SA_AIS_OK)
goto done;

+   LOG_ER("1.  Before sleep ");
+   sleep(5);
+   LOG_ER("2.  After sleep");
+
/* SaAmfSUType needed by SaAmfSGType */
if (avd_sutype_config_get() != SA_AIS_OK)
goto done;
3. Start SC-2, when it print "Before sleep". Then run the below command on SC-1:
immcfg -d 
"safMemberCompType=safVersion=1\,safCompType=AmfDemo1,safVersion=1,safSuType=AmfDemo1"

Observed behaviour
--

When SC-2 comes up after sleep, it crashes at:
Core was generated by `/usr/local/lib/opensaf/osafamfd --tracemask=0x'.
Program terminated with signal 11, Segmentation fault.
#0  sutcomptype_ccb_completed_cb(CcbUtilOperationData*) () at 
src/amf/amfd/sutcomptype.cc:131
131 if (sutcomptype->curr_num_components == 0) {
(gdb) bt
#0  sutcomptype_ccb_completed_cb(CcbUtilOperationData*) () at 
src/amf/amfd/sutcomptype.cc:131
#1  0x7f64bc2d9757 in ccb_completed_cb(unsigned long long, unsigned long 
long) () at src/amf/amfd/imm.cc:1065
#2  0x7f64bb81dd64 in imma_process_callback_info(imma_cb*, 
imma_client_node*, imma_callback_info*, unsigned long long) ()
at src/imm/agent/imma_proc.cc:2174
#3  0x7f64bb81f739 in imma_hdl_callbk_dispatch_all(imma_cb*, unsigned long 
long) () at src/imm/agent/imma_proc.cc:1732
#4  0x7f64bb81659a in saImmOiDispatch () at src/imm/agent/imma_oi_api.cc:609
#5  0x7f64bc296ab8 in main () at src/amf/amfd/main.cc:729




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2322 Amf: Amfd crashed during cold sync

2017-02-23 Thread Nagendra Kumar
- **status**: accepted --> review



---

** [tickets:#2322] Amf: Amfd crashed during cold sync**

**Status:** review
**Milestone:** 5.0.2
**Created:** Thu Feb 23, 2017 11:33 AM UTC by Nagendra Kumar
**Last Updated:** Thu Feb 23, 2017 11:33 AM UTC
**Owner:** Nagendra Kumar


Steps to reproduce
1. Start SC-1, upload a demo app.
2. Keep 5 seconds delay at the following code:
diff --git a/src/amf/amfd/imm.cc b/src/amf/amfd/imm.cc
--- a/src/amf/amfd/imm.cc
+++ b/src/amf/amfd/imm.cc
@@ -1602,6 +1602,10 @@ unsigned int avd_imm_config_get(void)
if (avd_comptype_config_get() != SA_AIS_OK)
goto done;

+   LOG_ER("1.  Before sleep ");
+   sleep(5);
+   LOG_ER("2.  After sleep");
+
/* SaAmfSUType needed by SaAmfSGType */
if (avd_sutype_config_get() != SA_AIS_OK)
goto done;
3. Start SC-2, when it print "Before sleep". Then run the below command on SC-1:
immcfg -d 
"safMemberCompType=safVersion=1\,safCompType=AmfDemo1,safVersion=1,safSuType=AmfDemo1"

Observed behaviour
--

When SC-2 comes up after sleep, it crashes at:
Core was generated by `/usr/local/lib/opensaf/osafamfd --tracemask=0x'.
Program terminated with signal 11, Segmentation fault.
#0  sutcomptype_ccb_completed_cb(CcbUtilOperationData*) () at 
src/amf/amfd/sutcomptype.cc:131
131 if (sutcomptype->curr_num_components == 0) {
(gdb) bt
#0  sutcomptype_ccb_completed_cb(CcbUtilOperationData*) () at 
src/amf/amfd/sutcomptype.cc:131
#1  0x7f64bc2d9757 in ccb_completed_cb(unsigned long long, unsigned long 
long) () at src/amf/amfd/imm.cc:1065
#2  0x7f64bb81dd64 in imma_process_callback_info(imma_cb*, 
imma_client_node*, imma_callback_info*, unsigned long long) ()
at src/imm/agent/imma_proc.cc:2174
#3  0x7f64bb81f739 in imma_hdl_callbk_dispatch_all(imma_cb*, unsigned long 
long) () at src/imm/agent/imma_proc.cc:1732
#4  0x7f64bb81659a in saImmOiDispatch () at src/imm/agent/imma_oi_api.cc:609
#5  0x7f64bc296ab8 in main () at src/amf/amfd/main.cc:729




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2322 Amf: Amfd crashed during cold sync

2017-02-23 Thread Nagendra Kumar



---

** [tickets:#2322] Amf: Amfd crashed during cold sync**

**Status:** accepted
**Milestone:** 5.0.2
**Created:** Thu Feb 23, 2017 11:33 AM UTC by Nagendra Kumar
**Last Updated:** Thu Feb 23, 2017 11:33 AM UTC
**Owner:** Nagendra Kumar


Steps to reproduce
1. Start SC-1, upload a demo app.
2. Keep 5 seconds delay at the following code:
diff --git a/src/amf/amfd/imm.cc b/src/amf/amfd/imm.cc
--- a/src/amf/amfd/imm.cc
+++ b/src/amf/amfd/imm.cc
@@ -1602,6 +1602,10 @@ unsigned int avd_imm_config_get(void)
if (avd_comptype_config_get() != SA_AIS_OK)
goto done;

+   LOG_ER("1.  Before sleep ");
+   sleep(5);
+   LOG_ER("2.  After sleep");
+
/* SaAmfSUType needed by SaAmfSGType */
if (avd_sutype_config_get() != SA_AIS_OK)
goto done;
3. Start SC-2, when it print "Before sleep". Then run the below command on SC-1:
immcfg -d 
"safMemberCompType=safVersion=1\,safCompType=AmfDemo1,safVersion=1,safSuType=AmfDemo1"

Observed behaviour
--

When SC-2 comes up after sleep, it crashes at:
Core was generated by `/usr/local/lib/opensaf/osafamfd --tracemask=0x'.
Program terminated with signal 11, Segmentation fault.
#0  sutcomptype_ccb_completed_cb(CcbUtilOperationData*) () at 
src/amf/amfd/sutcomptype.cc:131
131 if (sutcomptype->curr_num_components == 0) {
(gdb) bt
#0  sutcomptype_ccb_completed_cb(CcbUtilOperationData*) () at 
src/amf/amfd/sutcomptype.cc:131
#1  0x7f64bc2d9757 in ccb_completed_cb(unsigned long long, unsigned long 
long) () at src/amf/amfd/imm.cc:1065
#2  0x7f64bb81dd64 in imma_process_callback_info(imma_cb*, 
imma_client_node*, imma_callback_info*, unsigned long long) ()
at src/imm/agent/imma_proc.cc:2174
#3  0x7f64bb81f739 in imma_hdl_callbk_dispatch_all(imma_cb*, unsigned long 
long) () at src/imm/agent/imma_proc.cc:1732
#4  0x7f64bb81659a in saImmOiDispatch () at src/imm/agent/imma_oi_api.cc:609
#5  0x7f64bc296ab8 in main () at src/amf/amfd/main.cc:729




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #245 amf: saAmfComponentErrorClear_4() does not returns SA_AIS_ERR_NO_OP for operationally enabled comp.

2017-02-21 Thread Nagendra Kumar
- **status**: fixed --> assigned



---

** [tickets:#245] amf: saAmfComponentErrorClear_4() does not returns 
SA_AIS_ERR_NO_OP for operationally enabled comp.**

**Status:** assigned
**Milestone:** 5.2.FC
**Created:** Thu May 16, 2013 06:37 AM UTC by Praveen
**Last Updated:** Thu Dec 01, 2016 06:14 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2818.

Changeset:3728
 When saAmfComponentErrorClear_4() is called for an operationally enabled 
component, it returns SA_AIS_OK. According to spec (B.04.01, section 7.12.2 
page 329)return value should be SA_AIS_ERR_NO_OP.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #245 amf: saAmfComponentErrorClear_4() does not returns SA_AIS_ERR_NO_OP for operationally enabled comp.

2017-02-21 Thread Nagendra Kumar
- **status**: assigned --> fixed
- **Comment**:

changeset:   8604:33f9c7a3df4a
tag: tip
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Wed Feb 22 10:33:22 2017 +0530
summary: amfnd: fix nullptr issue [#245]

[staging:33f9c7]



---

** [tickets:#245] amf: saAmfComponentErrorClear_4() does not returns 
SA_AIS_ERR_NO_OP for operationally enabled comp.**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Thu May 16, 2013 06:37 AM UTC by Praveen
**Last Updated:** Wed Feb 22, 2017 05:52 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2818.

Changeset:3728
 When saAmfComponentErrorClear_4() is called for an operationally enabled 
component, it returns SA_AIS_OK. According to spec (B.04.01, section 7.12.2 
page 329)return value should be SA_AIS_ERR_NO_OP.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-02-20 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8592:f13798019501
tag: tip
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Feb 21 10:01:48 2017 +0530
summary: amfd: return TRY_AGAIN on rollback of shutdown admin op [#2133]

[staging:f13798]

Documentation Changes:
changeset:   208:f21e52b1f0d1
tag: tip
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Tue Feb 21 10:28:42 2017 +0530
summary: amf: deviations on SI shutdown [#2133]

[staging:f21e52]




---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Tue Feb 07, 2017 07:14 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1749 amf: Incorrect ER in syslog

2017-02-15 Thread Nagendra Kumar
- **status**: review --> fixed
- **Comment**:

changeset:   8585:56c0fedf1706
tag: tip
user:    Nagendra Kumar<nagendr...@oracle.com>
date:Thu Feb 16 11:05:09 2017 +0530
summary: amfd: change LOG_ER to LOG_NO [#1749]




---

** [tickets:#1749] amf: Incorrect ER in syslog**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Tue Apr 12, 2016 12:22 PM UTC by elunlen
**Last Updated:** Thu Feb 09, 2017 11:16 AM UTC
**Owner:** Nagendra Kumar


When requesting AMF to do a SI swap the following message may appear in the 
syslog:
2016-04-11 17:35:37 SC-1 osafamfd[500]: ER safSi=SC-2N,safApp=OpenSAF SWAP 
failed - Cold sync in progress
This is the case also if the response to the operation is SA_AIS_ERR_TRY_AGAIN 
or SA_AIS_ERR_BUSY.
Getting these responses is not error responses and should not result in an ER 
message in the syslog


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1749 amf: Incorrect ER in syslog

2017-02-09 Thread Nagendra Kumar
- **status**: accepted --> review



---

** [tickets:#1749] amf: Incorrect ER in syslog**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Tue Apr 12, 2016 12:22 PM UTC by elunlen
**Last Updated:** Tue Feb 07, 2017 10:06 AM UTC
**Owner:** Nagendra Kumar


When requesting AMF to do a SI swap the following message may appear in the 
syslog:
2016-04-11 17:35:37 SC-1 osafamfd[500]: ER safSi=SC-2N,safApp=OpenSAF SWAP 
failed - Cold sync in progress
This is the case also if the response to the operation is SA_AIS_ERR_TRY_AGAIN 
or SA_AIS_ERR_BUSY.
Getting these responses is not error responses and should not result in an ER 
message in the syslog


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2278 mds: Blocking send causes AMF health check time-out

2017-02-09 Thread Nagendra Kumar
Also, please share mds.log and complete syslog.


---

** [tickets:#2278] mds: Blocking send causes AMF health check time-out**

**Status:** assigned
**Milestone:** 5.1.1
**Created:** Thu Jan 26, 2017 09:49 AM UTC by Anders Widell
**Last Updated:** Thu Feb 09, 2017 08:31 AM UTC
**Owner:** A V Mahesh (AVM)


AMF health-check time-out is seen on SC-1 after restarting SC-2. The system is 
using OpenSAF 5.1.0 configured with TCP communication.

Syslog:

~~~
2017-01-20T18:29:04.405982+01:00 local0.err SC-1 osafamfnd[2820]: ER AMF 
director heart beat timeout, generating core for amfd
2017-01-20T18:29:05.408819+01:00 local0.crit SC-1 osafamfnd[2820]: Rebooting 
OpenSAF NodeId = 131343 EE Name = , Reason: AMF director heart beat timeout, 
OwnNodeId = 131343, SupervisionTime = 0
~~~

Back-trace of osafamfd:

~~~
0x7fa316cceb60 osaf_poll_no_timeout (osaf/libs/core/common/osaf_poll.c:33)
0x7fa316ccede5 osaf_poll (osaf/libs/core/common/osaf_poll.c:45)
0x7fa316ccee25 osaf_poll_one_fd (osaf/libs/core/common/osaf_poll.c:129)
0x7fa316cfab67 mds_mcm_time_wait 
(osaf/libs/core/common/include/osaf_utility.h:79)
0x7fa316cfae51 mds_subtn_tbl_add_disc_queue 
(osaf/libs/core/mds/mds_c_sndrcv.c:1808)
0x7fa316cfb03d mds_mcm_process_disc_queue_checks_redundant 
(osaf/libs/core/mds/mds_c_sndrcv.c:2338)
0x7fa316cfbcd1 mcm_pvt_red_snd_process_common 
(osaf/libs/core/mds/mds_c_sndrcv.c:2257)
0x7fa316cfd04d mcm_pvt_red_svc_snd (osaf/libs/core/mds/mds_c_sndrcv.c:2174)
0x7fa316cff8f9 mds_send (osaf/libs/core/mds/mds_c_sndrcv.c:736)
0x7fa316cf9068 ncsmds_api (osaf/libs/core/mds/mds_papi.c:191)
0x7fa316ce6f5f mbcsv_mds_send_msg (osaf/libs/core/mbcsv/mbcsv_mds.c:239)
0x7fa316cec440 mbcsv_send_ckpt_data_to_all_peers 
(osaf/libs/core/mbcsv/mbcsv_util.c:479)
0x7fa316ce56d7 mbcsv_process_snd_ckpt_request 
(osaf/libs/core/mbcsv/mbcsv_api.c:862)
0x40bfc0 avsv_send_ckpt_data(cl_cb_tag*, unsigned int, unsigned long, unsigned 
int, unsigned int) (osaf/services/saf/amf/amfd/chkop.cc:1062)
0x446649 avd_node_oper_state_set(AVD_AVND*, SaAmfOperationalStateT) 
(osaf/services/saf/amf/amfd/node.cc:505)
0x44040c avd_node_mark_absent(AVD_AVND*) 
(osaf/services/saf/amf/amfd/ndfsm.cc:1018)
0x4438ba avd_node_failover(AVD_AVND*) 
(osaf/services/saf/amf/amfd/ndproc.cc:1141)

~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2278 mds: Blocking send causes AMF health check time-out

2017-02-09 Thread Nagendra Kumar
Hi Anders, 
Do you see it frequently ? OR 
Do you think this situation comes under some specific test conditions or a 
specific platform ?


---

** [tickets:#2278] mds: Blocking send causes AMF health check time-out**

**Status:** assigned
**Milestone:** 5.1.1
**Created:** Thu Jan 26, 2017 09:49 AM UTC by Anders Widell
**Last Updated:** Thu Feb 02, 2017 03:33 AM UTC
**Owner:** A V Mahesh (AVM)


AMF health-check time-out is seen on SC-1 after restarting SC-2. The system is 
using OpenSAF 5.1.0 configured with TCP communication.

Syslog:

~~~
2017-01-20T18:29:04.405982+01:00 local0.err SC-1 osafamfnd[2820]: ER AMF 
director heart beat timeout, generating core for amfd
2017-01-20T18:29:05.408819+01:00 local0.crit SC-1 osafamfnd[2820]: Rebooting 
OpenSAF NodeId = 131343 EE Name = , Reason: AMF director heart beat timeout, 
OwnNodeId = 131343, SupervisionTime = 0
~~~

Back-trace of osafamfd:

~~~
0x7fa316cceb60 osaf_poll_no_timeout (osaf/libs/core/common/osaf_poll.c:33)
0x7fa316ccede5 osaf_poll (osaf/libs/core/common/osaf_poll.c:45)
0x7fa316ccee25 osaf_poll_one_fd (osaf/libs/core/common/osaf_poll.c:129)
0x7fa316cfab67 mds_mcm_time_wait 
(osaf/libs/core/common/include/osaf_utility.h:79)
0x7fa316cfae51 mds_subtn_tbl_add_disc_queue 
(osaf/libs/core/mds/mds_c_sndrcv.c:1808)
0x7fa316cfb03d mds_mcm_process_disc_queue_checks_redundant 
(osaf/libs/core/mds/mds_c_sndrcv.c:2338)
0x7fa316cfbcd1 mcm_pvt_red_snd_process_common 
(osaf/libs/core/mds/mds_c_sndrcv.c:2257)
0x7fa316cfd04d mcm_pvt_red_svc_snd (osaf/libs/core/mds/mds_c_sndrcv.c:2174)
0x7fa316cff8f9 mds_send (osaf/libs/core/mds/mds_c_sndrcv.c:736)
0x7fa316cf9068 ncsmds_api (osaf/libs/core/mds/mds_papi.c:191)
0x7fa316ce6f5f mbcsv_mds_send_msg (osaf/libs/core/mbcsv/mbcsv_mds.c:239)
0x7fa316cec440 mbcsv_send_ckpt_data_to_all_peers 
(osaf/libs/core/mbcsv/mbcsv_util.c:479)
0x7fa316ce56d7 mbcsv_process_snd_ckpt_request 
(osaf/libs/core/mbcsv/mbcsv_api.c:862)
0x40bfc0 avsv_send_ckpt_data(cl_cb_tag*, unsigned int, unsigned long, unsigned 
int, unsigned int) (osaf/services/saf/amf/amfd/chkop.cc:1062)
0x446649 avd_node_oper_state_set(AVD_AVND*, SaAmfOperationalStateT) 
(osaf/services/saf/amf/amfd/node.cc:505)
0x44040c avd_node_mark_absent(AVD_AVND*) 
(osaf/services/saf/amf/amfd/ndfsm.cc:1018)
0x4438ba avd_node_failover(AVD_AVND*) 
(osaf/services/saf/amf/amfd/ndproc.cc:1141)

~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1749 amf: Incorrect ER in syslog

2017-02-07 Thread Nagendra Kumar
- **Component**: osaf --> amf
- **Part**: - --> d
- **Comment**:

I will fix relevant error logs in Amf, let other services owners create the 
tickets.



---

** [tickets:#1749] amf: Incorrect ER in syslog**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Tue Apr 12, 2016 12:22 PM UTC by elunlen
**Last Updated:** Mon Aug 29, 2016 08:11 PM UTC
**Owner:** Nagendra Kumar


When requesting AMF to do a SI swap the following message may appear in the 
syslog:
2016-04-11 17:35:37 SC-1 osafamfd[500]: ER safSi=SC-2N,safApp=OpenSAF SWAP 
failed - Cold sync in progress
This is the case also if the response to the operation is SA_AIS_ERR_TRY_AGAIN 
or SA_AIS_ERR_BUSY.
Getting these responses is not error responses and should not result in an ER 
message in the syslog


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-02-06 Thread Nagendra Kumar
- **status**: accepted --> review
- **Comment**:

Sent patch for review with the above implementation.



---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Thu Feb 02, 2017 09:56 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] Re: #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-02-02 Thread Nagendra Kumar
Hi Minh,

I also wouldn't prefer to rollback but because of internal 
implementation, I am just reusing the code. I am preferring return code as 
TRY_AGAIN because the error has occurred and the operation can't be completed.

 

Thanks

-Nagu

 

From: Minh Hon Chau [mailto:minh-c...@users.sf.net] 
Sent: 02 February 2017 09:56
To: opensaf-tickets@lists.sourceforge.net
Subject: [tickets] [opensaf:tickets] Re: #2133 AMF: Rollback admin 
shutdown/lock SI operation if node failover

 

Hi Nagu,

I prefer to not rollback the operations (as commented by Praveen earlier) if 
the rollback is due to internal implementation, not from a specific use case. 
Anyway if we have no way to correct it, then we have to accept it. I don't have 
a clear indication on which error code should be returned, both TRY_AGAIN and 
TIMEOUT seems ok since the caller will have to retry the operation.

Thanks,
Minh

  _  

HYPERLINK "https://sourceforge.net/p/opensaf/tickets/2133/"[tickets:#2133] AMF: 
Rollback admin shutdown/lock SI operation if node failover

Status: accepted
Milestone: 5.2.FC
Created: Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
Last Updated: Wed Feb 01, 2017 08:50 AM UTC
Owner: Nagendra Kumar

In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK

  _  

Sent from sourceforge.net because HYPERLINK 
"mailto:opensaf-tickets@lists.sourceforge.net"opensaf-tickets@lists.sourceforge.net
 is subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a 
mailing list, you can unsubscribe from the mailing list.



---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Thu Feb 02, 2017 09:56 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-02-02 Thread Nagendra Kumar
Hi Minh,
I also wouldn’t prefer to rollback but because of internal 
implementation, I am just reusing the code. I am preferring return code as 
TRY_AGAIN because the error has occurred and the operation can’t be completed.

Thanks
-Nagu


---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Wed Feb 01, 2017 08:50 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


Re: [tickets] [opensaf:tickets] Re: #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-02-01 Thread Nagendra Kumar
Hi Minh,

I also wouldn't prefer to rollback but because of internal 
implementation, I am just reusing the code. I am preferring return code as 
TRY_AGAIN because the error has occurred and the operation can't be completed.

 

Thanks

-Nagu

 

From: Minh Hon Chau [mailto:minh-c...@users.sf.net] 
Sent: 02 February 2017 09:56
To: opensaf-tickets@lists.sourceforge.net
Subject: [tickets] [opensaf:tickets] Re: #2133 AMF: Rollback admin 
shutdown/lock SI operation if node failover

 

Hi Nagu,

I prefer to not rollback the operations (as commented by Praveen earlier) if 
the rollback is due to internal implementation, not from a specific use case. 
Anyway if we have no way to correct it, then we have to accept it. I don't have 
a clear indication on which error code should be returned, both TRY_AGAIN and 
TIMEOUT seems ok since the caller will have to retry the operation.

Thanks,
Minh

  _  

HYPERLINK "https://sourceforge.net/p/opensaf/tickets/2133/"[tickets:#2133] AMF: 
Rollback admin shutdown/lock SI operation if node failover

Status: accepted
Milestone: 5.2.FC
Created: Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
Last Updated: Wed Feb 01, 2017 08:50 AM UTC
Owner: Nagendra Kumar

In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK

  _  

Sent from sourceforge.net because HYPERLINK 
"mailto:opensaf-tickets@lists.sourceforge.net"opensaf-tickets@lists.sourceforge.net
 is subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-02-01 Thread Nagendra Kumar
- **Type**: defect --> enhancement
- **Comment**:

Changing to enhancement as it changes return types of few admin operation 
mentioned above.



---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Wed Feb 01, 2017 07:49 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-01-31 Thread Nagendra Kumar
Any comment ? I am preparing the patch with TRY_AGAIN.


---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Mon Jan 30, 2017 06:48 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-01-29 Thread Nagendra Kumar
>>I think it's good idea that we return other codes (TIMEOUT,...?) in case 
>>error escalation that rollback the shutdown command.
I think it is better to return TRY_AGAIN as it gives some margin for Error 
occured  in Specs.
"SA_AIS_ERR_TRY_AGAIN - The service cannot be provided at this time. The client
may retry later. This error generally should be returned when the requested 
action is
valid but not currently possible, probably because another operation is acting 
upon
the logical entity on which the administrative operation is invoked. Such an 
operation
can be another administrative operation or an error recovery initiated by the 
Availability
Management Framework."


---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Mon Jan 23, 2017 11:21 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-01-23 Thread Nagendra Kumar
I will provied fix of 2N red model for 5.2 release. The fix would be to return 
TIMEOUT for failure of admin shutdown cases when shutdown admin op gets 
reverted and admin state is rolled back to Unlocked.


---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Mon Jan 23, 2017 11:15 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-01-23 Thread Nagendra Kumar
>>Comparing to your findings, is there something we have to do with "single SI 
>>assignment, the admin state is locked." for su f/o and node f/o?
No, I think, if it is single SI assignment, we can return Success and mark 
admin state as Locked.
>>Another question, do you know use case's motivation or technical problem 
>>behind that we had this deviation/inconsistency?
It is for ease of flow. Like for single SI, we can easily mark locked and 
remove the assignments from Act SU. For SIs having two assignments, 2N red 
model is reusing SU switchover codes, it leaves Si in unlocked state.


---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Mon Jan 23, 2017 08:29 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-01-23 Thread Nagendra Kumar
- **status**: unassigned --> accepted
- **assigned_to**: Nagendra Kumar
- **Milestone**: future --> 5.2.FC



---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** accepted
**Milestone:** 5.2.FC
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Thu Jan 19, 2017 03:27 AM UTC
**Owner:** Nagendra Kumar


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2162 AMF: Headless recovery failed if SC failover during headless sync

2017-01-22 Thread Nagendra Kumar
Please find the logs attached for TC mentioned in the email.


Attachments:

- 
[Logs-tc.rar](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/f093e418/d8dc/attachment/Logs-tc.rar)
 (477.7 kB; application/octet-stream)


---

** [tickets:#2162] AMF: Headless recovery failed if SC failover during headless 
sync**

**Status:** review
**Milestone:** 5.2.FC
**Labels:** headless recovery 
**Created:** Thu Nov 03, 2016 11:01 AM UTC by Minh Hon Chau
**Last Updated:** Mon Jan 09, 2017 11:24 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2162/attachment/log.tgz) 
(1.4 MB; application/x-compressed)


Test steps:
- Set up 2N assignment, PL4 hosts SU4 (active assignment), PL5 host SU5 
(standby assignment)
- Stop SCs
- Stop PL4
- Restart SC1
- Restart SC2
- Since PL4 is stopped, headless sync will be time out in 10 secs. During this 
10 secs, reboot SC1 to trigger SC failover
Observation: SC2 becomes active controller, cold sync complete, but SU5 still 
has standby assignment.

When SC2 becomes active controller, the part of code that performs headless 
recovery is not executed (function failover_absent_assignment()). Therefore, 
the transient assignments remain after SC failover.

Log/trace are attached.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2133 AMF: Rollback admin shutdown/lock SI operation if node failover

2017-01-18 Thread Nagendra Kumar
I will check the admin op return code and will try to fix them.


---

** [tickets:#2133] AMF: Rollback admin shutdown/lock SI operation if node 
failover**

**Status:** unassigned
**Milestone:** future
**Created:** Thu Oct 20, 2016 06:49 PM UTC by Minh Hon Chau
**Last Updated:** Wed Jan 18, 2017 11:04 AM UTC
**Owner:** nobody


In scenario of shut down SI, delay QUIESCING csi callback, then reboot the node 
that hosting SU having pending this csi callback. The result of this operation 
looks differently between SGs
- For 2N: the SI Admin state is rollbacked to UNLOCK 
- For Nway: the SI Admin state moves to LOCKED
- In NpM: Haven't tested just browsing SG_NPM::node_fail_si_oper, looks SI 
Admin states rollbacks to UNLOCK

My question is whether the result of these scenario should be consistent? And 
what's the expected outcome?
Also, the handling of node_fail_si_oper for admin lock is not consistent. For 
2N, Admin state remains LOCKED, NpM rollbacks to UNLOCK


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


  1   2   3   4   5   6   7   8   9   10   >