[tickets] [opensaf:tickets] #2421 amfd: is_swbdl_delete_ok_for_node should also check for SG and Node admin state

2017-04-20 Thread Praveen
- **status**: assigned --> accepted



---

** [tickets:#2421] amfd: is_swbdl_delete_ok_for_node should also check for SG 
and Node admin state**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Tue Apr 11, 2017 09:33 AM UTC by Tai Dinh
**Last Updated:** Thu Apr 20, 2017 04:07 AM UTC
**Owner:** Praveen


During deleting of NodeSwBundle object, AMF only check if the SUs admin state 
is at LOCKED_INSTANTIATION or not. Which means that the deletion of that object 
is not allowed even in the case where the SG or Node is at LOCKED_INSTANTIATION 
state, which implicitetly means that the SU is UNINSTANTIATED.
This currently blocks the SMF campagin to be rolled back in some situation.

The SU's node admin state and SU's SG admin state should also be checked and 
the deletetion should be allowed if one of above state is LOCKED_INSTANTIATION.

/Tai


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1765 ckpt : saCkptCheckpointOpen api call failed and returing SA_AIS_ERR_LIBRARY after couple of failover

2017-04-20 Thread Hung Nguyen
- **status**: review --> fixed
- **Milestone**: 5.0.2 --> 5.17.06
- **Comment**:

5.17.08 (develop) [code:bfebed]
~~~
commit bfebede5783121fc363f63536bbb89ba3355152e
Author: Hoang Vo 
Date:   Fri Apr 21 09:13:18 2017 +0700

cpd: to correct failover behavior of cpsv [#1765]

Problem:
In case failover multiple times, the cpnd is down for a moment so there is 
no cpnd opening specific checkpoint.
This lead to retention timer is trigger.
When cpnd is up again but has different pid so retention timer is not 
stoped.
Repica is deleted at retention while its information still be in ckpt 
database.

Fix:
- Stop timer of removed node.
- Update data in patricia trees (for retention value consistence).
~~~

5.17.06 (release) [code:90973e]
~~~
commit 90973efa1f9b4002590450fd21e6b1a71f085296
Author: Hoang Vo 
Date:   Fri Apr 21 09:13:18 2017 +0700

cpd: to correct failover behavior of cpsv [#1765]

Problem:
In case failover multiple times, the cpnd is down for a moment so there is 
no cpnd opening specific checkpoint.
This lead to retention timer is trigger.
When cpnd is up again but has different pid so retention timer is not 
stoped.
Repica is deleted at retention while its information still be in ckpt 
database.

Fix:
- Stop timer of removed node.
- Update data in patricia trees (for retention value consistence).
~~~

default (mecurial) [staging:edc930]
~~~
changeset:   8774:edc930fcc8fc
user:Hoang Vo 
date:Fri Apr 21 09:32:25 2017 +0700
summary: cpd: to correct failover behavior of cpsv [#1765]
~~~



---

** [tickets:#1765] ckpt : saCkptCheckpointOpen api call failed and returing 
SA_AIS_ERR_LIBRARY after couple of failover**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Fri Apr 15, 2016 06:26 AM UTC by Ritu Raj
**Last Updated:** Tue Apr 04, 2017 01:34 PM UTC
**Owner:** Vo Minh Hoang
**Attachments:**

- 
[ckpt_trace.tar.bz2](https://sourceforge.net/p/opensaf/tickets/1765/attachment/ckpt_trace.tar.bz2)
 (3.2 MB; application/x-bzip)


setup:
Changeset- 7436
Version - opensaf 5.0 FC
4 nodes configured with single PBE and a load of 30K objects

* Issue observed :
saCkptCheckpointOpen api call failed and returing SA_AIS_ERR_LIBRARY after 
couple of failover

* Steps to reproduce:
> Ran couple of failover and observed saCkptCheckpointOpen failed.
> below is the snippet of agent trace:

Apr 15  8:08:50.275115 cpa [28883:cpa_mds.c:0776] << cpa_mds_msg_sync_send: 
retval = 1
Apr 15  8:08:50.275128 cpa [28883:cpa_api.c:1043] T4 Cpa CkptOpen failed with 
return value:2,ckptHandle:63
Apr 15  8:08:50.275141 cpa [28883:cpa_api.c:1146] << **saCkptCheckpointOpen: 
API return code = 2**

> Traces of both controllers and agent trace of payload is attached.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1969 smf: One step upgrade with cluster reboot does not wait for nodes to start

2017-04-20 Thread elunlen
- **status**: assigned --> review
- **assigned_to**: Rafael
- **Milestone**: future --> 5.17.06



---

** [tickets:#1969] smf: One step upgrade with cluster reboot does not wait for 
nodes to start**

**Status:** review
**Milestone:** 5.17.06
**Created:** Wed Aug 24, 2016 01:01 PM UTC by elunlen
**Last Updated:** Wed Apr 12, 2017 01:43 PM UTC
**Owner:** Rafael


When using the one step upgrade feature with a cluster reboot all nodes will 
restart including the SC-nodes. This is done as the last action in the upgrade 
step. After the active SC-node is up again SMF will continue with the procedure 
wrapup. When collecting information in order to prepare the wrapup the node 
destination for all nodes in the campaign is requested. However this 
information can only be collected from nodes that are started and has joined 
the cluster (unlocked).
The problem is that SMF does not seems wait in order to give all nodes a chance 
to join the cluster and if SMF fails to get node destination from any of the 
nodes the campaign will fail as seen in the log below. When reading node 
destination there is a 10 sec “try again” loop waiting for “node up” for each 
node. It is not unlikely that the active SC-node comes up before some of the 
other nodes and that it will take more than 10 sec after that before some of 
the other nodes joins the cluster. If that's the case the campaign will fail


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2431 smf: imm version changes need to be updated to latest

2017-04-20 Thread Neelakanta Reddy
 [a4c626]


---

** [tickets:#2431] smf: imm version changes need to be updated to latest **

**Status:** review
**Milestone:** 5.17.08
**Created:** Tue Apr 18, 2017 12:49 PM UTC by Neelakanta Reddy
**Last Updated:** Thu Apr 20, 2017 11:00 AM UTC
**Owner:** Neelakanta Reddy


Update the IMM version from A.2.1 to A.2.18 (latest version) im SMFD, to 
support saImmOmCcbGetErrorStrings.

The logic for return value for IMM operations has to be corrected  to support 
only TRY_AGAIN for "Resource abort".



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2431 smf: imm version changes need to be updated to latest

2017-04-20 Thread Neelakanta Reddy
https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_u_neelakanta_review_ci_a4c626d17618f69abffc35893294f26e3cde2887_=DwMCAg=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10=Vydv5EY3gJ1pzUvE62N7bXZAd7zlTAXJmu6ygWdk5jU=8SKTT-JaW_ywObgwN4Y-KWjzVX7c9LXhZiEiUgi2d9k=AVyDeGHvKBvbUWTNF5qWBvwkI7M8yxnCy-QzKFIkNfU=


---

** [tickets:#2431] smf: imm version changes need to be updated to latest **

**Status:** review
**Milestone:** 5.17.08
**Created:** Tue Apr 18, 2017 12:49 PM UTC by Neelakanta Reddy
**Last Updated:** Thu Apr 20, 2017 11:04 AM UTC
**Owner:** Neelakanta Reddy


Update the IMM version from A.2.1 to A.2.18 (latest version) im SMFD, to 
support saImmOmCcbGetErrorStrings.

The logic for return value for IMM operations has to be corrected  to support 
only TRY_AGAIN for "Resource abort".



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2431 smf: imm version changes need to be updated to latest

2017-04-20 Thread Neelakanta Reddy
- **status**: accepted --> review



---

** [tickets:#2431] smf: imm version changes need to be updated to latest **

**Status:** review
**Milestone:** 5.17.08
**Created:** Tue Apr 18, 2017 12:49 PM UTC by Neelakanta Reddy
**Last Updated:** Tue Apr 18, 2017 12:49 PM UTC
**Owner:** Neelakanta Reddy


Update the IMM version from A.2.1 to A.2.18 (latest version) im SMFD, to 
support saImmOmCcbGetErrorStrings.

The logic for return value for IMM operations has to be corrected  to support 
only TRY_AGAIN for "Resource abort".



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2435 amf: make auto repair restriction optional

2017-04-20 Thread Gary Lee
- Description has changed:

Diff:



--- old
+++ new
@@ -1,4 +1,4 @@
-Ticket #2144 added support for restricting auto-repair in accordance to the 
AMF spec.  However, it has become clear some applications have relied on the 
legacy auto-repair behaviour. We now have a non-backwards compatibility issue. 
Therefore we should make the auto-repair restriction behavior optional, and 
make it default to the legacy haviour.
+Ticket #2144 added support for restricting auto-repair in accordance to the 
AMF spec.  However, it has become clear some applications have relied on the 
legacy auto-repair behaviour. We now have backwards compatibility issue. 
Therefore we should make the auto-repair restriction behavior optional, and 
make it default to the legacy haviour.
 
 It is proposed to make this behaviour configurable via an attribute in a new 
AMF configuration object.
 






---

** [tickets:#2435] amf: make auto repair restriction optional**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Thu Apr 20, 2017 10:22 AM UTC by Gary Lee
**Last Updated:** Thu Apr 20, 2017 10:23 AM UTC
**Owner:** Gary Lee


Ticket #2144 added support for restricting auto-repair in accordance to the AMF 
spec.  However, it has become clear some applications have relied on the legacy 
auto-repair behaviour. We now have backwards compatibility issue. Therefore we 
should make the auto-repair restriction behavior optional, and make it default 
to the legacy haviour.

It is proposed to make this behaviour configurable via an attribute in a new 
AMF configuration object.

'''
diff --git a/src/amf/config/amf_classes.xml b/src/amf/config/amf_classes.xml
index ee0f185..d082990 100644
--- a/src/amf/config/amf_classes.xml
+++ b/src/amf/config/amf_classes.xml
@@ -1438,4 +1438,20 @@
SA_WRITABLE


+   
+   SA_CONFIG
+   
+   amfConfig
+   SA_STRING_T
+   SA_CONFIG
+   SA_INITIALIZED
+   
+   
+   amfRestrictAutoRepairEnable
+   SA_UINT32_T
+   SA_CONFIG
+   SA_WRITABLE
+   0
+   
+   
 
diff --git a/src/amf/config/amf_objects.xml b/src/amf/config/amf_objects.xml
index 502fc2f..31fc3e5 100644
--- a/src/amf/config/amf_objects.xml
+++ b/src/amf/config/amf_objects.xml
@@ -1,5 +1,12 @@
 
 http://www.saforum.org/IMMSchema; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:noNamespaceSchemaLocation="SAI-AIS-IMM-XSD-A.02.13.xsd">
+   
+   amfConfig=1,safApp=safAmfService
+   
+   amfRestrictAutoRepairEnable
+   0
+   
+   

safAppType=OpenSafApplicationType

'''


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2435 amf: make auto repair restriction optional

2017-04-20 Thread Gary Lee
- Description has changed:

Diff:



--- old
+++ new
@@ -1,4 +1,4 @@
-Ticket #2144 added support for restricting auto-repair in accordance to the 
AMF spec.  However, it has become clear some applications have relied on the 
legacy auto-repair behaviour. We now have backwards compatibility issue. 
Therefore we should make the auto-repair restriction behavior optional, and 
make it default to the legacy haviour.
+Ticket #2144 added support for restricting auto-repair in accordance to the 
AMF spec.  However, it has become clear some applications have relied on the 
legacy auto-repair behaviour. We now have a backwards compatibility issue. 
Therefore we should make the auto-repair restriction behavior optional, and 
make it default to the legacy haviour.
 
 It is proposed to make this behaviour configurable via an attribute in a new 
AMF configuration object.
 






---

** [tickets:#2435] amf: make auto repair restriction optional**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Thu Apr 20, 2017 10:22 AM UTC by Gary Lee
**Last Updated:** Thu Apr 20, 2017 10:29 AM UTC
**Owner:** Gary Lee


Ticket #2144 added support for restricting auto-repair in accordance to the AMF 
spec.  However, it has become clear some applications have relied on the legacy 
auto-repair behaviour. We now have a backwards compatibility issue. Therefore 
we should make the auto-repair restriction behavior optional, and make it 
default to the legacy haviour.

It is proposed to make this behaviour configurable via an attribute in a new 
AMF configuration object.

'''
diff --git a/src/amf/config/amf_classes.xml b/src/amf/config/amf_classes.xml
index ee0f185..d082990 100644
--- a/src/amf/config/amf_classes.xml
+++ b/src/amf/config/amf_classes.xml
@@ -1438,4 +1438,20 @@
SA_WRITABLE


+   
+   SA_CONFIG
+   
+   amfConfig
+   SA_STRING_T
+   SA_CONFIG
+   SA_INITIALIZED
+   
+   
+   amfRestrictAutoRepairEnable
+   SA_UINT32_T
+   SA_CONFIG
+   SA_WRITABLE
+   0
+   
+   
 
diff --git a/src/amf/config/amf_objects.xml b/src/amf/config/amf_objects.xml
index 502fc2f..31fc3e5 100644
--- a/src/amf/config/amf_objects.xml
+++ b/src/amf/config/amf_objects.xml
@@ -1,5 +1,12 @@
 
 http://www.saforum.org/IMMSchema; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:noNamespaceSchemaLocation="SAI-AIS-IMM-XSD-A.02.13.xsd">
+   
+   amfConfig=1,safApp=safAmfService
+   
+   amfRestrictAutoRepairEnable
+   0
+   
+   

safAppType=OpenSafApplicationType

'''


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2435 amf: make auto repair restriction optional

2017-04-20 Thread Gary Lee
- Description has changed:

Diff:



--- old
+++ new
@@ -1,4 +1,4 @@
-Ticket #2144 added support for restricting auto-repair in accordance to the 
AMF spec.  However, it has become clear some applications have relied on the 
legacy auto-repair behaviour. We now have a non-backwards compatiblity issue. 
Therefore we should make the auto-repair restriction behavior optional, and 
make it default to the legacy haviour.
+Ticket #2144 added support for restricting auto-repair in accordance to the 
AMF spec.  However, it has become clear some applications have relied on the 
legacy auto-repair behaviour. We now have a non-backwards compatibility issue. 
Therefore we should make the auto-repair restriction behavior optional, and 
make it default to the legacy haviour.
 
 It is proposed to make this behaviour configurable via an attribute in a new 
AMF configuration object.
 






---

** [tickets:#2435] amf: make auto repair restriction optional**

**Status:** accepted
**Milestone:** 5.17.06
**Created:** Thu Apr 20, 2017 10:22 AM UTC by Gary Lee
**Last Updated:** Thu Apr 20, 2017 10:22 AM UTC
**Owner:** Gary Lee


Ticket #2144 added support for restricting auto-repair in accordance to the AMF 
spec.  However, it has become clear some applications have relied on the legacy 
auto-repair behaviour. We now have a non-backwards compatibility issue. 
Therefore we should make the auto-repair restriction behavior optional, and 
make it default to the legacy haviour.

It is proposed to make this behaviour configurable via an attribute in a new 
AMF configuration object.

'''
diff --git a/src/amf/config/amf_classes.xml b/src/amf/config/amf_classes.xml
index ee0f185..d082990 100644
--- a/src/amf/config/amf_classes.xml
+++ b/src/amf/config/amf_classes.xml
@@ -1438,4 +1438,20 @@
SA_WRITABLE


+   
+   SA_CONFIG
+   
+   amfConfig
+   SA_STRING_T
+   SA_CONFIG
+   SA_INITIALIZED
+   
+   
+   amfRestrictAutoRepairEnable
+   SA_UINT32_T
+   SA_CONFIG
+   SA_WRITABLE
+   0
+   
+   
 
diff --git a/src/amf/config/amf_objects.xml b/src/amf/config/amf_objects.xml
index 502fc2f..31fc3e5 100644
--- a/src/amf/config/amf_objects.xml
+++ b/src/amf/config/amf_objects.xml
@@ -1,5 +1,12 @@
 
 http://www.saforum.org/IMMSchema; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:noNamespaceSchemaLocation="SAI-AIS-IMM-XSD-A.02.13.xsd">
+   
+   amfConfig=1,safApp=safAmfService
+   
+   amfRestrictAutoRepairEnable
+   0
+   
+   

safAppType=OpenSafApplicationType

'''


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2434 log: lgs doesn't checkpoint dest_names in open stream request

2017-04-20 Thread Canh Truong
- Description has changed:

Diff:



--- old
+++ new
@@ -1,3 +1,2 @@
-
 In case failover when one node is rebooted then back again, stb_dest_names has 
not been checkpointed to standby (this node) if the log agent send request open 
the stream again before cold sync is completed.
 



- **status**: unassigned --> accepted
- **assigned_to**: Canh Truong



---

** [tickets:#2434] log: lgs doesn't checkpoint dest_names in open stream 
request**

**Status:** accepted
**Milestone:** 5.17.08
**Created:** Wed Apr 19, 2017 01:25 PM UTC by Canh Truong
**Last Updated:** Wed Apr 19, 2017 01:25 PM UTC
**Owner:** Canh Truong


In case failover when one node is rebooted then back again, stb_dest_names has 
not been checkpointed to standby (this node) if the log agent send request open 
the stream again before cold sync is completed.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2420 imm: IMMND on PL hangs when headless

2017-04-20 Thread Hung Nguyen
- **status**: review --> fixed
- **Milestone**: 5.0.2 --> 5.17.06
- **Comment**:

5.17.08 (develop) [code:11325e]
~~~
commit 11325e3b7643c4d0500771ef7e022fcc47f1d31a
Author: Hung Nguyen 
Date:   Thu Apr 20 14:37:18 2017 +0700

imm: Use waitpid with WNOHANG to check for sync process and pbe process 
[#2420]

Use waitpid with WNOHANG to check for sync process and pbe process.
The processes are checked before resending the intro message.
The intro message is only sent when those processes exit.
~~~

5.17.06 (release) [code:51233a]
~~~
commit 51233a54a11809ac48e27c043361b0ac95c5b71a
Author: Hung Nguyen 
Date:   Thu Apr 20 14:37:18 2017 +0700

imm: Use waitpid with WNOHANG to check for sync process and pbe process 
[#2420]

Use waitpid with WNOHANG to check for sync process and pbe process.
The processes are checked before resending the intro message.
The intro message is only sent when those processes exit.
~~~

default (mecurial) [staging:2aa1ed]
~~~
changeset:   8773:2aa1edbd41e9
user:Hung Nguyen 
date:Tue Apr 11 19:05:48 2017 +0700
summary: imm: Use waitpid with WNOHANG to check for sync process and pbe 
process [#2420]
~~~



---

** [tickets:#2420] imm: IMMND on PL hangs when headless**

**Status:** fixed
**Milestone:** 5.17.06
**Created:** Tue Apr 11, 2017 07:13 AM UTC by Hung Nguyen
**Last Updated:** Tue Apr 11, 2017 12:11 PM UTC
**Owner:** Hung Nguyen


IMMND on PL hangs at waitpid() after coordinator removal.

When pbe process is in D State (Uninterruptible sleep (usually IO)), waitpid() 
will be hung if WNOHANG is not specified.

~~~
LOG_WA("SC were absent and PBE appears hung, sending SIGKILL");
kill(cb->pbePid, SIGKILL);
waitpid(cb->pbePid, NULL, 0);
~~~
The bug is introduced by [#2296].

Solution: Use waitpid() with WNOHANG specified. Check for pbe/sync process 
exiting before sending introduce message during headless.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2432 dtm: Node reboot because transportd reads invalid pid of dtmd

2017-04-20 Thread Minh Hon Chau
- **status**: accepted --> review



---

** [tickets:#2432] dtm: Node reboot because transportd reads invalid pid of 
dtmd**

**Status:** review
**Milestone:** 5.17.06
**Created:** Wed Apr 19, 2017 12:02 AM UTC by Minh Hon Chau
**Last Updated:** Wed Apr 19, 2017 12:26 PM UTC
**Owner:** Minh Hon Chau


There's an unexpected node reboot during Opensaf node startup

2017-01-24 18:19:53 SC-1 opensafd: Starting OpenSAF Services(5.2.M0 - 
8532:b6df9e2a2b8b:default) (Using TCP)
2017-01-24 18:19:53 SC-1 osaftransportd[398]: Started
2017-01-24 18:19:53 SC-1 osafdtmd[393]: mkfifo already exists: 
/var/lib/opensaf/osafdtmd.fifo File exists
2017-01-24 18:19:53 SC-1 osaftransportd[398]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: osafdtmd failed to start, OwnNodeId = 0, 
SupervisionTime = 60
2017-01-24 18:19:53 SC-1 osafdtmd[393]: Started

Another attempt to reproduce this problem by adding more debug log:

2017-04-18 18:01:14 SC-1 opensafd: Starting OpenSAF Services(5.2.0 - 
0:) (Using TCP)
2017-04-18 18:01:14 SC-1 osaftransportd[380]: fifo_file 
/var/lib/opensaf/osaftransportd.fifo
2017-04-18 18:01:14 SC-1 osaftransportd[380]: mkfifo already exists: 
/var/lib/opensaf/osaftransportd.fifo File exists
2017-04-18 18:01:14 SC-1 osafdtmd[386]: fifo_file /var/lib/opensaf/osafdtmd.fifo
2017-04-18 18:01:14 SC-1 osafdtmd[386]: mkfifo already exists: 
/var/lib/opensaf/osafdtmd.fifo File exists
2017-04-18 18:01:15 SC-1 osaftransportd[380]: __pidfile 
/var/run/opensaf/osaftransportd.pid
2017-04-18 18:01:15 SC-1 osaftransportd[380]: Started
2017-04-18 18:01:15 SC-1 osaftransportd[380]: WA file_path_:/var/run/opensaf, 
file_name_:osafdtmd.pid
2017-04-18 18:01:15 SC-1 osafdtmd[386]: __pidfile /var/run/opensaf/osafdtmd.pid
2017-04-18 18:01:15 SC-1 osaftransportd[380]: WA file name: osafdtmd.pid created
2017-04-18 18:01:15 SC-1 osaftransportd[380]: WA rdstate 6, pid: 4294967295
2017-04-18 18:01:15 SC-1 osaftransportd[380]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: osafdtmd failed to start, OwnNodeId = 0, 
SupervisionTime = 60
2017-04-18 18:01:15 SC-1 osafdtmd[386]: Started

It could be because osaftransportd fails to read pid in osafdmtd.pid



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets