[tickets] [opensaf:tickets] #245 amf: saAmfComponentErrorClear_4() does not returns SA_AIS_ERR_NO_OP for operationally enabled comp.

2017-02-21 Thread Nagendra Kumar
- **status**: fixed --> assigned



---

** [tickets:#245] amf: saAmfComponentErrorClear_4() does not returns 
SA_AIS_ERR_NO_OP for operationally enabled comp.**

**Status:** assigned
**Milestone:** 5.2.FC
**Created:** Thu May 16, 2013 06:37 AM UTC by Praveen
**Last Updated:** Thu Dec 01, 2016 06:14 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2818.

Changeset:3728
 When saAmfComponentErrorClear_4() is called for an operationally enabled 
component, it returns SA_AIS_OK. According to spec (B.04.01, section 7.12.2 
page 329)return value should be SA_AIS_ERR_NO_OP.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #245 amf: saAmfComponentErrorClear_4() does not returns SA_AIS_ERR_NO_OP for operationally enabled comp.

2017-02-21 Thread Nagendra Kumar
- **status**: assigned --> fixed
- **Comment**:

changeset:   8604:33f9c7a3df4a
tag: tip
user:Nagendra Kumar
date:Wed Feb 22 10:33:22 2017 +0530
summary: amfnd: fix nullptr issue [#245]

[staging:33f9c7]



---

** [tickets:#245] amf: saAmfComponentErrorClear_4() does not returns 
SA_AIS_ERR_NO_OP for operationally enabled comp.**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Thu May 16, 2013 06:37 AM UTC by Praveen
**Last Updated:** Wed Feb 22, 2017 05:52 AM UTC
**Owner:** Nagendra Kumar


Migrated from http://devel.opensaf.org/ticket/2818.

Changeset:3728
 When saAmfComponentErrorClear_4() is called for an operationally enabled 
component, it returns SA_AIS_OK. According to spec (B.04.01, section 7.12.2 
page 329)return value should be SA_AIS_ERR_NO_OP.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2316 amfnd: amfnd is stuck in removing assignment in shutting down phase

2017-02-21 Thread Minh Hon Chau
- **status**: unassigned --> accepted
- **assigned_to**: Minh Hon Chau



---

** [tickets:#2316] amfnd: amfnd is stuck in removing assignment in shutting 
down phase**

**Status:** accepted
**Milestone:** 5.0.2
**Created:** Wed Feb 22, 2017 01:00 AM UTC by Minh Hon Chau
**Last Updated:** Wed Feb 22, 2017 01:04 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- 
[app3_twon2su1si.xml](https://sourceforge.net/p/opensaf/tickets/2316/attachment/app3_twon2su1si.xml)
 (9.2 kB; text/xml)
- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2316/attachment/log.tgz) 
(696.0 kB; application/x-compressed)


- Configuration: Using attached model
- Steps:
. Loading application in attached model
. SU5 in PL5 has standby assignment
. kill amf_demo in PL5 which is component of SU5
. amd_demo will be instantiated and reassigned assignment
. Delay csi set callback for standby assignment in amf_demo component
. stop opensafd: /opensafd stop
. Release csi set callback in amf_demo component
- Observation: amfnd is stucked in removing assignment until it's killed by 
opensafd

> 2017-02-21 23:04:54 PL-5 amf_demo[543]: exiting (caught term signal)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' component restart probation 
> timer started (timeout: 300 ns)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO Restarting a component of 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' (comp restart count: 1)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' faulted due to 
> 'avaDown' : Recovery is 'componentRestart'
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Presence State INSTANTIATED => 
> RESTARTING
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: 
> 'safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' started
> 2017-02-21 23:04:54 PL-5 amf_demo_script: 
> safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3
> 2017-02-21 23:04:54 PL-5 amf_demo_script: Starting 
> /srv/osaftest/amf_demo/amf_demo succeeded, rc:0
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Presence State RESTARTING => 
> INSTANTIATED
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: Registered with AMF and HC started
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: CSI Set - add 
> 'safCsi=AmfDemo3,safSi=AmfDemo3,safApp=AmfDemo3' HAState Standby
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: debug delay 2 secs
> 2017-02-21 23:04:57 PL-5 opensafd: Stopping OpenSAF Services
> 2017-02-21 23:04:57 PL-5 osafamfnd[412]: NO Shutdown initiated
> 2017-02-21 23:04:57 PL-5 osafamfnd[412]: NO Removing assignments from AMF 
> components
> 2017-02-21 23:05:00 PL-5 amf_demo[605]: message repeated 3 times: [ debug 
> delay 2 secs]
> 2017-02-21 23:05:02 PL-5 amf_demo[605]: Health check 1
> 2017-02-21 23:05:24 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Component or SU restart 
> probation timer expired
> 2017-02-21 23:05:57 PL-5 opensafd: amfnd has not yet exited, killing it 
> forcibly.
> 2017-02-21 23:05:57 PL-5 amf_demo[605]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 amf_demo[605]: exiting (caught term signal)
> 2017-02-21 23:05:57 PL-5 osafclmna[404]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafimmnd[395]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafckptnd[431]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafsmfnd[453]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafamfwd[422]: Rebooting OpenSAF NodeId = 0 EE Name 
> = No EE Mapped, Reason: AMF unexpectedly crashed, OwnNodeId = 132367, 
> SupervisionTime = 60
> 
log/trace is attached



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2210 AMFD: Loss of RT attribute update before headless

2017-02-21 Thread Minh Hon Chau
- **status**: review --> fixed
- **assigned_to**: Minh Hon Chau -->  nobody 



---

** [tickets:#2210] AMFD: Loss of RT attribute update before headless**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Mon Nov 28, 2016 10:18 PM UTC by Minh Hon Chau
**Last Updated:** Tue Feb 07, 2017 03:45 AM UTC
**Owner:** nobody


A loss of IMM RT saAmfSIAdminState update in AMFD has been seen just before 
cluster goes headless. It results in coredump after headless.

One scenario is:
- Issue amf-admin shutdown SI, delay csi quiescing callback
- Stop SCs, release csi quiescing callback
- Restart SCs
Observation: the saAmfSIAdminState is read as UNLOCKED while related SUSI was 
QUIESCED, and coredump as below

~~~
Thread 1 (Thread 0x7fec174a0780 (LWP 493)):
#0  0x004fbfd5 in SG_2N::node_fail_si_oper (this=0x24109d0, 
su=0x2413440) at sg_2n_fsm.cc:3102
s_susi = 0x8f5000b
susi_temp = 0x5fa169
o_su = 0x2417f98
__FUNCTION__ = "node_fail_si_oper"
cb = 0x919240 <_control_block>
#1  0x004fe69c in SG_2N::node_fail (this=0x24109d0, cb=0x919240 
<_control_block>, su=0x2413440) at sg_2n_fsm.cc:
3469
a_susi = 0x1
s_susi = 0x7fffedecd2d0
o_su = 0x5a50bd 
flag = 2
__FUNCTION__ = "node_fail"
su_ha_state = 0
#2  0x00513010 in AVD_SG::failover_absent_assignment (this=0x24109d0) 
at sg.cc:2273
su = @0x2411330: 0x2413440
__for_range = std::vector of length 2, capacity 2 = {0x2413440, 
0x24111e0}
__for_begin = 
__for_end = 
__FUNCTION__ = "failover_absent_assignment"
#3  0x0043be65 in avd_cluster_tmr_init_evh (cb=0x919240 
<_control_block>, evt=0x7fec04000df0) at cluster.cc:103
i_sg = 0x24109d0
it = {first = "safSg=1,safApp=osaftest", second = }
__FUNCTION__ = "avd_cluster_tmr_init_evh"
su = 0x0
node = 0x240f9b0
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2316 amfnd: amfnd is stuck in removing assignment in shutting down phase

2017-02-21 Thread Minh Hon Chau
- **summary**: amfnd: amfnd is stucking in removing assignment in shutting down 
phase --> amfnd: amfnd is stuck in removing assignment in shutting down phase



---

** [tickets:#2316] amfnd: amfnd is stuck in removing assignment in shutting 
down phase**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Wed Feb 22, 2017 01:00 AM UTC by Minh Hon Chau
**Last Updated:** Wed Feb 22, 2017 01:00 AM UTC
**Owner:** nobody
**Attachments:**

- 
[app3_twon2su1si.xml](https://sourceforge.net/p/opensaf/tickets/2316/attachment/app3_twon2su1si.xml)
 (9.2 kB; text/xml)
- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2316/attachment/log.tgz) 
(696.0 kB; application/x-compressed)


- Configuration: Using attached model
- Steps:
. Loading application in attached model
. SU5 in PL5 has standby assignment
. kill amf_demo in PL5 which is component of SU5
. amd_demo will be instantiated and reassigned assignment
. Delay csi set callback for standby assignment in amf_demo component
. stop opensafd: /opensafd stop
. Release csi set callback in amf_demo component
- Observation: amfnd is stucked in removing assignment until it's killed by 
opensafd

> 2017-02-21 23:04:54 PL-5 amf_demo[543]: exiting (caught term signal)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' component restart probation 
> timer started (timeout: 300 ns)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO Restarting a component of 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' (comp restart count: 1)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' faulted due to 
> 'avaDown' : Recovery is 'componentRestart'
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Presence State INSTANTIATED => 
> RESTARTING
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: 
> 'safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' started
> 2017-02-21 23:04:54 PL-5 amf_demo_script: 
> safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3
> 2017-02-21 23:04:54 PL-5 amf_demo_script: Starting 
> /srv/osaftest/amf_demo/amf_demo succeeded, rc:0
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Presence State RESTARTING => 
> INSTANTIATED
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: Registered with AMF and HC started
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: CSI Set - add 
> 'safCsi=AmfDemo3,safSi=AmfDemo3,safApp=AmfDemo3' HAState Standby
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: debug delay 2 secs
> 2017-02-21 23:04:57 PL-5 opensafd: Stopping OpenSAF Services
> 2017-02-21 23:04:57 PL-5 osafamfnd[412]: NO Shutdown initiated
> 2017-02-21 23:04:57 PL-5 osafamfnd[412]: NO Removing assignments from AMF 
> components
> 2017-02-21 23:05:00 PL-5 amf_demo[605]: message repeated 3 times: [ debug 
> delay 2 secs]
> 2017-02-21 23:05:02 PL-5 amf_demo[605]: Health check 1
> 2017-02-21 23:05:24 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Component or SU restart 
> probation timer expired
> 2017-02-21 23:05:57 PL-5 opensafd: amfnd has not yet exited, killing it 
> forcibly.
> 2017-02-21 23:05:57 PL-5 amf_demo[605]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 amf_demo[605]: exiting (caught term signal)
> 2017-02-21 23:05:57 PL-5 osafclmna[404]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafimmnd[395]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafckptnd[431]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafsmfnd[453]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafamfwd[422]: Rebooting OpenSAF NodeId = 0 EE Name 
> = No EE Mapped, Reason: AMF unexpectedly crashed, OwnNodeId = 132367, 
> SupervisionTime = 60
> 
log/trace is attached



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2316 amfnd: amfnd is stucking in removing assignment in shutting down phase

2017-02-21 Thread Minh Hon Chau



---

** [tickets:#2316] amfnd: amfnd is stucking in removing assignment in shutting 
down phase**

**Status:** unassigned
**Milestone:** 5.0.2
**Created:** Wed Feb 22, 2017 01:00 AM UTC by Minh Hon Chau
**Last Updated:** Wed Feb 22, 2017 01:00 AM UTC
**Owner:** nobody
**Attachments:**

- 
[app3_twon2su1si.xml](https://sourceforge.net/p/opensaf/tickets/2316/attachment/app3_twon2su1si.xml)
 (9.2 kB; text/xml)
- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2316/attachment/log.tgz) 
(696.0 kB; application/x-compressed)


- Configuration: Using attached model
- Steps:
. Loading application in attached model
. SU5 in PL5 has standby assignment
. kill amf_demo in PL5 which is component of SU5
. amd_demo will be instantiated and reassigned assignment
. Delay csi set callback for standby assignment in amf_demo component
. stop opensafd: /opensafd stop
. Release csi set callback in amf_demo component
- Observation: amfnd is stucked in removing assignment until it's killed by 
opensafd

> 2017-02-21 23:04:54 PL-5 amf_demo[543]: exiting (caught term signal)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' component restart probation 
> timer started (timeout: 300 ns)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO Restarting a component of 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' (comp restart count: 1)
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' faulted due to 
> 'avaDown' : Recovery is 'componentRestart'
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Presence State INSTANTIATED => 
> RESTARTING
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: 
> 'safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' started
> 2017-02-21 23:04:54 PL-5 amf_demo_script: 
> safComp=AmfDemo,safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3
> 2017-02-21 23:04:54 PL-5 amf_demo_script: Starting 
> /srv/osaftest/amf_demo/amf_demo succeeded, rc:0
> 2017-02-21 23:04:54 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Presence State RESTARTING => 
> INSTANTIATED
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: Registered with AMF and HC started
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: CSI Set - add 
> 'safCsi=AmfDemo3,safSi=AmfDemo3,safApp=AmfDemo3' HAState Standby
> 2017-02-21 23:04:54 PL-5 amf_demo[605]: debug delay 2 secs
> 2017-02-21 23:04:57 PL-5 opensafd: Stopping OpenSAF Services
> 2017-02-21 23:04:57 PL-5 osafamfnd[412]: NO Shutdown initiated
> 2017-02-21 23:04:57 PL-5 osafamfnd[412]: NO Removing assignments from AMF 
> components
> 2017-02-21 23:05:00 PL-5 amf_demo[605]: message repeated 3 times: [ debug 
> delay 2 secs]
> 2017-02-21 23:05:02 PL-5 amf_demo[605]: Health check 1
> 2017-02-21 23:05:24 PL-5 osafamfnd[412]: NO 
> 'safSu=SU5,safSg=AmfDemoTwon,safApp=AmfDemo3' Component or SU restart 
> probation timer expired
> 2017-02-21 23:05:57 PL-5 opensafd: amfnd has not yet exited, killing it 
> forcibly.
> 2017-02-21 23:05:57 PL-5 amf_demo[605]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 amf_demo[605]: exiting (caught term signal)
> 2017-02-21 23:05:57 PL-5 osafclmna[404]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafimmnd[395]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafckptnd[431]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafsmfnd[453]: AL AMF Node Director is down, 
> terminate this process
> 2017-02-21 23:05:57 PL-5 osafamfwd[422]: Rebooting OpenSAF NodeId = 0 EE Name 
> = No EE Mapped, Reason: AMF unexpectedly crashed, OwnNodeId = 132367, 
> SupervisionTime = 60
> 
log/trace is attached



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2241 dtm: Use number suffix for mds log backup files

2017-02-21 Thread Anders Widell
- **status**: accepted --> review



---

** [tickets:#2241] dtm: Use number suffix for mds log backup files**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Thu Dec 22, 2016 10:12 AM UTC by Anders Widell
**Last Updated:** Thu Dec 22, 2016 10:12 AM UTC
**Owner:** Anders Widell


Use the suffix .0 instead of .bak for the backup file of the mds log, to align 
with log rotation normally used in /var/log, and to enable the possibility to 
implement support for multiple backup files.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2166 PLM: retry HPI call when retry error code is returned

2017-02-21 Thread Alex Jones
- **Milestone**: 5.2.FC --> future



---

** [tickets:#2166] PLM: retry HPI call when retry error code is returned**

**Status:** accepted
**Milestone:** future
**Created:** Thu Nov 03, 2016 07:18 PM UTC by Alex Jones
**Last Updated:** Thu Nov 03, 2016 07:18 PM UTC
**Owner:** Alex Jones


When trying to deactivate an HE while the shelf manager is failing over, we 
have seen OpenHPI (using ipmidirect plugin) return some different error codes 
(e.g. SA_ERR_HPI_TIMEOUT, SA_ERR_HPI_INTERNAL_ERROR). The PLM code should 
handle these and retry the call before declaring management lost.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1884 plm: add support for running on spare nodes

2017-02-21 Thread Alex Jones
- **Milestone**: 5.2.FC --> future



---

** [tickets:#1884] plm: add support for running on spare nodes**

**Status:** accepted
**Milestone:** future
**Created:** Mon Jun 20, 2016 09:02 AM UTC by Mathi Naickan
**Last Updated:** Mon Aug 29, 2016 08:10 PM UTC
**Owner:** Alex Jones


When #79 was implemented to support spare SCs (2n + spares), the implementation 
did not enhance PLM.
This ticket is to add the spare related changes in PLM also.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2314 base: ncs_edp_sanamet doesn't handle arrays

2017-02-21 Thread Alex Jones
- **status**: review --> fixed
- **Comment**:

changeset:   8602:1b0c40f815a1
branch:  opensaf-5.1.x
tag: tip
parent:  8600:56bee7d30df9
user:Alex Jones 
date:Tue Feb 21 11:04:24 2017 -0500
summary: base: fix ncs_edp_sanamet for arrays [#2314]

changeset:   8601:6e8905f7e04b
parent:  8597:e99deda2b7e8
user:Alex Jones 
date:Tue Feb 21 11:01:21 2017 -0500
summary: base: fix ncs_edp_sanamet for arrays [#2314]




---

** [tickets:#2314] base: ncs_edp_sanamet doesn't handle arrays**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Mon Feb 20, 2017 07:19 PM UTC by Alex Jones
**Last Updated:** Mon Feb 20, 2017 07:31 PM UTC
**Owner:** Alex Jones


A recent change to ncs_edp_sanamet broke encode/decode for SaNameT arrays. 
These are used by PLM and MSG. PLM and MSG standby no longer come up in 5.1.0.

This ticket proposes to put back the old functionality for arrays, until it is 
clear how to properly solve this issue.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2308 MSG: failover of msgq after node down can take 20 seconds

2017-02-21 Thread Alex Jones
- **status**: review --> fixed



---

** [tickets:#2308] MSG: failover of msgq after node down can take 20 seconds**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Tue Feb 14, 2017 08:06 PM UTC by Alex Jones
**Last Updated:** Tue Feb 21, 2017 03:57 PM UTC
**Owner:** Alex Jones


If a node is hosting a message queue, and the node is rebooted, it takes 20 
seconds for the message queue to failover to another node.

This is because CLM is not being listened to in msgd. The fix is to make sure 
CLM is being listened to.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2308 MSG: failover of msgq after node down can take 20 seconds

2017-02-21 Thread Alex Jones
changeset:   8600:56bee7d30df9
branch:  opensaf-5.1.x
tag: tip
parent:  8594:15aceb2ce9dd
user:Alex Jones 
date:Tue Feb 21 10:31:13 2017 -0500
summary: msgd: only call saClmFinalize if CLM init fails [#2308]

changeset:   8599:2470ad030846
branch:  opensaf-5.0.x
user:Alex Jones 
date:Tue Feb 21 10:31:13 2017 -0500
summary: msgd: only call saClmFinalize if CLM init fails [#2308]

changeset:   8598:2cd8b0e7323a
branch:  opensaf-5.0.x
parent:  8595:78b886a029c4
user:Alex Jones 
date:Tue Feb 21 09:26:49 2017 -0500
summary: msgd: only call saClmFinalize if CLM init fails [#2308]

changeset:   8597:e99deda2b7e8
user:Alex Jones 
date:Tue Feb 21 09:26:49 2017 -0500
summary: msgd: only call saClmFinalize if CLM init fails [#2308]



---

** [tickets:#2308] MSG: failover of msgq after node down can take 20 seconds**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Tue Feb 14, 2017 08:06 PM UTC by Alex Jones
**Last Updated:** Tue Feb 14, 2017 08:45 PM UTC
**Owner:** Alex Jones


If a node is hosting a message queue, and the node is rebooted, it takes 20 
seconds for the message queue to failover to another node.

This is because CLM is not being listened to in msgd. The fix is to make sure 
CLM is being listened to.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #1945 AMF: Refactoring for 5.2

2017-02-21 Thread Praveen
Hi Gary,
Please change the status of the ticket if there are no pending patches for 
review under this refactoring ticlket.
Also this ticket needs to be updated with change set information.

Thanks,
Praveen


---

** [tickets:#1945] AMF: Refactoring for 5.2**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Wed Aug 10, 2016 05:37 AM UTC by Gary Lee
**Last Updated:** Mon Aug 29, 2016 10:38 AM UTC
**Owner:** nobody


This is the 5.2 ticket to continue code re-factoring of the AMF service. The 
ticket for 5.0 was #1520.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2291 dtm: MDS log is overwritten after a node restart

2017-02-21 Thread Anders Widell
- **status**: review --> fixed
- **Comment**:

changeset:   8596:4333bb62d42d
parent:  8593:21aab7e03190
user:Anders Widell 
date:Tue Feb 21 12:52:39 2017 +0100
summary: dtm: Append existing log file after a node restart [#2291]

[staging:4333bb]



---

** [tickets:#2291] dtm: MDS log is overwritten after a node restart**

**Status:** fixed
**Milestone:** 5.2.FC
**Created:** Tue Feb 07, 2017 03:07 PM UTC by Anders Widell
**Last Updated:** Wed Feb 15, 2017 09:28 AM UTC
**Owner:** Anders Widell


osaftransportd overwrites an existing MDS log file after a node restart. If an 
MDS log file exists, it should be appended.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2266 base: Add a hash function

2017-02-21 Thread Anders Widell
- **status**: accepted --> review



---

** [tickets:#2266] base: Add a hash function**

**Status:** review
**Milestone:** 5.2.FC
**Created:** Tue Jan 17, 2017 02:39 PM UTC by Anders Widell
**Last Updated:** Thu Jan 26, 2017 03:44 PM UTC
**Owner:** Anders Widell


A collision-resistant hash function is needed for implementing ticket [#2258], 
but it can be useful in other contexts as well so the proposal is to implement 
it as a common library function.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets


[tickets] [opensaf:tickets] #2296 imm: IMMND on payload crashes after SC absence

2017-02-21 Thread Hung Nguyen
- **status**: review --> fixed
- **Comment**:

default (5.2) [staging:21aab7]
changeset:   8593:21aab7e03190
user:Hung Nguyen 
date:Tue Feb 21 14:46:41 2017 +0700
summary: imm: Fix problems with removing coordinator role when cluster goes 
headless [#2296]

opensaf-5.1.x [staging:15aceb]
changeset:   8594:15aceb2ce9dd
user:Hung Nguyen 
date:Tue Feb 21 14:49:28 2017 +0700
summary: imm: Fix problems with removing coordinator role when cluster goes 
headless [#2296]

opensaf-5.0.x [staging:78b886]
changeset:   8595:78b886a029c4
user:Hung Nguyen 
date:Tue Feb 21 14:49:28 2017 +0700
summary: imm: Fix problems with removing coordinator role when cluster goes 
headless [#2296]




---

** [tickets:#2296] imm: IMMND on payload crashes after SC absence**

**Status:** fixed
**Milestone:** 5.0.2
**Created:** Thu Feb 09, 2017 08:44 AM UTC by Hung Nguyen
**Last Updated:** Fri Feb 10, 2017 07:27 AM UTC
**Owner:** Hung Nguyen
**Attachments:**

- 
[logs.tgz](https://sourceforge.net/p/opensaf/tickets/2296/attachment/logs.tgz) 
(5.2 MB; application/x-compressed)


Removal of IMMND coordinator was introduced in [#1692].
Some cleanup actions are delayed until **immnd_proc_server()** is executed.

In case the cluster is back from headless too fast, **immnd_proc_server()** 
will not be executed and IMMND will crashes later.

~~~
2017-02-05 21:36:41 PL-5 osafimmnd[406]: NO Announce sync, epoch:28
2017-02-05 21:36:41 PL-5 osafimmnd[406]: NO SERVER STATE: IMM_SERVER_READY --> 
IMM_SERVER_SYNC_SERVER
2017-02-05 21:36:41 PL-5 osafimmnd[406]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
2017-02-05 21:36:41 PL-5 osafimmloadd: NO Sync starting
2017-02-05 21:36:42 PL-5 osafdtmd[393]: NO Lost contact with 'SC-1'
2017-02-05 21:36:42 PL-5 osafimmnd[406]: WA Director Service in NOACTIVE state 
- fevs replies pending:16 fevs highest processed:13154
2017-02-05 21:36:43 PL-5 osafimmnd[406]: WA SC Absence IS allowed:900 IMMD 
service is DOWN
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO IMMD SERVICE IS DOWN, HYDRA IS 
CONFIGURED => UNREGISTERING IMMND form MDS
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Removing client id:290002050f 
sv_id:26
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Removing client id:14d0002050f 
sv_id:26
2017-02-05 21:36:43 PL-5 osafimmnd[406]: WA Postponing hard delete of admin 
owner with id:41 when imm is not writable state
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Removing client id:1530002050f 
sv_id:27
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Implementer disconnected 147 <339, 
2050f> (OpenSafImmPBE)
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Removing client id:1550002050f 
sv_id:26
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Implementer disconnected 144 <0, 
2010f(down)> (safLogService)
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Implementer disconnected 145 <0, 
2010f(down)> (@safLogService_appl)
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Implementer disconnected 146 <0, 
2010f(down)> (@OpenSafImmReplicatorA)
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Implementer disconnected 143 <0, 
2010f(down)> (safClmService)
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Implementer disconnected 142 <0, 
2010f(down)> (safAmfService)
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO Impl Discarded node 2010f
2017-02-05 21:36:43 PL-5 osafimmnd[406]: NO MDS unregisterede. sleeping ...
2017-02-05 21:36:43 PL-5 osafimmpbed: WA PBE lost contact with parent IMMND - 
Exiting
2017-02-05 21:36:44 PL-5 osafimmnd[406]: NO Sleep done registering IMMND with 
MDS
2017-02-05 21:36:44 PL-5 osafimmnd[406]: NO SUCCESS IN REGISTERING IMMND WITH 
MDS
2017-02-05 21:36:44 PL-5 osafimmnd[406]: NO MDS: mds_register_callback: dest 
2050f01e8 already exist
2017-02-05 21:36:44 PL-5 osafimmnd[406]: WA IMMND - Client Node Get Failed for 
cli_hdl:1464583980303
2017-02-05 21:36:45 PL-5 osafdtmd[393]: NO Established contact with 'SC-1'
2017-02-05 21:36:49 PL-5 osafimmnd[406]: WA MDS Send Failed
2017-02-05 21:36:49 PL-5 osafimmnd[406]: WA Error code 2 returned for message 
type 17 - ignoring
2017-02-05 21:36:49 PL-5 osafimmnd[406]: NO IMMD service is UP ... 
ScAbsenseAllowed?:900 introduced?:2
2017-02-05 21:36:49 PL-5 osafimmnd[406]: NO Re-introduce-me 
highestProcessed:13154 highestReceived:13154
2017-02-05 21:36:49 PL-5 osafimmnd[406]: NO Epoch set to 29 in ImmModel
2017-02-05 21:36:49 PL-5 osafimmnd[406]: NO Re-introduce-me 
highestProcessed:13154 highestReceived:13154
2017-02-05 21:36:49 PL-5 osafimmnd[406]: NO ERR_BAD_HANDLE: admin owner id 42 
does not exist
2017-02-05 21:36:49 PL-5 osafimmnd[406]: NO Implementer connected: 149 
(OpenSafImmPBE) <0, 2040f>
2017-02-05 21:36:49 PL-5 osafimmnd[406]: NO Re-introduce-me 
highestProcessed:13157 highestReceived:13158
2017-02-05 21:36:49 PL-5 osafimmnd[406]: ER Node is in a state that cannot 
accept start of sync, will terminate
~~~

IMMND failed to revert back to