[tickets] [opensaf:tickets] #2037 IMM: Immd asserted on active controller in backward compatability

2016-09-20 Thread Anders Widell
- **Milestone**: 4.7.2 --> 5.0.2



---

** [tickets:#2037] IMM: Immd asserted on active controller in backward 
compatability**

**Status:** assigned
**Milestone:** 5.0.2
**Created:** Thu Sep 15, 2016 06:51 AM UTC by Madhurika Koppula
**Last Updated:** Thu Sep 15, 2016 10:31 AM UTC
**Owner:** Neelakanta Reddy
**Attachments:**

- 
[immnd_immd_cores.rtf](https://sourceforge.net/p/opensaf/tickets/2037/attachment/immnd_immd_cores.rtf)
 (8.5 kB; application/rtf)


**Environment Details:**

OS : Suse 64bit
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
1PBE enabled ).

Backward Compatability:
Opensaf versions on nodes:
SC-1 (5.0), SC-2 (5.1 FC), PL-3 (5.0), PL-4(5.1FC).

**Summary:** IMMD asserted on active controller after immnd crash.

**Steps followed & Observed behaviour:**

1) SC-1 is with role standby, SC-2 is with role active.
2) Sequence of api's called as below. 
a) saImmOiInitialize() 
b) saImmOiImplementerSet() 
c) kill -9 `pidof osafimmnd` d) saImmOiRtObjectDelete() 
 e) saImmOiFinalize()

Observations:

1) First immnd asserted on active controller  when calling 
immnd_evt_proc_fevs_rcv
2) Second active controller rebooted with immd assertion failed.

Below is the snippet of active controller SC-2:

Sep 20 20:52:47 SCALE_SLOT-42 osafntfimcnd[15091]: NO saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: Started
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO Persistent Back-End 
capability configured, Pbe file:imm.db (suffix may get added)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:3, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: immnd_evt.c:9146: 
immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest == 
cb->immnd_mdest_id) || isObjSync' failed.
Sep 20 20:52:47 SCALE_SLOT-42 python2.5: WA imma_mds_svc_evt: 
mds_auth_server_connect failed
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:4, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO Restarting a component of 
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' (comp restart count: 6)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery

Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO Extended intro from node 2020f
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: immd_evt.c:816: 
immd_accept_node: Assertion 'node_info->immnd_key != cb->node_id' failed.
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131599, SupervisionTime = 60

After reboot timestamp is as below:

Sep 20 20:52:47 SCALE_SLOT-42 opensaf_reboot: Rebooting local node; timeout=60
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA DISCARD DUPLICATE FEVS 
message:12996
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 22 02:13:05 SCALE_SLOT-42 syslog-ng[1133]: syslog-ng starting up; 
version='2.0.9'
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Backgrounding to notify hosts...
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: DNS resolution of CONN-PC 
failed; retrying later
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Flags: TI-RPC
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:07 SCALE_SLOT-42 opensafd: Starting OpenSAF Services(5.1.FC - ) 
(Using TIPC)
Sep 22 02:13:07 SCALE_SLOT-42 osafclmna[1745]: Started
Sep 22 02:13:07 SCALE_SLOT-42 osafclmna[1745]: NO 
safNode=SC-2,safCluster=myClmCluster Joined cluster, nodeid=2020f
Sep 22 02:13:07 SCALE_SLOT-42 osafrded[1754]: Started
Sep 22 02:13:08 

[tickets] [opensaf:tickets] #2037 IMM: Immd asserted on active controller in backward compatability

2016-09-15 Thread Neelakanta Reddy
immd traces are not availabel when the assertion is happened:
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: immnd_evt.c:9146: 
immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest == 
cb->immnd_mdest_id) || isObjSync' failed.

can you check, that the system resources are not used completely, like hard 
disk (check space si full).
looks, like memory corruption problems also. 

keep, sufficent resources and try to run the test again.



---

** [tickets:#2037] IMM: Immd asserted on active controller in backward 
compatability**

**Status:** assigned
**Milestone:** 4.7.2
**Created:** Thu Sep 15, 2016 06:51 AM UTC by Madhurika Koppula
**Last Updated:** Thu Sep 15, 2016 07:19 AM UTC
**Owner:** Neelakanta Reddy
**Attachments:**

- 
[immnd_immd_cores.rtf](https://sourceforge.net/p/opensaf/tickets/2037/attachment/immnd_immd_cores.rtf)
 (8.5 kB; application/rtf)


**Environment Details:**

OS : Suse 64bit
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
1PBE enabled ).

Backward Compatability:
Opensaf versions on nodes:
SC-1 (5.0), SC-2 (5.1 FC), PL-3 (5.0), PL-4(5.1FC).

**Summary:** IMMD asserted on active controller after immnd crash.

**Steps followed & Observed behaviour:**

1) SC-1 is with role standby, SC-2 is with role active.
2) Sequence of api's called as below. 
a) saImmOiInitialize() 
b) saImmOiImplementerSet() 
c) kill -9 `pidof osafimmnd` d) saImmOiRtObjectDelete() 
 e) saImmOiFinalize()

Observations:

1) First immnd asserted on active controller  when calling 
immnd_evt_proc_fevs_rcv
2) Second active controller rebooted with immd assertion failed.

Below is the snippet of active controller SC-2:

Sep 20 20:52:47 SCALE_SLOT-42 osafntfimcnd[15091]: NO saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: Started
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO Persistent Back-End 
capability configured, Pbe file:imm.db (suffix may get added)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:3, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: immnd_evt.c:9146: 
immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest == 
cb->immnd_mdest_id) || isObjSync' failed.
Sep 20 20:52:47 SCALE_SLOT-42 python2.5: WA imma_mds_svc_evt: 
mds_auth_server_connect failed
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:4, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO Restarting a component of 
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' (comp restart count: 6)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery

Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO Extended intro from node 2020f
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: immd_evt.c:816: 
immd_accept_node: Assertion 'node_info->immnd_key != cb->node_id' failed.
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131599, SupervisionTime = 60

After reboot timestamp is as below:

Sep 20 20:52:47 SCALE_SLOT-42 opensaf_reboot: Rebooting local node; timeout=60
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA DISCARD DUPLICATE FEVS 
message:12996
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 22 02:13:05 SCALE_SLOT-42 syslog-ng[1133]: syslog-ng starting up; 
version='2.0.9'
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Backgrounding to notify hosts...
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: DNS resolution of CONN-PC 
failed; retrying later
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Flags: TI-RPC
Sep 22 02:13:06 SCALE_SLOT-42 

[tickets] [opensaf:tickets] #2037 IMM: Immd asserted on active controller in backward compatability

2016-09-15 Thread Neelakanta Reddy
- **status**: unassigned --> assigned
- **assigned_to**: Neelakanta Reddy
- **Part**: - --> d



---

** [tickets:#2037] IMM: Immd asserted on active controller in backward 
compatability**

**Status:** assigned
**Milestone:** 4.7.2
**Created:** Thu Sep 15, 2016 06:51 AM UTC by Madhurika Koppula
**Last Updated:** Thu Sep 15, 2016 06:52 AM UTC
**Owner:** Neelakanta Reddy
**Attachments:**

- 
[immnd_immd_cores.rtf](https://sourceforge.net/p/opensaf/tickets/2037/attachment/immnd_immd_cores.rtf)
 (8.5 kB; application/rtf)


**Environment Details:**

OS : Suse 64bit
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
1PBE enabled ).

Backward Compatability:
Opensaf versions on nodes:
SC-1 (5.0), SC-2 (5.1 FC), PL-3 (5.0), PL-4(5.1FC).

**Summary:** IMMD asserted on active controller after immnd crash.

**Steps followed & Observed behaviour:**

1) SC-1 is with role standby, SC-2 is with role active.
2) Sequence of api's called as below. 
a) saImmOiInitialize() 
b) saImmOiImplementerSet() 
c) kill -9 `pidof osafimmnd` d) saImmOiRtObjectDelete() 
 e) saImmOiFinalize()

Observations:

1) First immnd asserted on active controller  when calling 
immnd_evt_proc_fevs_rcv
2) Second active controller rebooted with immd assertion failed.

Below is the snippet of active controller SC-2:

Sep 20 20:52:47 SCALE_SLOT-42 osafntfimcnd[15091]: NO saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: Started
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO Persistent Back-End 
capability configured, Pbe file:imm.db (suffix may get added)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:3, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: immnd_evt.c:9146: 
immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest == 
cb->immnd_mdest_id) || isObjSync' failed.
Sep 20 20:52:47 SCALE_SLOT-42 python2.5: WA imma_mds_svc_evt: 
mds_auth_server_connect failed
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:4, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO Restarting a component of 
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' (comp restart count: 6)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery

Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO Extended intro from node 2020f
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: immd_evt.c:816: 
immd_accept_node: Assertion 'node_info->immnd_key != cb->node_id' failed.
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131599, SupervisionTime = 60

After reboot timestamp is as below:

Sep 20 20:52:47 SCALE_SLOT-42 opensaf_reboot: Rebooting local node; timeout=60
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA DISCARD DUPLICATE FEVS 
message:12996
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 22 02:13:05 SCALE_SLOT-42 syslog-ng[1133]: syslog-ng starting up; 
version='2.0.9'
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Backgrounding to notify hosts...
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: DNS resolution of CONN-PC 
failed; retrying later
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Flags: TI-RPC
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:07 SCALE_SLOT-42 opensafd: Starting OpenSAF Services(5.1.FC - ) 
(Using TIPC)
Sep 22 02:13:07 SCALE_SLOT-42 osafclmna[1745]: Started
Sep 22 02:13:07 SCALE_SLOT-42 osafclmna[1745]: NO 
safNode=SC-2,safCluster=myClmCluster Joined cluster, nodeid=2020f
Sep 22 02:13:07 

[tickets] [opensaf:tickets] #2037 IMM: Immd asserted on active controller in backward compatability

2016-09-15 Thread Madhurika Koppula
logs are attcached 


Attachments:

- 
[immd_assert.tgz](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/98733abe/7f1c/attachment/immd_assert.tgz)
 (18.7 MB; application/octet-stream)


---

** [tickets:#2037] IMM: Immd asserted on active controller in backward 
compatability**

**Status:** unassigned
**Milestone:** 4.7.2
**Created:** Thu Sep 15, 2016 06:51 AM UTC by Madhurika Koppula
**Last Updated:** Thu Sep 15, 2016 06:51 AM UTC
**Owner:** nobody
**Attachments:**

- 
[immnd_immd_cores.rtf](https://sourceforge.net/p/opensaf/tickets/2037/attachment/immnd_immd_cores.rtf)
 (8.5 kB; application/rtf)


**Environment Details:**

OS : Suse 64bit
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
1PBE enabled ).

Backward Compatability:
Opensaf versions on nodes:
SC-1 (5.0), SC-2 (5.1 FC), PL-3 (5.0), PL-4(5.1FC).

**Summary:** IMMD asserted on active controller after immnd crash.

**Steps followed & Observed behaviour:**

1) SC-1 is with role standby, SC-2 is with role active.
2) Sequence of api's called as below. 
a) saImmOiInitialize() 
b) saImmOiImplementerSet() 
c) kill -9 `pidof osafimmnd` d) saImmOiRtObjectDelete() 
 e) saImmOiFinalize()

Observations:

1) First immnd asserted on active controller  when calling 
immnd_evt_proc_fevs_rcv
2) Second active controller rebooted with immd assertion failed.

Below is the snippet of active controller SC-2:

Sep 20 20:52:47 SCALE_SLOT-42 osafntfimcnd[15091]: NO saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: Started
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO Persistent Back-End 
capability configured, Pbe file:imm.db (suffix may get added)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:3, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: immnd_evt.c:9146: 
immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest == 
cb->immnd_mdest_id) || isObjSync' failed.
Sep 20 20:52:47 SCALE_SLOT-42 python2.5: WA imma_mds_svc_evt: 
mds_auth_server_connect failed
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:4, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO Restarting a component of 
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' (comp restart count: 6)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery

Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO Extended intro from node 2020f
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: immd_evt.c:816: 
immd_accept_node: Assertion 'node_info->immnd_key != cb->node_id' failed.
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131599, SupervisionTime = 60

After reboot timestamp is as below:

Sep 20 20:52:47 SCALE_SLOT-42 opensaf_reboot: Rebooting local node; timeout=60
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA DISCARD DUPLICATE FEVS 
message:12996
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 22 02:13:05 SCALE_SLOT-42 syslog-ng[1133]: syslog-ng starting up; 
version='2.0.9'
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Backgrounding to notify hosts...
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: DNS resolution of CONN-PC 
failed; retrying later
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Flags: TI-RPC
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:07 SCALE_SLOT-42 opensafd: Starting OpenSAF Services(5.1.FC - ) 
(Using TIPC)
Sep 22 02:13:07 SCALE_SLOT-42 osafclmna[1745]: Started
Sep 22 02:13:07 SCALE_SLOT-42 

[tickets] [opensaf:tickets] #2037 IMM: Immd asserted on active controller in backward compatability

2016-09-15 Thread Madhurika Koppula



---

** [tickets:#2037] IMM: Immd asserted on active controller in backward 
compatability**

**Status:** unassigned
**Milestone:** 4.7.2
**Created:** Thu Sep 15, 2016 06:51 AM UTC by Madhurika Koppula
**Last Updated:** Thu Sep 15, 2016 06:51 AM UTC
**Owner:** nobody
**Attachments:**

- 
[immnd_immd_cores.rtf](https://sourceforge.net/p/opensaf/tickets/2037/attachment/immnd_immd_cores.rtf)
 (8.5 kB; application/rtf)


**Environment Details:**

OS : Suse 64bit
Setup : 4 nodes ( 2 controllers and 2 payloads with headless feature disabled & 
1PBE enabled ).

Backward Compatability:
Opensaf versions on nodes:
SC-1 (5.0), SC-2 (5.1 FC), PL-3 (5.0), PL-4(5.1FC).

**Summary:** IMMD asserted on active controller after immnd crash.

**Steps followed & Observed behaviour:**

1) SC-1 is with role standby, SC-2 is with role active.
2) Sequence of api's called as below. 
a) saImmOiInitialize() 
b) saImmOiImplementerSet() 
c) kill -9 `pidof osafimmnd` d) saImmOiRtObjectDelete() 
 e) saImmOiFinalize()

Observations:

1) First immnd asserted on active controller  when calling 
immnd_evt_proc_fevs_rcv
2) Second active controller rebooted with immd assertion failed.

Below is the snippet of active controller SC-2:

Sep 20 20:52:47 SCALE_SLOT-42 osafntfimcnd[15091]: NO saImmOiDispatch() Fail 
SA_AIS_ERR_BAD_HANDLE (9)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: Started
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO Persistent Back-End 
capability configured, Pbe file:imm.db (suffix may get added)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:3, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15114]: immnd_evt.c:9146: 
immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest == 
cb->immnd_mdest_id) || isObjSync' failed.
Sep 20 20:52:47 SCALE_SLOT-42 python2.5: WA imma_mds_svc_evt: 
mds_auth_server_connect failed
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO MDS event from svc_id 25 
(change:4, dest:565216648273948)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO Restarting a component of 
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' (comp restart count: 6)
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown' 
: Recovery

Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: NO Extended intro from node 2020f
Sep 20 20:52:47 SCALE_SLOT-42 osafimmd[12159]: immd_evt.c:816: 
immd_accept_node: Assertion 'node_info->immnd_key != cb->node_id' failed.
Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: NO 
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' : 
Recovery is 'nodeFailfast'

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: ER 
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast

Sep 20 20:52:47 SCALE_SLOT-42 osafamfnd[12239]: Rebooting OpenSAF NodeId = 
131599 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131599, SupervisionTime = 60

After reboot timestamp is as below:

Sep 20 20:52:47 SCALE_SLOT-42 opensaf_reboot: Rebooting local node; timeout=60
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA DISCARD DUPLICATE FEVS 
message:12996
Sep 20 20:52:47 SCALE_SLOT-42 osafimmnd[15136]: WA Error code 2 returned for 
message type 82 - ignoring
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Sep 20 20:52:48 SCALE_SLOT-42 osafimmnd[15136]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Sep 22 02:13:05 SCALE_SLOT-42 syslog-ng[1133]: syslog-ng starting up; 
version='2.0.9'
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1650]: Backgrounding to notify hosts...
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:06 SCALE_SLOT-42 sm-notify[1651]: DNS resolution of CONN-PC 
failed; retrying later
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Version 1.2.3 starting
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Flags: TI-RPC
Sep 22 02:13:06 SCALE_SLOT-42 rpc.statd[1662]: Running as root.  chown 
/var/lib/nfs to choose different user
Sep 22 02:13:07 SCALE_SLOT-42 opensafd: Starting OpenSAF Services(5.1.FC - ) 
(Using TIPC)
Sep 22 02:13:07 SCALE_SLOT-42 osafclmna[1745]: Started
Sep 22 02:13:07 SCALE_SLOT-42 osafclmna[1745]: NO 
safNode=SC-2,safCluster=myClmCluster Joined cluster, nodeid=2020f
Sep 22 02:13:07 SCALE_SLOT-42 osafrded[1754]: Started
Sep 22 02:13:08 SCALE_SLOT-42 osaffmd[1763]: Started


Below is the