- **status**: unassigned --> assigned
- **assigned_to**: Neelakanta Reddy
- **Comment**:
The Assertion src/imm/immd/immd_evt.c:813: immd_accept_node: Assertion
'node_info->immnd_key != cb->node_id' failed.
1. The cb->node_id will be updated at immd_mds_register
The cb->node_id has to be 2020f, since it is 2020f
2. The node arrived is :
LOG_NO("Extended intro from node %x", node_info->immnd_key);
Mar 20 23:38:09.111387 osafimmd [17384:src/imm/immd/immd_evt.c:1563] NO
Extended intro from node 2020f
Means "node_info->immnd_key is 2020f from above trace print"
There should be some memory corruption problem over here.
For how many IMMND restarts does you observe this, the shared immd logs are not
from starting.
---
** [tickets:#2388] imm: active node rebooted due immd assertion failure**
**Status:** assigned
**Milestone:** 5.2.RC2
**Created:** Tue Mar 21, 2017 07:18 AM UTC by M Chandrasekhar
**Last Updated:** Tue Mar 21, 2017 07:18 AM UTC
**Owner:** Neelakanta Reddy
**Attachments:**
-
[logs.tar](https://sourceforge.net/p/opensaf/tickets/2388/attachment/logs.tar)
(38.0 MB; application/octet-stream)
###Environment details
OS : Suse 64bit
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )
SC-1 and PL-3 installed with 4.7GA
SC-2 and PL-4 installed with 5.2RC1
###Summary
Active controller got rebooted due to immd got assertion failure after few
immnd restarts.
steps followed:
1. bring up SC-1 and PL-3 with 4.7GA version
2. bring up SC-2 and PL-4 with 5.2RC version
3. do si-swap, and make SC-2 active
3. run few regression tests and immnd restarts and issue was noticed.
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO NODE STATE-> IMM_NODE_FULLY_AVAILABLE
2927
Mar 20 23:38:02 fos2 osafimmd[17384]: NO ACT: New Epoch for IMMND process at
node 2010f old epoch: 29 new epoch:30
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO RepositoryInitModeT is
SA_IMM_KEEP_REPOSITORY
Mar 20 23:38:02 fos2 osafimmnd[27544]: WA IMM Access Control mode is DISABLED!
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Epoch set to 30 in ImmModel
Mar 20 23:38:02 fos2 test_immsv: IN Received PROC_STALE_CLIENTS
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO SERVER STATE: IMM_SERVER_SYNC_CLIENT
--> IMM_SERVER_READY
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO ImmModel received scAbsenceAllowed 0
Mar 20 23:38:02 fos2 osafimmd[17384]: NO ACT: New Epoch for IMMND process at
node 2030f old epoch: 29 new epoch:30
Mar 20 23:38:02 fos2 osafimmd[17384]: NO ACT: New Epoch for IMMND process at
node 2040f old epoch: 29 new epoch:30
Mar 20 23:38:02 fos2 osafimmd[17384]: NO ACT: New Epoch for IMMND process at
node 2020f old epoch: 0 new epoch:30
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 944
(safSmfService) <315, 2020f>
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 945
(safEvtService) <123, 2020f>
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 946
(safLogService) <127, 2020f>
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 947
(safCheckPointService) <134, 2020f>
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 948
(safClmService) <131, 2020f>
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 949
(safLckService) <135, 2020f>
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 950
(MsgQueueService131599) <12777, 2020f>
Mar 20 23:38:02 fos2 osafimmnd[27544]: NO Implementer connected: 951
(safAmfService) <129, 2020f>
Mar 20 23:38:03 fos2 osafimmnd[27544]: NO Implementer (applier) connected: 952
(@OpenSafImmReplicatorB) <13770, 2020f>
Mar 20 23:38:03 fos2 osafntfimcnd[27526]: NO Started
Mar 20 23:38:03 fos2 osafimmnd[27544]: NO PBE-OI established on other SC.
Dumping incrementally to file imm.db
Mar 20 23:38:08 fos2 sudo: tet : TTY=unknown ; PWD=/tmp/26815aa ;
USER=root ; COMMAND=/bin/kill -9 27544
Mar 20 23:38:08 fos2 osafimmd[17384]: NO MDS event from svc_id 25 (change:4,
dest:565217221926950)
Mar 20 23:38:08 fos2 osafamfnd[17445]: NO Restarting a component of
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' (comp restart count: 10)
Mar 20 23:38:08 fos2 osafamfnd[17445]: NO
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
Mar 20 23:38:08 fos2 osafntfimcnd[27526]: NO saImmOiDispatch() Fail
SA_AIS_ERR_BAD_HANDLE (9)
Mar 20 23:38:08 fos2 osafimmnd[27586]: mkfifo already exists:
/var/lib/opensaf/osafimmnd.fifo File exists
Mar 20 23:38:08 fos2 osafimmnd[27586]: Started
Mar 20 23:38:08 fos2 osafimmnd[27586]: NO Persistent Back-End capability
configured, Pbe file:imm.db (suffix may get added)
Mar 20 23:38:08 fos2 osafimmd[17384]: NO MDS event from svc_id 25 (change:3,
dest:565217221935144)
Mar 20 23:38:08 fos2 osafimmnd[27586]: NO IMMD service is UP ...
ScAbsenseAllowed?:0 introduced?:0
Mar 20 23:38:08 fos2 osafimmnd[27586]: NO SERVER STATE: IMM_SERVER_ANONYMOUS
--> IMM_SERVER_CLUSTER_WAITING
Mar 20 23:38:08 fos2 osafimmnd[27586]: NO Fevs count adjusted to 64649
preLoadPid: 0
Mar 20 23:38:08 fos2 osafimmnd[27586]: src/imm/immnd/immnd_evt.c:9125:
immnd_evt_proc_fevs_rcv: Assertion '!reply_dest || (reply_dest ==
cb->immnd_mdest_id) || isObjSync' failed.
Mar 20 23:38:08 fos2 osafimmd[17384]: NO MDS event from svc_id 25 (change:4,
dest:565217221935144)
Mar 20 23:38:08 fos2 osafamfnd[17445]: NO Restarting a component of
'safSu=SC-2,safSg=NoRed,safApp=OpenSAF' (comp restart count: 11)
Mar 20 23:38:08 fos2 osafamfnd[17445]: NO
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
Mar 20 23:38:08 fos2 osafimmnd[27609]: mkfifo already exists:
/var/lib/opensaf/osafimmnd.fifo File exists
Mar 20 23:38:08 fos2 osafimmnd[27609]: Started
Mar 20 23:38:08 fos2 osafimmnd[27609]: NO Persistent Back-End capability
configured, Pbe file:imm.db (suffix may get added)
Mar 20 23:38:08 fos2 osafimmd[17384]: NO MDS event from svc_id 25 (change:3,
dest:565217227268166)
Mar 20 23:38:08 fos2 osafimmnd[27609]: NO IMMD service is UP ...
ScAbsenseAllowed?:0 introduced?:0
Mar 20 23:38:08 fos2 osafimmnd[27609]: NO SERVER STATE: IMM_SERVER_ANONYMOUS
--> IMM_SERVER_CLUSTER_WAITING
Mar 20 23:38:08 fos2 osafimmnd[27609]: NO Fevs count adjusted to 64651
preLoadPid: 0
Mar 20 23:38:08 fos2 osafimmnd[27609]: WA DISCARD DUPLICATE FEVS message:64652
Mar 20 23:38:08 fos2 osafimmnd[27609]: WA Error code 2 returned for message
type 82 - ignoring
Mar 20 23:38:09 fos2 osafimmnd[27609]: WA DISCARD DUPLICATE FEVS message:64653
Mar 20 23:38:09 fos2 osafimmnd[27609]: WA Error code 2 returned for message
type 82 - ignoring
Mar 20 23:38:09 fos2 osafimmd[17384]: NO Extended intro from node 2020f
Mar 20 23:38:09 fos2 osafimmd[17384]: src/imm/immd/immd_evt.c:813:
immd_accept_node: Assertion 'node_info->immnd_key != cb->node_id' failed.
Mar 20 23:38:09 fos2 osafamfnd[17445]: NO
'safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Mar 20 23:38:09 fos2 osafamfnd[17445]: ER
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Mar 20 23:38:09 fos2 osafamfnd[17445]: Rebooting OpenSAF NodeId = 131599 EE
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId =
131599, SupervisionTime = 60
Mar 20 23:38:09 fos2 osafimmnd[27609]: WA DISCARD DUPLICATE FEVS message:64653
immnd and immd traces are attached
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets