The problem is there because after restart the immnd process inherits
environment variables from amfnd including MDS_SOCK_SERVER_CONNECT. The
following patch also fixes the problem:
diff --git a/osaf/services/saf/immsv/immnd/immnd_main.c
b/osaf/services/saf/immsv/immnd/immnd_main.c
--- a/osaf/services/saf/immsv/immnd/immnd_main.c
+++ b/osaf/services/saf/immsv/immnd/immnd_main.c
@@ -119,6 +119,9 @@ static uint32_t immnd_initialize(char *p
setenv("MDS_SOCK_SERVER_NAME", name, 1);
putenv("MDS_SOCK_SERVER_CREATE=YES");
+ LOG_NO("MDS_SOCK_SERVER_CONNECT: %p", getenv("MDS_SOCK_SERVER_CONNECT"));
+ unsetenv("MDS_SOCK_SERVER_CONNECT");
+
if (ncs_agents_startup() != NCSCC_RC_SUCCESS) {
LOG_ER("ncs_agents_startup FAILED");
goto done;
The root cause is an AMF defect (should exist an old one) where it leaks
environment variables to its child processes.
Either an MDS change or immnd change would work but I am not sure which one is
the best yet... Ideally AMF should be fixed!
---
** [tickets:#990] mds: restart of IMMND fails**
**Status:** assigned
**Milestone:** 4.5.0
**Created:** Sun Aug 17, 2014 12:32 PM UTC by Neelakanta Reddy
**Last Updated:** Mon Aug 18, 2014 08:35 AM UTC
**Owner:** Hans Feldt
Bring up the cluster. Re-start IMMND process, IMMND failed to re-initialize in
MDS.
IMMND traces:
Aug 14 14:29:36.850338 osafimmnd [4569:sysf_def.c:0090] TR INITIALIZING LEAP
ENVIRONMENT
Aug 14 14:29:36.850615 osafimmnd [4569:sysf_def.c:0123] TR DONE INITIALIZING
LEAP ENVIRONMENT
Aug 14 14:29:36.850828 osafimmnd [4569:ncs_main_pub.c:0755] TR
NCS:NODE_ID=0x0002010F
Aug 14 14:29:36.851056 osafimmnd [4569:osaf_secutil.c:0178] >>
osaf_auth_server_create
Aug 14 14:29:36.851153 osafimmnd [4569:osaf_secutil.c:0200] <<
osaf_auth_server_create
Aug 14 14:29:36.851169 osafimmnd [4569:osaf_secutil.c:0256] >>
osaf_auth_server_connect
Aug 14 14:29:36.851190 osafimmnd [4569:osaf_secutil.c:0151] >> auth_server_main
Aug 14 14:29:36.851202 osafimmnd [4569:osaf_secutil.c:0064] >>
handle_new_connection
Aug 14 14:29:36.851215 osafimmnd [4569:mds_main.c:0132] >>
mds_register_callback: fd:13, pid:4569
Aug 14 14:29:36.851226 osafimmnd [4569:mds_main.c:0152] TR mds: received 77
from 2010f62fc0040, pid 4569
Aug 14 14:29:46.861479 osafimmnd [4569:osaf_secutil.c:0285] T3 poll timeout
10000
Aug 14 14:29:46.861717 osafimmnd [4569:osaf_secutil.c:0308] <<
osaf_auth_server_connect
Aug 14 14:29:46.861777 osafimmnd [4569:mds_main.c:0213] T3 tmo
Aug 14 14:29:46.861888 osafimmnd [4569:ncs_main_pub.c:0289] T4 ERROR: MDS
lib_req failed
Aug 14 14:29:46.861916 osafimmnd [4569:ncs_main_pub.c:0344] T4 ERROR: MDS
startup failed
Aug 14 14:29:46.861936 osafimmnd [4569:ncs_main_pub.c:0346] TR IN LEAP_DBG_SINK
Aug 14 14:29:46.862004 osafimmnd [4569:immnd_main.c:0123] ER ncs_agents_startup
FAILED
Aug 14 14:29:46.862027 osafimmnd [4569:immnd_main.c:0231] << immnd_initialize
Aug 14 14:29:46.862067 osafimmnd [4569:immnd_main.c:0258] ER initialize_immd
failed
Aug 14 14:29:46.862107 osafimmnd [4569:immnd_main.c:0370] ER Failed, exiting...
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets