>
>
> getenv() and setenv() are not required by posix to be thread safe (see
> man pthreads).  I assume there is a good reason for this.
>
> >From my interpretation of that man page, it appears syslog() is required
> to be thread safe.  I took a look at glibc2.9 and it does indeed use
> strftime_l inside its syslog calls (which is on the non-threadsafe
> list).
>
> One workaround is simply to remove the use of the syslog libc API and
> implement it directly.  I am not too enthusiastic about the current
> syslog implementation in libc after reading through the code (memory
> allocations, etc).  This was done in the past but at some point was
> reverted.
>
> The issue remains regarding the proper use of getenv() and setenv()
> inside corosync.  Ideally we wouldn't use thread unsafe apis, but we do
> and getenv/setenv seem pretty important.  I thought of using weak (or
> maybe its called lazy) bindings to override getenv and setenv, but these
> apis appear to not support lazy(/or weak i forget term) bindings.
>
> To correct this problem fully I suggest the following changes:
> 1) write our own syslog implementation which connects to the syslog
> daemon socket and uses the write system call to write the output.
> Examples of this are in older versions of openais
> 2) add getenv() and setenv() apis to the coroapi definitions
>  these getenv/setenv call the libc getenv/setenv with a mutex held to
> mutually exclude the environment.  Pacemaker and internal corosync code
> can then call these getenv/setenv calls where necessary to ensure
> correct operation.
>
> There may be some issue with pacemaker using setenv after forking which
> could then operate on an invalid mutex that I would like Andrew's
> feedback regarding.
>
> bzs:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=561180
>
> https://bugzilla.redhat.com/show_bug.cgi?id=561181
>
> Feel free to add your analysis
>
> Regards
> -steve
>

Thank you very much. I added a test program and some comments on bugzilla.
Also I tested my two proposals since I need some fix right now. It worked
OK. Only thing I needed to export HA_cluster_type=openais. I added this to
/etc/init.d/corosync before start(). I could change the pacemaker source to
use openais by default if HA_cluster_type is not set, then I don't need to
export. But it's OK. Until the fix is available, I will use my workaround.

hj
Index: corosync/exec/main.c
===================================================================
--- corosync/exec/main.c	(revision 406)
+++ corosync/exec/main.c	(working copy)
@@ -756,6 +756,10 @@
 		corosync_exit_error (AIS_DONE_INIT_SERVICES);
 	}
 	evil_init (api);
+
+        /* Create logsys worker thread after pcmk_startup() finishes 
+           to prevent a seg fault from setenv()/getenv() */
+        logsys_fork_completed();
 }
 
 int main (int argc, char **argv)
@@ -994,7 +998,6 @@
 	if (background) {
 		corosync_tty_detach ();
 	}
-	logsys_fork_completed();
 
 	/*
 	 * Sleep for a while to let other nodes in the cluster
Index: pacemaker/lib/ais/plugin.c
===================================================================
--- pacemaker/lib/ais/plugin.c	(revision 406)
+++ pacemaker/lib/ais/plugin.c	(working copy)
@@ -325,11 +325,11 @@
     get_config_opt(pcmk_api, local_handle, "debug", &value, "on");
     if(ais_get_boolean(value)) {
 	plugin_log_level = LOG_DEBUG;
-	setenv("HA_debug",  "1", 1);
+	/*setenv("HA_debug",  "1", 1);*/
 	
     } else {
 	plugin_log_level = LOG_INFO;
-	setenv("HA_debug",  "0", 1);
+	/*setenv("HA_debug",  "0", 1);*/
     }    
     
     get_config_opt(pcmk_api, local_handle, "to_file", &value, "off");
@@ -346,13 +346,13 @@
 	if(value == NULL) {
 	    ais_err("Logging to a file requested but no log file specified");
 	} else {
-	    setenv("HA_logfile",  value, 1);
+	    /*setenv("HA_logfile",  value, 1);*/
 	}
     }
     
     get_config_opt(pcmk_api, local_handle, "syslog_facility", &value, "daemon");
-    setenv("HA_logfacility",  value, 1);
-    setenv("HA_LOGFACILITY",  value, 1);
+    /*setenv("HA_logfacility",  value, 1);*/
+    /*setenv("HA_LOGFACILITY",  value, 1);*/
 
     config_find_done(pcmk_api, local_handle);
     
@@ -371,7 +371,7 @@
     local_cname_len = strlen(local_cname);
     
     get_config_opt(pcmk_api, local_handle, "use_logd", &value, "no");
-    setenv("HA_use_logd", value, 1);
+    /*setenv("HA_use_logd", value, 1);*/
 
     get_config_opt(pcmk_api, local_handle, "use_mgmtd", &value, "no");
     if(ais_get_boolean(value) == FALSE) {
@@ -511,8 +511,8 @@
     membership_notify_list = g_hash_table_new(g_direct_hash, g_direct_equal);
     ipc_client_list = g_hash_table_new(g_direct_hash, g_direct_equal);
     
-    setenv("HA_COMPRESSION",  "bz2", 1);
-    setenv("HA_cluster_type", "openais", 1);
+    /*setenv("HA_COMPRESSION",  "bz2", 1);
+    setenv("HA_cluster_type", "openais", 1);*/
     
     ais_info("CRM: Initialized");
     log_printf(LOG_INFO, "Logging: Initialized %s\n", __PRETTY_FUNCTION__);
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to