On Wed, 2010-02-03 at 01:15 -0700, Steven Dake wrote: > On Tue, 2010-02-02 at 18:48 -0700, hj lee wrote: > > > > > > getenv() and setenv() are not required by posix to be thread > > safe (see > > man pthreads). I assume there is a good reason for this. > > > > >From my interpretation of that man page, it appears syslog() > > is required > > to be thread safe. I took a look at glibc2.9 and it does > > indeed use > > strftime_l inside its syslog calls (which is on the > > non-threadsafe > > list). > > > > One workaround is simply to remove the use of the syslog libc > > API and > > implement it directly. I am not too enthusiastic about the > > current > > syslog implementation in libc after reading through the code > > (memory > > allocations, etc). This was done in the past but at some > > point was > > reverted. > > > > The issue remains regarding the proper use of getenv() and > > setenv() > > inside corosync. Ideally we wouldn't use thread unsafe apis, > > but we do > > and getenv/setenv seem pretty important. I thought of using > > weak (or > > maybe its called lazy) bindings to override getenv and setenv, > > but these > > apis appear to not support lazy(/or weak i forget term) > > bindings. > > > > To correct this problem fully I suggest the following changes: > > 1) write our own syslog implementation which connects to the > > syslog > > daemon socket and uses the write system call to write the > > output. > > Examples of this are in older versions of openais > > 2) add getenv() and setenv() apis to the coroapi definitions > > these getenv/setenv call the libc getenv/setenv with a mutex > > held to > > mutually exclude the environment. Pacemaker and internal > > corosync code > > can then call these getenv/setenv calls where necessary to > > ensure > > correct operation. > > > > There may be some issue with pacemaker using setenv after > > forking which > > could then operate on an invalid mutex that I would like > > Andrew's > > feedback regarding. > > > > bzs: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=561180 > > > > https://bugzilla.redhat.com/show_bug.cgi?id=561181 > > > > Feel free to add your analysis > > > > Regards > > -steve > > > > Thank you very much. I added a test program and some comments on > > bugzilla. Also I tested my two proposals since I need some fix right > > now. It worked OK. Only thing I needed to export > > HA_cluster_type=openais. I added this to /etc/init.d/corosync before > > start(). I could change the pacemaker source to use openais by default > > if HA_cluster_type is not set, then I don't need to export. But it's > > OK. Until the fix is available, I will use my workaround. > > > > hj > > Would you please test the attached fix. It should provide a workaround. >
hj Another option for workaround attached which I prefer you test instead. glibc's tzset call has code in it to check if it has been called, and if so, to exit. This results in tzset's getenv being executed before any threads are spawned, vs in a thread context inside a syslog call stack. Thanks > Regards > -steve > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais
Index: exec/main.c =================================================================== --- exec/main.c (revision 2651) +++ exec/main.c (working copy) @@ -1202,6 +1202,8 @@ char corosync_lib_dir[PATH_MAX]; hdb_handle_t object_runtime_handle; + tzset (); + #if defined(HAVE_PTHREAD_SPIN_LOCK) pthread_spin_init (&serialize_spin, 0); #endif
_______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
