Hi,

We have an application that uses lttng-ust for logging.  We are seeing a crash 
in getenv here:

#0  __GI_getenv (name=0xb6eb06de "TNG_UST_WITHOUT_BADDR_STATEDUMP") at 
getenv.c:85
#1  0xb6e7b350 in do_baddr_statedump (owner=0xb6ecf300 <global_apps>) at 
lttng-ust-statedump.c:315
#2  do_lttng_ust_statedump (owner=owner@entry=0xb6ecf300 <global_apps>)  at 
lttng-ust-statedump.c:341
#3  0xb6e71ef4 in lttng_handle_pending_statedump (owner=owner@entry=0xb6ecf300 
<global_apps>)  at lttng-events.c:856
#4  0xb6e690ac in handle_pending_statedump (sock_info=0xb6ecf300 <global_apps>) 
at lttng-ust-comm.c:581
#5  handle_message (lum=0xb48fe66c, sock=<optimized out>, sock_info=<optimized 
out>)  at lttng-ust-comm.c:966
#6  ust_listener_thread (arg=0xb6ecf300 <global_apps>)  at lttng-ust-comm.c:1490
#7  0xb6e33f6c in start_thread (arg=0xb48ff220) at pthread_create.c:339

The core shows that this thread is one of three threads in the child process 
just after a fork().  After the fork(), the one application thread in the child 
calls setenv() to set up the environment, and then execs another program.  The 
problem is that setenv() is not thread-safe, especially if it requires the 
environment vector to be resized.  If the application thread calls setenv() to 
add a new environment  variable at the same time that getenv is called by this 
lttng listener thread, bad things can happen. The setenv can cause the 
environment vector to be resized at the same time it is being searched, which 
causes getenv go off into the weeds.

I assume that this listener thread is created because we have preloaded 
libttng-ust-fork, and I see no reason that this particular process really needs 
to preload that library, so one workaround is probably to just remove it.  The 
problem is that this process inherits LD_PRELOAD from a parent process (one 
similar to init) that launches many other daemons, some that might actually 
require liblttng-ust-fork, so removing this library from the process that is 
crashing is not entirely trivial.  And this crash also raises the question of 
whether we could encounter similar crashes in other processes that use 
liblttng-ust.  It's only after intensive testing for many hours that we see 
this crash.

Would it be safe to say that it is probably a bug for an lttng thread to make a 
call to a non thread-safe function like getenv()?  What's the best way to fix 
this?

Thanks,
Doug



_______________________________________________
lttng-dev mailing list
[email protected]
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Reply via email to