In article <[EMAIL PROTECTED]>,
Maxim Sobolev <[EMAIL PROTECTED]> wrote:
> I'm not sure what exactly caused this behaviour (I can guess two potential
> victims: O'Brien's changes in crt stuff and recent Polstra's changes in
> libgcc_r), but it seems that some programs built on the previous -current from
> 27 October immediately segfault when I'm trying to run then on system installed
> from today's sources. The segfault disappeared when I recompiled affected
> program. With this message I'm attaching short backtrace.
> Program received signal SIGSEGV, Segmentation fault.
> 0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
> (gdb) bt
> #0 0x287de417 in pthread_mutex_lock () from /usr/lib/libc_r.so.4
> #1 0x806e782 in __register_frame_info ()
> #2 0x287a3137 in _init () from /usr/lib/libc_r.so.4
> #3 0x2879ffe5 in _init () from /usr/lib/libc_r.so.4
> #4 0x280797fd in _rtld () from /usr/libexec/ld-elf.so.1
Here are all the random facts which, when put together, explain what
is going on.
Your old application was (like all -pthread programs) linked
with "/usr/lib/libgcc_r.a". That library contains a function
"__register_frame_info" which uses some of the facilities of the
pthreads library "libc_r".
The pthreads library has to be initialized before it can be used, by
a call to _thread_init. If some functions such as pthread_mutex_lock
are called before the library has been initialized, a segmentation
_thread_init is called automatically from libc_r's _init function
when the dynamic linker loads the library. Unfortunately, that
isn't early enough. libgcc_r is the first thing to be initialized,
and it calls pthread_mutex_lock before _thread_init has been called.
Or rather I should say that OLD versions of libgcc_r did that --
because they were buggy.
In other words, your old application was linked with a buggy version
of libgcc_r, but it didn't become apparent until now.
It didn't become apparent until now because our crtbegin.o and
crtend.o were also buggy. They failed to call __register_frame_info.
This was a problem for C++ programs using exceptions, especially when
the gcc port was used and DWARF2 exception handling was selected.
Now we have fixed crtbegin.o and crtend.o, and we have fixed
libgcc_r.a. But it causes problems for your old application because
the new crtbegin.o and crtend.o (linked into the new shared libraries
such as libc_r) call __register_frame_info in your old, buggy,
statically linked libgcc_r.a.
Are you dizzy yet? To sum up, your old executable contains the bug but
it wasn't triggered until the recent changes.
Now, what can or should we do about this? Arguably we should simply
say in the release notes, "Relink your old multithreaded applications.
They had a bug which is now fixed." But if there are binary-only
commercial apps which exhibit the problem, this solution is useless.
I don't know whether there are any such apps, but I doubt it. N.B.,
Linux apps don't count because they were never linked with our
libgcc_r in the first place.
Or we can try to work around it, but there aren't any perfectly nice
ways to do so. Here are some possibilities:
- Put a hack in the threads library so that whenever
pthread_mutex_lock is called it checks to make sure that the
threads library has been initialized, and if not, it calls
_thread_init. This is a poor solution because it adds overhead to
a rather performance-critical function -- though admittedly the
overhead is very small. Another potential problem is that there
could be a race condition if several threads all called
pthread_mutex_lock at once before the threads library had been
initialized. I don't think the race condition would materialize,
though, since the first call would come from libgcc_r, well before
the application had gotten control.
- Put a hack into the dynamic linker to call _thread_init very early
if that symbol was defined. I like this solution even less,
because it's too hackish. The dynamic linker isn't the place for
special hooks like that.
- Put a hack into crtbegin.o or crtend.o. But we are using the
standard GNU versions of these, and I really really don't want to
change that. In any case, it's the wrong place for the
Overall I would lean toward putting the hack into pthread_mutex_lock.
John Polstra [EMAIL PROTECTED]
John D. Polstra & Co., Inc. Seattle, Washington USA
"Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message