On 03/24/2011 12:27 AM, Mike Frysinger wrote:
> 2011/3/23 Timo Teräs:
>> On 03/23/2011 06:33 PM, Mike Frysinger wrote:
>>> On Wed, Mar 23, 2011 at 12:29 PM, Carmelo AMOROSO wrote:
>>>> indeed we have seen also some issue with dlopen, when TLS variables were
>>>> involved. One suspect we have is actually thread-safety. glibc use some
>>>> locking primitives to access TLS block, while we don't.
>>>> Not yet had a free time slot to look at this deeply.
>>>
>>> i'm not sure how thread safety would play a role with TLS variables.
>>> TLS, by definition, is per-thread and that is guaranteed by the
>>> ABI/compiler/C lib/kernel down to the variable level, and that whole
>>> stack shouldnt be poking into any shared state (beyond shared .text).
>>
>> ldso needs to allocate each global TLS variable from globally single
>> place. That's how it organises the per-thread data area. So yes, when dl
>> allocates the TLS variables it needs to go fiddle global variables.
>
> mmm my understanding was that each thread got its own chunk of TLS
> area at thread creation time, and after that the ldso only existed
> when the thread needed to know the address of its TLS. what port are
> you talking about exactly ?
Using TLS variables is safe obviously.
But if after startup, someone does dlopen(libfoo.so) and libfoo defines
global TLS variables, those variables need to be allocated from the
global TLS tables. I believe that's mostly done in _dl_add_to_slotinfo.
However, I think this is not my problem.
I think the problem is _dl_loaded_modules as suspected first. What
happens likely is (foo.so and bar.so both depend on same module):
Thread A Thread B
dlopen(foo.so)
loads zap.so
dlclose(foo.so)
start unloading zap.so
(finds it refcount=0)
dlopen(bar.so)
finds zap.so from _dl_loaded_modules
refcount not checked, it's just referenced
unconditionally
unmaps zap.so
goes along with reference to module being
unloaded
-> BOOM
So yes, we seem to have bunch of globals unprotected. _dl_loaded_modules
and _dl_symbol_tables amongst others. (BTW. Singly linked list for
loaded_modules is really inefficient, we should probably make it a
hash table (indexed with st_dev and st_ino) so we can search it fast in
_dl_load_elf_shared_library().
So. As immediate bandage aid, we probably need to add rwlock (or just
regular mutex) to dlopen (rw)/dlsym (ro)/dlclose (rw). That should at
least prevent crashes. But yes, more fine grained locking (or even RCU)
would be preferable.
- Timo
_______________________________________________
uClibc mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/uclibc