On 16/01/2012 9.09, Carmelo Amoroso wrote:
> On 16/01/2012 8.53, Khem Raj wrote:
>> On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO
>> <[email protected]> wrote:
>>> On 15/01/2012 7.22, Khem Raj wrote:
>>>> On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj <[email protected]> wrote:
>>>>> On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj <[email protected]> wrote:
>>>>>> On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj <[email protected]> wrote:
>>>>>>> On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO 
>>>>>>> <[email protected]> wrote:
>>>>>>>>> and since I see the same issue on all architectures probably its not
>>>>>>>>> elfinterp changes
>>>>>>>>> too. Mostly it seems likely that it could be in the way the scopes are
>>>>>>>>> being handled
>>>>>>>>>
>>>>>>>>
>>>>>>>> we have reviewed several times this change before committing. Anyway we
>>>>>>>> will review it again. We have not ever seen any failure in the lookup
>>>>>>>> with all of our tests. The only change in the way the symbol scope is
>>>>>>>> created is in where the ld.so is added.
>>>>>>>> In the original code it was the last entry of the global scope, while
>>>>>>>> with the new structure in place it was added as soon as found (as glibc
>>>>>>>> actually does).... and I don't really think this could have some 
>>>>>>>> impact.
>>>>>>>
>>>>>>> I tried to reverse it as well but the problem remained.
>>>>>>>
>>>>>>>>
>>>>>>>> We are trying to startup a X system on our platform. Is there any 
>>>>>>>> simple
>>>>>>>> X app we can run to show the failure ?
>>>>>>>>
>>>>>>>> Is some .so failing to be dl-opened due to unresolved symbol ?
>>>>>>>
>>>>>>> this is potentially possible. I will try to debug it through
>>>>>>
>>>>>> This is the problem that happens with the new scoping and does not
>>>>>> happen without it
>>>>>>
>>>>>> Error reading Pango modules file
>>>>>>
>>>>>> (matchbox-desktop:1058): Pango-CRITICAL **: No modules found:
>>>>>> No builtin or dynamically loaded modules were found.
>>>>>> PangoFc will not work correctly.
>>>>>> This probably means there was an error in the creation of:
>>>>>>  '/etc/pango/pango.modules'
>>>>>> You should create this file by running:
>>>>>>  pango-querymodules > '/etc/pango/pango.modules'
>>>>>>
>>>>>> (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
>>>>>> expect ugly output. engine-type='PangoRenderFc', script='latin'
>>>>>>
>>>>>> (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font,
>>>>>> expect ugly output. engine-type='PangoRenderFc', script='common'
>>>>>
>>>>> here is the error
>>>>>
>>>>> /usr/bin/pango-querymodules: can't resolve symbol
>>>>> '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'.
>>>>>
>>>>> this does not happen without scope patch
>>>>>
>>>>> pango-querymodules loads a shared library
>>>>> /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this
>>>>> library had libstdc++.so.6 in its DT_NEEDED entries
>>>>>
>>>>> I was trying to create a small testcase where I created a small binary
>>>>> which would dlopen another .so which has libstdc++ in DT_NEEDED in its
>>>>> header so not able to reproduce a small testcase but making some
>>>>> progress
>>>>
>>>>
>>>> I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz
>>>> untar it on target and run make and the ./run.sh
>>>>
>>>> with buggy libraries i get
>>>> root@qemux86:~/rep/reproducer_v2# ./run.sh
>>>> 1)main:dlopen  libA.so
>>>> 4)libC:dlopen  libB.so
>>>> 5)libC:atexit(libC_fini)
>>>> 6)main:dlclose libA.so
>>>> /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini'
>>>> in lib './/libC.so'.
>>>>
>>>> whereas without the scopes patch I get
>>>>
>>>> root@qemux86:~/rep/reproducer_v2# ./run.sh
>>>> 1)main:dlopen  libA.so
>>>> 4)libC:dlopen  libB.so
>>>> 5)libC:atexit(libC_fini)
>>>> 6)main:dlclose libA.so
>>>> 7)libC:finish - atexit()
>>>> 8)main:finish main
>>>> root@qemux86:~/rep/reproducer_v2#
>>>>
>>>>
>>>> I think thats the problem that I am facing in pango-querymodules as well
>>>> another data point is if I use BIND_NOW then it works too.
>>>>
>>>> let me know if you can reproduce it with this testcase
>>>>
>>>> Thanks
>>>> -Khem
>>>>
>>>
>>> Thanks khem for your effort in reproducing.
>>> I-ll let you know asap.
>>>
>>> We will focus on this 100% since now.
>>>
>>> Carmelo
>>
>> I have a patch (sort of) which fixes this issue have a look at it.
>> Problem is that its trying to unload sub scopes after it has been
>> removed from global scope so I just delayed the removal of dlopened
>> library
>>
> 
> what is triggering the problem is the use of atexit()
> 

I'd  ask.. is it correct that a dlopen-ed shared library install
a function via atexit() to be called at program exit, if the shared
library could be un-loaded at any time during the program's life ?

I'd say that with the old it was just working fortunately !

The shared library image is actually un-mapped from the system, why we
should expect to have some of its symbols still alive ?

>> http://www.uclibc.org/~kraj/fix_libdl.patch
>>
> 
> looking at it

not considering the concerns on the use of atexit, this patch is
correct. Could we avoid to use the unlink_local_scope guard and test the
stored_ls pointer directly ?

carmelo
_______________________________________________
uClibc mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/uclibc

Reply via email to