On 16/01/2012 9.09, Carmelo Amoroso wrote: > On 16/01/2012 8.53, Khem Raj wrote: >> On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO >> <[email protected]> wrote: >>> On 15/01/2012 7.22, Khem Raj wrote: >>>> On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj <[email protected]> wrote: >>>>> On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj <[email protected]> wrote: >>>>>> On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj <[email protected]> wrote: >>>>>>> On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO >>>>>>> <[email protected]> wrote: >>>>>>>>> and since I see the same issue on all architectures probably its not >>>>>>>>> elfinterp changes >>>>>>>>> too. Mostly it seems likely that it could be in the way the scopes are >>>>>>>>> being handled >>>>>>>>> >>>>>>>> >>>>>>>> we have reviewed several times this change before committing. Anyway we >>>>>>>> will review it again. We have not ever seen any failure in the lookup >>>>>>>> with all of our tests. The only change in the way the symbol scope is >>>>>>>> created is in where the ld.so is added. >>>>>>>> In the original code it was the last entry of the global scope, while >>>>>>>> with the new structure in place it was added as soon as found (as glibc >>>>>>>> actually does).... and I don't really think this could have some >>>>>>>> impact. >>>>>>> >>>>>>> I tried to reverse it as well but the problem remained. >>>>>>> >>>>>>>> >>>>>>>> We are trying to startup a X system on our platform. Is there any >>>>>>>> simple >>>>>>>> X app we can run to show the failure ? >>>>>>>> >>>>>>>> Is some .so failing to be dl-opened due to unresolved symbol ? >>>>>>> >>>>>>> this is potentially possible. I will try to debug it through >>>>>> >>>>>> This is the problem that happens with the new scoping and does not >>>>>> happen without it >>>>>> >>>>>> Error reading Pango modules file >>>>>> >>>>>> (matchbox-desktop:1058): Pango-CRITICAL **: No modules found: >>>>>> No builtin or dynamically loaded modules were found. >>>>>> PangoFc will not work correctly. >>>>>> This probably means there was an error in the creation of: >>>>>> '/etc/pango/pango.modules' >>>>>> You should create this file by running: >>>>>> pango-querymodules > '/etc/pango/pango.modules' >>>>>> >>>>>> (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font, >>>>>> expect ugly output. engine-type='PangoRenderFc', script='latin' >>>>>> >>>>>> (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font, >>>>>> expect ugly output. engine-type='PangoRenderFc', script='common' >>>>> >>>>> here is the error >>>>> >>>>> /usr/bin/pango-querymodules: can't resolve symbol >>>>> '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'. >>>>> >>>>> this does not happen without scope patch >>>>> >>>>> pango-querymodules loads a shared library >>>>> /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this >>>>> library had libstdc++.so.6 in its DT_NEEDED entries >>>>> >>>>> I was trying to create a small testcase where I created a small binary >>>>> which would dlopen another .so which has libstdc++ in DT_NEEDED in its >>>>> header so not able to reproduce a small testcase but making some >>>>> progress >>>> >>>> >>>> I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz >>>> untar it on target and run make and the ./run.sh >>>> >>>> with buggy libraries i get >>>> root@qemux86:~/rep/reproducer_v2# ./run.sh >>>> 1)main:dlopen libA.so >>>> 4)libC:dlopen libB.so >>>> 5)libC:atexit(libC_fini) >>>> 6)main:dlclose libA.so >>>> /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini' >>>> in lib './/libC.so'. >>>> >>>> whereas without the scopes patch I get >>>> >>>> root@qemux86:~/rep/reproducer_v2# ./run.sh >>>> 1)main:dlopen libA.so >>>> 4)libC:dlopen libB.so >>>> 5)libC:atexit(libC_fini) >>>> 6)main:dlclose libA.so >>>> 7)libC:finish - atexit() >>>> 8)main:finish main >>>> root@qemux86:~/rep/reproducer_v2# >>>> >>>> >>>> I think thats the problem that I am facing in pango-querymodules as well >>>> another data point is if I use BIND_NOW then it works too. >>>> >>>> let me know if you can reproduce it with this testcase >>>> >>>> Thanks >>>> -Khem >>>> >>> >>> Thanks khem for your effort in reproducing. >>> I-ll let you know asap. >>> >>> We will focus on this 100% since now. >>> >>> Carmelo >> >> I have a patch (sort of) which fixes this issue have a look at it. >> Problem is that its trying to unload sub scopes after it has been >> removed from global scope so I just delayed the removal of dlopened >> library >> > > what is triggering the problem is the use of atexit() >
I'd ask.. is it correct that a dlopen-ed shared library install a function via atexit() to be called at program exit, if the shared library could be un-loaded at any time during the program's life ? I'd say that with the old it was just working fortunately ! The shared library image is actually un-mapped from the system, why we should expect to have some of its symbols still alive ? >> http://www.uclibc.org/~kraj/fix_libdl.patch >> > > looking at it not considering the concerns on the use of atexit, this patch is correct. Could we avoid to use the unlink_local_scope guard and test the stored_ls pointer directly ? carmelo _______________________________________________ uClibc mailing list [email protected] http://lists.busybox.net/mailman/listinfo/uclibc
