On 16/01/2012 10.36, Carmelo Amoroso wrote: > On 16/01/2012 9.09, Carmelo Amoroso wrote: >> On 16/01/2012 8.53, Khem Raj wrote: >>> On Sun, Jan 15, 2012 at 11:46 PM, Carmelo AMOROSO >>> <carmelo.amor...@st.com> wrote: >>>> On 15/01/2012 7.22, Khem Raj wrote: >>>>> On Sat, Jan 14, 2012 at 6:10 PM, Khem Raj <raj.k...@gmail.com> wrote: >>>>>> On Fri, Jan 13, 2012 at 4:13 PM, Khem Raj <raj.k...@gmail.com> wrote: >>>>>>> On Fri, Jan 13, 2012 at 3:45 PM, Khem Raj <raj.k...@gmail.com> wrote: >>>>>>>> On Fri, Jan 13, 2012 at 1:37 AM, Carmelo AMOROSO >>>>>>>> <carmelo.amor...@st.com> wrote: >>>>>>>>>> and since I see the same issue on all architectures probably its not >>>>>>>>>> elfinterp changes >>>>>>>>>> too. Mostly it seems likely that it could be in the way the scopes >>>>>>>>>> are >>>>>>>>>> being handled >>>>>>>>>> >>>>>>>>> >>>>>>>>> we have reviewed several times this change before committing. Anyway >>>>>>>>> we >>>>>>>>> will review it again. We have not ever seen any failure in the lookup >>>>>>>>> with all of our tests. The only change in the way the symbol scope is >>>>>>>>> created is in where the ld.so is added. >>>>>>>>> In the original code it was the last entry of the global scope, while >>>>>>>>> with the new structure in place it was added as soon as found (as >>>>>>>>> glibc >>>>>>>>> actually does).... and I don't really think this could have some >>>>>>>>> impact. >>>>>>>> >>>>>>>> I tried to reverse it as well but the problem remained. >>>>>>>> >>>>>>>>> >>>>>>>>> We are trying to startup a X system on our platform. Is there any >>>>>>>>> simple >>>>>>>>> X app we can run to show the failure ? >>>>>>>>> >>>>>>>>> Is some .so failing to be dl-opened due to unresolved symbol ? >>>>>>>> >>>>>>>> this is potentially possible. I will try to debug it through >>>>>>> >>>>>>> This is the problem that happens with the new scoping and does not >>>>>>> happen without it >>>>>>> >>>>>>> Error reading Pango modules file >>>>>>> >>>>>>> (matchbox-desktop:1058): Pango-CRITICAL **: No modules found: >>>>>>> No builtin or dynamically loaded modules were found. >>>>>>> PangoFc will not work correctly. >>>>>>> This probably means there was an error in the creation of: >>>>>>> '/etc/pango/pango.modules' >>>>>>> You should create this file by running: >>>>>>> pango-querymodules > '/etc/pango/pango.modules' >>>>>>> >>>>>>> (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font, >>>>>>> expect ugly output. engine-type='PangoRenderFc', script='latin' >>>>>>> >>>>>>> (matchbox-desktop:1058): Pango-WARNING **: failed to choose a font, >>>>>>> expect ugly output. engine-type='PangoRenderFc', script='common' >>>>>> >>>>>> here is the error >>>>>> >>>>>> /usr/bin/pango-querymodules: can't resolve symbol >>>>>> '_ZNSt14error_categoryD2Ev' in lib '/usr/lib/libstdc++.so.6'. >>>>>> >>>>>> this does not happen without scope patch >>>>>> >>>>>> pango-querymodules loads a shared library >>>>>> /usr/lib/pango/1.6.0/modules/pango-basic-fc.so using dlopen and this >>>>>> library had libstdc++.so.6 in its DT_NEEDED entries >>>>>> >>>>>> I was trying to create a small testcase where I created a small binary >>>>>> which would dlopen another .so which has libstdc++ in DT_NEEDED in its >>>>>> header so not able to reproduce a small testcase but making some >>>>>> progress >>>>> >>>>> >>>>> I might have a test case here http://uclibc.org/~kraj/reproducer_v2.tar.gz >>>>> untar it on target and run make and the ./run.sh >>>>> >>>>> with buggy libraries i get >>>>> root@qemux86:~/rep/reproducer_v2# ./run.sh >>>>> 1)main:dlopen libA.so >>>>> 4)libC:dlopen libB.so >>>>> 5)libC:atexit(libC_fini) >>>>> 6)main:dlclose libA.so >>>>> /home/root/rep/reproducer_v2/main: can't resolve symbol '_libC_fini' >>>>> in lib './/libC.so'. >>>>> >>>>> whereas without the scopes patch I get >>>>> >>>>> root@qemux86:~/rep/reproducer_v2# ./run.sh >>>>> 1)main:dlopen libA.so >>>>> 4)libC:dlopen libB.so >>>>> 5)libC:atexit(libC_fini) >>>>> 6)main:dlclose libA.so >>>>> 7)libC:finish - atexit() >>>>> 8)main:finish main >>>>> root@qemux86:~/rep/reproducer_v2# >>>>> >>>>> >>>>> I think thats the problem that I am facing in pango-querymodules as well >>>>> another data point is if I use BIND_NOW then it works too. >>>>> >>>>> let me know if you can reproduce it with this testcase >>>>> >>>>> Thanks >>>>> -Khem >>>>> >>>> >>>> Thanks khem for your effort in reproducing. >>>> I-ll let you know asap. >>>> >>>> We will focus on this 100% since now. >>>> >>>> Carmelo >>> >>> I have a patch (sort of) which fixes this issue have a look at it. >>> Problem is that its trying to unload sub scopes after it has been >>> removed from global scope so I just delayed the removal of dlopened >>> library >>> >> >> what is triggering the problem is the use of atexit() >> > > I'd ask.. is it correct that a dlopen-ed shared library install > a function via atexit() to be called at program exit, if the shared > library could be un-loaded at any time during the program's life ? > > I'd say that with the old it was just working fortunately ! > > The shared library image is actually un-mapped from the system, why we > should expect to have some of its symbols still alive ? > >>> http://www.uclibc.org/~kraj/fix_libdl.patch >>> >> >> looking at it > > not considering the concerns on the use of atexit, this patch is > correct. Could we avoid to use the unlink_local_scope guard and test the > stored_ls pointer directly ? > > carmelo
hum.... still wondering how original lookup mechanism worked !? _______________________________________________ uClibc mailing list uClibc@uclibc.org http://lists.busybox.net/mailman/listinfo/uclibc