Hi,

I want to share some result about the relocation during the loading (with 
RTLD_LAZY).

Relocation count for single so (libqt5) + without optimization:
    R_ARM_GLOB_DAT: 1585
    R_ARM_RELATIVE: 9823
    R_ARM_ABS32: 19489
    R_ARM_JUMP_SLOT: 16998

Relocation count for single so (libqt5) + with optimization:
    R_ARM_GLOB_DAT: 1578
    R_ARM_RELATIVE: 28227
    R_ARM_ABS32: 435
    R_ARM_JUMP_SLOT: 290

And the optimization done here is only about changing the visibility of 
exported symbols from "default" to "protected", thanks Thiago's blog ;).
So:

- the R_ARM_JUMP_SLOT relocation is reduced significantly,
   but which is only happened at run time (as RTLD_LAZY), so it's irrelevant 
with the loading performance.

- the R_ARM_RELATIVE relocation is increase but this type relocation is very 
fast.

- actually for loading time, the bottleneck is the R_ARM_ABS32 relocation, 
which is reduced around 97% now !

Finally the overall loading time is reduced from ~10-20s to ~1s...

But I still have some question about the R_ARM_ABS32 relocation.

It seems if the function is virtual (with "default" visibility), then it will 
be added into .rel.dyn as the R_ARM_ABS32 type, for example:
007b0124  0011a802 R_ARM_ABS32            00311e4b   
_ZN20QEventDispatcherUNIX13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE

Could someone help with below:
1. why the virtual function with "default" visibility needs relocation even if 
it's implemented inside ?
2. when changed to "protected" visibility, I guess it's optimized to add a 
GOT.PLT entry as a R_ARM_RELATIVE relocation, is that true ?

Thanks,
Song

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of 
ext Thiago Macieira
Sent: Tuesday, July 24, 2012 10:29 PM
To: [email protected]
Subject: Re: [Development] how to reduce the relocation <-- Use static qt 
libraries

On terça-feira, 24 de julho de 2012 13.22.25, [email protected] wrote:
> Yes, the bottleneck of the loading now is the local relocations 
> instead of inter-library's.
> 
> So what we want to do will be reducing the number of local relocation.
> 
> Based on my understanding, this local relocation should be caused by 
> the "symbol inter-positioning".

That's not exactly the case. Some types of relocations do permit symbol 
interpositioning. But some types of code require relocations even if they're 
not interposable.

In my listing, all the "local" relocations are non-interposable.

More information:
http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-
linux/
http://www.macieira.org/blog/2012/01/update-and-benchmark-on-the-dynamic-
library-proposals/

> And from gcc option -Bsymbolic:
> "
> When creating a shared library, bind references to global symbols to 
> the definition within the shared library, if any. Normally, it is 
> possible for a program linked against a shared library to override the 
> definition within the shared library. This option is only meaningful 
> on ELF platforms which support shared libraries. "
> 
> But for my case, it's not needed to override the definition within the 
> libqt5.so.

Yes, it is.

But you didn't realise that your code requires relocations even if the symbols 
can't be overridden.

In order to do that, you need a fully position *dependent* code that can't be 
moved. Executables on Linux are like that, but all libraries are movable in 
memory, even those compiled without -fPIC.

Since you're not running Linux, check if your OS supports that. Note that 
you'll need to know the exact load address at build time and that it must match 
the loaded address for the ROM if you want to do XIP.

> So, besides the prelink solution, I think the compiler (I mean
> armlink) should provide the ability to disable this symbol 
> inter-positioning, just like the -Bsymbolic in gcc.
> 
> Does anyone have idea from the compiler point of view ?

Sorry, you're barking up the wrong tree.

Your only option to reduce the number of relocations is to prelink to the exact 
load address. There are two ways of doing that:

1) the ELF prelinker, which prelinks all relocations to a given address, but 
does still allow relocating if the shared object is loaded at a different 
address. The code is PIC, so XIP should work just fine.

2) compile without PIC and prelink at a specific address at link time, which 
means that the code must be loaded there or it will fail to run. This is the 
Windows DLL model.

> 
> Also I see that Qt also uses the "-Bsymbolic-functions" to do some 
> optimization, is that similar case to reduce the relocation ?

Yes. Read my blogs for more detail.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
     Intel Sweden AB - Registration Number: 556189-6027
     Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development

Reply via email to