On 05/04/2018 12:57 PM, Ralf Ramsauer wrote:
> 
> 
> On 05/04/2018 12:38 PM, Jan Kiszka wrote:
>> On 2018-05-04 11:12, Ralf Ramsauer wrote:
>>> On 05/02/2018 06:18 PM, Jan Kiszka wrote:
>>>> On 2018-05-02 16:51, Ralf Ramsauer wrote:
>>>>>
>>>>>
>>>>> On 05/01/2018 06:54 PM, Jan Kiszka wrote:
>>>>>> On 2018-05-01 10:54, Ralf Ramsauer wrote:
>>>>>>> On 04/27/2018 08:21 PM, Jan Kiszka wrote:
>>>>>>>> On 2018-04-27 11:36, Ralf Ramsauer wrote:
>>>>>>>>> This won't drop symbols that are marked as used.
>>>>>>>>>
>>>>>>>>> The static, relocateable inmate library lib.a is created by ar. When
>>>>>>>>> linking executables, unreferenced symbols may be dropped, even if they
>>>>>>>>> are attributed as used.
>>>>>>>>>
>>>>>>>>> --whole-archive ensures that those symbols will be linked.
>>>>>>>>>
>>>>>>>>> Whereas on x86, we need --gc-sections.
>>>>>>>>
>>>>>>>> That's something I do not understand yet: With [1] we will build the
>>>>>>>> whole hypervisor, including x86, with --whole-archive, and that works
>>>>>>>> fine on x86 as well.
>>>>>>>
>>>>>>> That's what happens if I compile x86 inmates with --whole-archive
>>>>>>> instead of --gc-sections:
>>>>>>>
>>>>>>>   LD
>>>>>>> /home/ralf/workspace/jailhouse/inmates/demos/x86/32-bit-demo-linked.o
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(ioapic-32.o): In
>>>>>>> function `ioapic_init':
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/ioapic.c:48: undefined
>>>>>>> reference to `map_range'
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(smp-32.o): In
>>>>>>> function `smp_start_cpu':
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/smp.c:59: undefined
>>>>>>> reference to `delay_us'
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/smp.c:61: undefined
>>>>>>> reference to `delay_us'
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(pci-32.o): In
>>>>>>> function `pci_find_device':
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:47: undefined
>>>>>>> reference to `pci_read_config'
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:51: undefined
>>>>>>> reference to `pci_read_config'
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(pci-32.o): In
>>>>>>> function `pci_find_cap':
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:61: undefined
>>>>>>> reference to `pci_read_config'
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:65: undefined
>>>>>>> reference to `pci_read_config'
>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:68: undefined
>>>>>>> reference to `pci_read_config'
>>>>>>>
>>>>>>> Interestingly, this only happens to the 32-bit demo inmate.
>>>>>>>
>>>>>>
>>>>>> Because we keep everything, something might be missing now: The 32-bit
>>>>>> lib does not provide support for all features that its big 64-bit
>>>>>> brother has.
>>>>>>
>>>>>> I still think this approach is too much of a big hammer. Try
>>>>>> --print-gc-sections on your specific problem (uart section loss) and
>>>>>> play with --undefined as suggested by the ld man page. Not sure if there
>>>>>> is also some linker script statement that can do that trick, but it
>>>>>> might be worth checking.
>>>>>
>>>>> KEEP, together with --whole-archive does the trick:
>>>>>
>>>>
>>>> Why still --whole-archive? We should have a clear reason here.
>>>
>>> Doesn't work without --whole-archive, ld still drops it then. It's the
>>> combination of --whole-archive and KEEP() that makes it working for ARM
>>> on the one hand, and doesn't break the 32-bit x86 demo inmate on the other.
>>>
>>> I guess the reason is that KEEP() only keeps symbols that enter a 'final
>>> stage' of the linking process. And as those symbols are never referenced
>>> anywhere, they seem to be dropped before, like they never enter that
>>> stage of the process where they could be kept.
>>
>> Should be possible to confirm this: We have records of all stages (*.o,
>> *.a).
>>
>>>
>>> See [1], 4.6.4.4:
>>>   When link-time garbage collection is in use (-gc-sections), it is
>>>   often useful to mark sections that should not be eliminated. This is
>>>   accomplished by surrounding an input section's wildcard entry with
>>>   KEEP(), as in KEEP(*(.init)) or KEEP(SORT(*)(.ctors)).
>>
>> I'm wondering the following:
>>
>> - Is "--whole-archive --gc-sections" a reasonable combination (it
>>   sounds contradictory), or does this just happen to work and will break
>>   again with a different compiler version or some tuning elsewhere?

Looks like a dead end at the moment, but at least I understand what's
going on:

When linking, ld searches for the ENTRY point, and recursively looks up
referenced symbols until there are no more undef'd refs. [1]

So if, e.g., gic_init() is found in gic.o which is embedded in lib.a,
then ld will include _all_ symbols from gic.o and later garbage collect
undef'd ones.

In above case, symbols in uart drivers (e.g. uart_8250.o which is also
inside lib.a) are referenced nowhere, so the whole object file will
_never_ be considered for linking, even if there is a KEEP() around the
section. KEEP() only respects symbols that the linker has seen, but it
never has seen any symbol of uart_8250.o .

I can confirm this behavior by testing: If I introduce a global int
foobar = 42; to uart-8250.c and reference foobar from uart-demo.c, then
suddenly the .uarts section from uart-8250.c will be linked, as the
object file is now considered for linking.

Crud.

This [2] is a similar issue, and they propose to add --whole-archive.

So in our case, the --gc-sections --whole-archive combo somewhat makes
in deed sense, but unfortunately, this includes (e.g.) gic_* functions
to the uart-demo, where they are actually not needed.

>>
>> - There is no size regression due to --whole-archive for the final
>>   binaries, on all architectures?

And now I can also answer the question how we get those size regressions
on ARM: --whole-section considers all object files for linking. While
unused functions will be kicked again through --gc-sections, the linker
can't garbage collect (for some reason) global static functions.
Anything referenced from there will of course be included...

Hm. Maybe I can somehow cheat around that. Pretty sure the kernel has
some similar issues somewhere, as they make heavy use of --gc-sections.

  Ralf

[1] https://elinux.org/images/2/2d/ELC2010-gc-sections_Denys_Vlasenko.pdf

[2]
https://stackoverflow.com/questions/19101088/why-does-lds-keep-does-not-keep-my-symbols

> 
> No size regression on x86, but on arm. ld seems to behave different.
> Compared the objdumps on arm: inmates get in deed some really unused
> stuff in their binaries, like strcmp, ...
> 
> Hmm. Will look for a better fix.



> 
> Thanks
>   Ralf
> 
>>
>>>
>>> BTW: I read that __attribute__((used)) never affects the linking
>>> process, it only forces the compiler to not optimize away unreferenced
>>> things during compilation.
>>
>> That makes sense.
>>
>> Jan
>>
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to