On 2018-05-04 15:44, Ralf Ramsauer wrote:
>
>
> On 05/04/2018 12:57 PM, Ralf Ramsauer wrote:
>>
>>
>> On 05/04/2018 12:38 PM, Jan Kiszka wrote:
>>> On 2018-05-04 11:12, Ralf Ramsauer wrote:
>>>> On 05/02/2018 06:18 PM, Jan Kiszka wrote:
>>>>> On 2018-05-02 16:51, Ralf Ramsauer wrote:
>>>>>>
>>>>>>
>>>>>> On 05/01/2018 06:54 PM, Jan Kiszka wrote:
>>>>>>> On 2018-05-01 10:54, Ralf Ramsauer wrote:
>>>>>>>> On 04/27/2018 08:21 PM, Jan Kiszka wrote:
>>>>>>>>> On 2018-04-27 11:36, Ralf Ramsauer wrote:
>>>>>>>>>> This won't drop symbols that are marked as used.
>>>>>>>>>>
>>>>>>>>>> The static, relocateable inmate library lib.a is created by ar. When
>>>>>>>>>> linking executables, unreferenced symbols may be dropped, even if
>>>>>>>>>> they
>>>>>>>>>> are attributed as used.
>>>>>>>>>>
>>>>>>>>>> --whole-archive ensures that those symbols will be linked.
>>>>>>>>>>
>>>>>>>>>> Whereas on x86, we need --gc-sections.
>>>>>>>>>
>>>>>>>>> That's something I do not understand yet: With [1] we will build the
>>>>>>>>> whole hypervisor, including x86, with --whole-archive, and that works
>>>>>>>>> fine on x86 as well.
>>>>>>>>
>>>>>>>> That's what happens if I compile x86 inmates with --whole-archive
>>>>>>>> instead of --gc-sections:
>>>>>>>>
>>>>>>>> LD
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/demos/x86/32-bit-demo-linked.o
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(ioapic-32.o): In
>>>>>>>> function `ioapic_init':
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/ioapic.c:48: undefined
>>>>>>>> reference to `map_range'
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(smp-32.o): In
>>>>>>>> function `smp_start_cpu':
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/smp.c:59: undefined
>>>>>>>> reference to `delay_us'
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/smp.c:61: undefined
>>>>>>>> reference to `delay_us'
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(pci-32.o): In
>>>>>>>> function `pci_find_device':
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:47: undefined
>>>>>>>> reference to `pci_read_config'
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:51: undefined
>>>>>>>> reference to `pci_read_config'
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(pci-32.o): In
>>>>>>>> function `pci_find_cap':
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:61: undefined
>>>>>>>> reference to `pci_read_config'
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:65: undefined
>>>>>>>> reference to `pci_read_config'
>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:68: undefined
>>>>>>>> reference to `pci_read_config'
>>>>>>>>
>>>>>>>> Interestingly, this only happens to the 32-bit demo inmate.
>>>>>>>>
>>>>>>>
>>>>>>> Because we keep everything, something might be missing now: The 32-bit
>>>>>>> lib does not provide support for all features that its big 64-bit
>>>>>>> brother has.
>>>>>>>
>>>>>>> I still think this approach is too much of a big hammer. Try
>>>>>>> --print-gc-sections on your specific problem (uart section loss) and
>>>>>>> play with --undefined as suggested by the ld man page. Not sure if there
>>>>>>> is also some linker script statement that can do that trick, but it
>>>>>>> might be worth checking.
>>>>>>
>>>>>> KEEP, together with --whole-archive does the trick:
>>>>>>
>>>>>
>>>>> Why still --whole-archive? We should have a clear reason here.
>>>>
>>>> Doesn't work without --whole-archive, ld still drops it then. It's the
>>>> combination of --whole-archive and KEEP() that makes it working for ARM
>>>> on the one hand, and doesn't break the 32-bit x86 demo inmate on the other.
>>>>
>>>> I guess the reason is that KEEP() only keeps symbols that enter a 'final
>>>> stage' of the linking process. And as those symbols are never referenced
>>>> anywhere, they seem to be dropped before, like they never enter that
>>>> stage of the process where they could be kept.
>>>
>>> Should be possible to confirm this: We have records of all stages (*.o,
>>> *.a).
>>>
>>>>
>>>> See [1], 4.6.4.4:
>>>> When link-time garbage collection is in use (-gc-sections), it is
>>>> often useful to mark sections that should not be eliminated. This is
>>>> accomplished by surrounding an input section's wildcard entry with
>>>> KEEP(), as in KEEP(*(.init)) or KEEP(SORT(*)(.ctors)).
>>>
>>> I'm wondering the following:
>>>
>>> - Is "--whole-archive --gc-sections" a reasonable combination (it
>>> sounds contradictory), or does this just happen to work and will break
>>> again with a different compiler version or some tuning elsewhere?
>
> Looks like a dead end at the moment, but at least I understand what's
> going on:
>
> When linking, ld searches for the ENTRY point, and recursively looks up
> referenced symbols until there are no more undef'd refs. [1]
>
> So if, e.g., gic_init() is found in gic.o which is embedded in lib.a,
> then ld will include _all_ symbols from gic.o and later garbage collect
> undef'd ones.
>
> In above case, symbols in uart drivers (e.g. uart_8250.o which is also
> inside lib.a) are referenced nowhere, so the whole object file will
> _never_ be considered for linking, even if there is a KEEP() around the
> section. KEEP() only respects symbols that the linker has seen, but it
> never has seen any symbol of uart_8250.o .
>
> I can confirm this behavior by testing: If I introduce a global int
> foobar = 42; to uart-8250.c and reference foobar from uart-demo.c, then
> suddenly the .uarts section from uart-8250.c will be linked, as the
> object file is now considered for linking.
>
That's possibly where these could come into play:
-u symbol
--undefined=symbol
Force symbol to be entered in the output file as an undefined
symbol. Doing this may, for example, trigger linking of
additional modules from standard libraries. -u may be
repeated with different option arguments to enter additional
undefined symbols. This option is equivalent to the "EXTERN"
linker script command.
If this option is being used to force additional modules to
be pulled into the link, and if it is an error for the symbol
to remain undefined, then the option --require-defined should
be used instead.
--require-defined=symbol
Require that symbol is defined in the output file. This
option is the same as option --undefined except that if
symbol is not defined in the output file then the linker will
issue an error and exit. The same effect can be achieved in
a linker script by using "EXTERN", "ASSERT" and "DEFINED"
together. This option can be used multiple times to require
additional symbols.
> Crud.
>
> This [2] is a similar issue, and they propose to add --whole-archive.
>
> So in our case, the --gc-sections --whole-archive combo somewhat makes
> in deed sense, but unfortunately, this includes (e.g.) gic_* functions
> to the uart-demo, where they are actually not needed.
>
>>>
>>> - There is no size regression due to --whole-archive for the final
>>> binaries, on all architectures?
>
> And now I can also answer the question how we get those size regressions
> on ARM: --whole-section considers all object files for linking. While
> unused functions will be kicked again through --gc-sections, the linker
> can't garbage collect (for some reason) global static functions.
> Anything referenced from there will of course be included...
>
> Hm. Maybe I can somehow cheat around that. Pretty sure the kernel has
> some similar issues somewhere, as they make heavy use of --gc-sections.
>
OK, thanks for the analysis so far. If my other pointers will not help
either, I will reconsider pros and cons of this approach.
Jan
--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
--
You received this message because you are subscribed to the Google Groups
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.