On 05/04/2018 04:08 PM, Jan Kiszka wrote:
> On 2018-05-04 15:44, Ralf Ramsauer wrote:
>>
>>
>> On 05/04/2018 12:57 PM, Ralf Ramsauer wrote:
>>>
>>>
>>> On 05/04/2018 12:38 PM, Jan Kiszka wrote:
>>>> On 2018-05-04 11:12, Ralf Ramsauer wrote:
>>>>> On 05/02/2018 06:18 PM, Jan Kiszka wrote:
>>>>>> On 2018-05-02 16:51, Ralf Ramsauer wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 05/01/2018 06:54 PM, Jan Kiszka wrote:
>>>>>>>> On 2018-05-01 10:54, Ralf Ramsauer wrote:
>>>>>>>>> On 04/27/2018 08:21 PM, Jan Kiszka wrote:
>>>>>>>>>> On 2018-04-27 11:36, Ralf Ramsauer wrote:
>>>>>>>>>>> This won't drop symbols that are marked as used.
>>>>>>>>>>>
>>>>>>>>>>> The static, relocateable inmate library lib.a is created by ar. When
>>>>>>>>>>> linking executables, unreferenced symbols may be dropped, even if
>>>>>>>>>>> they
>>>>>>>>>>> are attributed as used.
>>>>>>>>>>>
>>>>>>>>>>> --whole-archive ensures that those symbols will be linked.
>>>>>>>>>>>
>>>>>>>>>>> Whereas on x86, we need --gc-sections.
>>>>>>>>>>
>>>>>>>>>> That's something I do not understand yet: With [1] we will build the
>>>>>>>>>> whole hypervisor, including x86, with --whole-archive, and that works
>>>>>>>>>> fine on x86 as well.
>>>>>>>>>
>>>>>>>>> That's what happens if I compile x86 inmates with --whole-archive
>>>>>>>>> instead of --gc-sections:
>>>>>>>>>
>>>>>>>>> LD
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/demos/x86/32-bit-demo-linked.o
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(ioapic-32.o):
>>>>>>>>> In
>>>>>>>>> function `ioapic_init':
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/ioapic.c:48: undefined
>>>>>>>>> reference to `map_range'
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(smp-32.o): In
>>>>>>>>> function `smp_start_cpu':
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/smp.c:59: undefined
>>>>>>>>> reference to `delay_us'
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/smp.c:61: undefined
>>>>>>>>> reference to `delay_us'
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(pci-32.o): In
>>>>>>>>> function `pci_find_device':
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:47: undefined
>>>>>>>>> reference to `pci_read_config'
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:51: undefined
>>>>>>>>> reference to `pci_read_config'
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/lib32.a(pci-32.o): In
>>>>>>>>> function `pci_find_cap':
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:61: undefined
>>>>>>>>> reference to `pci_read_config'
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:65: undefined
>>>>>>>>> reference to `pci_read_config'
>>>>>>>>> /home/ralf/workspace/jailhouse/inmates/lib/x86/../pci.c:68: undefined
>>>>>>>>> reference to `pci_read_config'
>>>>>>>>>
>>>>>>>>> Interestingly, this only happens to the 32-bit demo inmate.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Because we keep everything, something might be missing now: The 32-bit
>>>>>>>> lib does not provide support for all features that its big 64-bit
>>>>>>>> brother has.
>>>>>>>>
>>>>>>>> I still think this approach is too much of a big hammer. Try
>>>>>>>> --print-gc-sections on your specific problem (uart section loss) and
>>>>>>>> play with --undefined as suggested by the ld man page. Not sure if
>>>>>>>> there
>>>>>>>> is also some linker script statement that can do that trick, but it
>>>>>>>> might be worth checking.
>>>>>>>
>>>>>>> KEEP, together with --whole-archive does the trick:
>>>>>>>
>>>>>>
>>>>>> Why still --whole-archive? We should have a clear reason here.
>>>>>
>>>>> Doesn't work without --whole-archive, ld still drops it then. It's the
>>>>> combination of --whole-archive and KEEP() that makes it working for ARM
>>>>> on the one hand, and doesn't break the 32-bit x86 demo inmate on the
>>>>> other.
>>>>>
>>>>> I guess the reason is that KEEP() only keeps symbols that enter a 'final
>>>>> stage' of the linking process. And as those symbols are never referenced
>>>>> anywhere, they seem to be dropped before, like they never enter that
>>>>> stage of the process where they could be kept.
>>>>
>>>> Should be possible to confirm this: We have records of all stages (*.o,
>>>> *.a).
>>>>
>>>>>
>>>>> See [1], 4.6.4.4:
>>>>> When link-time garbage collection is in use (-gc-sections), it is
>>>>> often useful to mark sections that should not be eliminated. This is
>>>>> accomplished by surrounding an input section's wildcard entry with
>>>>> KEEP(), as in KEEP(*(.init)) or KEEP(SORT(*)(.ctors)).
>>>>
>>>> I'm wondering the following:
>>>>
>>>> - Is "--whole-archive --gc-sections" a reasonable combination (it
>>>> sounds contradictory), or does this just happen to work and will break
>>>> again with a different compiler version or some tuning elsewhere?
>>
>> Looks like a dead end at the moment, but at least I understand what's
>> going on:
>>
>> When linking, ld searches for the ENTRY point, and recursively looks up
>> referenced symbols until there are no more undef'd refs. [1]
>>
>> So if, e.g., gic_init() is found in gic.o which is embedded in lib.a,
>> then ld will include _all_ symbols from gic.o and later garbage collect
>> undef'd ones.
>>
>> In above case, symbols in uart drivers (e.g. uart_8250.o which is also
>> inside lib.a) are referenced nowhere, so the whole object file will
>> _never_ be considered for linking, even if there is a KEEP() around the
>> section. KEEP() only respects symbols that the linker has seen, but it
>> never has seen any symbol of uart_8250.o .
>>
>> I can confirm this behavior by testing: If I introduce a global int
>> foobar = 42; to uart-8250.c and reference foobar from uart-demo.c, then
>> suddenly the .uarts section from uart-8250.c will be linked, as the
>> object file is now considered for linking.
>>
>
> That's possibly where these could come into play:
>
> -u symbol
> --undefined=symbol
> Force symbol to be entered in the output file as an undefined
> symbol. Doing this may, for example, trigger linking of
> additional modules from standard libraries. -u may be
> repeated with different option arguments to enter additional
> undefined symbols. This option is equivalent to the "EXTERN"
> linker script command.
>
> If this option is being used to force additional modules to
> be pulled into the link, and if it is an error for the symbol
> to remain undefined, then the option --require-defined should
> be used instead.
>
> --require-defined=symbol
> Require that symbol is defined in the output file. This
> option is the same as option --undefined except that if
> symbol is not defined in the output file then the linker will
> issue an error and exit. The same effect can be achieved in
> a linker script by using "EXTERN", "ASSERT" and "DEFINED"
> together. This option can be used multiple times to require
> additional symbols.
Yes, this works, but I need to specify every single symbol that I want
to link, there's no wildcard functionality (at least no documented one).
I mean you don't want to touch the linker file every time someone adds
an UART driver. Happy debugging if someone forgets the pinning in the
linker file. ;-)
So this should definitely be somehow automated, if this is really the
solution.
There i see two ways how to accomplish that:
- Let the DEFINE_UART() macro emit EXTERN(symbol_name), on
#if __ASSEMBLY__ and somehow include uart driver sources when
preprocessing inmate.lds.S
- Do some Makefile magic and define uart driver object names in an own
variable. Append that variable to objs-y, and substitute the name
and pass the --undefined arguments to LDFLAGS.
This requires the basename of the driver source file to be the same
as the basename of the symbol (which currently is the case as far as
i see)
Other suggestions? Both solutions are ugly, but defining symbols names
manually is unhandsome as well.
Puh, that patch series really exploded in complexity, didn't expect
that. Just wanted to implement poor man's dynamic hardware configuration
for ARM inmates which in the end will really help me with my psci
patches. Then I ended up with cache coherence issues and deep linker
internals...
Thanks
Ralf
>
>> Crud.
>>
>> This [2] is a similar issue, and they propose to add --whole-archive.
>>
>> So in our case, the --gc-sections --whole-archive combo somewhat makes
>> in deed sense, but unfortunately, this includes (e.g.) gic_* functions
>> to the uart-demo, where they are actually not needed.
>>
>>>>
>>>> - There is no size regression due to --whole-archive for the final
>>>> binaries, on all architectures?
>>
>> And now I can also answer the question how we get those size regressions
>> on ARM: --whole-section considers all object files for linking. While
>> unused functions will be kicked again through --gc-sections, the linker
>> can't garbage collect (for some reason) global static functions.
>> Anything referenced from there will of course be included...
>>
>> Hm. Maybe I can somehow cheat around that. Pretty sure the kernel has
>> some similar issues somewhere, as they make heavy use of --gc-sections.
>>
>
> OK, thanks for the analysis so far. If my other pointers will not help
> either, I will reconsider pros and cons of this approach.
>
> Jan
>
--
You received this message because you are subscribed to the Google Groups
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.