> On Dec 23, 2021, at 08:50, yadong.li <[email protected]> wrote:
> 
> Hi
> 
>  *   We learn the group of camkes from the 
> https://github.com/seL4/camkes-tool/blob/master/docs/index.md
>  *   group can colocate two component instances in a single address space, we 
> want to use the share var by this way, like directcall transfer string 
> virtual address, But we feel confused about the global symbols who have the 
> same symbol name include function and variable in a single address space, 
> Their behavior seems undefined 。

My information on how this works is out of date and I can’t answer all your 
questions, but I can tell you how this feature was originally implemented.

Single address space components were originally implemented using 
post-compilation symbol mangling. As you’ve noticed, naive linking of two 
component instances together presents two problems:
1. Unrelated homonyms between the instances now conflict. A symbol called `foo` 
in component instance A and a symbol called `foo` in component instance B now 
refer to the same thing when linked together.
2. References to glue code have the opposite problem: their names may differ in 
component instance A and component instance B, but need to refer to the same 
functions when the two are linked together.

Both problems were solved with the same mechanism: GNU objcopy. The invocations 
to objcopy were template-generated so they could take advantage of knowledge 
about the system begin compiled. A generated objcopy invocation would do the 
following:
1. Adjust all non-generated symbols to have internal visibility. Component 
instance A and component instance B have already been (partially) linked, so at 
this point the only unresolved symbols that need to remain externally visible 
are those related to the connection(s) between them. This solves problem 1 
above.
2. Name-mangle all remaining symbols to something prefixed with the relevant 
connection instance’s name. IIRC the name mangling scheme was something like 
‘<connection instance name><space><original symbol name>’. This guaranteed 
uniqueness because <space> is not a character that existed in C or assembly 
symbols. Whether using a space in a symbol name is legal or not, I don’t know, 
but all the binutils seemed fine with this.
3. Rename the second component in the name mangling above to a common name. I 
don’t recall exactly how this name was chosen, but this solves problem 2 above.

This sounds pretty unorthodox and brittle, but it actually worked surprisingly 
well. All combinations of single address space component systems seemed to Just 
Work, with a few notable sharp edges:
* Anything involving MMIO was tricky. These symbols frequently needed to remain 
externally visible and the two component instances would often have differing 
names but the same addresses for them.
* GNU ld was more or less required. The multiple steps involving partial 
linking was only supported by GNU ld and Gold at the time. LLVM’s lld may have 
caught up in the interim years.
* The objcopy name mangling broke cross-section references used by GCC’s 
implementation of Link-Time Optimization. As a result, any LTO compilation 
degraded to LTO being disabled. This wouldn’t have been a big deal except that 
one of the primary reasons to put two components in a single address space is 
to enable cross-component inlining, usually facilitated by LTO. AFAICT this 
(playing objcopy tricks and expecting LTO to still work) was simply not a 
supported use case. We explored how to work around this and got some one-off 
efforts working for benchmarking, but proper support would have involved 
altering the way binutils work.
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to