On Mon, May 25 2026, Simon Glass <[email protected]> wrote:

> Hi Rasmus,
>
> On 2026-05-22T21:27:48, Rasmus Villemoes <[email protected]> wrote:
>> linker_lists.h: emit lots of meta-data for debugging and sanity checking
>>
>> This is an attempt at to make it possible to catch various problems
>> with linker lists at build time, without affecting the size of the
>> final binary. Also, I did not want to have to modify all the linker
>> scripts in order to not get all this meta-data thrown away by garbage
>> collection.
>>
>> The basic trick is that one can use the .size assembly directive so
>> set the st_size of an ELF symbol, without that symbol actually
>> occupying that amount of space in any section or the final binary. And
>> all of the .size, .type, .{push,pop}section directives are completely
>> generic, so this should work for all architecture.
>>
>> So whenever we declare a start or end symbol for use by C code, also
>> emit information about the size and alignment of the type that the
>> list is supposed to be made of.
>>
>> A relatively straight-forward script (next patch) can then parse the
>> ELF file and report on any inconsistencies. For example, suppose two
>> [...]
>>
>> include/linker_lists.h | 43 +++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 43 insertions(+)
>
> I see this:
>
> test/cmd_ut.c:67:1: note: in expansion of macro ‘SUITE_DECL’
>    67 | SUITE_DECL(mbr);
>       | ^~~~~~~~~~
> include/linker_lists.h:45:17: error: expected ‘)’ before ‘:’ token
>    45 |                 : : "i"(sizeof(_type)), "i"(__alignof__(_type)))
>       |                 ^
> include/linker_lists.h:48:9: note: in expansion of macro ‘ll_emit_type_info’
>    48 |         ll_emit_type_info(_list, _type);
>          \
>       |         ^~~~~~~~~~~~~~~~~
> include/linker_lists.h:297:9: note: in expansion of macro 
> ‘ll_emit_start_symbol’
>   297 |         ll_emit_start_symbol(_list, _type);
>          \
>       |         ^~~~~~~~~~~~~~~~~~~~
>
> See below

I don't see that, and I did use sandbox as one of the targets I
built. Running the script on the sandbox u-boot ELF file, I get

List                                    Start           End             Size    
Align   Count
acpi_writer                             0x00449440      0x00449580      32      
8       10
bootdev_hunter                          0x00449580      0x00449658      24      
8       9
cmd                                     0x00449660      0x0044cb50      56      
8       242
[...]
usb_driver_entry                        0x0045a9a0      0x0045a9d0      16      
8       3
ut                                      0x0045a9e0      0x00461ce0      32      
8       920
ut_2_addrmap                            0x0045a9e0      0x0045aa00      32      
8       1
ut_2_bdinfo                             0x0045aa00      0x0045aa80      32      
8       4
ut_2_bloblist                           0x0045aa80      0x0045ac40      32      
8       14
[...]
ut_2_upl                                0x00461c60      0x00461ce0      32      
8       4
w1_driver_entry                         0x00461ce0      0x00461ce0      16      
8       0

i.e. ut is both shown as the whole list with 920 entries, and each
sublist is also shown, exactly as I wanted it to appear.


>> diff --git a/include/linker_lists.h b/include/linker_lists.h
>> @@ -23,7 +24,45 @@
>> +#define ll_emit_type_info(_list, _type) \
>> +     __asm__(                                                \
>> +             ".pushsection "ll_info_section_name(_list)',\'aw\'\n'   \
>> +             ".type "ll_size_symbol_name(_list)", STT_OBJECT\n"      \
>> +             ".size "ll_size_symbol_name(_list)", %c0\n"             \
>> +             ".type "ll_align_symbol_name(_list)", STT_OBJECT\n"     \
>> +             ".size "ll_align_symbol_name(_list)", %c1\n"            \
>> +             ll_size_symbol_name(_list)':\n'                         \
>> +             ll_align_symbol_name(_list)':\n'                        \
>> +             '.popsection\n'                                         \
>> +             : : 'i'(sizeof(_type)), 'i'(__alignof__(_type)))
>
> This is extended asm (it has operand constraints), so GCC requires it
> to be inside a function.

Not exactly. My 'info gcc' has this to say:

  Similarly to basic ‘asm’, extended ‘asm’ statements may be used both
  inside a C function or at file scope ("top-level"), where you can use
  this technique to emit assembler directives, define assembly language
  macros that can be invoked elsewhere in the file, or write entire
  functions in assembly language.  Extended ‘asm’ statements outside of
  functions may not use any qualifiers, may not specify clobbers, may
  not use ‘%’, ‘+’ or ‘&’ modifiers in constraints and can only use
  constraints which don't allow using any register.

and since the only constraints I use are those that provide an immediate
to the asm, that should be ok (and WorksForMe).

I'm not sure how it has happened, but in your reply, some of the double
quotes have turned into single quotes (e.g. "i" has become 'i', ":\n"
has become ':\n'). Both here and further down. I hope you haven't tried
to compile the code in that form.

> ll_start_decl()/ll_end_decl() pull this into
> ll_emit_start_symbol/ll_emit_end_symbol at file scope - see
> SUITE_DECL() in test/cmd_ut.c
>
> One way out is to drop the operands and have the C side emit a
> zero-initialised marker object whose array dimensions encode
> sizeof/__alignof__ - e.g. a static struct in a dedicated section whose
> two members are sized sizeof(_type) and __alignof__(_type). That costs
> a few bytes per list per TU but keeps the trick working at file scope.
>
>> diff --git a/include/linker_lists.h b/include/linker_lists.h
>> @@ -23,7 +24,45 @@
>> +#define ll_emit_start_symbol(_list, _type)                           \
>> +     ll_emit_type_info(_list, _type);                                \
>> +     __asm__(                                                        \
>> +             ".pushsection "ll_start_section_name(_list)',\'aw\'\n'  \
>> +             ".type "__stringify(ll_start_symbol_name(_list))", 
>> STT_OBJECT\n" \
>> +             ".size "__stringify(ll_start_symbol_name(_list))", 0\n" \
>> +             __stringify(ll_start_symbol_name(_list))':\n'           \
>> +             '.popsection\n'                                         \
>> +             )
>
> These multi-statement macros are not wrapped, so they only work in
> contexts that accept two declarations/statements separated by ';'.
> ll_start_decl() then chains another declaration after them, leaving a
> stray ';' at file scope

No. The definition of ll_emit_start_symbol() does not end with a
semi-colon, the user supplies that, so that it reads e.g.

  #define ll_start_decl(_sym, _type, _list)                                     
\
        ll_emit_start_symbol(_list, _type);                             \
        static _type _sym[0] __aligned(CONFIG_LINKER_LIST_ALIGN)        \
                __section(ll_start_section_name(_list))

and the user of ll_start_decl() in turn supplies a ; at the invocation
site.

There's no stray ; anywhere that I can see.

- accepted as a GCC extension but rejected
> under -Wpedantic. Please either fold the two asm blocks into one (a
> single .pushsection ... .popsection can contain all the
> .type/.size/label directives), or wrap them in a way that gives a
> single declaration/statement. The latter is awkward at file scope,
> which is another argument for the single-asm approach.

Sorry, I don't see the purpose, as I still need at least one statement
beyond the existing symbol definition, 
and by defining the statement-generating macros in a way that they never
include a trailing semi-colon, they can freely include other such
macros. And I don't want to duplicate the type_info macro, that's why
it's split off in the first place.

>> diff --git a/include/linker_lists.h b/include/linker_lists.h
>> @@ -23,7 +24,45 @@
>> +#define ll_info_section_name(_list)         '__u_boot_list_'#_list'_0'
>
> Worth mentioning in the commit message that this is matched by the
> existing KEEP(*(SORT(__u_boot_list*))) globs in all the linker
> scripts, so the new _0 sections end up in the linker_list region of
> the final image. Since each section is zero bytes that does not affect
> binary size, but it does mean the metadata is reachable from the ELF
> symtab without changing any board's linker script - which is exactly
> what you say you wanted.

Yes, and I don't really need those sections to include the #_list part,
I could just create a single __u_boot_list___metadata section. I used
the above mostly so that they list metadata symbols for a given list end
up immediately preceding the items belonging to that list in an 'nm -n'
dump.

>> diff --git a/include/linker_lists.h b/include/linker_lists.h
>> @@ -23,7 +24,45 @@
>> +#define ll_info_section_name(_list)         '__u_boot_list_'#_list'_0'
>> +#define ll_size_symbol_name(_list)          
>> '_u_boot_list_'#_list'_0_item_size'
>> +#define ll_align_symbol_name(_list)         
>> '_u_boot_list_'#_list'_0_item_align'
>
> The size/align symbols are emitted as local labels, so every TU that
> touches a given list contributes its own local copy.

That is deliberate.

> That is fine for the parser in patch 9, but it relies on strip/objcopy
> not throwing away local symbols on the way to the final image.

No, it does not.

> A note about this - or about pointing the script at u-boot (the
> unstripped ELF) rather than u-boot.bin - would help anyone trying this
> themselves.

The script cannot be used on u-boot.bin at all, that has no symbol
information and is not an ELF file. The whole point is that I emit a lot
of symbols and metadata associated to those that _is_ thrown away by
objcopy, so u-boot.bin remains unchanged.

>> diff --git a/include/linker_lists.h b/include/linker_lists.h
>> @@ -128,6 +167,7 @@
>>  #define ll_entry_start(_type, _list)                                 \
>>  ({                                                                   \
>> +     ll_emit_start_symbol(_list, _type);                             \
>>       static char start[0] __aligned(CONFIG_LINKER_LIST_ALIGN)        \
>>               __section(ll_start_section_name(_list));                \
>
> Just to check: is there any reason ll_entry_start() emits the metadata
> at every call site rather than only where the start/end markers are
> first defined?

Yes.

> Within a given TU every call expands to the same
> .size/.type for the same symbol, which the assembler tolerates but is
> wasted work.

No, because this is part of the sanity checking I want to do. If you
ever use ll_something(ctrl, struct foo) and also ll_something(ctrl,
struct bar), I'd like to catch at build time that the 'ctrl' list is
used with two different types. 

Sure, the sizes could (and often will) match, so it's not foolproof, but
it can catch some cases. I'm trying to come up with some way to make
this part even better, without adding anything that will cause
u-boot.bin to grow (I don't really care about the size of u-boot the ELF
file).

Rasmus

Reply via email to