On Sun, May 24, 2026 at 4:43 AM Paulo Duarte <[email protected]> wrote:
>
> The imported boot.S places the boot stack inside the .bss segment:
>
>         .bss
> .boot_stack:
>         .space 4096
> .boot_stack_end:
>
> c_boot_entry() is the first C function called from _start, with sp
> already pointing at .boot_stack_end.  Its first action is to call
> zero_out_bss(), which memsets [__bss_start, __bss_end) — the whole
> .bss range, including the very boot stack the kernel is *currently
> running on*.  That wipes the saved x29/x30 and any locals the
> compiler spilled on entry, so the next return / function call
> branches to 0 and the kernel hangs in EL1.
>
> Move the boot stack into its own `.boot_stack` nobits section and
> place that section after `__bss_end` in the linker script so
> zero_out_bss() leaves it alone:
>
>         .section .boot_stack, "aw", %nobits
> boot_stack:
>         .space 4096
> .boot_stack_end:
>
> Brought up under qemu-system-aarch64 -M virt the bug fires
> immediately; wip-aarch64 likely never exercised the
> zero_out_bss-from-_start path because its testing was on a
> different boot route.

Could you expand? What different boot route?

The patch makes sense, but it is really interesting that this was not
causing issues for us at the time.

> ---
>  aarch64/aarch64/boot.S | 10 ++++++++--
>  aarch64/ldscript       |  3 +++
>  2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/aarch64/aarch64/boot.S b/aarch64/aarch64/boot.S
> index 85d3b944..ab736489 100644
> --- a/aarch64/aarch64/boot.S
> +++ b/aarch64/aarch64/boot.S
> @@ -92,8 +92,14 @@ ENTRY(_start)
>         b       EXT(c_boot_entry)
>  END(_start)
>
> -       .bss
> -.boot_stack:
> +       /*
> +        * Put the boot stack in its own nobits section so it lives outside
> +        * [__bss_start, __bss_end). Otherwise c_boot_entry's call to
> +        * zero_out_bss() (which memsets the whole BSS region) would clobber
> +        * its own saved x29/x30, sending us to PC=0 on ret.
> +        */
> +       .section .boot_stack, "aw", %nobits
> +boot_stack:
>         .space  4096
>  .boot_stack_end:
>
> diff --git a/aarch64/ldscript b/aarch64/ldscript
> index 236fc6f8..a5aec69d 100644
> --- a/aarch64/ldscript
> +++ b/aarch64/ldscript
> @@ -27,6 +27,9 @@ SECTIONS
>          __bss_start = .;
>          *(.bss);
>          __bss_end = .;
> +        /* Boot stack lives in its own nobits region after __bss_end so
> +           it survives zero_out_bss() running from within itself. */
> +        *(.boot_stack);
>      }
>      _image_end = .;
>  }
> --
> 2.54.0
>

Reply via email to