Re: [Mingw-w64-public] [PATCH] aarch64: Add runtime relocations

Martin Storsjö Fri, 14 Nov 2025 01:45:40 -0800

On Thu, 13 Nov 2025, Evgeny Karpov wrote:

From: Evgeny Karpov <[email protected]>

Is this the intended email address to use for these contributions - nomore @microsoft.com?

Subject: [PATCH] aarch64: Add runtime relocations

The patch implements the required changes to support runtime relocations.
For 26-bit relocation, the linker generates a jump stub, as a single
opcode is not sufficient for relocation.

The supporting binutils patch is being upstreamed.
https://sourceware.org/pipermail/binutils/2025-November/145651.html

A similar change has been upstreamed to Cygwin.
https://cygwin.com/pipermail/cygwin-patches/2025q4/014332.html

Signed-off-by: Evgeny Karpov <[email protected]>
---
 mingw-w64-crt/crt/pseudo-reloc.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/mingw-w64-crt/crt/pseudo-reloc.c b/mingw-w64-crt/crt/pseudo-reloc.c
index dd08e718a..20c04a1f0 100644
--- a/mingw-w64-crt/crt/pseudo-reloc.c
+++ b/mingw-w64-crt/crt/pseudo-reloc.c
@@ -464,6 +464,31 @@ do_pseudo_reloc (void * start, void * end, void * base)
         case 16:
            __write_memory ((void *) reloc_target, &reldata, 2);
           break;
+#ifdef __aarch64__
+        case 12:
+          /* Replace add Xn, Xn, :lo12:label with ldr Xn, [Xn, 
:lo12:__imp__func].
+             That loads the address of _func into Xn.  */
+          opcode = 0xf9400000 | (opcode & 0x3ff); // ldr
+          reldata = ((ptrdiff_t) base + r->sym) & ((1 << 12) - 1);
+          reldata >>= 3;
+          opcode |= reldata << 10;
+           __write_memory ((void *) reloc_target, &opcode, 4);
+          break;
+        case 21:
+          /* Replace adrp Xn, label with adrp Xn, __imp__func.  */
+          opcode &= 0x9f00001f;
+          reldata = (((ptrdiff_t) base + r->sym) >> 12)
+                    - (((ptrdiff_t) base + r->target) >> 12);
+          reldata &= (1 << 21) - 1;
+          opcode |= (reldata & 3) << 29;
+          reldata >>= 2;
+          opcode |= reldata << 5;
+           __write_memory ((void *) reloc_target, &opcode, 4);
+          break;
+        /* A note regarding 26 bits relocation.
+           A single opcode is not sufficient for 26 bits relocation in dynamic 
linking.
+           The linker generates a jump stub instead.  */
+#endif

First off - I would point out that I did consider doing something likethis for the case with LLVM/Clang as well, but I decided not to.

Instead, in LLVM/Clang, we instead generate .refptr indirection - justlike GCC also does on x86_64. Doing that avoids a number of problems:

- It avoids having to add support for these new relocations here(including the nitpicky details I'll follow up with below)

- It avoids having to do these relocations in the .text section. Doingthat requires changing the permission of the code section towrite+execute, which generally is undesireable. (Plus, in specialenvironments such as UWP, it is entirely forbidden to have regions ofmemory being both writable and executable at the same time.)

- It avoids the issue with how far away the target symbol can be. E.g. onx86_64, a 32 bit relative address isn't big enough if the target is toofar away. When this issue does show up, it produces extremely confusingissues, so to help diagnose it better, we added a check here in the thepseudo relocation code (see commitca35236d9799af8a3d2f9baa35b60e6c11abeb24) to error out if the target istoo far away to express in the given number of bits.

That said, I see that you've worked around the range issue by rewriting"add Xn, Xn, :lo12:label" into "ldr Xn, [xn, :lo12:__imp_label]" - whichmakes the range problem a non-issue. That's neat!

Regarding range, did you actually test this in a mingw setting? The rangecheck code, currently on line 442-456, should trigger on these relocations(with bits == 12 or 21) and error out, even if you actually do handle alarger range. If you want to go this way of this patch, I'm pretty sureyou need to patch the range check as well, to make it not trigger on theserelocations.

However - there are two potential flaws with this approach which I don'tsee how you are solving. (Do you have a full prebuilt toolchain with thesepatches integrated where I could try it out? I triedhttps://github.com/Windows-on-ARM-Experiments/mingw-woarm64-build/releases/download/2025-07-15/aarch64-w64-mingw32-msvcrt-toolchain.tar.gzbut that doesn't seem to have these bits enabled yet.)


What you have now works fine for code like this:

    extern int variable;
    int *get_var_addr(void) {
        return &variable;
    }

Where GCC generates code like this:

    get_var_addr:
        adrp    x0, variable
        add     x0, x0, :lo12:variable
        ret

If the linker part of these relocations work in the same way as for theother existing pseudo relocations on x86, then the linker replaces thesymbol references to the undefined "variable" into "__imp_variable" atlinking time like this:


    get_var_addr:
        adrp    x0, __imp_variable
        add     x0, x0, :lo12:__imp_variable
        ret

Now this works fine with your pseudo relocation handling, which at runtimeturns it into this:


    get_var_addr:
        adrp    x0, __imp_variable
        ldr     x0, [x0, :lo12:__imp_variable]
        ret

However, what does it do about this case?

    extern int variable;
    int get_var(void) {
        return variable;
    }

With the current versions of GCC, this generates the following code:

    get_var:
        adrp    x0, variable
        ldr     w0, [x0, #:lo12:variable]
        ret

Now in this case, we already have the :lo12: relocation in an ldrinstruction, so the pseudo relocation trick no longer works as intended.To fix this case, the pseudo relocation handling code would need to insertan extra ldr instruction after this one.

The secondly, the current mechanism for the pseudo relocations work by_adding_ the difference between the __imp_variable and the actual importedaddress to the relocation. Not overwriting, but adding. This makes it alsowork transparently for relocations with a PIC-relative address (althoughthat's range limited). It also makes it work for cases where the symbolreference has a built-in offset.


As for concrete examples to show the issue:

    struct S {
        int a, b, c, d;
    };
    extern struct S s;
    int get_field(void) {
        return s.d;
    }
    int *get_field_addr(void) {
        return &s.d;
    }

Currently with GCC, this produces the following code:

    get_field:
        adrp    x0, s+12
        ldr     w0, [x0, #:lo12:s+12]
        ret

    get_field_addr:
        adrp    x0, s+12
        add     x0, x0, :lo12:s+12
        ret

Now if I understand the code you're proposing correctly, this would loseand drop the +12 offset entirely, and just end up addressing the start ofthe struct instead.

I guess it's possible to somehow try to work around these issues by makingGCC not emit this kind of code at all - to never bake in an offset likethis, and never generate a direct "ldr" like in the get_addr and get_fieldcases. (Then you should also amend the linker to check that the symboloffset, when creating such pseudo relocations, has to be zero, as it won'twork in the end otherwise.)

So it may be possible to hack around all these issues somehow, but I wouldsuggest instead going the same way as GCC did for x86_64, and we've donein LLVM/Clang for all architectures, by doing indirection via a .refptrpointer instead. That way, you don't need _any_ code changes to the linkeror runtime. (Other than adding support for pseudo relocations forautoimport of full 64 bit addresses, if that needs architecture specificcode in binutils.)


// Martin



_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Re: [Mingw-w64-public] [PATCH] aarch64: Add runtime relocations

Reply via email to