https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123524
--- Comment #9 from mikulas at artax dot karlin.mff.cuni.cz ---
I bisected the other problem (not using the scaled addressing modes) - and it
is caused by the commit 24bc02b1eda3795163616c02725ee14bac9d975c ("gimple-fold:
Remove assume_aligned folding").
The source code uses __builtin_assume_aligned when accessing the variables on
the frame:
#define frame_var(fp, idx) (cast_ptr(unsigned char *,
__builtin_assume_aligned(frame_char_(fp) + ((size_t)(idx) << slot_bits),
slot_size)))
When I delete __builtin_assume_aligned from the source code, the scaled
addressing is properly generated.
BTW. gcc-16 on arm64 also generates slightly worse code due to
__builtin_assume_aligned.
With __builtin_assume_aligned:
fe6c: b8402260 ldur w0, [x19, #2]
fe70: b8406261 ldur w1, [x19, #6]
fe74: b840a262 ldur w2, [x19, #10]
fe78: 38606a83 ldrb w3, [x20, x0]
fe7c: 38616a84 ldrb w4, [x20, x1]
fe80: 2b04007f cmn w3, w4
fe84: 54ff29e1 b.ne e3c0 <u_run+0x9aa0> // b.any
fe88: 8b224e82 add x2, x20, w2, uxtw #3
fe8c: f8607a80 ldr x0, [x20, x0, lsl #3]
fe90: f8617a81 ldr x1, [x20, x1, lsl #3]
fe94: ab010000 adds x0, x0, x1
fe98: 54ff2946 b.vs e3c0 <u_run+0x9aa0>
fe9c: f9000040 str x0, [x2]
fea0: 78412e61 ldrh w1, [x19, #18]!
fea4: 90000000 adrp x0, 0 <FIXED_binary_divide_int8_t>
fea8: 91000000 add x0, x0, #0x0
feac: f861d800 ldr x0, [x0, w1, sxtw #3]
feb0: d61f0000 br x0
Without __builtin_assume_aligned:
fd30: b8402260 ldur w0, [x19, #2]
fd34: b8406261 ldur w1, [x19, #6]
fd38: b840a262 ldur w2, [x19, #10]
fd3c: 38606a83 ldrb w3, [x20, x0]
fd40: 38616a84 ldrb w4, [x20, x1]
fd44: 2b04007f cmn w3, w4
fd48: 54ff29e1 b.ne e284 <u_run+0x9964> // b.any
fd4c: f8607a80 ldr x0, [x20, x0, lsl #3]
fd50: f8617a81 ldr x1, [x20, x1, lsl #3]
fd54: ab010000 adds x0, x0, x1
fd58: 54ff2966 b.vs e284 <u_run+0x9964>
fd5c: f8225a80 str x0, [x20, w2, uxtw #3]
fd60: 78412e61 ldrh w1, [x19, #18]!
fd64: 90000000 adrp x0, 0 <FIXED_binary_divide_int8_t>
fd68: 91000000 add x0, x0, #0x0
fd6c: f861d800 ldr x0, [x0, w1, sxtw #3]
fd70: d61f0000 br x0