https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86487
--- Comment #2 from avieira at gcc dot gnu.org ---
I am having quite a lot of trouble understanding what is going wrong, or maybe
I should say, what parts are going right.
I believe it tries to match the fifth alternative for anddi3_insn here which
is:
'&r' 'r' 'De'
This fails because of the early clobber, rightfully so because:
(insn 13 11 14 2 (set (reg:DI 0 r0 [125])
(and:DI (reg:DI 1 r1 [+-4 ])
(const_int 1 [0x1]))) "../t.c":3 79 {*anddi3_insn}
(nil))
DI r0 overlaps with DI r1, seeing you need two consecutive GPRs to contain a
DImode.
I decided to debug reload to find out why it had picked r1 and I find
'get_hard_regno' first picks r2 for (subreg:DI (SI 122)) in the same
instruction. If we go up we see:
(insn 10 9 11 2 (set (reg:SI 2 r2 [122])
(xor:SI (reg:SI 0 r0 [orig:123 a ] [123])
(const_int 1 [0x1]))) "../t.c":3 111 {*arm_xorsi3}
(nil))
Then in 'get_hard_regno' it invokes 'subreg_regno_offset', that returns
'nregs_xmode - nregs_ymode' as offset in big endian for paradoxical subregs
with offset 0, where, xmode is inner and ymode is outer. That is '-1' in our
case (and always negative). So I believe reload is now seeing 'r1-r2' as the
register pair for that first 'and' operand and 'r0-r1' as the destination
operand.
At first I was thinking this was a middle-end issue, specifically for
paradoxical subregs. However, I also saw a bit of Aarch64 big endian assembly
that used 'odd' registers to represent DI register pairs (V2DI).
Given the comment in 'subreg_regno_offset':
/* If this is a big endian paradoxical subreg, which uses more
actual hard registers than the original register, we must
return a negative offset so that we find the proper highpart
of the register.
We assume that the ordering of registers within a multi-register
value has a consistent endianness: if bytes and register words
have different endianness, the hard registers that make up a
multi-register value must be at least word-sized. */
It made me start to think that GCC expects register pairs in big endian to be
"called" by their Least Significant Register (LSR) and to be counted back from
there. So '[r1, r0]' to be called (DI r1). I am not entirely sure about this
though...
I tried changing the arm back-end to only accept DI mode register pairs if the
register is odd. That fixed this case but broke a lot of other things. I am
thinking another way to fix it is to adapt Arm's 's_register_operand' to not
accept paradoxical subregs in big endian, but I would first like to understand
how the middle end expects/sees/generates register pairs if
'REG_WORDS_BIG_ENDIAN' is true.