On 10/10/2023 20:09, Segher Boessenkool wrote:
Hi Andrew,
On Tue, Oct 10, 2023 at 04:11:18PM +0100, Andrew Stubbs wrote:
I'm also seeing wrong-code bugs when I allow more than 32 new registers,
but that might be an unrelated problem. Or the allocation is broken? I'm
still analyzing this.
It could be connected. both things should not happen.
If it matters, ... the new registers can't be used for general purposes,
What does this mean? I think you mean they *can* be used for anything,
you just don't want to (maybe it is slow)? If you make it allocatable
registers, they *will* be allocated for anythin the compilers deems a
good idea.
Nope, the "Accelerator VGPR" registers are exclusively for the use of
the new matrix multiply instructions that we don't support (yet).
The compiler is free to use them for storing data, but there are no real
instructions to do there.
so I'm trying to set them up as a temporary spill destination. This
means they're typically not busy. It feels like it shouldn't be this
hard... :(
So what did you do, put them later in the allocation order? Make their
register_move_cost higher than for normal registers (but still below
memory_move_cost)? Or what? TARGEt_SPILL_CLASS maybe?
We put them in a new register class, with a new constraint, and
implemented the move instructions (only) with new alternatives for the
new class. Then implemented TARGET_SPILL_CLASS in the obvious way.
All this is working just fine as long as there are only 32 new registers
unfixed (a0-a31); the code even runs correctly and I can see the
spilling happening correctly.
If I enable register a32 then it prefers that, and I get wrong code.
Using that register ought to be logically correct, albeit suboptimal, so
I don't understand that either.
Andrew