Kyrill  Tkachov <kyrylo.tkac...@foss.arm.com> writes:
> Hi all,
>
> The recent changes to aarch64_expand_vector_init cause an ICE in the
> attached testcase.  The register allocator "ICEs with Max. number of
> generated reload insns per insn is achieved (90)"
>
> That is because aarch64_expand_vector_init creates a paradoxical subreg to 
> move a DImode value
> into a V2DI vector:
> (insn 74 72 76 8 (set (reg:V2DI 287 [ _166 ])
>          (subreg:V2DI (reg/v/f:DI 112 [ d ]) 0)) 1050 {*aarch64_simd_movv2di}
>
> This is done because we want to express that the whole of the V2DI
> vector will be written so that init-regs doesn't try to
> zero-initialise it before we overwrite each lane individually anyway.
>
> This can go bad for because if the DImode value is allocated in, say,
> x30: the last register in that register class, the V2DI subreg of that
> isn't valid or meaningful and that seems to cause the trouble.
>
> It's kinda hard to know what the right solution for this is.
> We could emit a duplicate of the value into all the lanes of the vector, but 
> we have tests that test against that
> (we're trying to avoid unnecessary duplicates)
>
> What this patch does is it defines a pattern for moving a scalar into
> lane 0 of a vector using a simple FMOV or LDR and represents that as a
> merging with a vector of zeroes.  That way, the instruction represents
> a full write of the destination vector but doesn't "read" more bits
> from the source than necessary. The zeroing effect is also a more
> accurate representation of the semantics of FMOV.

This feels like a hack.  Either the paradoxical subreg of the pseudo
is invalid for some reason (in which case we should ICE when it's formed,
not just in the case of x30 being allocated) or the subreg is valid,
in which case the RA should handle it correctly (and the backend should
give it the information it needs to do that).

I could see the argument for ignoring the problem for expediency if the
patch was a clean-up in its own right, but I think it's wrong to add so
much code to paper over a bug.

Thanks,
Richard

Reply via email to