https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98891

--- Comment #4 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #1)
> Reduced testcase:
> extern unsigned long long a, b, c;
> 
> void
> foo (void)
> {
>   a = b | ~c;
> }
> 
> Seems this is the usual dilemma between split double-word operations early
> vs. split it late, each has its advantages and serious disadvantages.
> By splitting early, combiner can't really do much with it, it is split into
> loads, not, or and store of the halves separately and combiner doesn't see
> the two halves together, one would need essentially vectorization on RTL to
> match that.

Splitting early is required since it results in much more efficient code.
However the real underlying problem is the concept that a type can map to
different register files. Generally a compiler must decide the register file
for each operand before register allocation, but GCC does this during register
allocation. And it does it badly with incomplete knowledge and way too many
costing hacks. To get decent code for AArch64 we had to add special hooks to
force the allocator to strongly prefer allocating integer types to integer
registers and FP/SIMD types to FP/SIMD registers.

Reply via email to