RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

Yangfei (Felix) Mon, 01 Jun 2020 19:45:15 -0700

Hi,

> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Monday, June 1, 2020 4:47 PM
> To: Yangfei (Felix) <felix.y...@huawei.com>
> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com>; Jakub
> Jelinek <ja...@redhat.com>; Hongtao Liu <crazy...@gmail.com>; H.J. Lu
> <hjl.to...@gmail.com>
> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
> fixed sve vector length


Snip...
 
> Sounds good.  Maybe at this point the x_inner and y_inner code is getting
> complicated enough to put into a lambda too:
> 
>   x_inner = ... (x);
>   y_inner = ... (y);
> 
> Just a suggestion though.

Yes, that's a good suggestion.  I see the code becomes more cleaner with 
another lambda.
 
> Yeah, looks good.
> 
> Formatting nit though: multi-line conditions should be wrapped in (...),
> i.e.:
> 
>     return (...
>             && ...
>             && ...);
> 

Done.  v6 patch is based on trunk 20200601.
Bootstrapped and tested on aarch64-linux-gnu. 
Also bootstrapped on x86-64-linux-gnu with --enable-multilib (for building -m32 
x86 libgcc).
Regresssion test on x86-64-linux-gnu looks good except for the following 
failures which has been confirmed by x86 devs: 

> FAIL: gcc.target/i386/avx512f-vcvtps2ph-2.c (test for excess errors)
> UNRESOLVED: gcc.target/i386/avx512f-vcvtps2ph-2.c compilation failed to 
> produce executable
154803c154803

Thanks,
Felix

pr95254-v6.diff
Description: pr95254-v6.diff

RE: [PATCH PR95254] aarch64: gcc generate inefficient code with fixed sve vector length

Reply via email to