> -----Original Message----- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Thursday, August 27, 2020 4:08 PM > To: xiezhiheng <xiezhih...@huawei.com> > Cc: Richard Biener <richard.guent...@gmail.com>; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions > emitted at -O3 > > xiezhiheng <xiezhih...@huawei.com> writes: > > I made two separate patches for these two groups for review purposes. > > > > Note: Patch for min/max intrinsics should be applied before the patch for > rounding intrinsics > > > > Bootstrapped and tested on aarch64 Linux platform. > > Thanks, LGTM. Pushed to master. > > Richard
I made the patch for multiply and multiply accumulator intrinsics. Note that bfmmlaq intrinsic is special because this instruction ignores the FPCR and does not update the FPSR exception status. https://developer.arm.com/docs/ddi0596/h/simd-and-floating-point-instructions-alphabetic-order/bfmmla-bfloat16-floating-point-matrix-multiply-accumulate-into-2x2-matrix So I set it to the AUTO_FP flag. Bootstrapped and tested on aarch64 Linux platform. Thanks, Xie Zhiheng diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 75b62b590e2..8ca9746189a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2020-10-09 Zhiheng Xie <xiezhih...@huawei.com> + Nannan Zheng <zhengnan...@huawei.com> + + * config/aarch64/aarch64-simd-builtins.def: Add proper FLAG + for mul/mla/mls intrinsics. +
pr94442-v1.patch
Description: pr94442-v1.patch