Hi all, This patch reimplements most of the vpadal intrinsics to use RTL builtins in the normal way. The ones that aren't converted are the int32x2_t -> int64x1_t ones as the RTL pattern doesn't currently handle these modes. We don't have a V1DI mode so it would need to return a DImode value or a V2DI one with the first lane being the result. It's not hard to do, but it would require a bit more refactoring so we can do it separately later.
This patch hopefully improves the status quo. The new Vwhalf mode attribute is created because the existing Vwtype attribute maps V8QI wrongly (for this pattern) to "8h" as the suffix rather than "4h" as needed. Bootstrapped and tested on aarch64-none-linux-gnu. Pushing to trunk. Thanks, Kyrill gcc/ * config/aarch64/iterators.md (Vwhalf): New iterator. * config/aarch64/aarch64-simd.md (aarch64_<sur>adalp<mode>_3): Rename to... (aarch64_<sur>adalp<mode>): ... This. Make more builtin-friendly. (<sur>sadv16qi): Adjust callsite of the above. * config/aarch64/aarch64-simd-builtins.def (sadalp, uadalp): New builtins. * config/aarch64/arm_neon.h (vpadal_s8): Reimplement using builtins. (vpadal_s16): Likewise. (vpadal_u8): Likewise. (vpadal_u16): Likewise. (vpadalq_s8): Likewise. (vpadalq_s16): Likewise. (vpadalq_s32): Likewise. (vpadalq_u8): Likewise. (vpadalq_u16): Likewise. (vpadalq_u32): Likewise.
vpadal-int.patch
Description: vpadal-int.patch