Hi all,

This patch reimplements most of the vpadal intrinsics to use RTL builtins in 
the normal way.
The ones that aren't converted are the int32x2_t -> int64x1_t ones as the RTL 
pattern doesn't currently handle
these modes. We don't have a V1DI mode so it would need to return a DImode 
value or a V2DI one with the first lane
being the result. It's not hard to do, but it would require a bit more 
refactoring so we can do it separately later.

This patch hopefully improves the status quo.

The new Vwhalf mode attribute is created because the existing Vwtype attribute 
maps V8QI wrongly (for this pattern) to "8h" as the
suffix rather than "4h" as needed.

Bootstrapped and tested on aarch64-none-linux-gnu.

Pushing to trunk.
Thanks,
Kyrill

gcc/
        * config/aarch64/iterators.md (Vwhalf): New iterator.
        * config/aarch64/aarch64-simd.md (aarch64_<sur>adalp<mode>_3): Rename 
to...
        (aarch64_<sur>adalp<mode>): ... This.  Make more builtin-friendly.
        (<sur>sadv16qi): Adjust callsite of the above.
        * config/aarch64/aarch64-simd-builtins.def (sadalp, uadalp): New 
builtins.
        * config/aarch64/arm_neon.h (vpadal_s8): Reimplement using builtins.
        (vpadal_s16): Likewise.
        (vpadal_u8): Likewise.
        (vpadal_u16): Likewise.
        (vpadalq_s8): Likewise.
        (vpadalq_s16): Likewise.
        (vpadalq_s32): Likewise.
        (vpadalq_u8): Likewise.
        (vpadalq_u16): Likewise.
        (vpadalq_u32): Likewise.

Attachment: vpadal-int.patch
Description: vpadal-int.patch

Reply via email to