On Wed, Oct 26, 2016 at 12:11:44PM +0000, Wilco Dijkstra wrote:
> Add a SHA1H pattern with a V2SI input.  This avoids unnecessary
> DUPs when using intrinsics like vsha1h_u32 (vgetq_lane_u32 (x, 0)).

I think this is incorrect for big endian - element 0 of a vec_select in
big-endian for V4SImode is the high 32-bits (i.e. bits 96-127 of the
architected register). I think you'd need two patterns, one as below for
!BYTES_BIG_ENDIAN, and one selecting element 3 for BYTES_BIG_ENDIAN.

Thanks,
James

> ChangeLog:
> 2016-10-26  Wilco Dijkstra  <wdijk...@arm.com>
> 
>       * config/aarch64/aarch64-simd.md (aarch64_crypto_sha1hv4si): New 
> pattern.
> --
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> 9ce7f00050913aebd9f83ae9c4ce4ad469dd0d98..47f1740aa8bcab948607e00c2503a34aafb5ba0e
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -5705,6 +5705,16 @@
>    [(set_attr "type" "crypto_sha1_fast")]
>  )
>  
> +(define_insn "aarch64_crypto_sha1hv4si"
> +  [(set (match_operand:SI 0 "register_operand" "=w")
> +     (unspec:SI [(vec_select:SI (match_operand:V4SI 1 "register_operand" "w")
> +                  (parallel [(const_int 0)]))]
> +      UNSPEC_SHA1H))]
> +  "TARGET_SIMD && TARGET_CRYPTO"
> +  "sha1h\\t%s0, %s1"
> +  [(set_attr "type" "crypto_sha1_fast")]
> +)
> +
>  (define_insn "aarch64_crypto_sha1su1v4si"
>    [(set (match_operand:V4SI 0 "register_operand" "=w")
>          (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0")
> 

Reply via email to