Re: [PATCH] rs6000: Builtins for AES acceleration instructions [RFC02657]

Surya Kumari Jangala Tue, 17 Mar 2026 00:26:52 -0700

Hi Avinash,

On 05/03/26 4:24 pm, Avinash Jayakar wrote:
> From: Avinash Jayakar <[email protected]>
> 
> Hi,
> 
> Following patch depends on these 2 patches in the following order:
> 1. mcpu=future: 
> https://gcc.gnu.org/pipermail/gcc-patches/2025-December/703739.html
> 2. future builtin infra: 
> https://gcc.gnu.org/pipermail/gcc-patches/2026-March/709782.html
> 
> Bootstrapped and regtested on powerpc64le-linux-gnu with no regressions.
> 
> Thanks and regards,
> Avinash Jayakar
> 
> rs6000: Builtins for AES acceleration instructions [RFC02657]
> 
> This patch adds new builtins for AES acceleration instructions which
> may or may not be supported in a future processor. Note, the names of
> the builtins may change in future.
> 
> The following new builtins for AES acceleration can be used with
> -mcpu=future option:
> __vector_pair  __builtin_aes_encrypt_paired (__vector_pair,
>                                            __vector_pair, uint2);
> __vector_pair  __builtin_aes128_encrypt_paired (__vector_pair,
>                                               __vector_pair);
> __vector_pair  __builtin_aes192_encrypt_paired (__vector_pair,
>                                               __vector_pair);
> __vector_pair  __builtin_aes256_encrypt_paired (__vector_pair,
>                                               __vector_pair);
> __vector_pair  __builtin_aes_decrypt_paired (__vector_pair,
>                                            __vector_pair, uint2);
> __vector_pair  __builtin_aes128_decrypt_paired (__vector_pair,
>                                               __vector_pair);
> __vector_pair  __builtin_aes192_decrypt_paired (__vector_pair,
>                                               __vector_pair);
> __vector_pair  __builtin_aes256_decrypt_paired (__vector_pair,
>                                               __vector_pair);
> __vector_pair  __builtin_aes_genlastkey_paired (__vector_pair, uint2);
> __vector_pair  __builtin_aes128_genlastkey_paired (__vector_pair);
> __vector_pair  __builtin_aes192_genlastkey_paired (__vector_pair);
> __vector_pair  __builtin_aes256_genlastkey_paired (__vector_pair);
> vec_t __builtin_galois_field_mult (vec_t, vec_t, uint1);
> vec_t __builtin_galois_field_mult_gcm (vec_t, vec_t);
> vec_t __builtin_galois_field_mult_xts (vec_t, vec_t);
> 
> 2026-03-05  Avinash Jayakar  <[email protected]>
> 
> gcc/ChangeLog:


Empty line not needed here.

> 
>       * config/rs6000/crypto.md (unspec): Add unspec entries for all
>       AES acceleration instructions.

Add all the UNSPEC entries in the changelog.

>       (AESACC_base_code): New iterator for xxaesencp and xxaesdecp base
>       mnemonics.
>       (AESACC_code): New iterator for xxaesencp and xxaesdecp extended
>       mnemonics.
>       (AESGENLKP_code): New iterator for xxaesgenlkp extended mnemonics.
>       (AESGF_code): New iterator for xxgfmul128 extended mnemonics.
>       (AESACC_base_insn): New attribute iterator for xxaesencp and xxaesdecp
>       base mnemonics.
>       (AESACC_insn): New attribute iterator for xxaesencp and xxaesdecp
>       extended mnemonics.
>       (AESGENLKP_insn): New attribute iterator for xxaesgenlkp extended
>       mnemonics.
>       (AESGF_insn): New attribute iterator for xxgfmul128 extended mnemonics.
>       (<AESACC_base_insn>): New define_insn for xxaesencp and xxaesdecp base
>       mnemonics.
>       (<AESACC_insn>): New define_insn for xxaesencp and xxaesdecp extended
>       mnemonics.
>       (<AESGENLKP_insn>): New define_insn for xxaesgenlkp extended mnemonics.
>       (xxaesgenlkp): New define_insn for genlkp base mnemonic.
>       (<AESGF_insn>): New define_insn for xxgfmul128 extended mnemonics.
>       (xxgfmul128): New define_insn for xxgfmul128 base mnemonic.
>       * config/rs6000/rs6000-builtins.def: Added new builtin definitions for
>       AES acceleration.
> 
> gcc/testsuite/ChangeLog:
> 

empty line not needed here.

>       * gcc.target/powerpc/aes-builtin-1.c: New test.
>       * gcc.target/powerpc/aes-builtin-2.c: New test.
> ---
>  gcc/config/rs6000/crypto.md                   | 102 +++++++++++++++++-
>  gcc/config/rs6000/rs6000-builtins.def         |  46 ++++++++
>  .../gcc.target/powerpc/aes-builtin-1.c        |  90 ++++++++++++++++
>  .../gcc.target/powerpc/aes-builtin-2.c        |  34 ++++++
>  4 files changed, 271 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/aes-builtin-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/aes-builtin-2.c
> 
> diff --git a/gcc/config/rs6000/crypto.md b/gcc/config/rs6000/crypto.md
> index f91791673c9..dcca2288f50 100644
> --- a/gcc/config/rs6000/crypto.md
> +++ b/gcc/config/rs6000/crypto.md
> @@ -35,7 +35,22 @@ (define_c_enum "unspec"
>     UNSPEC_VSBOX
>     UNSPEC_VSHASIGMA
>     UNSPEC_VPERMXOR
> -   UNSPEC_VPMSUM])
> +   UNSPEC_VPMSUM
> +   UNSPEC_XXAESENCP
> +   UNSPEC_XXAES128ENCP
> +   UNSPEC_XXAES192ENCP
> +   UNSPEC_XXAES256ENCP
> +   UNSPEC_XXAESDECP
> +   UNSPEC_XXAES128DECP
> +   UNSPEC_XXAES192DECP
> +   UNSPEC_XXAES256DECP
> +   UNSPEC_XXAESGENLKP
> +   UNSPEC_XXAES128GENLKP
> +   UNSPEC_XXAES192GENLKP
> +   UNSPEC_XXAES256GENLKP
> +   UNSPEC_XXGFMUL128
> +   UNSPEC_XXGFMUL128GCM
> +   UNSPEC_XXGFMUL128XTS])
>  
>  ;; Iterator for VPMSUM/VPERMXOR
>  (define_mode_iterator CR_mode [V16QI V8HI V4SI V2DI])
> @@ -62,6 +77,40 @@ (define_int_attr CR_insn [(UNSPEC_VCIPHER      "vcipher")
>                         (UNSPEC_VCIPHERLAST  "vcipherlast")
>                         (UNSPEC_VNCIPHERLAST "vncipherlast")])
>  
> +(define_int_iterator AESACC_base_code [UNSPEC_XXAESENCP

Please add some comments for the iterators.

> +     UNSPEC_XXAESDECP])
> +
> +(define_int_iterator AESACC_code [UNSPEC_XXAES128ENCP
> +     UNSPEC_XXAES192ENCP
> +     UNSPEC_XXAES256ENCP
> +     UNSPEC_XXAES128DECP
> +     UNSPEC_XXAES192DECP
> +     UNSPEC_XXAES256DECP])

The formatting is incorrect. All the unspec's should have the same indentation.
Please check CR_code and CR_insn.
Ditto for all the other int_iterator and int_attr.

> +
> +(define_int_attr AESACC_base_insn [(UNSPEC_XXAESENCP  "xxaesencp")
> +     (UNSPEC_XXAESDECP  "xxaesdecp")])
> +
> +(define_int_attr AESACC_insn [(UNSPEC_XXAES128ENCP  "xxaes128encp")
> +     (UNSPEC_XXAES192ENCP  "xxaes192encp")
> +     (UNSPEC_XXAES256ENCP  "xxaes256encp")
> +     (UNSPEC_XXAES128DECP  "xxaes128decp")
> +     (UNSPEC_XXAES192DECP  "xxaes192decp")
> +     (UNSPEC_XXAES256DECP  "xxaes256decp")])
> +
> +(define_int_iterator AESGENLKP_code [UNSPEC_XXAES128GENLKP
> +     UNSPEC_XXAES192GENLKP
> +     UNSPEC_XXAES256GENLKP])
> +
> +(define_int_attr AESGENLKP_insn [(UNSPEC_XXAES128GENLKP  "xxaes128genlkp")
> +     (UNSPEC_XXAES192GENLKP  "xxaes192genlkp")
> +     (UNSPEC_XXAES256GENLKP  "xxaes256genlkp")])
> +
> +(define_int_iterator AESGF_code [UNSPEC_XXGFMUL128GCM
> +     UNSPEC_XXGFMUL128XTS])
> +
> +(define_int_attr AESGF_insn [(UNSPEC_XXGFMUL128GCM  "xxgfmul128gcm")
> +     (UNSPEC_XXGFMUL128XTS  "xxgfmul128xts")])
> +
>  ;; 2 operand crypto instructions
>  (define_insn "crypto_<CR_insn>_<mode>"
>    [(set (match_operand:CR_vqdi 0 "register_operand" "=v")
> @@ -111,3 +160,54 @@ (define_insn "crypto_vshasigma<CR_char>"
>    "TARGET_CRYPTO"
>    "vshasigma<CR_char> %0,%1,%2,%3"
>    [(set_attr "type" "vecsimple")])
> +
> +;; AES acceleration instructions
> +
> +(define_insn "<AESACC_base_insn>"
> +  [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
> +    (unspec:OO [(match_operand:OO 1 "vsx_register_operand" "wa")

Incorrect indentation. The 'unspec' should align with the 'match_operand'
on the previous line.

> +             (match_operand:OO 2 "vsx_register_operand" "wa")
> +             (match_operand:SI 3 "const_0_to_3_operand" "n")]
> +            AESACC_base_code))]
> +  "TARGET_FUTURE"
> +  "<AESACC_base_insn> %x0,%x1,%x2,%3")
> +
> +(define_insn "<AESACC_insn>"
> +  [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
> +    (unspec:OO [(match_operand:OO 1 "vsx_register_operand" "wa")
> +             (match_operand:OO 2 "vsx_register_operand" "wa")]
> +            AESACC_code))]
> +  "TARGET_FUTURE"
> +  "<AESACC_insn> %x0,%x1,%x2")
> +
> +(define_insn "xxaesgenlkp"
> +  [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
> +    (unspec:OO [(match_operand:OO 1 "vsx_register_operand" "wa")
> +             (match_operand:SI 2 "const_0_to_3_operand" "n")]
> +            UNSPEC_XXAESGENLKP))]
> +  "TARGET_FUTURE"
> +  "xxaesgenlkp %x0,%x1,%2")
> +
> +(define_insn "<AESGENLKP_insn>"
> +  [(set (match_operand:OO 0 "vsx_register_operand" "=wa")
> +    (unspec:OO [(match_operand:OO 1 "vsx_register_operand" "wa")]
> +            AESGENLKP_code))]
> +  "TARGET_FUTURE"
> +  "<AESGENLKP_insn> %x0,%x1")
> +
> +(define_insn "xxgfmul128"
> +  [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
> +    (unspec:V16QI [(match_operand:V16QI 1 "vsx_register_operand" "wa")
> +                (match_operand:V16QI 2 "vsx_register_operand" "wa")
> +                (match_operand:SI 3 "const_0_to_1_operand" "n")]
> +               UNSPEC_XXGFMUL128))]
> +  "TARGET_FUTURE"
> +  "xxgfmul128 %x0,%x1,%x2,%3")
> +
> +(define_insn "<AESGF_insn>"
> +  [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
> +    (unspec:V16QI [(match_operand:V16QI 1 "vsx_register_operand" "wa")
> +                (match_operand:V16QI 2 "vsx_register_operand" "wa")]
> +               AESGF_code))]
> +  "TARGET_FUTURE"
> +  "<AESGF_insn> %x0,%x1,%x2")
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 7e5a4fb96e7..bde6f2fe9ca 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -3924,3 +3924,49 @@
>  
>    void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
>      STXVP nothing {mma,pair}
> +
> +[future]
> +  const v256 __builtin_aes_encrypt_paired (v256, v256, const int<2>);
> +    XXAESENCP xxaesencp {mma}

Why do we need 'mma' attribute for these builtins?

> +
> +  const v256 __builtin_aes128_encrypt_paired (v256, v256);
> +    XXAES128ENCP xxaes128encp {mma}
> +
> +  const v256 __builtin_aes192_encrypt_paired (v256, v256);
> +    XXAES192ENCP xxaes192encp {mma}
> +
> +  const v256 __builtin_aes256_encrypt_paired (v256, v256);
> +    XXAES256ENCP xxaes256encp {mma}
> +
> +  const v256 __builtin_aes_decrypt_paired (v256, v256, const int<2>);
> +    XXAESDECP xxaesdecp {mma}
> +
> +  const v256 __builtin_aes128_decrypt_paired (v256, v256);
> +    XXAES128DECP xxaes128decp {mma}
> +
> +  const v256 __builtin_aes192_decrypt_paired (v256, v256);
> +    XXAES192DECP xxaes192decp {mma}
> +
> +  const v256 __builtin_aes256_decrypt_paired (v256, v256);
> +    XXAES256DECP xxaes256decp {mma}
> +
> +  const v256 __builtin_aes_genlastkey_paired (v256, const int<2>);
> +    XXAESGENLKP xxaesgenlkp {mma}
> +
> +  const v256 __builtin_aes128_genlastkey_paired (v256);
> +    XXAES128GENLKP xxaes128genlkp {mma}
> +
> +  const v256 __builtin_aes192_genlastkey_paired (v256);
> +    XXAES192GENLKP xxaes192genlkp {mma}
> +
> +  const v256 __builtin_aes256_genlastkey_paired (v256);
> +    XXAES256GENLKP xxaes256genlkp {mma}
> +
> +  const vuc __builtin_galois_field_mult (vuc, vuc, const int<1>);
> +    XXGFMUL128 xxgfmul128 {}
> +
> +  const vuc __builtin_galois_field_mult_gcm (vuc, vuc);
> +    XXGFMUL128GCM xxgfmul128gcm {}
> +
> +  const vuc __builtin_galois_field_mult_xts (vuc, vuc);
> +    XXGFMUL128XTS xxgfmul128xts {}
> diff --git a/gcc/testsuite/gcc.target/powerpc/aes-builtin-1.c 
> b/gcc/testsuite/gcc.target/powerpc/aes-builtin-1.c
> new file mode 100644
> index 00000000000..aa5d61693b2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/aes-builtin-1.c
> @@ -0,0 +1,90 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=future -O2" } */
> +
> +void
> +aes (__vector_pair *text, __vector_pair *key, __vector_pair *res)

Please change the function names from aes* to aes*_enc_pair.
And rename aes*_dec to aes*_dec_pair.

Also, it would be good to have some tests where each test calls
only one builtin. This way, we can check if the builtin generates 
the expected code. For eg., the test aes_dec() is good, but pls
add another test which only calls __builtin_aes_genlastkey_paired().
Please have all these new tests in another file.

-Surya

> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair c = __builtin_aes_encrypt_paired (t, k, 0);
> +  c = __builtin_aes_encrypt_paired (c, k, 1);
> +  c = __builtin_aes_encrypt_paired (c, k, 2);
> +  *res = c;
> +}
> +void
> +aes128 (__vector_pair *text, __vector_pair *key, __vector_pair *res)
> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair c = __builtin_aes128_encrypt_paired (t, k);
> +  *res = c;
> +}
> +void
> +aes192 (__vector_pair *text, __vector_pair *key, __vector_pair *res)
> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair c = __builtin_aes192_encrypt_paired (t, k);
> +  *res = c;
> +}
> +void
> +aes256 (__vector_pair *text, __vector_pair *key, __vector_pair *res)
> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair c = __builtin_aes256_encrypt_paired (t, k);
> +  *res = c;
> +}
> +void
> +aes_dec (__vector_pair *text, __vector_pair *key, __vector_pair *res, int a)
> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair lk = __builtin_aes_genlastkey_paired (k, 0);
> +  __vector_pair c = __builtin_aes_decrypt_paired (t, lk, 0);
> +  lk = __builtin_aes_genlastkey_paired (k, 1);
> +  c = __builtin_aes_decrypt_paired (c, lk, 1);
> +  lk = __builtin_aes_genlastkey_paired (k, 2);
> +  c = __builtin_aes_decrypt_paired (c, lk, 2);
> +  *res = c;
> +}
> +void
> +aes128_dec (__vector_pair *text, __vector_pair *key, __vector_pair *res)
> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair lk = __builtin_aes128_genlastkey_paired (k);
> +  __vector_pair c = __builtin_aes128_decrypt_paired (t, lk);
> +  *res = c;
> +}
> +void
> +aes192_dec (__vector_pair *text, __vector_pair *key, __vector_pair *res)
> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair lk = __builtin_aes192_genlastkey_paired (k);
> +  __vector_pair c = __builtin_aes192_decrypt_paired (t, lk);
> +  *res = c;
> +}
> +void
> +aes256_dec (__vector_pair *text, __vector_pair *key, __vector_pair *res)
> +{
> +  __vector_pair t = *text;
> +  __vector_pair k = *key;
> +  __vector_pair lk = __builtin_aes256_genlastkey_paired (k);
> +  __vector_pair c = __builtin_aes256_decrypt_paired (t, lk);
> +  *res = c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mxxaesencp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes128encp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes192encp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes256encp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaesdecp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes128decp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes192decp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes256decp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaesgenlkp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes128genlkp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes192genlkp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxaes256genlkp\M} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/aes-builtin-2.c 
> b/gcc/testsuite/gcc.target/powerpc/aes-builtin-2.c
> new file mode 100644
> index 00000000000..98f16ad0fe8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/aes-builtin-2.c
> @@ -0,0 +1,34 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=future -O2" } */
> +
> +typedef unsigned char vec_t __attribute__((vector_size(16)));
> +
> +void
> +gfmul (vec_t *a, vec_t *b, vec_t *res)
> +{
> +  vec_t A = *a;
> +  vec_t B = *b;
> +  vec_t R = __builtin_galois_field_mult (A, B, 0);
> +  R = __builtin_galois_field_mult (R, B, 1);
> +  *res = R;
> +}
> +void
> +gfmul_gcm (vec_t *a, vec_t *b, vec_t *res)
> +{
> +  vec_t A = *a;
> +  vec_t B = *b;
> +  vec_t R = __builtin_galois_field_mult_gcm (A, B);
> +  *res = R;
> +}
> +void
> +gfmul_xts (vec_t *a, vec_t *b, vec_t *res)
> +{
> +  vec_t A = *a;
> +  vec_t B = *b;
> +  vec_t R = __builtin_galois_field_mult_xts (A, B);
> +  *res = R;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mxxgfmul128\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mxxgfmul128gcm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxgfmul128xts\M} 1 } } */

Re: [PATCH] rs6000: Builtins for AES acceleration instructions [RFC02657]

Reply via email to