Hi,
On 13/05/26 1:47 PM, Manjunath S Matti wrote:
> The changes have been bootstrapped and regression tested on
> powerpc64le-linux.
>
> Changes from V1:
> * Commit message formatted to 72 character lenght.
> * Changed to 'Future ISA' instead of 'ISA 3.2'.
> * Listed down all the new UNSPEC, instruction patterns and builtin names
> in the Changelog entries.
> * Removed the type and size attribute for all the 21 new instruction
> patterns.
> * Removed detailed description for the builtins in extend.texi.
>
> This patch implements builtin support for 21 new ECC (Elliptic Curve
> Cryptography) acceleration instructions defined in RFC02669 for Power
> future ISA. These instructions are designed to accelerate P-256 and
> P-384 elliptic curve operations on POWER future processors. These
> instructions may or may not be supported in a future processor. Note,
> the names of the builtins may change in future.
>
> The instructions are organized into five categories:
>
> 1. Multiply-Multiply operations (3 instructions):
> - xxmulmul: Multiply-multiply with scaling (scale values 0-6)
> - xxmulmulhiadd: Multiply-multiply with high add and accumulator
> - xxmulmulloadd: Multiply-multiply low add with accumulator
>
> 2. Scaled Multiply-Sum operations (3 instructions):
> - xxssumudm: Scaled sum unsigned doubleword modulo
> - xxssumudmc: Scaled sum unsigned doubleword modulo carry
> - xxssumudmcext: Extended version with separate accumulator
> (prefixed)
>
> 3. Quadword Add/Subtract operations (4 instructions):
> - xsaddadduqm: Add add unsigned quadword modulo
> - xsaddaddsuqm: Add add scaled unsigned quadword modulo
> - xsaddsubuqm: Add subtract unsigned quadword modulo
> - xsaddsubsuqm: Add subtract scaled unsigned quadword modulo
>
> 4. Merge operations (4 instructions):
> - xsmerge2t1uqm, xsmerge2t2uqm, xsmerge2t3uqm: 2-operand merge
> - xsmerge3t1uqm: 3-operand merge with accumulator
>
> 5. Rebase operations (7 instructions):
> - xsrebase2t1uqm through xsrebase2t4uqm: 2-operand rebase
> - xsrebase3t1uqm through xsrebase3t3uqm: 3-operand rebase with
> accumulator
>
> All instructions operate on 128-bit unsigned integers
> (vector unsigned __int128) and use VSX registers.
> The xxssumudmcext instruction is a prefixed instruction (8 bytes),
> while all others use the standard XX3 form (4 bytes).
>
> 2026-05-13 Manjunath Matti <[email protected]>
>
> gcc/ChangeLog:
> * config/rs6000/predicates.md (const_0_to_6_operand): New predicate.
> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xxmulmul): Add new
> builtin definitions under [future] stanza.
There are 9 chars before the above line. Only a tab of 8 char width should be
there. Ditto for the rest of the changelog.
> (__builtin_vsx_xxmulmulhiadd): Likewise.
> (__builtin_vsx_xxmulmulloadd): Likewise.
> (__builtin_vsx_xxssumudm): Likewise.
> (__builtin_vsx_xxssumudmc): Likewise.
> (__builtin_vsx_xxssumudmcext): Likewise.
> (__builtin_vsx_xsaddadduqm): Likewise.
> (__builtin_vsx_xsaddaddsuqm): Likewise.
> (__builtin_vsx_xsaddsubuqm): Likewise.
> (__builtin_vsx_xsaddsubsuqm): Likewise.
> (__builtin_vsx_xsmerge2t1uqm): Likewise.
> (__builtin_vsx_xsmerge2t2uqm): Likewise.
> (__builtin_vsx_xsmerge2t3uqm): Likewise.
> (__builtin_vsx_xsmerge3t1uqm): Likewise.
> (__builtin_vsx_xsrebase2t1uqm): Likewise.
> (__builtin_vsx_xsrebase2t2uqm): Likewise.
> (__builtin_vsx_xsrebase2t3uqm): Likewise.
> (__builtin_vsx_xsrebase2t4uqm): Likewise.
> (__builtin_vsx_xsrebase3t1uqm): Likewise.
> (__builtin_vsx_xsrebase3t2uqm): Likewise.
> (__builtin_vsx_xsrebase3t3uqm): Likewise.
> * config/rs6000/vsx.md (UNSPEC_XXMULMUL): Add UNSPEC entry.
> (UNSPEC_XXMULMULHIADD): Likewise.
> (UNSPEC_XXMULMULLOADD): Likewise.
> (UNSPEC_XXSSUMUDM): Likewise.
> (UNSPEC_XXSSUMUDMC): Likewise.
> (UNSPEC_XXSSUMUDMCEXT): Likewise.
> (UNSPEC_XSADDADDUQM): Likewise.
> (UNSPEC_XSADDADDSUQM): Likewise.
> (UNSPEC_XSADDSUBUQM): Likewise.
> (UNSPEC_XSADDSUBSUQM): Likewise.
> (UNSPEC_XSMERGE2T1UQM): Likewise.
> (UNSPEC_XSMERGE2T2UQM): Likewise.
> (UNSPEC_XSMERGE2T3UQM): Likewise.
> (UNSPEC_XSMERGE3T1UQM): Likewise.
> (UNSPEC_XSREBASE2T1UQM): Likewise.
> (UNSPEC_XSREBASE2T2UQM): Likewise.
> (UNSPEC_XSREBASE2T3UQM): Likewise.
> (UNSPEC_XSREBASE2T4UQM): Likewise.
> (UNSPEC_XSREBASE3T1UQM): Likewise.
> (UNSPEC_XSREBASE3T2UQM): Likewise.
> (UNSPEC_XSREBASE3T3UQM): Likewise.
> (<vsx_xxmulmul>): New define_insn for Multiply-multiply with scaling.
Just 'New define_insn.' is enough.
> (<vsx_xxmulmulhiadd>): New define_insn for Multiply-Multiply with high
> add and accumulator.
> (<vsx_xxmulmulloadd>): New define_insn for Multiply-Multiply low add
> with accumulator.
> (<vsx_xxssumudm>): New define_insn for Scaled sum unsigned doubleword
> modulo.
> (<vsx_xxssumudmc>): New define_insn for Scaled sum unsigned doubleword
> modulo carry.
> (<vsx_xxssumudmcext>): New define_insn for Scaled sum unsigned
> doubleword
> modulo carry extended.
> (<vsx_xsaddadduqm>): New define_insn for Add add unsigned quadword
> modulo.
> (<vsx_xsaddaddsuqm>): New define_insn for Add add scaled unsigned
> quadword modulo.
> (<vsx_xsaddsubuqm>): New define_insn for Add subtract unsigned
> quadword
> modulo.
> (<vsx_xsaddsubsuqm>): New define_insn for Add subtract scaled unsigned
> quadword modulo.
> (<vsx_xsmerge2t1uqm>): New define_insn for Merge type 1, 2-operand.
> (<vsx_xsmerge2t2uqm>): New define_insn for Merge type 2, 2-operand.
> (<vsx_xsmerge2t3uqm>): New define_insn for Merge type 3, 2-operand.
> (<vsx_xsmerge3t1uqm>): New define_insn for Merge type 1, 3-operand.
> (<vsx_xsrebase2t1uqm>): New define_insn for Rebase type 1, 2-operand.
> (<vsx_xsrebase2t2uqm>): New define_insn for Rebase type 2, 2-operand.
> (<vsx_xsrebase2t3uqm>): New define_insn for Rebase type 3, 2-operand.
> (<vsx_xsrebase2t4uqm>): New define_insn for Rebase type 4, 2-operand.
> (<vsx_xsrebase3t1uqm>): New define_insn for Rebase type 1, 3-operand.
> (<vsx_xsrebase3t2uqm>): New define_insn for Rebase type 2, 3-operand.
> (<vsx_xsrebase3t3uqm>): New define_insn for Rebase type 3, 3-operand.
> * doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions): Add
> documentation for ECC cryptography builtins available on
> future ISA.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/powerpc/ecc-builtin-1.c: New test for ECC builtins.
>
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 54dbc8bcc95..4162c22f8f6 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -312,6 +312,11 @@
> (and (match_code "const_int")
> (match_test "IN_RANGE (INTVAL (op), 2, 3)")))
>
> +;; Match op = 0..6.
> +(define_predicate "const_0_to_6_operand"
> + (and (match_code "const_int")
> + (match_test "IN_RANGE (INTVAL (op), 0, 6)")))
> +
> ;; Match op = 0..7.
> (define_predicate "const_0_to_7_operand"
> (and (match_code "const_int")
> diff --git a/gcc/config/rs6000/rs6000-builtins.def
> b/gcc/config/rs6000/rs6000-builtins.def
> index 0d1529b71d4..24c0bc37a4d 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -3970,3 +3970,68 @@
>
> const vuc __builtin_galois_field_mult_xts (vuc, vuc);
> XXGFMUL128XTS xxgfmul128xts {}
> +
> + const vuq __builtin_vsx_xxmulmul (vull, vull, const int<0,6>);
> + XXMULMUL vsx_xxmulmul {}
> +
> + const vuq __builtin_vsx_xxmulmulhiadd (vuq, vull, vull, const int<1>, \
> + const int<1>, const int<1>);
> + XXMULMULHIADD vsx_xxmulmulhiadd {}
> +
> + const vuq __builtin_vsx_xxmulmulloadd (vuq, vull, vull, const int<1>, \
> + const int<1>);
> + XXMULMULLOADD vsx_xxmulmulloadd {}
> +
> + const vuq __builtin_vsx_xxssumudm (vuq, vull, vull, const int<1>);
> + XXSSUMUDM vsx_xxssumudm {}
> +
> + const vuq __builtin_vsx_xxssumudmc (vuq, vull, vull, const int<1>);
> + XXSSUMUDMC vsx_xxssumudmc {}
> +
> + const vuq __builtin_vsx_xxssumudmcext (vull, vull, vuq, const int<1>);
> + XXSSUMUDMCEXT vsx_xxssumudmcext {}
> +
> + const vuq __builtin_vsx_xsaddadduqm (vuq, vuq, vuq);
> + XSADDADDUQM vsx_xsaddadduqm {}
> +
> + const vuq __builtin_vsx_xsaddaddsuqm (vuq, vuq, vuq);
> + XSADDADDSUQM vsx_xsaddaddsuqm {}
> +
> + const vuq __builtin_vsx_xsaddsubuqm (vuq, vuq, vuq);
> + XSADDSUBUQM vsx_xsaddsubuqm {}
> +
> + const vuq __builtin_vsx_xsaddsubsuqm (vuq, vuq, vuq);
> + XSADDSUBSUQM vsx_xsaddsubsuqm {}
> +
> + const vuq __builtin_vsx_xsmerge2t1uqm (vuq, vuq);
> + XSMERGE2T1UQM vsx_xsmerge2t1uqm {}
> +
> + const vuq __builtin_vsx_xsmerge2t2uqm (vuq, vuq);
> + XSMERGE2T2UQM vsx_xsmerge2t2uqm {}
> +
> + const vuq __builtin_vsx_xsmerge2t3uqm (vuq, vuq);
> + XSMERGE2T3UQM vsx_xsmerge2t3uqm {}
> +
> + const vuq __builtin_vsx_xsmerge3t1uqm (vuq, vuq, vuq);
> + XSMERGE3T1UQM vsx_xsmerge3t1uqm {}
> +
> + const vuq __builtin_vsx_xsrebase2t1uqm (vuq, vuq);
> + XSREBASE2T1UQM vsx_xsrebase2t1uqm {}
> +
> + const vuq __builtin_vsx_xsrebase2t2uqm (vuq, vuq);
> + XSREBASE2T2UQM vsx_xsrebase2t2uqm {}
> +
> + const vuq __builtin_vsx_xsrebase2t3uqm (vuq, vuq);
> + XSREBASE2T3UQM vsx_xsrebase2t3uqm {}
> +
> + const vuq __builtin_vsx_xsrebase2t4uqm (vuq, vuq);
> + XSREBASE2T4UQM vsx_xsrebase2t4uqm {}
> +
> + const vuq __builtin_vsx_xsrebase3t1uqm (vuq, vuq, vuq);
> + XSREBASE3T1UQM vsx_xsrebase3t1uqm {}
> +
> + const vuq __builtin_vsx_xsrebase3t2uqm (vuq, vuq, vuq);
> + XSREBASE3T2UQM vsx_xsrebase3t2uqm {}
> +
> + const vuq __builtin_vsx_xsrebase3t3uqm (vuq, vuq, vuq);
> + XSREBASE3T3UQM vsx_xsrebase3t3uqm {}
The above built-in functions have to be kept in [future_vsx] stanza.
Jeevitha has a patch for [future_vsx].
Also, why is the return value declared as 'const'?
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index cfad9b8c6d5..af02a1979e8 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -369,6 +369,27 @@
> UNSPEC_XXSPLTI32DX
> UNSPEC_XXBLEND
> UNSPEC_XXPERMX
> + UNSPEC_XXMULMUL
> + UNSPEC_XXMULMULHIADD
> + UNSPEC_XXMULMULLOADD
> + UNSPEC_XXSSUMUDM
> + UNSPEC_XXSSUMUDMC
> + UNSPEC_XXSSUMUDMCEXT
> + UNSPEC_XSADDADDUQM
> + UNSPEC_XSADDADDSUQM
> + UNSPEC_XSADDSUBUQM
> + UNSPEC_XSADDSUBSUQM
> + UNSPEC_XSMERGE2T1UQM
> + UNSPEC_XSMERGE2T2UQM
> + UNSPEC_XSMERGE2T3UQM
> + UNSPEC_XSMERGE3T1UQM
> + UNSPEC_XSREBASE2T1UQM
> + UNSPEC_XSREBASE2T2UQM
> + UNSPEC_XSREBASE2T3UQM
> + UNSPEC_XSREBASE2T4UQM
> + UNSPEC_XSREBASE3T1UQM
> + UNSPEC_XSREBASE3T2UQM
> + UNSPEC_XSREBASE3T3UQM
> ])
>
> (define_int_iterator XVCVBF16 [UNSPEC_VSX_XVCVSPBF16
> @@ -6807,3 +6828,218 @@
> emit_insn (gen_vsx_extract_v2di (dest_op1, src_op, const1_rtx));
> DONE;
> })
> +
> +
> +;; ECC (Elliptic Curve Cryptography) acceleration instructions for Power
> future
> +;; These instructions support P-256 and P-384 elliptic curve operations
> +
> +;; xxmulmul - Multiply-Multiply with scaling
> +(define_insn "vsx_xxmulmul"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "wa")
> + (match_operand:V2DI 2 "vsx_register_operand" "wa")
> + (match_operand:SI 3 "const_0_to_6_operand" "n")]
This should be match_operand:QI.
Ditto everywhere else where const_0_* is being used.
> + UNSPEC_XXMULMUL))]
> + "TARGET_FUTURE"
> + "xxmulmul %x0,%x1,%x2,%3")
> +
> +;; xxmulmulhiadd - Multiply-Multiply with high add and accumulator
> +(define_insn "vsx_xxmulmulhiadd"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V2DI 2 "vsx_register_operand" "wa")
> + (match_operand:V2DI 3 "vsx_register_operand" "wa")
> + (match_operand:SI 4 "const_0_to_1_operand" "n")
> + (match_operand:SI 5 "const_0_to_1_operand" "n")
> + (match_operand:SI 6 "const_0_to_1_operand" "n")]
> + UNSPEC_XXMULMULHIADD))]
> + "TARGET_FUTURE"
> + "xxmulmulhiadd %x0,%x2,%x3,%4,%5,%6")
> +
> +;; xxmulmulloadd - Multiply-Multiply low add with accumulator
> +(define_insn "vsx_xxmulmulloadd"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V2DI 2 "vsx_register_operand" "wa")
> + (match_operand:V2DI 3 "vsx_register_operand" "wa")
> + (match_operand:SI 4 "const_0_to_1_operand" "n")
> + (match_operand:SI 5 "const_0_to_1_operand" "n")]
> + UNSPEC_XXMULMULLOADD))]
> + "TARGET_FUTURE"
> + "xxmulmulloadd %x0,%x2,%x3,%4,%5")
> +
> +;; xxssumudm - Scaled sum unsigned doubleword modulo
> +(define_insn "vsx_xxssumudm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V2DI 2 "vsx_register_operand" "wa")
> + (match_operand:V2DI 3 "vsx_register_operand" "wa")
> + (match_operand:SI 4 "const_0_to_1_operand" "n")]
> + UNSPEC_XXSSUMUDM))]
> + "TARGET_FUTURE"
> + "xxssumudm %x0,%x2,%x3,%4")
> +
> +;; xxssumudmc - Scaled sum unsigned doubleword modulo carry
> +(define_insn "vsx_xxssumudmc"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V2DI 2 "vsx_register_operand" "wa")
> + (match_operand:V2DI 3 "vsx_register_operand" "wa")
> + (match_operand:SI 4 "const_0_to_1_operand" "n")]
> + UNSPEC_XXSSUMUDMC))]
> + "TARGET_FUTURE"
> + "xxssumudmc %x0,%x2,%x3,%4")
> +
> +;; xxssumudmcext - Scaled sum unsigned doubleword modulo carry extended
> (prefixed)
> +(define_insn "vsx_xxssumudmcext"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "wa")
> + (match_operand:V2DI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")
> + (match_operand:SI 4 "const_0_to_1_operand" "n")]
> + UNSPEC_XXSSUMUDMCEXT))]
> + "TARGET_FUTURE"
> + "xxssumudmcext %x0,%x1,%x2,%x3,%4")
> +
> +;; xsaddadduqm - Add add unsigned quadword modulo
> +(define_insn "vsx_xsaddadduqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSADDADDUQM))]
> + "TARGET_FUTURE"
> + "xsaddadduqm %x0,%x2,%x3")
> +
> +;; xsaddaddsuqm - Add add scaled unsigned quadword modulo
> +(define_insn "vsx_xsaddaddsuqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSADDADDSUQM))]
> + "TARGET_FUTURE"
> + "xsaddaddsuqm %x0,%x2,%x3")
> +
> +;; xsaddsubuqm - Add subtract unsigned quadword modulo
> +(define_insn "vsx_xsaddsubuqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSADDSUBUQM))]
> + "TARGET_FUTURE"
> + "xsaddsubuqm %x0,%x2,%x3")
> +
> +;; xsaddsubsuqm - Add subtract scaled unsigned quadword modulo
> +(define_insn "vsx_xsaddsubsuqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSADDSUBSUQM))]
> + "TARGET_FUTURE"
> + "xsaddsubsuqm %x0,%x2,%x3")
> +
> +;; xsmerge2t1uqm - Merge type 1 (2-operand)
> +(define_insn "vsx_xsmerge2t1uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "wa")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")]
> + UNSPEC_XSMERGE2T1UQM))]
> + "TARGET_FUTURE"
> + "xsmerge2t1uqm %x0,%x1,%x2")
> +
> +;; xsmerge2t2uqm - Merge type 2 (2-operand)
> +(define_insn "vsx_xsmerge2t2uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "wa")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")]
> + UNSPEC_XSMERGE2T2UQM))]
> + "TARGET_FUTURE"
> + "xsmerge2t2uqm %x0,%x1,%x2")
> +
> +;; xsmerge2t3uqm - Merge type 3 (2-operand)
> +(define_insn "vsx_xsmerge2t3uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "wa")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")]
> + UNSPEC_XSMERGE2T3UQM))]
> + "TARGET_FUTURE"
> + "xsmerge2t3uqm %x0,%x1,%x2")
> +
> +;; xsmerge3t1uqm - Merge type 1 (3-operand with accumulator)
> +(define_insn "vsx_xsmerge3t1uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSMERGE3T1UQM))]
> + "TARGET_FUTURE"
> + "xsmerge3t1uqm %x0,%x2,%x3")
> +
> +;; xsrebase2t1uqm - Rebase type 1 (2-operand)
> +(define_insn "vsx_xsrebase2t1uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "wa")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")]
> + UNSPEC_XSREBASE2T1UQM))]
> + "TARGET_FUTURE"
> + "xsrebase2t1uqm %x0,%x1,%x2")
> +
> +;; xsrebase2t2uqm - Rebase type 2 (2-operand)
> +(define_insn "vsx_xsrebase2t2uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "wa")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")]
> + UNSPEC_XSREBASE2T2UQM))]
> + "TARGET_FUTURE"
> + "xsrebase2t2uqm %x0,%x1,%x2")
> +
> +;; xsrebase2t3uqm - Rebase type 3 (2-operand)
> +(define_insn "vsx_xsrebase2t3uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "wa")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")]
> + UNSPEC_XSREBASE2T3UQM))]
> + "TARGET_FUTURE"
> + "xsrebase2t3uqm %x0,%x1,%x2")
> +
> +;; xsrebase2t4uqm - Rebase type 4 (2-operand)
> +(define_insn "vsx_xsrebase2t4uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "wa")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")]
> + UNSPEC_XSREBASE2T4UQM))]
> + "TARGET_FUTURE"
> + "xsrebase2t4uqm %x0,%x1,%x2")
> +
> +;; xsrebase3t1uqm - Rebase type 1 (3-operand with accumulator)
> +(define_insn "vsx_xsrebase3t1uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSREBASE3T1UQM))]
> + "TARGET_FUTURE"
> + "xsrebase3t1uqm %x0,%x2,%x3")
> +
> +;; xsrebase3t2uqm - Rebase type 2 (3-operand with accumulator)
> +(define_insn "vsx_xsrebase3t2uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSREBASE3T2UQM))]
> + "TARGET_FUTURE"
> + "xsrebase3t2uqm %x0,%x2,%x3")
> +
> +;; xsrebase3t3uqm - Rebase type 3 (3-operand with accumulator)
> +(define_insn "vsx_xsrebase3t3uqm"
> + [(set (match_operand:V1TI 0 "vsx_register_operand" "=wa")
> + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "0")
> + (match_operand:V1TI 2 "vsx_register_operand" "wa")
> + (match_operand:V1TI 3 "vsx_register_operand" "wa")]
> + UNSPEC_XSREBASE3T3UQM))]
> + "TARGET_FUTURE"
> + "xsrebase3t3uqm %x0,%x2,%x3")
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 42f83b98a05..8cf4ed5d9f8 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -26737,6 +26737,107 @@ vec_t __builtin_galois_field_mult_gcm (vec_t,
> vec_t);
> vec_t __builtin_galois_field_mult_xts (vec_t, vec_t);
> @end smallexample
>
> +@subsubheading PowrPC Elliptic Curve Cryptography Assist Built-in-Functions
> +
> +The following additional built-in functions are available for the
'may' be available.
> +PowerPC family of processors, on Future ISA (@option{-mcpu=future}).
Pls remove -mcpu=future.
> +These instructions provide hardware acceleration for Elliptic Curve
> +Cryptography (ECC) operations, specifically optimized for P-256 and P-384
> +elliptic curves.
> +
> +All ECC built-in functions operate on 128-bit unsigned integers
This is not entirely true. We have instructions like xxmulmul which operate
on 64bit inputs. Better to just remove this line.
> +(@code{vector unsigned __int128}) and use VSX registers. The functions
> +are organized into five categories: multiply-multiply operations, scaled
> +multiply-sum operations, quadword add/subtract operations, merge operations,
> +and rebase operations.
> +
> +@smallexample
> +vector unsigned __int128
> +__builtin_vsx_xxmulmul (vector unsigned long long @var{a},
> + vector unsigned long long @var{b},
> + const int @var{scale});
If you look at the other function prototypes, we only specify the types and
not the variable names.
> +vector unsigned __int128
> +__builtin_vsx_xxmulmulhiadd (vector unsigned __int128 @var{acc},
> + vector unsigned long long @var{a},
> + vector unsigned long long @var{b},
> + const int @var{m1},
> + const int @var{m2},
> + const int @var{m3});
> +vector unsigned __int128
> +__builtin_vsx_xxmulmulloadd (vector unsigned __int128 @var{acc},
> + vector unsigned long long @var{a},
> + vector unsigned long long @var{b},
> + const int @var{m1},
> + const int @var{m2});
> +vector unsigned __int128
> +__builtin_vsx_xxssumudm (vector unsigned __int128 @var{acc},
> + vector unsigned long long @var{a},
> + vector unsigned long long @var{b},
> + const int @var{scale});
> +vector unsigned __int128
> +__builtin_vsx_xxssumudmc (vector unsigned __int128 @var{acc},
> + vector unsigned long long @var{a},
> + vector unsigned long long @var{b},
> + const int @var{scale});
> +vector unsigned __int128
> +__builtin_vsx_xxssumudmcext (vector unsigned long long @var{a},
> + vector unsigned long long @var{b},
> + vector unsigned __int128 @var{c},
> + const int @var{scale});
> +vector unsigned __int128
> +__builtin_vsx_xsaddadduqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsaddaddsuqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsaddsubuqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsaddsubsuqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsmerge2t1uqm (vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsmerge2t2uqm (vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsmerge2t3uqm (vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsmerge3t1uqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsrebase2t1uqm (vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsrebase2t2uqm (vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsrebase2t3uqm (vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsrebase2t4uqm (vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsrebase3t1uqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsrebase3t2uqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +vector unsigned __int128
> +__builtin_vsx_xsrebase3t3uqm (vector unsigned __int128 @var{acc},
> + vector unsigned __int128 @var{a},
> + vector unsigned __int128 @var{b});
> +@end smallexample
>
> @node PowerPC Hardware Transactional Memory Built-in Functions
> @subsection PowerPC Hardware Transactional Memory Built-in Functions
> diff --git a/gcc/testsuite/gcc.target/powerpc/ecc-builtin-1.c
> b/gcc/testsuite/gcc.target/powerpc/ecc-builtin-1.c
> new file mode 100644
> index 00000000000..e0b7dd63ae7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/ecc-builtin-1.c
> @@ -0,0 +1,202 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=future -O2" } */
A {dg-require-effective-target powerpc_future_ok} is needed here.
Jeevitha has a patch for this.
-Surya
> +
> +/* Test the ECC (Elliptic Curve Cryptography) acceleration builtins for
> Power future.
> + These instructions support P-256 and P-384 elliptic curve operations. */
> +
> +#include <altivec.h>
> +
> +/* Test xxmulmul - Multiply-Multiply with scaling */
> +vector unsigned __int128
> +test_xxmulmul (vector unsigned long long a, vector unsigned long long b)
> +{
> + return __builtin_vsx_xxmulmul (a, b, 3);
> +}
> +
> +/* Test xxmulmulhiadd - Multiply-Multiply with high add and accumulator */
> +vector unsigned __int128
> +test_xxmulmulhiadd (vector unsigned __int128 acc,
> + vector unsigned long long a,
> + vector unsigned long long b)
> +{
> + return __builtin_vsx_xxmulmulhiadd (acc, a, b, 1, 0, 1);
> +}
> +
> +/* Test xxmulmulloadd - Multiply-Multiply low add with accumulator */
> +vector unsigned __int128
> +test_xxmulmulloadd (vector unsigned __int128 acc,
> + vector unsigned long long a,
> + vector unsigned long long b)
> +{
> + return __builtin_vsx_xxmulmulloadd (acc, a, b, 1, 0);
> +}
> +
> +/* Test xxssumudm - Scaled sum unsigned doubleword modulo */
> +vector unsigned __int128
> +test_xxssumudm (vector unsigned __int128 acc,
> + vector unsigned long long a,
> + vector unsigned long long b)
> +{
> + return __builtin_vsx_xxssumudm (acc, a, b, 1);
> +}
> +
> +/* Test xxssumudmc - Scaled sum unsigned doubleword modulo carry */
> +vector unsigned __int128
> +test_xxssumudmc (vector unsigned __int128 acc,
> + vector unsigned long long a,
> + vector unsigned long long b)
> +{
> + return __builtin_vsx_xxssumudmc (acc, a, b, 0);
> +}
> +
> +/* Test xxssumudmcext - Scaled sum unsigned doubleword modulo carry extended
> */
> +vector unsigned __int128
> +test_xxssumudmcext (vector unsigned long long a,
> + vector unsigned long long b,
> + vector unsigned __int128 c)
> +{
> + return __builtin_vsx_xxssumudmcext (a, b, c, 1);
> +}
> +
> +/* Test xsaddadduqm - Add add unsigned quadword modulo */
> +vector unsigned __int128
> +test_xsaddadduqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsaddadduqm (acc, a, b);
> +}
> +
> +/* Test xsaddaddsuqm - Add add scaled unsigned quadword modulo */
> +vector unsigned __int128
> +test_xsaddaddsuqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsaddaddsuqm (acc, a, b);
> +}
> +
> +/* Test xsaddsubuqm - Add subtract unsigned quadword modulo */
> +vector unsigned __int128
> +test_xsaddsubuqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsaddsubuqm (acc, a, b);
> +}
> +
> +/* Test xsaddsubsuqm - Add subtract scaled unsigned quadword modulo */
> +vector unsigned __int128
> +test_xsaddsubsuqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsaddsubsuqm (acc, a, b);
> +}
> +
> +/* Test xsmerge2t1uqm - Merge type 1 (2-operand) */
> +vector unsigned __int128
> +test_xsmerge2t1uqm (vector unsigned __int128 a, vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsmerge2t1uqm (a, b);
> +}
> +
> +/* Test xsmerge2t2uqm - Merge type 2 (2-operand) */
> +vector unsigned __int128
> +test_xsmerge2t2uqm (vector unsigned __int128 a, vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsmerge2t2uqm (a, b);
> +}
> +
> +/* Test xsmerge2t3uqm - Merge type 3 (2-operand) */
> +vector unsigned __int128
> +test_xsmerge2t3uqm (vector unsigned __int128 a, vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsmerge2t3uqm (a, b);
> +}
> +
> +/* Test xsmerge3t1uqm - Merge type 1 (3-operand with accumulator) */
> +vector unsigned __int128
> +test_xsmerge3t1uqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsmerge3t1uqm (acc, a, b);
> +}
> +
> +/* Test xsrebase2t1uqm - Rebase type 1 (2-operand) */
> +vector unsigned __int128
> +test_xsrebase2t1uqm (vector unsigned __int128 a, vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsrebase2t1uqm (a, b);
> +}
> +
> +/* Test xsrebase2t2uqm - Rebase type 2 (2-operand) */
> +vector unsigned __int128
> +test_xsrebase2t2uqm (vector unsigned __int128 a, vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsrebase2t2uqm (a, b);
> +}
> +
> +/* Test xsrebase2t3uqm - Rebase type 3 (2-operand) */
> +vector unsigned __int128
> +test_xsrebase2t3uqm (vector unsigned __int128 a, vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsrebase2t3uqm (a, b);
> +}
> +
> +/* Test xsrebase2t4uqm - Rebase type 4 (2-operand) */
> +vector unsigned __int128
> +test_xsrebase2t4uqm (vector unsigned __int128 a, vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsrebase2t4uqm (a, b);
> +}
> +
> +/* Test xsrebase3t1uqm - Rebase type 1 (3-operand with accumulator) */
> +vector unsigned __int128
> +test_xsrebase3t1uqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsrebase3t1uqm (acc, a, b);
> +}
> +
> +/* Test xsrebase3t2uqm - Rebase type 2 (3-operand with accumulator) */
> +vector unsigned __int128
> +test_xsrebase3t2uqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsrebase3t2uqm (acc, a, b);
> +}
> +
> +/* Test xsrebase3t3uqm - Rebase type 3 (3-operand with accumulator) */
> +vector unsigned __int128
> +test_xsrebase3t3uqm (vector unsigned __int128 acc,
> + vector unsigned __int128 a,
> + vector unsigned __int128 b)
> +{
> + return __builtin_vsx_xsrebase3t3uqm (acc, a, b);
> +}
> +
> +/* { dg-final { scan-assembler-times {\mxxmulmul\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxmulmulhiadd\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxmulmulloadd\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxssumudm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxssumudmc\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxxssumudmcext\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsaddadduqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsaddaddsuqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsaddsubuqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsaddsubsuqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsmerge2t1uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsmerge2t2uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsmerge2t3uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsmerge3t1uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsrebase2t1uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsrebase2t2uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsrebase2t3uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsrebase2t4uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsrebase3t1uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsrebase3t2uqm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxsrebase3t3uqm\M} 1 } } */