https://gcc.gnu.org/g:2f68d69e47b5b627c6bb71a6bb3d7b2e0e641b2f

commit r15-3961-g2f68d69e47b5b627c6bb71a6bb3d7b2e0e641b2f
Author: Victor Do Nascimento <victor.donascime...@arm.com>
Date:   Tue May 21 11:17:45 2024 +0100

    optabs: Make all `*dot_prod_optab's modeled as conversions
    
    Given the specification in the GCC internals manual defines the
    {u|s}dot_prod<m> standard name as taking "two signed elements of the
    same mode, adding them to a third operand of wider mode", there is
    currently ambiguity in the relationship between the mode of the first
    two arguments and that of the third.
    
    This vagueness means that, in theory, different modes may be
    supportable in the third argument.  This flexibility would allow for a
    given backend to add to the accumulator a different number of
    vectorized products, e.g. A backend may provide instructions for both:
    
      accum += a[0] * b[0] + a[1] * b[1] + a[2] * b[2] + a[3] * b[3]
    
    and
    
      accum += a[0] * b[0] + a[1] * b[1],
    
    as is now seen in the SVE2.1 extension to AArch64.  In spite of the
    aforementioned flexibility, modeling the dot-product operation as a
    direct optab means that we have no way to encode both input and the
    accumulator data modes into the backend pattern name, which prevents
    us from harnessing this flexibility.
    
    We therefore make all dot_prod optabs conversions, allowing, for
    example, for the encoding of both 2-way and 4-way dot product backend
    patterns.
    
    gcc/ChangeLog:
    
            * optabs.def (sdot_prod_optab): Convert from OPTAB_D to
            OPTAB_CD.
            (udot_prod_optab): Likewise.
            (usdot_prod_optab): Likewise.
            * doc/md.texi (Standard Names): update entries for u,s and us
            dot_prod names.

Diff:
---
 gcc/doc/md.texi | 46 ++++++++++++++++++++++------------------------
 gcc/optabs.def  |  6 +++---
 2 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index a92591122517..7001dafdc9e1 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5760,15 +5760,14 @@ for (i = 0; i < LEN + BIAS; i++)
     operand0 += operand2[i];
 @end smallexample
 
-@cindex @code{sdot_prod@var{m}} instruction pattern
-@item @samp{sdot_prod@var{m}}
-
-Compute the sum of the products of two signed elements.
-Operand 1 and operand 2 are of the same mode. Their
-product, which is of a wider mode, is computed and added to operand 3.
-Operand 3 is of a mode equal or wider than the mode of the product. The
-result is placed in operand 0, which is of the same mode as operand 3.
-@var{m} is the mode of operand 1 and operand 2.
+@cindex @code{sdot_prod@var{m}@var{n}} instruction pattern
+@item @samp{sdot_prod@var{m}@var{n}}
+
+Multiply operand 1 by operand 2 without loss of precision, given that
+both operands contain signed elements.  Add each product to the overlapping
+element of operand 3 and store the result in operand 0.  Operands 0 and 3
+have mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n}
+having narrower elements than @var{m}.
 
 Semantically the expressions perform the multiplication in the following signs
 
@@ -5778,15 +5777,14 @@ sdot<signed op0, signed op1, signed op2, signed op3> ==
 @dots{}
 @end smallexample
 
-@cindex @code{udot_prod@var{m}} instruction pattern
-@item @samp{udot_prod@var{m}}
+@cindex @code{udot_prod@var{m}@var{n}} instruction pattern
+@item @samp{udot_prod@var{m}@var{n}}
 
-Compute the sum of the products of two unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their
-product, which is of a wider mode, is computed and added to operand 3.
-Operand 3 is of a mode equal or wider than the mode of the product. The
-result is placed in operand 0, which is of the same mode as operand 3.
-@var{m} is the mode of operand 1 and operand 2.
+Multiply operand 1 by operand 2 without loss of precision, given that
+both operands contain unsigned elements.  Add each product to the overlapping
+element of operand 3 and store the result in operand 0.  Operands 0 and 3
+have mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n}
+having narrower elements than @var{m}.
 
 Semantically the expressions perform the multiplication in the following signs
 
@@ -5796,14 +5794,14 @@ udot<unsigned op0, unsigned op1, unsigned op2, unsigned 
op3> ==
 @dots{}
 @end smallexample
 
-@cindex @code{usdot_prod@var{m}} instruction pattern
-@item @samp{usdot_prod@var{m}}
+@cindex @code{usdot_prod@var{m}@var{n}} instruction pattern
+@item @samp{usdot_prod@var{m}@var{n}}
 Compute the sum of the products of elements of different signs.
-Operand 1 must be unsigned and operand 2 signed. Their
-product, which is of a wider mode, is computed and added to operand 3.
-Operand 3 is of a mode equal or wider than the mode of the product. The
-result is placed in operand 0, which is of the same mode as operand 3.
-@var{m} is the mode of operand 1 and operand 2.
+Multiply operand 1 by operand 2 without loss of precision, given that operand 1
+is unsigned and operand 2 is signed.  Add each product to the overlapping
+element of operand 3 and store the result in operand 0.  Operands 0 and 3 have
+mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n} having
+narrower elements than @var{m}.
 
 Semantically the expressions perform the multiplication in the following signs
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 58a939442bd4..ba860144d8be 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -110,6 +110,9 @@ OPTAB_CD(mask_scatter_store_optab, "mask_scatter_store$a$b")
 OPTAB_CD(mask_len_scatter_store_optab, "mask_len_scatter_store$a$b")
 OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
 OPTAB_CD(vec_init_optab, "vec_init$a$b")
+OPTAB_CD (sdot_prod_optab, "sdot_prod$I$a$b")
+OPTAB_CD (udot_prod_optab, "udot_prod$I$a$b")
+OPTAB_CD (usdot_prod_optab, "usdot_prod$I$a$b")
 
 OPTAB_CD (while_ult_optab, "while_ult$a$b")
 
@@ -413,10 +416,7 @@ OPTAB_D (savg_floor_optab, "avg$a3_floor")
 OPTAB_D (uavg_floor_optab, "uavg$a3_floor")
 OPTAB_D (savg_ceil_optab, "avg$a3_ceil")
 OPTAB_D (uavg_ceil_optab, "uavg$a3_ceil")
-OPTAB_D (sdot_prod_optab, "sdot_prod$I$a")
 OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3")
-OPTAB_D (udot_prod_optab, "udot_prod$I$a")
-OPTAB_D (usdot_prod_optab, "usdot_prod$I$a")
 OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
 OPTAB_D (usad_optab, "usad$I$a")
 OPTAB_D (ssad_optab, "ssad$I$a")

Reply via email to