[committed] First two patches from Mariam's CRC work

Jeff Law Thu, 28 Nov 2024 06:56:24 -0800

So these are updated versions of the first two of Mariam's patches forCRC optimization. They introduce the basic building blocks that areused by subsequent patches as well as CRC builtin support.

The biggest conceptual change from Mariam's patch is to drop theassumption that we're going to be using word_mode in the table basedexpansion. That in turn means we can support the table based CRCexpansion independent of the target's word size and what modes aresupported for basic ALU operations.

This has been tested on every cross target in my tester and it has beenbootstrapped and regression tested on x86_64. The full series has alsobeen bootstrapped and regression tested on a variety of targetsincluding, but not limited to aarch64, riscv64, ppc64le, and others.


Attaching committed patch #1 and #2 for the archivers.

Jeff

commit bb46d05ad64e4e0acb3307e76bab340aa8587d3e
Author: Mariam Arutunian <mariamarutun...@gmail.com>
Date:   Mon Nov 11 12:48:34 2024 -0700

    [PATCH v6 01/12] Implement internal functions for efficient CRC computation.
    
    Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster
    CRC generation.
    One performs bit-forward and the other bit-reversed CRC computation.
    If CRC optabs are supported, they are used for the CRC computation.
    Otherwise, table-based CRC is generated.
    The supported data and CRC sizes are 8, 16, 32, and 64 bits.
    The polynomial is without the leading 1.
    A table with 256 elements is used to store precomputed CRCs.
    For the reflection of inputs and the output, a simple algorithm involving
    SHIFT, AND, and OR operations is used.
    
    gcc/
    
            * doc/md.texi (crc@var{m}@var{n}4, crc_rev@var{m}@var{n}4): 
Document.
            * expr.cc (calculate_crc): New function.
            (assemble_crc_table): Likewise.
            (generate_crc_table): Likewise.
            (calculate_table_based_CRC): Likewise.
            (expand_crc_table_based): Likewise.
            (gen_common_operation_to_reflect): Likewise.
            (reflect_64_bit_value): Likewise.
            (reflect_32_bit_value): Likewise.
            (reflect_16_bit_value): Likewise.
            (reflect_8_bit_value): Likewise.
            (generate_reflecting_code_standard): Likewise.
            (expand_reversed_crc_table_based): Likewise.
            * expr.h (generate_reflecting_code_standard): New function 
declaration.
            (expand_crc_table_based): Likewise.
            (expand_reversed_crc_table_based): Likewise.
            * internal-fn.cc: (crc_direct): Define.
            (direct_crc_optab_supported_p): Likewise.
            (expand_crc_optab_fn): New function
            * internal-fn.def (CRC, CRC_REV): New internal functions.
            * optabs.def (crc_optab, crc_rev_optab): New optabs.
    
            Signed-off-by: Mariam Arutunian <mariamarutun...@gmail.com>
            Co-authored-by: Joern Rennecke <joern.renne...@embecosm.com>
            Co-authored-by: Jeff Law <j...@ventanamicro.com>

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index c4c37053833..69605bf75c0 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -8578,6 +8578,20 @@ Return 1 if operand 1 is a normal floating point number 
and 0
 otherwise.  @var{m} is a scalar floating point mode.  Operand 0
 has mode @code{SImode}, and operand 1 has mode @var{m}.
 
+@cindex @code{crc@var{m}@var{n}4} instruction pattern
+@item @samp{crc@var{m}@var{n}4}
+Calculate a bit-forward CRC using operands 1, 2 and 3,
+then store the result in operand 0.
+Operands 1 is the initial CRC, operands 2 is the data and operands 3 is the
+polynomial without leading 1.
+Operands 0, 1 and 3 have mode @var{n} and operand 2 has mode @var{m}, where
+both modes are integers.  The size of CRC to be calculated is determined by the
+mode; for example, if @var{n} is @code{HImode}, a CRC16 is calculated.
+
+@cindex @code{crc_rev@var{m}@var{n}4} instruction pattern
+@item @samp{crc_rev@var{m}@var{n}4}
+Similar to @samp{crc@var{m}@var{n}4}, but calculates a bit-reversed CRC.
+
 @end table
 
 @end ifset
diff --git a/gcc/expr.cc b/gcc/expr.cc
index f4939140bb5..de25437660e 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -14177,3 +14177,350 @@ int_expr_size (const_tree exp)
 
   return tree_to_shwi (size);
 }
+
+/* Calculate CRC for the initial CRC and given POLYNOMIAL.
+   CRC_BITS is CRC size.  */
+
+static unsigned HOST_WIDE_INT
+calculate_crc (unsigned HOST_WIDE_INT crc,
+              unsigned HOST_WIDE_INT polynomial,
+              unsigned short crc_bits)
+{
+  unsigned HOST_WIDE_INT msb = HOST_WIDE_INT_1U << (crc_bits - 1);
+  crc = crc << (crc_bits - 8);
+  for (short i = 8; i > 0; --i)
+    {
+      if (crc & msb)
+       crc = (crc << 1) ^ polynomial;
+      else
+       crc <<= 1;
+    }
+  /* Zero out bits in crc beyond the specified number of crc_bits.  */
+  if (crc_bits < sizeof (crc) * CHAR_BIT)
+    crc &= (HOST_WIDE_INT_1U << crc_bits) - 1;
+  return crc;
+}
+
+/* Assemble CRC table with 256 elements for the given POLYNOM and CRC_BITS with
+   given ID.
+   ID is the identifier of the table, the name of the table is unique,
+   contains CRC size and the polynomial.
+   POLYNOM is the polynomial used to calculate the CRC table's elements.
+   CRC_BITS is the size of CRC, may be 8, 16, ... . */
+
+rtx
+assemble_crc_table (tree id, unsigned HOST_WIDE_INT polynom,
+                   unsigned short crc_bits)
+{
+  unsigned table_el_n = 0x100;
+  tree ar = build_array_type (make_unsigned_type (crc_bits),
+                             build_index_type (size_int (table_el_n - 1)));
+  tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, id, ar);
+  SET_DECL_ASSEMBLER_NAME (decl, id);
+  DECL_ARTIFICIAL (decl) = 1;
+  rtx tab = gen_rtx_SYMBOL_REF (Pmode, IDENTIFIER_POINTER (id));
+  TREE_ASM_WRITTEN (decl) = 0;
+
+  /* Initialize the table.  */
+  vec<tree, va_gc> *initial_values;
+  vec_alloc (initial_values, table_el_n);
+  for (size_t i = 0; i < table_el_n; ++i)
+    {
+      unsigned HOST_WIDE_INT crc = calculate_crc (i, polynom, crc_bits);
+      tree element = build_int_cstu (make_unsigned_type (crc_bits), crc);
+      vec_safe_push (initial_values, element);
+    }
+  DECL_INITIAL (decl) = build_constructor_from_vec (ar, initial_values);
+
+  TREE_READONLY (decl) = 1;
+  TREE_STATIC (decl) = 1;
+
+  if (TREE_PUBLIC (id))
+    {
+      TREE_PUBLIC (decl) = 1;
+      make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl));
+    }
+
+  mark_decl_referenced (decl);
+  varpool_node::finalize_decl (decl);
+
+  return tab;
+}
+
+/* Generate CRC lookup table by calculating CRC for all possible
+   8-bit data values.  The table is stored with a specific name in the 
read-only
+   static data section.
+   POLYNOM is the polynomial used to calculate the CRC table's elements.
+   CRC_BITS is the size of CRC, may be 8, 16, ... .  */
+
+rtx
+generate_crc_table (unsigned HOST_WIDE_INT polynom, unsigned short crc_bits)
+{
+  gcc_assert (crc_bits <= 64);
+
+  /* Buf size - 24 letters + 6 '_'
+     + 20 numbers (2 for crc bit size + 2 for 0x + 16 for 64-bit polynomial)
+     + 1 for \0.  */
+  char buf[51];
+  sprintf (buf, "crc_table_for_crc_%u_polynomial_" HOST_WIDE_INT_PRINT_HEX,
+          crc_bits, polynom);
+
+  tree id = maybe_get_identifier (buf);
+  if (id)
+    return gen_rtx_SYMBOL_REF (Pmode, IDENTIFIER_POINTER (id));
+
+  id = get_identifier (buf);
+  return assemble_crc_table (id, polynom, crc_bits);
+}
+
+/* Generate table-based CRC code for the given CRC, INPUT_DATA and the
+   POLYNOMIAL (without leading 1).
+
+   First, using POLYNOMIAL's value generates CRC table of 256 elements,
+   then generates the assembly for the following code,
+   where crc_bit_size and data_bit_size may be 8, 16, 32, 64, depending on CRC:
+
+     for (int i = 0; i < data_bit_size / 8; i++)
+       crc = (crc << 8) ^ crc_table[(crc >> (crc_bit_size - 8))
+                                   ^ (data >> (data_bit_size - (i + 1) * 8)
+                                   & 0xFF))];
+
+   So to take values from the table, we need 8-bit data.
+   If input data size is not 8, then first we extract upper 8 bits,
+   then the other 8 bits, and so on.  */
+
+void
+calculate_table_based_CRC (rtx *crc, const rtx &input_data,
+                          const rtx &polynomial,
+                          machine_mode crc_mode, machine_mode data_mode)
+{
+  unsigned short crc_bit_size = GET_MODE_BITSIZE (crc_mode).to_constant ();
+  unsigned short data_size = GET_MODE_SIZE (data_mode).to_constant ();
+  machine_mode mode = GET_MODE (*crc);
+  rtx tab = generate_crc_table (UINTVAL (polynomial), crc_bit_size);
+
+  for (unsigned short i = 0; i < data_size; i++)
+    {
+      /* crc >> (crc_bit_size - 8).  */
+      *crc = force_reg (crc_mode, *crc);
+      rtx op1 = expand_shift (RSHIFT_EXPR, mode, *crc, crc_bit_size - 8,
+                             NULL_RTX, 1);
+
+      /* data >> (8 * (GET_MODE_SIZE (data_mode).to_constant () - i - 1)).  */
+      unsigned range_8 = 8 * (data_size - i - 1);
+      rtx data = force_reg (data_mode, input_data);
+      data = expand_shift (RSHIFT_EXPR, mode, data, range_8, NULL_RTX, 1);
+
+      /* data >> (8 * (GET_MODE_SIZE (data_mode)
+                                       .to_constant () - i - 1)) & 0xFF.  */
+      rtx data_final = expand_and (mode, data,
+                                  gen_int_mode (255, data_mode), NULL_RTX);
+
+      /* (crc >> (crc_bit_size - 8)) ^ data_8bit.  */
+      rtx in = expand_binop (mode, xor_optab, op1, data_final,
+                            NULL_RTX, 1, OPTAB_WIDEN);
+
+      /* ((crc >> (crc_bit_size - 8)) ^ data_8bit) & 0xFF.  */
+      rtx index = expand_and (mode, in, gen_int_mode (255, mode),
+                             NULL_RTX);
+      int log_crc_size = exact_log2 (GET_MODE_SIZE (crc_mode).to_constant ());
+      index = expand_shift (LSHIFT_EXPR, mode, index,
+                           log_crc_size, NULL_RTX, 0);
+
+      rtx addr = gen_reg_rtx (Pmode);
+      convert_move (addr, index, 1);
+      addr = expand_binop (Pmode, add_optab, addr, tab, NULL_RTX,
+                           0, OPTAB_DIRECT);
+
+      /* crc_table[(crc >> (crc_bit_size - 8)) ^ data_8bit]  */
+      rtx tab_el = validize_mem (gen_rtx_MEM (crc_mode, addr));
+
+      /* (crc << 8) if CRC is larger than 8, otherwise crc = 0.  */
+      rtx high = NULL_RTX;
+      if (crc_bit_size != 8)
+       high = expand_shift (LSHIFT_EXPR, mode, *crc, 8, NULL_RTX, 0);
+      else
+       high = gen_int_mode (0, mode);
+
+      /* crc = (crc << 8)
+              ^ crc_table[(crc >> (crc_bit_size - 8)) ^ data_8bit];  */
+      *crc = expand_binop (mode, xor_optab, tab_el, high, NULL_RTX, 1,
+                          OPTAB_WIDEN);
+    }
+}
+
+/* Generate table-based CRC code for the given CRC, INPUT_DATA and the
+   POLYNOMIAL (without leading 1).
+
+   CRC is OP1, data is OP2 and the polynomial is OP3.
+   This must generate a CRC table and an assembly for the following code,
+   where crc_bit_size and data_bit_size may be 8, 16, 32, 64:
+   uint_crc_bit_size_t
+   crc_crc_bit_size (uint_crc_bit_size_t crc_init,
+                    uint_data_bit_size_t data, size_t size)
+   {
+     uint_crc_bit_size_t crc = crc_init;
+     for (int i = 0; i < data_bit_size / 8; i++)
+       crc = (crc << 8) ^ crc_table[(crc >> (crc_bit_size - 8))
+                                   ^ (data >> (data_bit_size - (i + 1) * 8)
+                                   & 0xFF))];
+     return crc;
+   }  */
+
+void
+expand_crc_table_based (rtx op0, rtx op1, rtx op2, rtx op3,
+                       machine_mode data_mode)
+{
+  gcc_assert (!CONST_INT_P (op0));
+  gcc_assert (CONST_INT_P (op3));
+  machine_mode crc_mode = GET_MODE (op0);
+  rtx crc = gen_reg_rtx (crc_mode);
+  convert_move (crc, op1, 0);
+  calculate_table_based_CRC (&crc, op2, op3, crc_mode, data_mode);
+  convert_move (op0, crc, 0);
+}
+
+/* Generate the common operation for reflecting values:
+   *OP = (*OP & AND1_VALUE) << SHIFT_VAL | (*OP & AND2_VALUE) >> SHIFT_VAL;  */
+
+void
+gen_common_operation_to_reflect (rtx *op,
+                                unsigned HOST_WIDE_INT and1_value,
+                                unsigned HOST_WIDE_INT and2_value,
+                                unsigned shift_val)
+{
+  rtx op1 = expand_and (GET_MODE (*op), *op,
+                       gen_int_mode (and1_value, GET_MODE (*op)), NULL_RTX);
+  op1 = expand_shift (LSHIFT_EXPR, GET_MODE (*op), op1, shift_val, op1, 0);
+  rtx op2 = expand_and (GET_MODE (*op), *op,
+                       gen_int_mode (and2_value, GET_MODE (*op)), NULL_RTX);
+  op2 = expand_shift (RSHIFT_EXPR, GET_MODE (*op), op2, shift_val, op2, 1);
+  *op = expand_binop (GET_MODE (*op), ior_optab, op1,
+                     op2, *op, 0, OPTAB_LIB_WIDEN);
+}
+
+/* Reflect 64-bit value for the 64-bit target.  */
+
+void
+reflect_64_bit_value (rtx *op)
+{
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x00000000FFFFFFFF),
+                                  HOST_WIDE_INT_C (0xFFFFFFFF00000000), 32);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x0000FFFF0000FFFF),
+                                  HOST_WIDE_INT_C (0xFFFF0000FFFF0000), 16);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x00FF00FF00FF00FF),
+                                  HOST_WIDE_INT_C (0xFF00FF00FF00FF00), 8);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x0F0F0F0F0F0F0F0F),
+                                  HOST_WIDE_INT_C (0xF0F0F0F0F0F0F0F0), 4);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x3333333333333333),
+                                  HOST_WIDE_INT_C (0xCCCCCCCCCCCCCCCC), 2);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x5555555555555555),
+                                  HOST_WIDE_INT_C (0xAAAAAAAAAAAAAAAA), 1);
+}
+
+/* Reflect 32-bit value for the 32-bit target.  */
+
+void
+reflect_32_bit_value (rtx *op)
+{
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x0000FFFF),
+                                 HOST_WIDE_INT_C (0xFFFF0000), 16);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x00FF00FF),
+                                 HOST_WIDE_INT_C (0xFF00FF00), 8);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x0F0F0F0F),
+                                  HOST_WIDE_INT_C (0xF0F0F0F0), 4);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x33333333),
+                                  HOST_WIDE_INT_C (0xCCCCCCCC), 2);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x55555555),
+                                  HOST_WIDE_INT_C (0xAAAAAAAA), 1);
+}
+
+/* Reflect 16-bit value for the 16-bit target.  */
+
+void
+reflect_16_bit_value (rtx *op)
+{
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x00FF),
+                                  HOST_WIDE_INT_C (0xFF00), 8);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x0F0F),
+                                  HOST_WIDE_INT_C (0xF0F0), 4);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x3333),
+                                  HOST_WIDE_INT_C (0xCCCC), 2);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x5555),
+                                  HOST_WIDE_INT_C (0xAAAA), 1);
+}
+
+/* Reflect 8-bit value for the 8-bit target.  */
+
+void
+reflect_8_bit_value (rtx *op)
+{
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x0F),
+                                  HOST_WIDE_INT_C (0xF0), 4);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x33),
+                                  HOST_WIDE_INT_C (0xCC), 2);
+  gen_common_operation_to_reflect (op, HOST_WIDE_INT_C (0x55),
+                                  HOST_WIDE_INT_C (0xAA), 1);
+}
+
+/* Generate instruction sequence which reflects the value of the OP
+   using shift, and, or operations.  OP's mode may be less than word_mode.  */
+
+void
+generate_reflecting_code_standard (rtx *op)
+{
+  gcc_assert (GET_MODE_BITSIZE (GET_MODE (*op)).to_constant ()  >= 8
+             && GET_MODE_BITSIZE (GET_MODE (*op)).to_constant () <= 64);
+
+  if (GET_MODE_BITSIZE (GET_MODE (*op)).to_constant () == 64)
+    reflect_64_bit_value (op);
+  else if (GET_MODE_BITSIZE (GET_MODE (*op)).to_constant () == 32)
+    reflect_32_bit_value (op);
+  else if (GET_MODE_BITSIZE (GET_MODE (*op)).to_constant () == 16)
+    reflect_16_bit_value (op);
+  else
+    reflect_8_bit_value (op);
+}
+
+/* Generate table-based reversed CRC code for the given CRC, INPUT_DATA and
+   the POLYNOMIAL (without leading 1).
+
+   CRC is OP1, data is OP2 and the polynomial is OP3.
+   This must generate CRC table and assembly for the following code,
+   where crc_bit_size and data_bit_size may be 8, 16, 32, 64:
+   uint_crc_bit_size_t
+   crc_crc_bit_size (uint_crc_bit_size_t crc_init,
+                          uint_data_bit_size_t data, size_t size)
+   {
+     reflect (crc_init)
+     uint_crc_bit_size_t crc = crc_init;
+     reflect (data);
+     for (int i = 0; i < data_bit_size / 8; i++)
+       crc = (crc << 8) ^ crc_table[(crc >> (crc_bit_size - 8))
+                         ^ (data >> (data_bit_size - (i + 1) * 8) & 0xFF))];
+     reflect (crc);
+     return crc;
+   }  */
+
+void
+expand_reversed_crc_table_based (rtx op0, rtx op1, rtx op2, rtx op3,
+                                machine_mode data_mode,
+                                void (*gen_reflecting_code) (rtx *op))
+{
+  gcc_assert (!CONST_INT_P (op0));
+  gcc_assert (CONST_INT_P (op3));
+  machine_mode crc_mode = GET_MODE (op0);
+
+  rtx crc = gen_reg_rtx (crc_mode);
+  convert_move (crc, op1, 0);
+  gen_reflecting_code (&crc);
+
+  rtx data = gen_reg_rtx (data_mode);
+  convert_move (data, op2, 0);
+  gen_reflecting_code (&data);
+
+  calculate_table_based_CRC (&crc, data, op3, crc_mode, data_mode);
+
+  gen_reflecting_code (&crc);
+  convert_move (op0, crc, 0);
+}
diff --git a/gcc/expr.h b/gcc/expr.h
index 04782b15f19..5dd059de801 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -377,4 +377,10 @@ extern rtx expr_size (tree);
 extern bool mem_ref_refers_to_non_mem_p (tree);
 extern bool non_mem_decl_p (tree);
 
+/* Generate table-based CRC.  */
+extern void generate_reflecting_code_standard (rtx *);
+extern void expand_crc_table_based (rtx, rtx, rtx, rtx, machine_mode);
+extern void expand_reversed_crc_table_based (rtx, rtx, rtx, rtx, machine_mode,
+                                            void (*) (rtx *));
+
 #endif /* GCC_EXPR_H */
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index a45b3291800..c7c3f1c34ba 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -191,6 +191,7 @@ init_internal_fns ()
 #define mask_fold_left_direct { 1, 1, false }
 #define mask_len_fold_left_direct { 1, 1, false }
 #define check_ptrs_direct { 0, 0, false }
+#define crc_direct { 1, -1, true }
 
 const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) not_direct,
@@ -4054,6 +4055,79 @@ expand_convert_optab_fn (internal_fn fn, gcall *stmt, 
convert_optab optab,
   expand_fn_using_insn (stmt, icode, 1, nargs);
 }
 
+/* Expand CRC call STMT.  */
+
+static void
+expand_crc_optab_fn (internal_fn fn, gcall *stmt, convert_optab optab)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree rhs1 = gimple_call_arg (stmt, 0); // crc
+  tree rhs2 = gimple_call_arg (stmt, 1); // data
+  tree rhs3 = gimple_call_arg (stmt, 2); // polynomial
+
+  tree result_type = TREE_TYPE (lhs);
+  tree data_type = TREE_TYPE (rhs2);
+
+  gcc_assert (TYPE_MODE (result_type) >= TYPE_MODE (data_type));
+
+  rtx dest = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  rtx crc = expand_normal (rhs1);
+  rtx data = expand_normal (rhs2);
+  gcc_assert (TREE_CODE (rhs3) == INTEGER_CST);
+  rtx polynomial = gen_rtx_CONST_INT (TYPE_MODE (result_type),
+  TREE_INT_CST_LOW (rhs3));
+
+  /* Use target specific expansion if it exists.
+     Otherwise, generate table-based CRC.  */
+  if (direct_internal_fn_supported_p (fn, tree_pair (data_type, result_type),
+                                     OPTIMIZE_FOR_SPEED))
+    {
+      class expand_operand ops[4];
+      create_call_lhs_operand (&ops[0], dest, TYPE_MODE (result_type));
+      create_input_operand (&ops[1], crc, TYPE_MODE (result_type));
+      create_input_operand (&ops[2], data, TYPE_MODE (data_type));
+      create_input_operand (&ops[3], polynomial, TYPE_MODE (result_type));
+      insn_code icode = convert_optab_handler (optab, TYPE_MODE (data_type),
+                                              TYPE_MODE (result_type));
+      expand_insn (icode, 4, ops);
+      assign_call_lhs (lhs, dest, &ops[0]);
+    }
+  else
+    {
+      /* We're bypassing all the operand conversions that are done in the
+        case when we get an icode, operands and pass that off to expand_insn.
+
+        That path has special case handling for promoted return values which
+        we must emulate here (is the same kind of special treatment ever
+        needed for input arguments here?).
+
+        In particular we do not want to store directly into a promoted
+        SUBREG destination, instead store into a suitably sized pseudo.  */
+      rtx orig_dest = dest;
+      if (SUBREG_P (dest) && SUBREG_PROMOTED_VAR_P (dest))
+       dest = gen_reg_rtx (GET_MODE (dest));
+
+      /* If it's IFN_CRC generate bit-forward CRC.  */
+      if (fn == IFN_CRC)
+       expand_crc_table_based (dest, crc, data, polynomial,
+                               TYPE_MODE (data_type));
+      else
+       /* If it's IFN_CRC_REV generate bit-reversed CRC.  */
+       expand_reversed_crc_table_based (dest, crc, data, polynomial,
+                                        TYPE_MODE (data_type),
+                                        generate_reflecting_code_standard);
+
+      /* Now get the return value where it needs to be, taking care to
+        ensure it's promoted appropriately if the ABI demands it.
+
+        Re-use assign_call_lhs to handle the details.  */
+      class expand_operand ops[4];
+      create_call_lhs_operand (&ops[0], dest, TYPE_MODE (result_type));
+      ops[0].value = dest;
+      assign_call_lhs (lhs, orig_dest, &ops[0]);
+    }
+}
+
 /* Expanders for optabs that can use expand_direct_optab_fn.  */
 
 #define expand_unary_optab_fn(FN, STMT, OPTAB) \
@@ -4190,6 +4264,7 @@ multi_vector_optab_supported_p (convert_optab optab, 
tree_pair types,
 #define direct_cond_len_unary_optab_supported_p direct_optab_supported_p
 #define direct_cond_len_binary_optab_supported_p direct_optab_supported_p
 #define direct_cond_len_ternary_optab_supported_p direct_optab_supported_p
+#define direct_crc_optab_supported_p convert_optab_supported_p
 #define direct_mask_load_optab_supported_p convert_optab_supported_p
 #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p
 #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e993c99c558..6e84c693697 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -202,6 +202,8 @@ along with GCC; see the file COPYING3.  If not see
                                cond_len_##UNSIGNED_OPTAB, cond_len_##TYPE)
 #endif
 
+DEF_INTERNAL_OPTAB_FN (CRC, ECF_CONST | ECF_NOTHROW, crc, crc)
+DEF_INTERNAL_OPTAB_FN (CRC_REV, ECF_CONST | ECF_NOTHROW, crc_rev, crc)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 75f39d85ada..5d75b1379ac 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -85,6 +85,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4")
 OPTAB_CD(umsub_widen_optab, "umsub$b$a4")
 OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4")
 OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4")
+OPTAB_CD(crc_optab, "crc$a$b4")
+OPTAB_CD(crc_rev_optab, "crc_rev$a$b4")
 OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b")

commit c5126f0a004c27b180ac48f9e874e3744c088a09
Author: Mariam Arutunian <mariamarutun...@gmail.com>
Date:   Mon Nov 11 12:51:18 2024 -0700

    [PATCH v6 02/12] Add built-ins and tests for bit-forward and bit-reversed 
CRCs.
    
    This patch introduces new built-in functions to GCC for computing
    bit-forward and bit-reversed CRCs.
    These builtins aim to provide efficient CRC calculation capabilities.
    When the target architecture supports CRC operations (as indicated by the
    presence of a CRC optab),
    the builtins will utilize the expander to generate CRC code.
    In the absence of hardware support, the builtins default to generating code
    for a table-based CRC calculation.
    
    The built-ins are defined as follows:
    __builtin_rev_crc16_data8,
    __builtin_rev_crc32_data8, __builtin_rev_crc32_data16,
    __builtin_rev_crc32_data32
    __builtin_rev_crc64_data8, __builtin_rev_crc64_data16,
     __builtin_rev_crc64_data32, __builtin_rev_crc64_data64,
    __builtin_crc8_data8,
    __builtin_crc16_data16, __builtin_crc16_data8,
    __builtin_crc32_data8, __builtin_crc32_data16, __builtin_crc32_data32,
    __builtin_crc64_data8, __builtin_crc64_data16,  __builtin_crc64_data32,
    __builtin_crc64_data64
    
    Each built-in takes three parameters:
    crc: The initial CRC value.
    data: The data to be processed.
    polynomial: The CRC polynomial without the leading 1.
    
    To validate the correctness of these built-ins, this patch also includes
    additions to the GCC testsuite.
    This enhancement allows GCC to offer developers high-performance CRC
    computation options
    that automatically adapt to the capabilities of the target hardware.
    
    gcc/
    
            * builtin-types.def (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE): Define.
            (BT_FN_UINT16_UINT16_UINT8_CONST_SIZE): Likewise.
            (BT_FN_UINT16_UINT16_UINT16_CONST_SIZE): Likewise.
            (BT_FN_UINT32_UINT32_UINT8_CONST_SIZE): Likewise.
            (BT_FN_UINT32_UINT32_UINT16_CONST_SIZE): Likewise.
            (BT_FN_UINT32_UINT32_UINT32_CONST_SIZE): Likewise.
            (BT_FN_UINT64_UINT64_UINT8_CONST_SIZE): Likewise.
            (BT_FN_UINT64_UINT64_UINT16_CONST_SIZE): Likewise.
            (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE): Likewise.
            (BT_FN_UINT64_UINT64_UINT64_CONST_SIZE): Likewise.
            * builtins.cc (associated_internal_fn): Handle CRC related builtins.
            (expand_builtin_crc_table_based): New function.
            (expand_builtin): Handle CRC related builtins.
            * builtins.def (BUILT_IN_CRC8_DATA8): New builtin.
            (BUILT_IN_CRC16_DATA8): Likewise.
            (BUILT_IN_CRC16_DATA16): Likewise.
            (BUILT_IN_CRC32_DATA8): Likewise.
            (BUILT_IN_CRC32_DATA16): Likewise.
            (BUILT_IN_CRC32_DATA32): Likewise.
            (BUILT_IN_CRC64_DATA8): Likewise.
            (BUILT_IN_CRC64_DATA16): Likewise.
            (BUILT_IN_CRC64_DATA32): Likewise.
            (BUILT_IN_CRC64_DATA64): Likewise.
            (BUILT_IN_REV_CRC8_DATA8): New builtin.
            (BUILT_IN_REV_CRC16_DATA8): Likewise.
            (BUILT_IN_REV_CRC16_DATA16): Likewise.
            (BUILT_IN_REV_CRC32_DATA8): Likewise.
            (BUILT_IN_REV_CRC32_DATA16): Likewise.
            (BUILT_IN_REV_CRC32_DATA32): Likewise.
            (BUILT_IN_REV_CRC64_DATA8): Likewise.
            (BUILT_IN_REV_CRC64_DATA16): Likewise.
            (BUILT_IN_REV_CRC64_DATA32): Likewise.
            (BUILT_IN_REV_CRC64_DATA64): Likewise.
            * builtins.h (expand_builtin_crc_table_based): New function
            declaration.
            * doc/extend.texi: Add documentation for new CRC builtins.
    
    gcc/testsuite/
    
            * gcc.dg/crc-builtin-rev-target32.c: New test.
            * gcc.dg/crc-builtin-rev-target64.c: New test.
            * gcc.dg/crc-builtin-target32.c: New test.
            * gcc.dg/crc-builtin-target64.c: New test.
    
            Signed-off-by: Mariam Arutunian <mariamarutun...@gmail.com>
            Co-authored-by: Joern Rennecke <joern.renne...@embecosm.com>
            Co-authored-by: Jeff Law <j...@ventanamicro.com>

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 427af741c6b..fa988d35064 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -840,6 +840,26 @@ DEF_FUNCTION_TYPE_3 (BT_FN_PTR_SIZE_SIZE_PTRMODE,
                     BT_PTR, BT_SIZE, BT_SIZE, BT_PTRMODE)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_UINT8_PTRMODE, BT_VOID, BT_PTR, BT_UINT8,
                     BT_PTRMODE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT8_UINT8_UINT8_CONST_SIZE, BT_UINT8, BT_UINT8,
+                    BT_UINT8, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT16_UINT16_UINT8_CONST_SIZE, BT_UINT16, 
BT_UINT16,
+                    BT_UINT8, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT16_UINT16_UINT16_CONST_SIZE, BT_UINT16,
+                    BT_UINT16, BT_UINT16, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT32_UINT32_UINT8_CONST_SIZE, BT_UINT32, 
BT_UINT32,
+                    BT_UINT8, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT32_UINT32_UINT16_CONST_SIZE, BT_UINT32,
+                    BT_UINT32, BT_UINT16, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT32_UINT32_UINT32_CONST_SIZE, BT_UINT32,
+                    BT_UINT32, BT_UINT32, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT8_CONST_SIZE, BT_UINT64, 
BT_UINT64,
+                    BT_UINT8, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT16_CONST_SIZE, BT_UINT64,
+                    BT_UINT64, BT_UINT16, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT32_CONST_SIZE, BT_UINT64,
+                    BT_UINT64, BT_UINT32, BT_CONST_SIZE)
+DEF_FUNCTION_TYPE_3 (BT_FN_UINT64_UINT64_UINT64_CONST_SIZE, BT_UINT64,
+                    BT_UINT64, BT_UINT64, BT_CONST_SIZE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
                     BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index fd7acdfc915..9d106405f79 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -2225,7 +2225,28 @@ associated_internal_fn (built_in_function fn, tree 
return_type)
       if (REAL_MODE_FORMAT (TYPE_MODE (return_type))->b == 2)
        return IFN_LDEXP;
       return IFN_LAST;
-
+    case BUILT_IN_CRC8_DATA8:
+    case BUILT_IN_CRC16_DATA8:
+    case BUILT_IN_CRC16_DATA16:
+    case BUILT_IN_CRC32_DATA8:
+    case BUILT_IN_CRC32_DATA16:
+    case BUILT_IN_CRC32_DATA32:
+    case BUILT_IN_CRC64_DATA8:
+    case BUILT_IN_CRC64_DATA16:
+    case BUILT_IN_CRC64_DATA32:
+    case BUILT_IN_CRC64_DATA64:
+      return IFN_CRC;
+    case BUILT_IN_REV_CRC8_DATA8:
+    case BUILT_IN_REV_CRC16_DATA8:
+    case BUILT_IN_REV_CRC16_DATA16:
+    case BUILT_IN_REV_CRC32_DATA8:
+    case BUILT_IN_REV_CRC32_DATA16:
+    case BUILT_IN_REV_CRC32_DATA32:
+    case BUILT_IN_REV_CRC64_DATA8:
+    case BUILT_IN_REV_CRC64_DATA16:
+    case BUILT_IN_REV_CRC64_DATA32:
+    case BUILT_IN_REV_CRC64_DATA64:
+      return IFN_CRC_REV;
     default:
       return IFN_LAST;
     }
@@ -7763,6 +7784,35 @@ expand_speculation_safe_value (machine_mode mode, tree 
exp, rtx target,
   return targetm.speculation_safe_value (mode, target, val, failsafe);
 }
 
+/* Expand CRC* or REV_CRC* built-ins.  */
+
+rtx
+expand_builtin_crc_table_based (internal_fn fn, scalar_mode crc_mode,
+                               scalar_mode data_mode, machine_mode mode,
+                               tree exp, rtx target)
+{
+  tree rhs1 = CALL_EXPR_ARG (exp, 0); // crc
+  tree rhs2 = CALL_EXPR_ARG (exp, 1); // data
+  tree rhs3 = CALL_EXPR_ARG (exp, 2); // polynomial
+
+  if (!target || mode == VOIDmode)
+    target = gen_reg_rtx (crc_mode);
+
+  rtx op1 = expand_normal (rhs1);
+  rtx op2 = expand_normal (rhs2);
+  gcc_assert (TREE_CODE (rhs3) == INTEGER_CST);
+  rtx op3 = gen_int_mode (TREE_INT_CST_LOW (rhs3), crc_mode);
+
+  if (fn == IFN_CRC)
+    expand_crc_table_based (target, op1, op2, op3, data_mode);
+  else
+    /* If it's IFN_CRC_REV generate bit-reversed CRC.  */
+    expand_reversed_crc_table_based (target, op1, op2, op3,
+                                    data_mode,
+                                    generate_reflecting_code_standard);
+  return target;
+}
+
 /* Expand an expression EXP that calls a built-in function,
    with result going to TARGET if that's convenient
    (and in mode MODE if that's convenient).
@@ -8942,6 +8992,66 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,
       mode = get_builtin_sync_mode (fcode - BUILT_IN_SPECULATION_SAFE_VALUE_1);
       return expand_speculation_safe_value (mode, exp, target, ignore);
 
+    case BUILT_IN_CRC8_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC, QImode, QImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC16_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC, HImode, QImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC16_DATA16:
+      return expand_builtin_crc_table_based (IFN_CRC, HImode, HImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC32_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC, SImode, QImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC32_DATA16:
+      return expand_builtin_crc_table_based (IFN_CRC, SImode, HImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC32_DATA32:
+      return expand_builtin_crc_table_based (IFN_CRC, SImode, SImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC64_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC, DImode, QImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC64_DATA16:
+      return expand_builtin_crc_table_based (IFN_CRC, DImode, HImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC64_DATA32:
+      return expand_builtin_crc_table_based (IFN_CRC, DImode, SImode, mode,
+                                              exp, target);
+    case BUILT_IN_CRC64_DATA64:
+      return expand_builtin_crc_table_based (IFN_CRC, DImode, DImode, mode,
+                                              exp, target);
+    case BUILT_IN_REV_CRC8_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, QImode, QImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC16_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, HImode, QImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC16_DATA16:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, HImode, HImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC32_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, SImode, QImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC32_DATA16:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, SImode, HImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC32_DATA32:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, SImode, SImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC64_DATA8:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, DImode, QImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC64_DATA16:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, DImode, HImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC64_DATA32:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, DImode, SImode,
+                                              mode, exp, target);
+    case BUILT_IN_REV_CRC64_DATA64:
+      return expand_builtin_crc_table_based (IFN_CRC_REV, DImode, DImode,
+                                              mode, exp, target);
     default:   /* just do library call, if unknown builtin */
       break;
     }
diff --git a/gcc/builtins.def b/gcc/builtins.def
index fac6bc9ad16..d47ac281e99 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -720,7 +720,26 @@ DEF_EXT_LIB_BUILTIN    (BUILT_IN_Y1L, "y1l", 
BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_M
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_YN, "yn", BT_FN_DOUBLE_INT_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_YNF, "ynf", BT_FN_FLOAT_INT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_YNL, "ynl", BT_FN_LONGDOUBLE_INT_LONGDOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
-
+DEF_GCC_BUILTIN               (BUILT_IN_CRC8_DATA8, "crc8_data8", 
BT_FN_UINT8_UINT8_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC16_DATA8, "crc16_data8", 
BT_FN_UINT16_UINT16_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC16_DATA16, "crc16_data16", 
BT_FN_UINT16_UINT16_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC32_DATA8, "crc32_data8", 
BT_FN_UINT32_UINT32_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC32_DATA16, "crc32_data16", 
BT_FN_UINT32_UINT32_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC32_DATA32, "crc32_data32", 
BT_FN_UINT32_UINT32_UINT32_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC64_DATA8, "crc64_data8", 
BT_FN_UINT64_UINT64_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC64_DATA16, "crc64_data16", 
BT_FN_UINT64_UINT64_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC64_DATA32, "crc64_data32", 
BT_FN_UINT64_UINT64_UINT32_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_CRC64_DATA64, "crc64_data64", 
BT_FN_UINT64_UINT64_UINT64_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC8_DATA8, "rev_crc8_data8", 
BT_FN_UINT8_UINT8_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC16_DATA8, "rev_crc16_data8", 
BT_FN_UINT16_UINT16_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC16_DATA16, "rev_crc16_data16", 
BT_FN_UINT16_UINT16_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC32_DATA8, "rev_crc32_data8", 
BT_FN_UINT32_UINT32_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC32_DATA16, "rev_crc32_data16", 
BT_FN_UINT32_UINT32_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC32_DATA32, "rev_crc32_data32", 
BT_FN_UINT32_UINT32_UINT32_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC64_DATA8, "rev_crc64_data8", 
BT_FN_UINT64_UINT64_UINT8_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC64_DATA16, "rev_crc64_data16", 
BT_FN_UINT64_UINT64_UINT16_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC64_DATA32, "rev_crc64_data32", 
BT_FN_UINT64_UINT64_UINT32_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN               (BUILT_IN_REV_CRC64_DATA64, "rev_crc64_data64", 
BT_FN_UINT64_UINT64_UINT64_CONST_SIZE, ATTR_CONST_NOTHROW_LEAF_LIST)
 /* Category: _Complex math builtins.  */
 DEF_C99_COMPL_BUILTIN        (BUILT_IN_CABS, "cabs", 
BT_FN_DOUBLE_COMPLEX_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_COMPL_BUILTIN        (BUILT_IN_CABSF, "cabsf", 
BT_FN_FLOAT_COMPLEX_FLOAT, ATTR_MATHFN_FPROUNDING)
diff --git a/gcc/builtins.h b/gcc/builtins.h
index 8d93f75a9a4..0094b9dcc3f 100644
--- a/gcc/builtins.h
+++ b/gcc/builtins.h
@@ -133,6 +133,9 @@ extern void expand_builtin_trap (void);
 extern void expand_ifn_atomic_bit_test_and (gcall *);
 extern void expand_ifn_atomic_compare_exchange (gcall *);
 extern void expand_ifn_atomic_op_fetch_cmp_0 (gcall *);
+extern rtx expand_builtin_crc_table_based (internal_fn, scalar_mode,
+                                          scalar_mode, machine_mode,
+                                          tree, rtx);
 extern rtx expand_builtin (tree, rtx, rtx, machine_mode, int);
 extern enum built_in_function builtin_mathfn_code (const_tree);
 extern tree fold_builtin_expect (location_t, tree, tree, tree, tree);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 106fa57addf..f0f007bb746 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -16280,6 +16280,115 @@ Returns the openacc gang, worker or vector size 
depending on whether @var{x} is
 0, 1 or 2.
 @enddefbuiltin
 
+@defbuiltin{uint8_t __builtin_rev_crc8_data8 (uint8_t @var{crc}, uint8_t 
@var{data}, uint8_t @var{poly})}
+Returns the calculated 8-bit bit-reversed CRC using the initial CRC (8-bit),
+data (8-bit) and the polynomial (8-bit).
+@var{crc} is the initial CRC, @var{data} is the data and
+@var{poly} is the polynomial without leading 1.
+Table-based or clmul-based CRC may be used for the
+calculation, depending on the target architecture.
+@enddefbuiltin
+
+@defbuiltin{uint16_t __builtin_rev_crc16_data16 (uint16_t @var{crc}, uint16_t 
@var{data}, uint16_t @var{poly})}
+Similar to @code{__builtin_rev_crc8_data8}, except the argument and return 
types
+are 16-bit.
+@enddefbuiltin
+
+@defbuiltin{uint16_t __builtin_rev_crc16_data8 (uint16_t @var{crc}, uint8_t 
@var{data}, uint16_t @var{poly})}
+Similar to @code{__builtin_rev_crc16_data16}, except the @var{data} argument
+type is 8-bit.
+@enddefbuiltin
+
+@defbuiltin{uint32_t __builtin_rev_crc32_data32 (uint32_t @var{crc}, uint32_t 
@var{data}, uint32_t @var{poly})}
+Similar to @code{__builtin_rev_crc8_data8}, except the argument and return 
types
+are 32-bit and for the CRC calculation may be also used crc* machine 
instruction
+depending on the target and the polynomial.
+@enddefbuiltin
+
+@defbuiltin{uint32_t __builtin_rev_crc32_data8 (uint32_t @var{crc}, uint8_t 
@var{data}, uint32_t @var{poly})}
+Similar to @code{__builtin_rev_crc32_data32}, except the @var{data} argument
+type is 8-bit.
+@enddefbuiltin
+
+@defbuiltin{uint32_t __builtin_rev_crc32_data16 (uint32_t @var{crc}, uint16_t 
@var{data}, uint32_t @var{poly})}
+Similar to @code{__builtin_rev_crc32_data32}, except the @var{data} argument
+type is 16-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_rev_crc64_data64 (uint64_t @var{crc}, uint64_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_rev_crc8_data8}, except the argument and return 
types
+are 64-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_rev_crc64_data8 (uint64_t @var{crc}, uint8_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_rev_crc64_data64}, except the @var{data} argument 
type
+is 8-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_rev_crc64_data16 (uint64_t @var{crc}, uint16_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_rev_crc64_data64}, except the @var{data} argument 
type
+is 16-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_rev_crc64_data32 (uint64_t @var{crc}, uint32_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_rev_crc64_data64}, except the @var{data} argument 
type
+is 32-bit.
+@enddefbuiltin
+
+@defbuiltin{uint8_t __builtin_crc8_data8 (uint8_t @var{crc}, uint8_t 
@var{data}, uint8_t @var{poly})}
+Returns the calculated 8-bit bit-forward CRC using the initial CRC (8-bit),
+data (8-bit) and the polynomial (8-bit).
+@var{crc} is the initial CRC, @var{data} is the data and
+@var{poly} is the polynomial without leading 1.
+Table-based or clmul-based CRC may be used for the
+calculation, depending on the target architecture.
+@enddefbuiltin
+
+@defbuiltin{uint16_t __builtin_crc16_data16 (uint16_t @var{crc}, uint16_t 
@var{data}, uint16_t @var{poly})}
+Similar to @code{__builtin_crc8_data8}, except the argument and return types
+are 16-bit.
+@enddefbuiltin
+
+@defbuiltin{uint16_t __builtin_crc16_data8 (uint16_t @var{crc}, uint8_t 
@var{data}, uint16_t @var{poly})}
+Similar to @code{__builtin_crc16_data16}, except the @var{data} argument type
+is 8-bit.
+@enddefbuiltin
+
+@defbuiltin{uint32_t __builtin_crc32_data32 (uint32_t @var{crc}, uint32_t 
@var{data}, uint32_t @var{poly})}
+Similar to @code{__builtin_crc8_data8}, except the argument and return types
+are 32-bit.
+@enddefbuiltin
+
+@defbuiltin{uint32_t __builtin_crc32_data8 (uint32_t @var{crc}, uint8_t 
@var{data}, uint32_t @var{poly})}
+Similar to @code{__builtin_crc32_data32}, except the @var{data} argument type
+is 8-bit.
+@enddefbuiltin
+
+@defbuiltin{uint32_t __builtin_crc32_data16 (uint32_t @var{crc}, uint16_t 
@var{data}, uint32_t @var{poly})}
+Similar to @code{__builtin_crc32_data32}, except the @var{data} argument type
+is 16-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_crc64_data64 (uint64_t @var{crc}, uint64_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_crc8_data8}, except the argument and return types
+are 64-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_crc64_data8 (uint64_t @var{crc}, uint8_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_crc64_data64}, except the @var{data} argument type
+is 8-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_crc64_data16 (uint64_t @var{crc}, uint16_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_crc64_data64}, except the @var{data} argument type
+is 16-bit.
+@enddefbuiltin
+
+@defbuiltin{uint64_t __builtin_crc64_data32 (uint64_t @var{crc}, uint32_t 
@var{data}, uint64_t @var{poly})}
+Similar to @code{__builtin_crc64_data64}, except the @var{data} argument type
+is 32-bit.
+@enddefbuiltin
+
 @node Target Builtins
 @section Built-in Functions Specific to Particular Target Machines
 
diff --git a/gcc/testsuite/gcc.dg/crc-builtin-rev-target32.c 
b/gcc/testsuite/gcc.dg/crc-builtin-rev-target32.c
new file mode 100644
index 00000000000..c95704450cb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/crc-builtin-rev-target32.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int32plus } */
+
+#include <stdint-gcc.h>
+
+int8_t rev_crc8_data8 ()
+{
+  return __builtin_rev_crc8_data8 (0x34, 'a', 0x12);
+}
+
+int16_t rev_crc16_data8 ()
+{
+  return __builtin_rev_crc16_data8 (0x1234, 'a', 0x1021);
+}
+
+int16_t rev_crc16_data16 ()
+{
+  return __builtin_rev_crc16_data16 (0x1234, 0x3214, 0x1021);
+}
+
+int32_t rev_crc32_data8 ()
+{
+  return __builtin_rev_crc32_data8 (0xffffffff, 0x32, 0x4002123);
+}
+
+int32_t rev_crc32_data16 ()
+{
+  return __builtin_rev_crc32_data16 (0xffffffff, 0x3232, 0x4002123);
+}
+
+int32_t rev_crc32_data32 ()
+{
+  return __builtin_rev_crc32_data32 (0xffffffff, 0x123546ff, 0x4002123);
+}
+
+/* { dg-final { scan-assembler "crc_table_for_crc_8_polynomial_0x12|mul"} } */
+/* { dg-final { scan-assembler "crc_table_for_crc_16_polynomial_0x1021|mul"} } 
*/
+/* { dg-final { scan-assembler 
"crc_table_for_crc_32_polynomial_0x4002123|mul"} } */
diff --git a/gcc/testsuite/gcc.dg/crc-builtin-rev-target64.c 
b/gcc/testsuite/gcc.dg/crc-builtin-rev-target64.c
new file mode 100644
index 00000000000..74b511ccfbe
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/crc-builtin-rev-target64.c
@@ -0,0 +1,62 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-require-effective-target int32plus } */
+
+#include <stdint-gcc.h>
+
+int8_t rev_crc8_data8 ()
+{
+  return __builtin_rev_crc8_data8 (0x34, 'a', 0x12);
+}
+
+int16_t rev_crc16_data8 ()
+{
+  return __builtin_rev_crc16_data8 (0x1234, 'a', 0x1021);
+}
+
+int16_t rev_crc16_data16 ()
+{
+  return __builtin_rev_crc16_data16 (0x1234, 0x3214, 0x1021);
+}
+
+int32_t rev_crc32_data8 ()
+{
+  return __builtin_rev_crc32_data8 (0xffffffff, 0x32, 0x4002123);
+}
+
+int32_t rev_crc32_data16 ()
+{
+  return __builtin_rev_crc32_data16 (0xffffffff, 0x3232, 0x4002123);
+}
+
+int32_t rev_crc32_data32 ()
+{
+  return __builtin_rev_crc32_data32 (0xffffffff, 0x123546ff, 0x4002123);
+}
+
+int64_t rev_crc64_data8 ()
+{
+  return __builtin_rev_crc64_data8 (0xffffffffffffffff, 0x32,
+                                   0x40021234002123);
+}
+
+int64_t rev_crc64_data16 ()
+{
+  return __builtin_rev_crc64_data16 (0xffffffffffffffff, 0x3232,
+                                    0x40021234002123);
+}
+
+int64_t rev_crc64_data32 ()
+{
+  return __builtin_rev_crc64_data32 (0xffffffffffffffff, 0x123546ff,
+                                    0x40021234002123);
+}
+
+int64_t rev_crc64_data64 ()
+{
+  return __builtin_rev_crc64_data64 (0xffffffffffffffff, 0x123546ff123546ff,
+                                    0x40021234002123);
+}
+
+/* { dg-final { scan-assembler "crc_table_for_crc_8_polynomial_0x12|mul" } } */
+/* { dg-final { scan-assembler "crc_table_for_crc_16_polynomial_0x1021|mul" } 
} */
+/* { dg-final { scan-assembler "crc_table_for_crc_32_polynomial_0x4002123|mul" 
} } */
diff --git a/gcc/testsuite/gcc.dg/crc-builtin-target32.c 
b/gcc/testsuite/gcc.dg/crc-builtin-target32.c
new file mode 100644
index 00000000000..f19ee74a071
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/crc-builtin-target32.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int32plus } */
+
+#include <stdint-gcc.h>
+
+int8_t crc8_data8 ()
+{
+  return __builtin_crc8_data8 (0x34, 'a', 0x12);
+}
+
+int16_t crc16_data8 ()
+{
+  return __builtin_crc16_data8 (0x1234, 'a', 0x1021);
+}
+
+int16_t crc16_data16 ()
+{
+  return __builtin_crc16_data16 (0x1234, 0x3214, 0x1021);
+}
+
+int32_t crc32_data8 ()
+{
+  return __builtin_crc32_data8 (0xffffffff, 0x32, 0x4002123);
+}
+
+int32_t crc32_data16 ()
+{
+  return __builtin_crc32_data16 (0xffffffff, 0x3232, 0x4002123);
+}
+
+int32_t crc32_data32 ()
+{
+  return __builtin_crc32_data32 (0xffffffff, 0x123546ff, 0x4002123);
+}
+
+/* { dg-final { scan-assembler "crc_table_for_crc_8_polynomial_0x12|mul" } } */
+/* { dg-final { scan-assembler "crc_table_for_crc_16_polynomial_0x1021|mul"} } 
*/
+/* { dg-final { scan-assembler 
"crc_table_for_crc_32_polynomial_0x4002123|mul"} } */
diff --git a/gcc/testsuite/gcc.dg/crc-builtin-target64.c 
b/gcc/testsuite/gcc.dg/crc-builtin-target64.c
new file mode 100644
index 00000000000..1af403b4328
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/crc-builtin-target64.c
@@ -0,0 +1,61 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-require-effective-target int32plus } */
+
+#include <stdint-gcc.h>
+
+int8_t crc8_data8 ()
+{
+  return __builtin_crc8_data8 (0x34, 'a', 0x12);
+}
+
+int16_t crc16_data8 ()
+{
+  return __builtin_crc16_data8 (0x1234, 'a', 0x1021);
+}
+
+int16_t crc16_data16 ()
+{
+  return __builtin_crc16_data16 (0x1234, 0x3214, 0x1021);
+}
+
+int32_t crc32_data8 ()
+{
+  return __builtin_crc32_data8 (0xffffffff, 0x32, 0x4002123);
+}
+
+int32_t crc32_data16 ()
+{
+  return __builtin_crc32_data16 (0xffffffff, 0x3232, 0x4002123);
+}
+
+int32_t crc32_data32 ()
+{
+  return __builtin_crc32_data32 (0xffffffff, 0x123546ff, 0x4002123);
+}
+
+int64_t crc64_data8 ()
+{
+  return __builtin_crc64_data8 (0xffffffffffffffff, 0x32, 0x40021234002123);
+}
+
+int64_t crc64_data16 ()
+{
+  return __builtin_crc64_data16 (0xffffffffffffffff, 0x3232, 0x40021234002123);
+}
+
+int64_t crc64_data32 ()
+{
+  return __builtin_crc64_data32 (0xffffffffffffffff, 0x123546ff,
+                                0x40021234002123);
+}
+
+int64_t crc64_data64 ()
+{
+  return __builtin_crc64_data64 (0xffffffffffffffff, 0x123546ff123546ff,
+                                0x40021234002123);
+}
+
+/* { dg-final { scan-assembler "crc_table_for_crc_8_polynomial_0x12|mul" } } */
+/* { dg-final { scan-assembler "crc_table_for_crc_16_polynomial_0x1021|mul" } 
} */
+/* { dg-final { scan-assembler "crc_table_for_crc_32_polynomial_0x4002123|mul" 
} } */
+/* { dg-final { scan-assembler 
"crc_table_for_crc_64_polynomial_0x40021234002123|mul" } } */

[committed] First two patches from Mariam's CRC work

Reply via email to