gcc-16-future-float16)] Update ChangeLog.*

Michael Meissner via Gcc-cvs Mon, 10 Nov 2025 10:41:07 -0800

https://gcc.gnu.org/g:f11cf14ed97cc38192cc5961805b5707404eb32d


commit f11cf14ed97cc38192cc5961805b5707404eb32d
Author: Michael Meissner <[email protected]>
Date:   Mon Nov 10 13:40:50 2025 -0500

    Update ChangeLog.*

Diff:
---
 gcc/ChangeLog.ibm | 802 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/REVISION      |   1 +
 2 files changed, 803 insertions(+)

diff --git a/gcc/ChangeLog.ibm b/gcc/ChangeLog.ibm
new file mode 100644
index 000000000000..0069b22009c3
--- /dev/null
+++ b/gcc/ChangeLog.ibm
@@ -0,0 +1,802 @@
+==================== Branch ibm/gcc-16-future-float16, patch #109 
====================
+
+Tell user if we have hardware support for 16-bit floating point.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros); Define
+       __BFLOAT16_HW__ if we have hardware support for __bflot16 conversions.
+       Define __FLOAT16_HW__ if we have hardware support for _Float16
+       conversions.
+
+==================== Branch ibm/gcc-16-future-float16, patch #108 
====================
+
+Add --with-powerpc-float16 and --with-powerpc-float16-disable-warning.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config.gcc (powerpc*-*-*): Add support for the configuration option
+       --with-powerpc-float16 and --with-powerpc-float16-disable-warning.
+       * config/rs6000/rs6000-call.cc (init_cumulative_args): Likewise.
+       (rs6000_function_arg): Likewise.
+       * config/rs6000/rs6000-cpus.def (TARGET_16BIT_FLOATING_POINT): Likewise.
+       (ISA_2_7_MASKS_SERVER): Likewise.
+       (POWERPC_MASKS): Likewise.
+
+==================== Branch ibm/gcc-16-future-float16, patch #107 
====================
+
+Add 16-bit floating point vectorization.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config.gcc (powerpc*-*-*): Add float16.o.
+       * config/rs6000/float16.cc: New file to add 16-bit floating point
+       vectorization.
+       * config/rs6000/float16.md: (FP16_BINARY_OP): New mode iterator.
+       (fp16_names): New mode attribute.
+       (UNSPEC_XVCVSPHP_V8HF): New unspec.
+       (UNSPEC_XVCVSPBF16_V8BF): Likewise.
+       (<fp16_names><mode>): New insns to support vectorization of 16-bit
+       floating point.
+       (fma<mode>4): Likewise.
+       (fms<mode>4): Likewise.
+       (nfma<mode>): Likewise.
+       (nfms<mode>4): Likewise.
+       (vec_pack_trunc_v4sf_v8hf): Likewise.
+       (vec_pack_trunc_v4sf_v8bf): Likewise.
+       (vec_pack_trunc_v4sf): Likewise.
+       (xvcvsphp_v8hf): Likewise.
+       (xvcvspbf16_v8bf): Likewise.
+       (vec_unpacks_hi_v8hf): Likewise.
+       (vec_unpacks_lo_v8hf): Likewise.
+       (xvcvhpsp_v8hf): Likewise.
+       (vec_unpacks_hi_v8bf): Likewise.
+       (vec_unpacks_lo_v8bf): Likewise.
+       (xvcvbf16spn_v8bf): Likewise.
+       * config/rs6000/rs6000-protos.h (enum fp16_operation): New enumeration
+       for vectorizing 16-bit floating point.
+       (fp16_vectorization): New declaration.
+       * config/rs6000/t-rs6000 (float16.o): Add build rules.
+
+==================== Branch ibm/gcc-16-future-float16, patch #106 
====================
+
+Add BF/HF neg, abs operands and logical insns.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/float16.md (neg<mode>2): Add BFmode/HFmode negate,
+       absolute value and negative absolute value operations.  Add logical
+       insns operating on BFmode/HFmode.
+       (abs<mode>2): Likewise.
+       (nabs<mode>2): Likewise.
+       (and<mode>3): Likewise.
+       (ior<mode>): Likewise.
+       (xor<mode>3): Likewise.
+       (nor<mode>3): Likewise.
+       (andn<mode>3): Likewise.
+       (eqv<mode>3): Likewise.
+       (nand<mode>3): Likewise.
+       (iorn<mode>3): Likewise.
+       (bool<mode>3): Likewise.
+       (boolc<mode>3): Likewise.
+       (boolcc<mode>): Likewise.
+
+==================== Branch ibm/gcc-16-future-float16, patch #105 
====================
+
+Add conversions between 16-bit floating point and other scalar modes.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/float16.md (fp16_float_convert): New mode iterator.
+       (extend<FP16_HW:mode><fp16_float_convert:mode>2): New insns to convert
+       between the 2 16-bit floating point modes and other floating point
+       scalars other than SFmode/DFmode by converting first to DFmode.
+       (trunc<fp16_float_convert:mode><FP16_HW:mode>2): Likewise.
+       (float<GPR:mode><FP16_HW:mode>2): New insns to convert beween the 2
+       16-bit floating point modes and signed/unsigned integers.
+       (floatuns<GPR:mode><FP16_HW:mode>2): Likewise.
+       (fix_trunc<FP16_HW:mode><GPR:mode>): Likewise.
+       (fixuns_trunc<FP16_HW:mode><GPR:mode>2): Likewise.
+
+==================== Branch ibm/gcc-16-future-float16, patch #104 
====================
+
+Add conversions between __bfloat16 and float/double.
+
+This patch provides conversions between __bfloat16 and float/double scalars on
+power10 and power11 systems.
+
+Unlike the support for _Float16, there is not a single instruction to convert
+between a __bfloat16 and float/double scalar value on the power10.
+
+Instead we have to use the vector conversion instructions.
+
+To convert a __bfloat16 scalar to a float/double scalar, GCC will generate:
+
+       lxsihzx     0,0,4       Load value into vector register
+       xxsldwi     0,0,0,1     Get the value into the upper 32-bits
+       xvcvbf16spn 0,0         Convert vector __bfloat16 to vector float
+       xscvspdpn   0,0         Convert memory float format to scalar
+
+To convert a scalar float/double to __bfloat16, GCC will generate:
+
+       xscvdpsp   0,0          Convert float scalar to float memory format
+       xvcvspbf16 0,0          Convert vector float to vector __bfloat16
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/float16.md (FP16_HW): Add BFmode.
+       (VFP16_HW): New mode iterator.
+       (cvt_fp16_to_v4sf_insn): New mode attribute.
+       (FP16_VECTOR4): Likewise.
+       (UNSPEC_FP16_SHIFT_LEFT_32BIT): New unspec constant.
+       (UNSPEC_CVT_FP16_TO_V4SF): Likewise.
+       (UNSPEC_XXSPLTW_FP16): Likewise.
+       (UNSPEC_XVCVSPBF16_BF): Likewise.
+       (extendbf<mode>2): New insns to convert between BFmode and
+       SFmode/DFmode.
+       (xscvdpspn_sf): Likewise.
+       (xscvspdpn_sf): Likewise.
+       (<fp16_vector8>_shift_left_32bit): Likewise.
+       (trunc<mode>bf): Likewise.
+       (vsx_xscvdpspn_sf): Likewise.
+       (cvt_fp16_to_v4sf_<mode): Likewise.
+       (cvt_fp16_to_v4sf_<mode>_le): Likewise.
+       (cvt_fp16_to_v4sf_<mode>_be): Likewise.
+       (dup_<mode>_to_v4s): Likewise.
+       (xxspltw_<mode>): Likewise.
+       (xvcvbf16spn_bf): Likewise.
+       (xvcvspbf16_bf): Likewise.
+
+==================== Branch ibm/gcc-16-future-float16, patch #103 
====================
+
+Add conversions between _Float16 and float/double.
+
+This patch adds support to generate xscvhpdp and xscvdphp on Power9 systems and
+later, to convert between _Float16 and float scalar values.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/float16.md (FP16_HW): New mode iterator.
+       (extendhf<mode>2): Add support converting between HFmode and
+       SFmode/DFmoded if we are on power9 or later.
+       (trunc<mode>hf2): Likewise.
+
+==================== Branch ibm/gcc-16-future-float16, patch #102 
====================
+
+Add HF/BF emulation functions to libgcc.
+
+This patch adds the necessary support in libgcc to allow using the machine
+independent 16-bit floating point support.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+libgcc/
+
+       * config.host (powerpc*-*-linux*): Add HF/BF emulation functions to
+       PowerPC libgcc.
+       * config/rs6000/sfp-machine.h (_FP_NANFRAC_H): New macro.
+       (_FP_NANFRAC_B): Likewise.
+       (_FP_NANSIGN_H): Likewise.
+       (_FP_NANSIGN_B): Likewise.
+       (DFtype2): Add HF/BF emulation function declarations.
+       (SFtype2): Likewise.
+       (DItype2): Likewise.
+       (UDItype2): Likewise.
+       (SItype2): Likewise.
+       (USItype2): Likewise.
+       (HFtype2): Likewise.
+       (__eqhf2): Likewise.
+       (__extendhfdf2): Likewise.
+       (__extendhfsf2): Likewise.
+       (__fixhfdi): Likewise.
+       (__fixhfsi): Likewise.
+       (__fixunshfdi): Likewise.
+       (__fixunshfsi): Likewise.
+       (__floatdihf): Likewise.
+       (__floatsihf): Likewise.
+       (__floatundihf): Likewise.
+       (__floatunsihf): Likewise.
+       (__truncdfhf2): Likewise.
+       (__truncsfhf2): Likewise.
+       (BFtype2): Likewise.
+       (__extendbfsf2): Likewise.
+       (__floatdibf): Likewise.
+       (__floatsibf): Likewise.
+       (__floatundibf): Likewise.
+       (__floatunsibf): Likewise.
+       (__truncdfbf2): Likewise.
+       (__truncsfbf2): Likewise.
+       (__truncbfhf2): Likewise.
+       (__trunchfbf2): Likewise.
+       * config/rs6000/t-float16: New file.
+       * configure.ac (powerpc*-*-linux*): Check if the PowerPC compiler
+       supports _Float16 and __bfloat16 types.
+       * configure: Regenerate.
+
+==================== Branch ibm/gcc-16-future-float16, patch #101 
====================
+
+Add initial 16-bit floating point support.
+
+This patch adds the initial support for the 16-bit floating point formats.
+_Float16 is the IEEE 754 half precision format.  __bfloat16 is the Google Brain
+16-bit format.
+
+In order to use both _Float16 and __bfloat16, the user has to use the -mfloat16
+option to enable the support.
+
+In this patch only the machine indepndent support is used.  In order to be
+usable, the next patch will also need to be installed. That patch will add
+support in libgcc for 16-bit floating point support.
+
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/float16.md: New file to add basic 16-bit floating point
+       support.
+       * config/rs6000/predicates.md (easy_fp_constant): Add support for HFmode
+       and BFmode constants.
+       (fp16_xxspltiw_constant): New predicate.
+       * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add support for
+       16-bit floating point types.
+       (rs6000_init_builtins): Create the bfloat16_type_node.
+       * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
+       __FLOAT16__ and __BFLOAT16__ if 16-bit floating pont is enabled.
+       * config/rs6000/rs6000-call.cc (init_cumulative_args): Warn if a
+       function returns a 16-bit floating point value unless -Wno-psabi is
+       used.
+       (rs6000_function_arg): Warn if a 16-bit floating point value is passed
+       to a function unless -Wno-psabi is ued.
+       * config/rs6000/rs6000-protos.h (vec_const_128bit_type): Add mode field
+       to detect initializing 16-bit floating constants.
+       * config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached): Add
+       support for 16-bit floating point.
+       (rs6000_modes_tieable_p): Don't allow 16-bit floating point modes to tie
+       with other modes.
+       (rs6000_debug_reg_global): Add BFmode and HFmode.
+       (rs6000_setup_reg_addr_masks): Add support for 16-bit floating point
+       types.
+       (rs6000_setup_reg_addr_masks): Likewise.
+       (rs6000_init_hard_regno_mode_ok): Likewise.
+       (rs6000_option_override_internal): Add a check whether -mfloat16 can be
+       used.
+       (easy_altivec_constant): Add suport for 16-bit floating point.
+       (xxspltib_constant_p): Likewise.
+       (rs6000_expand_vector_init): Likewise.
+       (reg_offset_addressing_ok_p): Likewise.
+       (rs6000_legitimate_offset_address_p): Likewise.
+       (legitimate_lo_sum_address_p): Likewise.
+       (rs6000_secondary_reload_simple_move): Likewise.
+       (rs6000_preferred_reload_class): Likewise.
+       (rs6000_can_change_mode_class): Likewise.
+       (rs6000_load_constant_and_splat): Likewise.
+       (rs6000_scalar_mode_supported_p): Likewise.
+       (rs6000_libgcc_floating_mode_supported_p): Return true for HFmode and
+       BFmode if -mfloat16.
+       (rs6000_floatn_mode): Enable _Float16 if -mfloat16.
+       (rs6000_opt_masks): Add -mfloat16.
+       (constant_fp_to_128bit_vector): Add support for 16-bit floating point.
+       (vec_const_128bit_to_bytes): Likewise.
+       (constant_generates_xxspltiw): Likewise.
+       * config/rs6000/rs6000.h (TARGET_BFLOAT16_HW): New macro.
+       (TARGET_FLOAT16_HW): Likewise.
+       (TARGET_BFLOAT16_HW_VECTOR): Likewise.
+       (TARGET_FLOAT16_HW_VECTOR): Likewise.
+       (FP16_SCALAR_MODE_P): Likewise.
+       (FP16_HW_SCALAR_MODE_P): Likewise.
+       (FP16_VECTOR_MODE_P): Likewise.
+       * config/rs6000/rs6000.md (wd): Add BFmode and HFmode.
+       * config/rs6000/rs6000.opt (-mloat16): New option.
+       * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mfloat16.
+
+==================== Branch ibm/gcc-16-future-float16, patch #100 
====================
+
+Add infrastructure for _Float16 and __bfloat16 types.
+
+This patch adds the infrastructure for adding 16-bit floating point types in 
the
+next patch.  Two new types that will be added:
+
+_Float16 (HFmode):
+==================
+
+This is the IEEE 754-2008 16-bit floating point.  It has 1 sign bit, 5
+exponent bits, 10 explicit mantassia bits (the 11th bit is implied with
+normalization).
+
+The PowerPC ISA 3.0 (power9) has instructions to convert between the
+scalar representations of _Float16 and float types.  The PowerPC ISA
+3.1 (power10 and power11) has instructions for converting between the
+even elements of _Float16 vectors and float vectors.  In addition, the
+MMA subsystem has support for _Float16 vector processing.
+
+
+__bfloat16 (BFmode):
+====================
+
+This is the brain 16-bit floating point created by the Google Brain
+project.  It has 1 sign bit, 8 exponent bits, 7 explicit mantissa bits
+(the 8th bit is implied with normalization).  The 16 bits in the
+__bfloat16 format is the same as the upper 16 bits in the normal IEEE
+754 32-bit floating point format.
+
+he PowerPC ISA 3.1 (power10 and power11) has instructions for
+converting between the even elements of _bfloat16 vectors and float
+vectors.  In addition, the MMA subsystem has support for _bfloat16
+vector processing.
+
+
+This patch adds new modes that will be used in the future.  The
+V8HFmode and V8BFmodes are treated as normal vector modes.
+
+This patch does not add loads and stores for BFmode and HFmode.  These
+will be added in the next patch.
+
+    BFmode   -- 16-bit mode for __bfloat16 support
+    HFmode   -- 16-bit mode for _Float16 support
+    V8BFmode -- 128-bit vector mode __bfloat16
+    V8HFmode -- 128-bit vector mode _Float16
+    V4BFmode -- 64-bit vector mode __bfloat16 used in some insns
+    V4HFmode -- 64-bit vector mode _Float16 used in some insns
+
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/altivec.md (VM): Add support for V8HFmode and
+        V8BFmode.
+       (VM2): Likewise.
+       (VI_char): Likewise.
+       (VI_scalar): Likewise.
+       (VI_unit): Likewise.
+       (VP_small): Likewise.
+       (VP_small_lc): Likewise.
+       (VU_char): Likewise.
+       * config/rs6000/rs6000-modes.def (HFmode): Add new mode.
+       (BFmode): Likewise.
+       (V8BFmode): Likewise.
+       (V8HFmode): Likewise.
+       * config/rs6000/rs6000-p8swap.cc (rs6000_gen_stvx): Remove #ifdef for
+       HAVE_V8HFmode.  Add support for V8BFmode.
+       (rs6000_gen_lvx): Likewise.
+       (replace_swapped_load_constant): Likewise.
+       * config/rs6000/rs6000.cc (rs6000_debug_reg_global): Add support for
+       V8HFmode and V8BFmode.
+       (rs6000_init_hard_regno_mode_ok): Likewise.
+       (output_vec_const_move): Likewise.
+       (rs6000_expand_vector_init): Likewise.
+       (reg_offset_addressing_ok_p): Likewise.
+       (rs6000_const_vec): Likewise.
+       (rs6000_emit_move): Likewise.
+       * config/rs6000/rs6000.h (ALTIVEC_VECTOR_MODE): Likewise.
+       * config/rs6000/rs6000.md (FMOVE128_GPR): Likewise.
+       (wd): Likewise.
+       (du_or_d): Likewise.
+       (BOOL_128): Likewise.
+       (BOOL_REGS_OUTPUT): Likewise.
+       (BOOL_REGS_OP1): Likewise.
+       (BOOL_REGS_OP2): Likewise.
+       (BOOL_REGS_UNARY): Likewise.
+       (RELOAD): Likewise.
+       * config/rs6000/vector.md (VEC_L): Likewise.
+       (VEC_M): Likewise.
+       (VEC_E): Likewise.
+       (VEC_base): Likewise.
+       (VEC_base_l): Likewise.
+       * config/rs6000/vsx.md (VECTOR_16BIT): New mode iterator.
+       (VSX_L): Add support for V8HFmode and V8BFmode.
+       (VSX_M): Likewise.
+       (VSX_XXBR): Likewise.
+       (VSm): Likewise.
+       (VSr): Likewise.
+       (VSisa): Likewise.
+       (??r): Likewise.
+       (nW): Likewise.
+       (VSc): Likewise.
+       (VM3): Likewise.
+       (VM3_char): Likewise.
+       (vsx_le_perm_load_<mode>): Rename from vsx_le_perm_load_v8hi and add
+       V8HFmode and V8BFmode.
+       (vsx_le_perm_store_<mode>): Rename from vsx_le_perm_store_v8hi and add
+       V8HFmode and V8BFmode.
+       (splitter for vsx_le_perm_store_<mode>): Likewise.
+       (vsx_ld_elemrev_<mode>): Rename from vsx_ld_elemrev_v8hi and add
+       V8HFmode and V8BFmode support.
+       (vsx_ld_elemrev_<mode>_internal): Rename from
+       vsx_ld_elemrev_v8hi_internal and add V8HFmode and V8BFmode support.
+       (vsx_st_elemrev_<mode>): Rename from vsx_st_elemrev_v8hi and add
+       V8HFmode and V8BFmode support.
+       (vsx_st_elemrev_<mode>_internal): Rename from
+       vsx_st_elemrev_v8hi_internal and add V8HFmode and V8BFmode support.
+       (xxswapd_<mode>): Rename from xxswapd_v8hi and add V8HFmode and V8BFmode
+       support.
+       (vsx_lxvd2x8_le_<MODE>): Rename from vsx_lxvd2x8_le_V8HI and add
+       V8HFmode and V8BFmode support.
+       (vsx_stxvd2x8_le_<MODE>): Rename from vsx_stxvd2x8_le_V8HI and add
+       V8HFmode and V8BFmode support.
+       (vsx_extract_<mode>_store_p9): Add V8HFmode and V8BFmode.
+       (vsx_extract_<mode>_p8): Likewise.
+
+==================== Branch ibm/gcc-16-future-float16, patch #5 
====================
+
+Use vector pair load/store for memcpy with -mcpu=future
+
+In the development for the power10 processor, GCC did not enable using the load
+vector pair and store vector pair instructions when optimizing things like
+memory copy.  This patch enables using those instructions if -mcpu=future is
+used.
+
+This patch assumes that the following previously posted patches have been
+installed:
+
+  * https://gcc.gnu.org/pipermail/gcc-patches/2025-November/699956.html
+  * https://gcc.gnu.org/pipermail/gcc-patches/2025-November/699977.html
+  * https://gcc.gnu.org/pipermail/gcc-patches/2025-November/699978.html
+  * https://gcc.gnu.org/pipermail/gcc-patches/2025-November/699979.html
+
+I have tested these patches on both big endian and little endian PowerPC
+servers, with no regressions.  Can I check these patchs into the trunk?
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): Enable using load
+       vector pair and store vector pair instructions for memory copy
+       operations.
+       (POWERPC_MASKS): Make the option for enabling using load vector pair and
+       store vector pair operations set and reset when the PowerPC processor is
+       changed.
+       * gcc/config/rs6000/rs6000.cc (rs6000_machine_from_flags): Disable
+       -mblock-ops-vector-pair from influencing .machine selection.
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/future-3.c: New test.
+
+==================== Branch ibm/gcc-16-future-float16, patch #4 
====================
+
+Add support for 1,024 bit DMF registers.
+
+This patch is a prelimianry patch to add the full 1,024 bit dense math register
+(DMRs) for -mcpu=future.  The MMA 512-bit accumulators map onto the top of the
+DMR register.
+
+This patch only adds the new 1,024 bit register support.  It does not add
+support for any instructions that need 1,024 bit registers instead of 512 bit
+registers.
+
+I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit
+registers.  The 'wD' constraint added in previous patches is used for these
+registers.  I added support to do load and store of DMRs via the VSX registers,
+since there are no load/store dense math instructions.  I added the new keyword
+'__dmf' to create 1,024 bit types that can be loaded into DMRs.  At present, I
+don't have aliases for __dmf512 and __dmf1024 that we've discussed internally.
+
+The patches have been tested on both little and big endian systems.  Can I 
check
+it into the master branch?
+
+2025-11-10   Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec.
+       (UNSPEC_DM_INSERT512_LOWER): Likewise.
+       (UNSPEC_DM_EXTRACT512): Likewise.
+       (UNSPEC_DMF_RELOAD_FROM_MEMORY): Likewise.
+       (UNSPEC_DMF_RELOAD_TO_MEMORY): Likewise.
+       (movtdo): New define_expand and define_insn_and_split to implement 1,024
+       bit DMR registers.
+       (movtdo_insert512_upper): New insn.
+       (movtdo_insert512_lower): Likewise.
+       (movtdo_extract512): Likewise.
+       (reload_dmf_from_memory): Likewise.
+       (reload_dmf_to_memory): Likewise.
+       * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMF
+       support.
+       (rs6000_init_builtins): Add support for __dmf keyword.
+       * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support
+       for TDOmode.
+       (rs6000_function_arg): Likewise.
+       * config/rs6000/rs6000-modes.def (TDOmode): New mode.
+       * config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add
+       support for TDOmode.
+       (rs6000_hard_regno_mode_ok_uncached): Likewise.
+       (rs6000_hard_regno_mode_ok): Likewise.
+       (rs6000_modes_tieable_p): Likewise.
+       (rs6000_debug_reg_global): Likewise.
+       (rs6000_setup_reg_addr_masks): Likewise.
+       (rs6000_init_hard_regno_mode_ok): Add support for TDOmode.  Setup reload
+       hooks for DMF mode.
+       (reg_offset_addressing_ok_p): Add support for TDOmode.
+       (rs6000_emit_move): Likewise.
+       (rs6000_secondary_reload_simple_move): Likewise.
+       (rs6000_preferred_reload_class): Likewise.
+       (rs6000_secondary_reload_class): Likewise.
+       (rs6000_mangle_type): Add mangling for __dmf type.
+       (rs6000_dmf_register_move_cost): Add support for TDOmode.
+       (rs6000_split_multireg_move): Likewise.
+       (rs6000_invalid_conversion): Likewise.
+       * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode.
+       (enum rs6000_builtin_type_index): Add DMF type nodes.
+       (dmf_type_node): Likewise.
+       (ptr_dmf_type_node): Likewise.
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/dm-1024bit.c: New test.
+       * lib/target-supports.exp (check_effective_target_ppc_dmf_ok): New
+       target test.
+
+==================== Branch ibm/gcc-16-future-float16, patch #3 
====================
+
+Add support for dense math registers.
+
+The MMA subsystem added the notion of accumulator registers as an optional
+feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped with
+the VSX registers 0..31, but logically the accumulator registers were separate
+from the FPR registers.  In ISA 3.1, it was anticipated that in future systems,
+the accumulator registers may no overlap with the FPR registers.  This patch
+adds the support for dense math registers as separate registers.
+
+This particular patch does not change the MMA support to use the accumulators
+within the dense math registers.  This patch just adds the basic support for
+having separate DMRs.  The next patch will switch the MMA support to use the
+accumulators if -mcpu=future is used.
+
+For testing purposes, I added an undocumented option '-mdense-math' to enable
+or disable the dense math support.
+
+This patch updates the wD constraint added in the previous patch.  If MMA is
+selected but dense math is not selected (i.e. -mcpu=power10), the wD constraint
+will allow access to accumulators that overlap with VSX registers 0..31.  If
+both MMA and dense math are selected (i.e. -mcpu=future), the wD constraint
+will only allow dense math registers.
+
+This patch modifies the existing %A output modifier.  If MMA is selected but
+dense math is not selected, then %A output modifier converts the VSX register
+number to the accumulator number, by dividing it by 4.  If both MMA and dense
+math are selected, then %A will map the separate DMF registers into 0..7.
+
+The intention is that user code using extended asm can be modified to run on
+both MMA without dense math and MMA with dense math:
+
+    1) If possible, don't use extended asm, but instead use the MMA built-in
+       functions;
+
+    2) If you do need to write extended asm, change the d constraints
+       targetting accumulators should now use wD;
+
+    3) Only use the built-in zero, assemble and disassemble functions create
+       move data between vector quad types and dense math accumulators.
+       I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the
+       extended asm code.  The reason is these instructions assume there is a
+       1-to-1 correspondence between 4 adjacent FPR registers and an
+       accumulator that overlaps with those instructions.  With accumulators
+       now being separate registers, there no longer is a 1-to-1
+       correspondence.
+
+It is possible that the mangling for DMFs and the GDB register numbers may
+produce other changes in the future.
+
+gcc/
+
+2025-11-10   Michael Meissner  <[email protected]>
+
+       * config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec.
+       (movxo): Add comments about dense math registers.
+       (movxo_nodm): Rename from movxo and restrict the usage to machines
+       without dense math registers.
+       (movxo_dm): New insn for movxo support for machines with dense math
+       registers.
+       (mma_<acc>): Restrict usage to machines without dense math registers.
+       (mma_xxsetaccz): Add a define_expand wrapper, and add support for dense
+       math registers.
+       (mma_dmsetaccz): New insn.
+       * config/rs6000/predicates.md (dmf_operand): New predicate.
+       (accumulator_operand): Add support for dense math registers.
+       * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do
+       not issue a de-prime instruction when disassembling a vector quad on a
+       system with dense math registers.
+       * config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): Define
+       __DENSE_MATH__ if we have dense math registers.
+       * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): Add -mdense-math.
+       (POWERPC_MASKS): Likewise.
+       * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMF_REG_TYPE.
+       (enum rs6000_reload_reg_type): Add RELOAD_REG_DMF.
+       (LAST_RELOAD_REG_CLASS): Add support for DMF registers and the wD
+       constraint.
+       (reload_reg_map): Likewise.
+       (rs6000_reg_names): Likewise.
+       (alt_reg_names): Likewise.
+       (rs6000_hard_regno_nregs_internal): Likewise.
+       (rs6000_hard_regno_mode_ok_uncached): Likewise.
+       (rs6000_debug_reg_global): Likewise.
+       (rs6000_setup_reg_addr_masks): Likewise.
+       (rs6000_init_hard_regno_mode_ok): Likewise.
+       (rs6000_option_override_internal): If -mdense-math, issue an error if
+       -mno-mma or not -mcpu=future.
+       (rs6000_secondary_reload_memory): Add support for DMF registers.
+       (rs6000_secondary_reload_simple_move): Likewise.
+       (rs6000_preferred_reload_class): Likewise.
+       (rs6000_secondary_reload_class): Likewise.
+       (print_operand): Make %A handle both FPRs and DMRs.
+       (rs6000_dmf_register_move_cost): New helper function.
+       (rs6000_register_move_cost): Add support for DMR registers.
+       (rs6000_memory_move_cost): Likewise.
+       (rs6000_compute_pressure_classes): Likewise.
+       (rs6000_debugger_regno): Likewise.
+       (rs6000_opt_masks): Add -mdense-math support.
+       (rs6000_split_multireg_move): Add support for DMRs.
+       * config/rs6000/rs6000.h (TARGET_MMA_NO_DENSE_MATH): New macro.
+       (UNITS_PER_DMF_WORD): Likewise.
+       (FIRST_PSEUDO_REGISTER): Update for DMRs.
+       (FIXED_REGISTERS): Add DMRs.
+       (CALL_REALLY_USED_REGISTERS): Likewise.
+       (REG_ALLOC_ORDER): Likewise.
+       (DMF_REGNO_P): New macro.
+       (enum reg_class): Add DM_REGS.
+       (REG_CLASS_NAMES): Likewise.
+       (REG_CLASS_CONTENTS): Likewise.
+       (enum r6000_reg_class_enum): Add RS6000_CONSTRAINT_wD.
+       (REGISTER_NAMES): Add DMF registers.
+       (ADDITIONAL_REGISTER_NAMES): Likewise.
+       * config/rs6000/rs6000.md (FIRST_DMF_REGNO): New constant.
+       (LAST_DMF_REGNO): Likewise.
+       * config/rs6000/rs6000.opt (-mdense-math): New option.
+
+==================== Branch ibm/gcc-16-future-float16, patch #2 
====================
+
+Add wD constraint.
+
+This patch adds a new constraint ('wD') that matches the accumulator registers
+that overlap with VSX registers 0..31 on power10.  Future patches will add the
+support for a separate accumulator register class that will be used when the
+support for dense math registes is added.
+
+2025-11-10   Michael Meissner  <[email protected]>
+
+       * config/rs6000/constraints.md (wD): New constraint.
+       * config/rs6000/mma.md (mma_<acc>): Prepare for alternate accumulator
+       registers.  Use wD constraint instead of 'd' constraint.  Use
+       accumulator_operand instead of fpr_reg_operand.
+       (mma_<vv>): Likewise.
+       (mma_<avv>): Likewise.
+       (mma_<pv>): Likewise.
+       (mma_<apv>): Likewise.
+       (mma_<vvi4i4i8>): Likewise.
+       (mma_<avvi4i4i8>): Likewise.
+       (mma_<vvi4i4i2>): Likewise.
+       (mma_<avvi4i4i2>): Likewise.
+       (mma_<vvi4i4>): Likewise.
+       (mma_<avvi4i4>): Likewise.
+       (mma_<pvi4i2): Likewise.
+       (mma_<apvi4i2>): Likewise.
+       (mma_<vvi4i4i4>): Likewise.
+       (mma_<avvi4i4i4): Likewise.
+       * config/rs6000/predicates.md (accumulator_operand): New predicate.
+       * config/rs6000/rs6000.cc (rs6000_debug_reg_global): Print the register
+       class for the 'wD' constraint.
+       (rs6000_init_hard_regno_mode_ok): Set up the 'wD' register constraint
+       class.
+       * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add element for
+       the 'wD' constraint.
+       * doc/md.texi (PowerPC constraints): Document the 'wD' constraint.
+
+==================== Branch ibm/gcc-16-future-float16, patch #1 
====================
+
+Add -mcpu=future option to the PowerPC.
+
+I originally made a more complicated patch (V5) on September 22nd, 2025 that
+tried to do infrastructure cleanup as well as adding -mcpu=future.  This patch
+is a more limited patch in that it just adds the -mcpu=future patch, and it 
does
+not do the other infrastructure work.
+
+I submitted the V6 patch on November 6th.  However, in that patch, I
+forgot to add code to set the .machine directive to "future" if the
+user did -mcpu=future.  This patch fixes this.
+
+This patch just adds support for -mcpu=future.  Thanks to a question
+from Surya Kumari Jangal, I figured out a new method to do this patch.
+
+In the past, we would always add a new ISA flag for the cpu
+(i.e. -mpower11).  But this means the user could potentially use
+-mfuture instead of -mcpu=future.  To discourage this, we would then
+add a warning not to use the -m<xxx> direction.
+
+This patch now uses a separate variable (TARGET_FUTURE) that is set
+separately when the cpu type is set.  This way we don't have to create
+a new dummy ISA option.
+
+The changes in this patch include:
+
+  * The TARGET_FUTURE variable is set both in the inital cpu setup.
+
+  * It is stored and restored as part of the target attribute and target
+    pragma support.
+
+  * The internal debug option -mdebug=reg now prints whether the TARGET_FUTURE
+    field is set.
+
+  * The macro _ARCH_FUTURE is defined if the user used -mcpu=future.
+
+  * I added 2 tests to make sure -mcpu=future works.
+
+  * If the user uses -mcpu=future, -mfuture is passed to the assembler.
+
+  * I added support so the configuration option --with-cpu=future is
+    used, it will set the default cpu type.
+
+Can I check this patch into the GCC trunk?  I have built bootstrap
+builds on both a little endian Power10 system and a big endian Power9
+system and there were no regressions.  On the little endian Power10
+system, I built the last run using the --with-cpu=future configuration
+option.
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * config.gcc (powerpc*-*-*): Add support for --with-cpu=future.
+       * config/rs6000/aix71.h (ASM_CPU_SPEC): Pass -mfuture to the assembler
+       if -mcpu=future is used.
+       * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
+       * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
+       * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
+       _ARCH_FUTURE if -mcpu=future was used.
+       (rs6000_cpu_cpp_builtins): Likewise.
+       * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
+       (future cpu): Add support for -mcpu=future.
+       * config/rs6000/rs6000-opts.h (PROCESSOR_FUTURE): Define to be power11.
+       * config/rs6000/rs6000-protos.h (rs6000_target_modify_macros): Add
+       -mcpu=future support.
+       (rs6000_target_modify_macros_ptr): Likewise.
+       * config/rs6000/rs6000-tables.opt: Regenerate.
+       * config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): Add
+       -mcpu=future support.
+       (rs6000_debug_reg_global): Likewise.
+       (rs6000_option_override_internal): Likewise.
+       (rs6000_machine_from_flags): Likewise.
+       (rs6000_pragma_target_parse): Likewise.
+       (rs6000_function_specific_save): Likewise.
+       (rs6000_function_specific_restore): Likewise.
+       (rs6000_function_specific_print): Likewise.
+       (rs6000_print_options_internal): Likewise.
+       (rs6000_print_isa_options): Likewise.
+       * config/rs6000/rs6000.h (ASM_CPU_SPEC): Pass -mfuture to the assembler
+       if -mcpu=future is used.
+       (TARGET_FUTURE): New macro.
+       * config/rs6000/rs6000.opt (TARGET_FUTURE): New target variable.
+       (x_TARGET_FUTURE): Likewise.
+       * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mcpu=future.
+
+gcc/testsuite/
+
+       * gcc.target/powerpc/future-1.c: New test.
+       * gcc.target/powerpc/future-2.c: Likewise.
+
+==================== Branch ibm/gcc-16-future-float16, baseline 
====================
+
+2025-11-10  Michael Meissner  <[email protected]>
+
+gcc/
+
+       * ChangeLog.test: New file for branch.
+       * REVISION: Update.
+
+       Clone branch
diff --git a/gcc/REVISION b/gcc/REVISION
new file mode 100644
index 000000000000..d784a9c017db
--- /dev/null
+++ b/gcc/REVISION
@@ -0,0 +1 @@
+ibm-gcc-16-future-float16 branch

[gcc(refs/vendors/ibm/heads/gcc-16-future-float16)] Update ChangeLog.*

Reply via email to