https://gcc.gnu.org/g:4a8535b80ade521daab4a2f87c690e5ed68239ec
commit 4a8535b80ade521daab4a2f87c690e5ed68239ec Author: Michael Meissner <[email protected]> Date: Wed Nov 12 17:51:57 2025 -0500 Update ChangeLog.* Diff: --- gcc/ChangeLog.ibm | 730 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 730 insertions(+) diff --git a/gcc/ChangeLog.ibm b/gcc/ChangeLog.ibm index 08975ad27752..f0462ec745f4 100644 --- a/gcc/ChangeLog.ibm +++ b/gcc/ChangeLog.ibm @@ -1,3 +1,733 @@ +==================== Branch ibm/gcc-16-future-float16, patch #208 ==================== + +Add --with-powerpc-float16 and --with-powerpc-float16-disable-warning. + +2025-11-12 Michael Meissner <[email protected]> + +gcc/ + + * config.gcc (powerpc*-*-*): Add support for the configuration option + --with-powerpc-float16 and --with-powerpc-float16-disable-warning. + * config/rs6000/rs6000-call.cc (init_cumulative_args): Likewise. + (rs6000_function_arg): Likewise. + * config/rs6000/rs6000-cpus.def (TARGET_16BIT_FLOATING_POINT): Likewise. + (ISA_2_7_MASKS_SERVER): Likewise. + (POWERPC_MASKS): Likewise. + +==================== Branch ibm/gcc-16-future-float16, patch #407 ==================== + +Add 16-bit floating point vectorization. + +2025-11-12 Michael Meissner <[email protected]> + +gcc/ + + * config.gcc (powerpc*-*-*): Add float16.o. + * config/rs6000/float16.cc: New file to add 16-bit floating point + vectorization. + * config/rs6000/float16.md: (FP16_BINARY_OP): New mode iterator. + (fp16_names): New mode attribute. + (UNSPEC_XVCVSPHP_V8HF): New unspec. + (UNSPEC_XVCVSPBF16_V8BF): Likewise. + (<fp16_names><mode>): New insns to support vectorization of 16-bit + floating point. + (fma<mode>4): Likewise. + (fms<mode>4): Likewise. + (nfma<mode>): Likewise. + (nfms<mode>4): Likewise. + (vec_pack_trunc_v4sf_v8hf): Likewise. + (vec_pack_trunc_v4sf_v8bf): Likewise. + (vec_pack_trunc_v4sf): Likewise. + (xvcvsphp_v8hf): Likewise. + (xvcvspbf16_v8bf): Likewise. + (vec_unpacks_hi_v8hf): Likewise. + (vec_unpacks_lo_v8hf): Likewise. + (xvcvhpsp_v8hf): Likewise. + (vec_unpacks_hi_v8bf): Likewise. + (vec_unpacks_lo_v8bf): Likewise. + (xvcvbf16spn_v8bf): Likewise. + * config/rs6000/rs6000-protos.h (enum fp16_operation): New enumeration + for vectorizing 16-bit floating point. + (fp16_vectorization): New declaration. + * config/rs6000/t-rs6000 (float16.o): Add build rules. + +==================== Branch ibm/gcc-16-future-float16, patch #406 ==================== + +Add BF/HF neg, abs operands and logical insns. + +2025-11-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/float16.md (neg<mode>2): Add BFmode/HFmode negate, + absolute value and negative absolute value operations. Add logical + insns operating on BFmode/HFmode. + (abs<mode>2): Likewise. + (nabs<mode>2): Likewise. + (and<mode>3): Likewise. + (ior<mode>): Likewise. + (xor<mode>3): Likewise. + (nor<mode>3): Likewise. + (andn<mode>3): Likewise. + (eqv<mode>3): Likewise. + (nand<mode>3): Likewise. + (iorn<mode>3): Likewise. + (bool<mode>3): Likewise. + (boolc<mode>3): Likewise. + (boolcc<mode>): Likewise. + +==================== Branch ibm/gcc-16-future-float16, patch #405 ==================== + +Add conversions between 16-bit floating point and other scalar modes. + +2025-11-16 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/float16.md (fp16_float_convert): New mode iterator. + (extend<FP16_HW:mode><fp16_float_convert:mode>2): New insns to convert + between the 2 16-bit floating point modes and other floating point + scalars other than SFmode/DFmode by converting first to DFmode. + (trunc<fp16_float_convert:mode><FP16_HW:mode>2): Likewise. + (float<GPR:mode><FP16_HW:mode>2): New insns to convert beween the 2 + 16-bit floating point modes and signed/unsigned integers. + (floatuns<GPR:mode><FP16_HW:mode>2): Likewise. + (fix_trunc<FP16_HW:mode><GPR:mode>): Likewise. + (fixuns_trunc<FP16_HW:mode><GPR:mode>2): Likewise. + +==================== Branch ibm/gcc-16-future-float16, patch #404 ==================== + +Add conversions between __bfloat16 and float/double. + +This patch provides conversions between __bfloat16 and float/double scalars on +power10 and power11 systems. + +Unlike the support for _Float16, there is not a single instruction to convert +between a __bfloat16 and float/double scalar value on the power10. + +Instead we have to use the vector conversion instructions. + +To convert a __bfloat16 scalar to a float/double scalar, GCC will generate: + + lxsihzx 0,0,4 Load value into vector register + xxsldwi 0,0,0,1 Get the value into the upper 32-bits + xvcvbf16spn 0,0 Convert vector __bfloat16 to vector float + xscvspdpn 0,0 Convert memory float format to scalar + +To convert a scalar float/double to __bfloat16, GCC will generate: + + xscvdpsp 0,0 Convert float scalar to float memory format + xvcvspbf16 0,0 Convert vector float to vector __bfloat16 + +2025-11-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/float16.md (FP16_HW): Add BFmode. + (VFP16_HW): New mode iterator. + (cvt_fp16_to_v4sf_insn): New mode attribute. + (FP16_VECTOR4): Likewise. + (UNSPEC_FP16_SHIFT_LEFT_32BIT): New unspec constant. + (UNSPEC_CVT_FP16_TO_V4SF): Likewise. + (UNSPEC_XXSPLTW_FP16): Likewise. + (UNSPEC_XVCVSPBF16_BF): Likewise. + (extendbf<mode>2): New insns to convert between BFmode and + SFmode/DFmode. + (xscvdpspn_sf): Likewise. + (xscvspdpn_sf): Likewise. + (<fp16_vector8>_shift_left_32bit): Likewise. + (trunc<mode>bf): Likewise. + (vsx_xscvdpspn_sf): Likewise. + (cvt_fp16_to_v4sf_<mode): Likewise. + (cvt_fp16_to_v4sf_<mode>_le): Likewise. + (cvt_fp16_to_v4sf_<mode>_be): Likewise. + (dup_<mode>_to_v4s): Likewise. + (xxspltw_<mode>): Likewise. + (xvcvbf16spn_bf): Likewise. + (xvcvspbf16_bf): Likewise. + * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define + __BFLOAT16_HW__ if we have hardware support for __bfloat16. + * config/rs6000/rs6000.cc (rs6000_init_hard_regno_mode_ok): Mark that we + use VSX arithmetic support for V8BFmode if we are a power10 or later. + +==================== Branch ibm/gcc-16-future-float16, patch #403 ==================== + +Add conversions between _Float16 and float/double. + +This patch adds support to generate xscvhpdp and xscvdphp on Power9 systems and +later, to convert between _Float16 and float scalar values. + +2025-11-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/float16.md (FP16_HW): New mode iterator. + (extendhf<mode>2): Add support converting between HFmode and + SFmode/DFmoded if we are on power9 or later. + (trunc<mode>hf2): Likewise. + * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define + __FLOAT16_HW__ if we have hardware support for _Float16. + * config/rs6000/rs6000.cc (rs6000_init_hard_regno_mode_ok): Mark that we + use VSX arithmetic support for V8HFmode if we are a power9 or later. + +==================== Branch ibm/gcc-16-future-float16, patch #402 ==================== + +Add HF/BF emulation functions to libgcc. + +This patch adds the necessary support in libgcc to allow using the machine +independent 16-bit floating point support. + +2025-11-12 Michael Meissner <[email protected]> + +libgcc/ + + * config.host (powerpc*-*-linux*): Add HF/BF emulation functions to + PowerPC libgcc. + * config/rs6000/sfp-machine.h (_FP_NANFRAC_H): New macro. + (_FP_NANFRAC_B): Likewise. + (_FP_NANSIGN_H): Likewise. + (_FP_NANSIGN_B): Likewise. + (DFtype2): Add HF/BF emulation function declarations. + (SFtype2): Likewise. + (DItype2): Likewise. + (UDItype2): Likewise. + (SItype2): Likewise. + (USItype2): Likewise. + (HFtype2): Likewise. + (__eqhf2): Likewise. + (__extendhfdf2): Likewise. + (__extendhfsf2): Likewise. + (__fixhfdi): Likewise. + (__fixhfsi): Likewise. + (__fixunshfdi): Likewise. + (__fixunshfsi): Likewise. + (__floatdihf): Likewise. + (__floatsihf): Likewise. + (__floatundihf): Likewise. + (__floatunsihf): Likewise. + (__truncdfhf2): Likewise. + (__truncsfhf2): Likewise. + (BFtype2): Likewise. + (__extendbfsf2): Likewise. + (__floatdibf): Likewise. + (__floatsibf): Likewise. + (__floatundibf): Likewise. + (__floatunsibf): Likewise. + (__truncdfbf2): Likewise. + (__truncsfbf2): Likewise. + (__truncbfhf2): Likewise. + (__trunchfbf2): Likewise. + * config/rs6000/t-float16: New file. + * configure.ac (powerpc*-*-linux*): Check if the PowerPC compiler + supports _Float16 and __bfloat16 types. + * configure: Regenerate. + +==================== Branch ibm/gcc-16-future-float16, patch #401 ==================== + +Add initial 16-bit floating point support. + +This patch adds the initial support for the 16-bit floating point formats. +_Float16 is the IEEE 754 half precision format. __bfloat16 is the Google Brain +16-bit format. + +In order to use both _Float16 and __bfloat16, the user has to use the -mfloat16 +option to enable the support. + +In this patch only the machine indepndent support is used. In order to be +usable, the next patch will also need to be installed. That patch will add +support in libgcc for 16-bit floating point support. + + +2025-11-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/constraints.md (eZ): New constraint for -0.0. + * config/rs6000/float16.md: New file to add basic 16-bit floating point + support. + * config/rs6000/predicates.md (easy_fp_constant): Add support for HFmode + and BFmode constants. + (easy_vector_constant): Add support for V8HFmode and V8BFmode to load up + the vector -0.0 constant. + (minus_zero_constant): New predicate. + (fp16_xxspltiw_constant): Likewise. + * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add support for + 16-bit floating point types. + (rs6000_init_builtins): Create the bfloat16_type_node if needed. + * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define + __FLOAT16__ and __BFLOAT16__ if 16-bit floating pont is enabled. + * config/rs6000/rs6000-call.cc (init_cumulative_args): Warn if a + function returns a 16-bit floating point value unless -Wno-psabi is + used. + (rs6000_function_arg): Warn if a 16-bit floating point value is passed + to a function unless -Wno-psabi is ued. + * config/rs6000/rs6000-protos.h (vec_const_128bit_type): Add mode field + to detect initializing 16-bit floating constants. + * config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached): Add + support for 16-bit floating point. + (rs6000_modes_tieable_p): Don't allow 16-bit floating point modes to tie + with other modes. + (rs6000_debug_reg_global): Add BFmode and HFmode. + (rs6000_setup_reg_addr_masks): Add support for 16-bit floating point + types. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Likewise. + (rs6000_option_override_internal): Add a check whether -mfloat16 can be + used. + (easy_altivec_constant): Add suport for 16-bit floating point. + (xxspltib_constant_p): Likewise. + (rs6000_expand_vector_init): Likewise. + (rs6000_expand_vector_set): Likewise. + (rs6000_expand_vector_extract): Likewise. + (rs6000_split_vec_extract_var): Likewise. + (reg_offset_addressing_ok_p): Likewise. + (rs6000_legitimate_offset_address_p): Likewise. + (legitimate_lo_sum_address_p): Likewise. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_can_change_mode_class): Likewise. + (rs6000_output_move_128bit): Likewise. + (rs6000_load_constant_and_splat): Likewise. + (rs6000_scalar_mode_supported_p): Likewise. + (rs6000_libgcc_floating_mode_supported_p): Return true for HFmode and + BFmode if -mfloat16. + (rs6000_floatn_mode): Enable _Float16 if -mfloat16. + (rs6000_opt_masks): Add -mfloat16. + (constant_fp_to_128bit_vector): Add support for 16-bit floating point. + (vec_const_128bit_to_bytes): Likewise. + (constant_generates_xxspltiw): Likewise. + * config/rs6000/rs6000.h (FP16_SCALAR_MODE_P): Ne macro. + (FP16_VECTOR_MODE_P): Likewise. + (TARGET_BFLOAT16_HW): New macro. + (TARGET_FLOAT16_HW): Likewise. + (TARGET_BFLOAT16_HW_VECTOR): Likewise. + (TARGET_FLOAT16_HW_VECTOR): Likewise. + * config/rs6000/rs6000.md (wd): Add BFmode and HFmode. + (toplevel): Include float16.md. + * config/rs6000/rs6000.opt (-mloat16): New option. + * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mfloat16. + +==================== Branch ibm/gcc-16-future-float16, patch #400 ==================== + +Add infrastructure for _Float16 and __bfloat16 types. + +This patch adds the infrastructure for adding 16-bit floating point types in the +next patch. Two new types that will be added: + +_Float16 (HFmode): +================== + +This is the IEEE 754-2008 16-bit floating point. It has 1 sign bit, 5 +exponent bits, 10 explicit mantassia bits (the 11th bit is implied with +normalization). + +The PowerPC ISA 3.0 (power9) has instructions to convert between the +scalar representations of _Float16 and float types. The PowerPC ISA +3.1 (power10 and power11) has instructions for converting between the +even elements of _Float16 vectors and float vectors. In addition, the +MMA subsystem has support for _Float16 vector processing. + + +__bfloat16 (BFmode): +==================== + +This is the brain 16-bit floating point created by the Google Brain +project. It has 1 sign bit, 8 exponent bits, 7 explicit mantissa bits +(the 8th bit is implied with normalization). The 16 bits in the +__bfloat16 format is the same as the upper 16 bits in the normal IEEE +754 32-bit floating point format. + +he PowerPC ISA 3.1 (power10 and power11) has instructions for +converting between the even elements of _bfloat16 vectors and float +vectors. In addition, the MMA subsystem has support for _bfloat16 +vector processing. + + +This patch adds new modes that will be used in the future. The +V8HFmode and V8BFmodes are treated as normal vector modes. + +This patch does not add loads and stores for BFmode and HFmode. These +will be added in the next patch. + + BFmode -- 16-bit mode for __bfloat16 support + HFmode -- 16-bit mode for _Float16 support + V8BFmode -- 128-bit vector mode __bfloat16 + V8HFmode -- 128-bit vector mode _Float16 + V4BFmode -- 64-bit vector mode __bfloat16 used in some insns + V4HFmode -- 64-bit vector mode _Float16 used in some insns + + +2025-11-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/altivec.md (VM): Add support for V8HFmode and + V8BFmode. + (VM2): Likewise. + (VI_char): Likewise. + (VI_scalar): Likewise. + (VI_unit): Likewise. + (VP_small): Likewise. + (VP_small_lc): Likewise. + (VU_char): Likewise. + * config/rs6000/rs6000-modes.def (HFmode): Add new mode. + (BFmode): Likewise. + (V8BFmode): Likewise. + (V8HFmode): Likewise. + * config/rs6000/rs6000-p8swap.cc (rs6000_gen_stvx): Remove #ifdef for + HAVE_V8HFmode. Add support for V8BFmode. + (rs6000_gen_lvx): Likewise. + (replace_swapped_load_constant): Likewise. + * config/rs6000/rs6000.cc (rs6000_debug_reg_global): Add support for + V8HFmode and V8BFmode. + (rs6000_init_hard_regno_mode_ok): Likewise. + (output_vec_const_move): Likewise. + (rs6000_expand_vector_init): Likewise. + (reg_offset_addressing_ok_p): Likewise. + (rs6000_const_vec): Likewise. + (rs6000_emit_move): Likewise. + * config/rs6000/rs6000.h (ALTIVEC_VECTOR_MODE): Likewise. + * config/rs6000/rs6000.md (FMOVE128_GPR): Likewise. + (wd): Likewise. + (du_or_d): Likewise. + (BOOL_128): Likewise. + (BOOL_REGS_OUTPUT): Likewise. + (BOOL_REGS_OP1): Likewise. + (BOOL_REGS_OP2): Likewise. + (BOOL_REGS_UNARY): Likewise. + (RELOAD): Likewise. + * config/rs6000/vector.md (VEC_L): Likewise. + (VEC_M): Likewise. + (VEC_E): Likewise. + (VEC_base): Likewise. + (VEC_base_l): Likewise. + * config/rs6000/vsx.md (VECTOR_16BIT): New mode iterator. + (VSX_L): Add support for V8HFmode and V8BFmode. + (VSX_M): Likewise. + (VSX_XXBR): Likewise. + (VSm): Likewise. + (VSr): Likewise. + (VSisa): Likewise. + (??r): Likewise. + (nW): Likewise. + (VSv): Likewise. + (VSX_EXTRACT_I): Likewise. + (VSX_EXTRACT_I2): Likewise. + (VSX_EXTRACT_I4): Likewise. + (VSX_EXTRACT_WIDTH): Likewise. + (VSX_EXTRACT_PREDICATE): Likewise. + (VSX_EX): Likewise. + (VM3): Likewise. + (VM3_char): Likewise. + (vsx_le_perm_load_<mode>): Rename from vsx_le_perm_load_v8hi and add + V8HFmode and V8BFmode. + (vsx_le_perm_store_<mode>): Rename from vsx_le_perm_store_v8hi and add + V8HFmode and V8BFmode. + (splitter for vsx_le_perm_store_<mode>): Likewise. + (vsx_ld_elemrev_<mode>): Rename from vsx_ld_elemrev_v8hi and add + V8HFmode and V8BFmode support. + (vsx_ld_elemrev_<mode>_internal): Rename from + vsx_ld_elemrev_v8hi_internal and add V8HFmode and V8BFmode support. + (vsx_st_elemrev_<mode>): Rename from vsx_st_elemrev_v8hi and add + V8HFmode and V8BFmode support. + (vsx_st_elemrev_<mode>_internal): Rename from + vsx_st_elemrev_v8hi_internal and add V8HFmode and V8BFmode support. + (xxswapd_<mode>): Rename from xxswapd_v8hi and add V8HFmode and V8BFmode + support. + (vsx_lxvd2x8_le_<MODE>): Rename from vsx_lxvd2x8_le_V8HI and add + V8HFmode and V8BFmode support. + (vsx_stxvd2x8_le_<MODE>): Rename from vsx_stxvd2x8_le_V8HI and add + V8HFmode and V8BFmode support. + (vsx_extract_<mode>_store_p9): Add V8HFmode and V8BFmode. + (vsx_extract_<mode>_p8): Likewise. + +==================== Branch ibm/gcc-16-future-float16, patch #302 ==================== + +Add support for 1,024 bit DMF registers. + +This patch is a prelimianry patch to add the full 1,024 bit dense math register +(DMRs) for -mcpu=future. The MMA 512-bit accumulators map onto the top of the +DMR register. + +This patch only adds the new 1,024 bit register support. It does not add +support for any instructions that need 1,024 bit registers instead of 512 bit +registers. + +I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit +registers. The 'wD' constraint added in previous patches is used for these +registers. I added support to do load and store of DMRs via the VSX registers, +since there are no load/store dense math instructions. I added the new keyword +'__dmf' to create 1,024 bit types that can be loaded into DMRs. At present, I +don't have aliases for __dmf512 and __dmf1024 that we've discussed internally. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2025-11-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/mma.md (UNSPEC_DM_INSERT512_UPPER): New unspec. + (UNSPEC_DM_INSERT512_LOWER): Likewise. + (UNSPEC_DM_EXTRACT512): Likewise. + (UNSPEC_DMF_RELOAD_FROM_MEMORY): Likewise. + (UNSPEC_DMF_RELOAD_TO_MEMORY): Likewise. + (movtdo): New define_expand and define_insn_and_split to implement 1,024 + bit DMR registers. + (movtdo_insert512_upper): New insn. + (movtdo_insert512_lower): Likewise. + (movtdo_extract512): Likewise. + (reload_dmf_from_memory): Likewise. + (reload_dmf_to_memory): Likewise. + * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add DMF + support. + (rs6000_init_builtins): Add support for __dmf keyword. + * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support + for TDOmode. + (rs6000_function_arg): Likewise. + * config/rs6000/rs6000-modes.def (TDOmode): New mode. + * config/rs6000/rs6000.cc (rs6000_hard_regno_nregs_internal): Add + support for TDOmode. + (rs6000_hard_regno_mode_ok_uncached): Likewise. + (rs6000_hard_regno_mode_ok): Likewise. + (rs6000_modes_tieable_p): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Add support for TDOmode. Setup reload + hooks for DMF mode. + (reg_offset_addressing_ok_p): Add support for TDOmode. + (rs6000_emit_move): Likewise. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_secondary_reload_class): Likewise. + (rs6000_mangle_type): Add mangling for __dmf type. + (rs6000_dmf_register_move_cost): Add support for TDOmode. + (rs6000_split_multireg_move): Likewise. + (rs6000_invalid_conversion): Likewise. + * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode. + (enum rs6000_builtin_type_index): Add DMF type nodes. + (dmf_type_node): Likewise. + (ptr_dmf_type_node): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/dm-1024bit.c: New test. + * lib/target-supports.exp (check_effective_target_ppc_dmf_ok): New + target test. + +==================== Branch ibm/gcc-16-future-float16, patch #301 ==================== + +Add support for dense math registers. + +The MMA subsystem added the notion of accumulator registers as an optional +feature of ISA 3.1 (power10). In ISA 3.1, these accumulators overlapped with +the VSX registers 0..31, but logically the accumulator registers were separate +from the FPR registers. In ISA 3.1, it was anticipated that in future systems, +the accumulator registers may no overlap with the FPR registers. This patch +adds the support for dense math registers as separate registers. + +This particular patch does not change the MMA support to use the accumulators +within the dense math registers. This patch just adds the basic support for +having separate DMRs. The next patch will switch the MMA support to use the +accumulators if -mcpu=future is used. + +For testing purposes, I added an undocumented option '-mdense-math' to enable +or disable the dense math support. + +This patch updates the wD constraint added in the previous patch. If MMA is +selected but dense math is not selected (i.e. -mcpu=power10), the wD constraint +will allow access to accumulators that overlap with VSX registers 0..31. If +both MMA and dense math are selected (i.e. -mcpu=future), the wD constraint +will only allow dense math registers. + +This patch modifies the existing %A output modifier. If MMA is selected but +dense math is not selected, then %A output modifier converts the VSX register +number to the accumulator number, by dividing it by 4. If both MMA and dense +math are selected, then %A will map the separate DMF registers into 0..7. + +The intention is that user code using extended asm can be modified to run on +both MMA without dense math and MMA with dense math: + + 1) If possible, don't use extended asm, but instead use the MMA built-in + functions; + + 2) If you do need to write extended asm, change the d constraints + targetting accumulators should now use wD; + + 3) Only use the built-in zero, assemble and disassemble functions create + move data between vector quad types and dense math accumulators. + I.e. do not use the xxmfacc, xxmtacc, and xxsetaccz directly in the + extended asm code. The reason is these instructions assume there is a + 1-to-1 correspondence between 4 adjacent FPR registers and an + accumulator that overlaps with those instructions. With accumulators + now being separate registers, there no longer is a 1-to-1 + correspondence. + +It is possible that the mangling for DMFs and the GDB register numbers may +produce other changes in the future. + +gcc/ + +2025-11-11 Michael Meissner <[email protected]> + + * config/rs6000/mma.md (UNSPEC_MMA_DMSETDMRZ): New unspec. + (movxo): Add comments about dense math registers. + (movxo_nodm): Rename from movxo and restrict the usage to machines + without dense math registers. + (movxo_dm): New insn for movxo support for machines with dense math + registers. + (mma_<acc>): Restrict usage to machines without dense math registers. + (mma_xxsetaccz): Add a define_expand wrapper, and add support for dense + math registers. + (mma_dmsetaccz): New insn. + * config/rs6000/predicates.md (dmf_operand): New predicate. + (accumulator_operand): Add support for dense math registers. + * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do + not issue a de-prime instruction when disassembling a vector quad on a + system with dense math registers. + * config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): Define + __DENSE_MATH__ if we have dense math registers. + * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): Add -mdense-math. + (POWERPC_MASKS): Likewise. + * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add DMF_REG_TYPE. + (enum rs6000_reload_reg_type): Add RELOAD_REG_DMF. + (LAST_RELOAD_REG_CLASS): Add support for DMF registers and the wD + constraint. + (reload_reg_map): Likewise. + (rs6000_reg_names): Likewise. + (alt_reg_names): Likewise. + (rs6000_hard_regno_nregs_internal): Likewise. + (rs6000_hard_regno_mode_ok_uncached): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Likewise. + (rs6000_option_override_internal): If -mdense-math, issue an error if + -mno-mma or not -mcpu=future. + (rs6000_secondary_reload_memory): Add support for DMF registers. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_secondary_reload_class): Likewise. + (print_operand): Make %A handle both FPRs and DMRs. + (rs6000_dmf_register_move_cost): New helper function. + (rs6000_register_move_cost): Add support for DMR registers. + (rs6000_memory_move_cost): Likewise. + (rs6000_compute_pressure_classes): Likewise. + (rs6000_debugger_regno): Likewise. + (rs6000_opt_masks): Add -mdense-math support. + (rs6000_split_multireg_move): Add support for DMRs. + * config/rs6000/rs6000.h (TARGET_MMA_NO_DENSE_MATH): New macro. + (UNITS_PER_DMF_WORD): Likewise. + (FIRST_PSEUDO_REGISTER): Update for DMRs. + (FIXED_REGISTERS): Add DMRs. + (CALL_REALLY_USED_REGISTERS): Likewise. + (REG_ALLOC_ORDER): Likewise. + (DMF_REGNO_P): New macro. + (enum reg_class): Add DM_REGS. + (REG_CLASS_NAMES): Likewise. + (REG_CLASS_CONTENTS): Likewise. + (enum r6000_reg_class_enum): Add RS6000_CONSTRAINT_wD. + (REGISTER_NAMES): Add DMF registers. + (ADDITIONAL_REGISTER_NAMES): Likewise. + * config/rs6000/rs6000.md (FIRST_DMF_REGNO): New constant. + (LAST_DMF_REGNO): Likewise. + * config/rs6000/rs6000.opt (-mdense-math): New option. + +==================== Branch ibm/gcc-16-future-float16, patch #300 ==================== + +Add wD constraint. + +This patch adds a new constraint ('wD') that matches the accumulator registers +that overlap with VSX registers 0..31 on power10. Future patches will add the +support for a separate accumulator register class that will be used when the +support for dense math registes is added. + +2025-11-11 Michael Meissner <[email protected]> + + * config/rs6000/constraints.md (wD): New constraint. + * config/rs6000/mma.md (mma_<acc>): Prepare for alternate accumulator + registers. Use wD constraint instead of 'd' constraint. Use + accumulator_operand instead of fpr_reg_operand. + (mma_<vv>): Likewise. + (mma_<avv>): Likewise. + (mma_<pv>): Likewise. + (mma_<apv>): Likewise. + (mma_<vvi4i4i8>): Likewise. + (mma_<avvi4i4i8>): Likewise. + (mma_<vvi4i4i2>): Likewise. + (mma_<avvi4i4i2>): Likewise. + (mma_<vvi4i4>): Likewise. + (mma_<avvi4i4>): Likewise. + (mma_<pvi4i2): Likewise. + (mma_<apvi4i2>): Likewise. + (mma_<vvi4i4i4>): Likewise. + (mma_<avvi4i4i4): Likewise. + * config/rs6000/predicates.md (accumulator_operand): New predicate. + * config/rs6000/rs6000.cc (rs6000_debug_reg_global): Print the register + class for the 'wD' constraint. + (rs6000_init_hard_regno_mode_ok): Set up the 'wD' register constraint + class. + * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add element for + the 'wD' constraint. + * doc/md.texi (PowerPC constraints): Document the 'wD' constraint. + +==================== Branch ibm/gcc-16-future-float16, patch #201 ==================== + +Use vector pair load/store for memcpy with -mcpu=future + +In the development for the power10 processor, GCC did not enable using the load +vector pair and store vector pair instructions when optimizing things like +memory copy. This patch enables using those instructions if -mcpu=future is +used. + +I have tested these patches on both big endian and little endian PowerPC +servers, with no regressions. Can I check these patchs into the trunk? + +2025-11-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): Enable using load + vector pair and store vector pair instructions for memory copy + operations. + (POWERPC_MASKS): Make the option for enabling using load vector pair and + store vector pair operations set and reset when the PowerPC processor is + changed. + * gcc/config/rs6000/rs6000.cc (rs6000_machine_from_flags): Disable + -mblock-ops-vector-pair from influencing .machine selection. + +gcc/testsuite/ + + * gcc.target/powerpc/future-3.c: New test. + +==================== Branch ibm/gcc-16-future-float16, patch #200 ==================== + +Add -mcpu=future. + +2025-11-11 Michael Meissner <[email protected]> + +gcc/ + + * config.gcc (powerpc*-*-*): Add support for -mcpu=future. + * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future. + * config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise. + * config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise. + * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define + _ARCH_FUTURE if -mcpu=future. + * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macros. + (POWERPC_MASKS): Add OPTION_MASK_FUTURE. + * config/rs6000/rs6000-tables.opt: Regenerate. + (future processor): Add -mcpu=future. + * config/rs6000/rs6000-opts.h (PROCESSOR_FUTURE): Define as power11. + * config/rs6000/rs6000.h (ASM_CPU_SPEC): Add support for -mcpu=future. + * config/rs6000/rs6000.opt (-mfuture): New option. + * doc/invoke.texi (IBM RS/6000 and PowerPC Options): Document + -mcpu=future. + +gcc/testsuite/ + + * gcc.target/powerpc/future-1.c: New test. + * gcc.target/powerpc/future-2.c: Likewise. + ==================== Branch ibm/gcc-16-future-float16, patch #109 was reverted ==================== ==================== Branch ibm/gcc-16-future-float16, patch #108 was reverted ==================== ==================== Branch ibm/gcc-16-future-float16, patch #107 was reverted ====================
