https://gcc.gnu.org/g:de07bcdb521211102f2083e6989ffeae066a5c36
commit de07bcdb521211102f2083e6989ffeae066a5c36 Author: Michael Meissner <[email protected]> Date: Tue May 12 20:40:30 2026 -0400 Update ChangeLog.* Diff: --- gcc/ChangeLog.dmf | 707 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 707 insertions(+) diff --git a/gcc/ChangeLog.dmf b/gcc/ChangeLog.dmf index b5698d52af89..d0aeac838858 100644 --- a/gcc/ChangeLog.dmf +++ b/gcc/ChangeLog.dmf @@ -1,3 +1,710 @@ +==================== Branch work246-dmf, patch #114 ==================== + +Add paddis support. + +This patch adds support for the paddis instruction that might be added to a +future PowerPC processor. + +2026-05-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/constraints.md (eU): New constraint. + (eV): Likewise. + * config/rs6000/predicates.md (paddis_operand): New predicate. + (paddis_paddi_operand): Likewise. + (add_operand): Add paddis support. + * config/rs6000/rs6000.cc (num_insns_constant_gpr): Add paddis support. + (num_insns_constant_multi): Likewise. + (print_operand): Add %B<n> for paddis support. + * config/rs6000/rs6000.h (TARGET_PADDIS): New macro. + (SIGNED_INTEGER_32BIT_P): Likewise. + * config/rs6000/rs6000.md (isa attribute): Add paddis support. + (enabled attribute); Likewise. + (add<mode>3): Likewise. + (adddi3 splitter): New splitter for paddis. + (movdi_internal64): Add paddis support. + (movdi splitter): New splitter for paddis. + +gcc/testsuite/ + + * gcc.target/powerpc/prefixed-addis.c: New test. + +==================== Branch work246-dmf, patch #113 ==================== + +Support load/store vector with right length. + +This patch adds support for new instructions that may be added to the PowerPC +architecture in the future to enhance the load and store vector with length +instructions. + +The current instructions (lxvl, lxvll, stxvl, and stxvll) are inconvient to use +since the count for the number of bytes must be in the top 8 bits of the GPR +register, instead of the bottom 8 bits. This meant that code generating these +instructions typically had to do a shift left by 56 bits to get the count into +the right position. In a future version of the PowerPC architecture, new +variants of these instructions might be added that expect the count to be in +the bottom 8 bits of the GPR register. These patches add this support to GCC +if the user uses the -mcpu=future option. + +I discovered that the code in rs6000-string.cc to generate ISA 3.1 lxvl/stxvl +future lxvll/stxvll instructions would generate these instructions on 32-bit. +However the patterns for these instructions is only done on 64-bit systems. So +I added a check for 64-bit support before generating the instructions. + +The patches have been tested on both little and big endian systems. Can I check +it into the master branch? + +2026-05-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/rs6000-string.cc (expand_block_move): Do not generate + lxvl and stxvl on 32-bit. + * config/rs6000/vsx.md (lxvl): If -mcpu=future, generate the lxvl with + the shift count automaticaly used in the insn. + (lxvrl): New insn for -mcpu=future. + (lxvrll): Likewise. + (stxvl): If -mcpu=future, generate the stxvl with the shift count + automaticaly used in the insn. + (stxvrl): New insn for -mcpu=future. + (stxvrll): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/lxvrl.c: New test. + +==================== Branch work246-dmf, patch #112 ==================== + +Add xvrlw support. + +This patch adds support for a possible new variant of the vector rotate left +instruction that might be added to a future PowerPC. This variant (xvrlw) can +use any VSX register instead of requiring only Altivec registers. + +2026-05-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/altivec.md (xvrlw): New insn. + * config/rs6000/rs6000.h (TARGET_XVRLW): New macro. + +gcc/testsuite/ + + * gcc.target/powerpc/vector-rotate-left.c: New test. + +==================== Branch work246-dmf, patch #111 ==================== + +Add saturate subtract support + +This patch adds support for saturating subtract instructions that might be added +to a future PowerPC. I think I had originally submitted patches that added a +new built-in function to generate the subfus and subdus instructions. Segher +suggested that instead of generating a built-in function, that I should just +having GCC automatically recognize cases where a saturating subtract could be +generated. This patch generates the saturating subtract instructions in the +appropriate context. + +2026-05-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/rs6000.md (gtu_geu): New code iterator. + (subfus<mode>3_<code>): New insns. + +gcc/testsuite/ + + * gcc.target/powerpc/saturate-subtract-1.c: New test. + * gcc.target/powerpc/saturate-subtract-2.c: Likewise. + * lib/target-supports.exp (check_effective_target_powerpc_future_ok): + New target test. + +==================== Branch work246-dmf, patch #110 ==================== + +Use vector pair load/store for memcpy with -mcpu=future + +In the development for the power10 processor, GCC did not enable using the load +vector pair and store vector pair instructions when optimizing things like +memory copy. This patch enables using those instructions if -mcpu=future is +used. + +I have tested these patches on both big endian and little endian PowerPC +servers, with no regressions. Can I check these patchs into the trunk? + +2026-05-12 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): Enable using load + vector pair and store vector pair instructions for memory copy + operations. + (POWERPC_MASKS): Make the option for enabling using load vector pair and + store vector pair operations set and reset when the PowerPC processor is + changed. + * config/rs6000/rs6000.cc (rs6000_machine_from_flags): Disable + -mblock-ops-vector-pair from influencing .machine selection. + +gcc/testsuite/ + + * gcc.target/powerpc/future-3.c: New test. + +==================== Branch work246-dmf, patch #106 ==================== + +On dense math systems use the 'dm' prefix on MMA instructions. + +This is part six of the dense math register patches for the PowerPC. + +This is an optional patch that on dense math systems changes the XV* MMA +instructions to DMXV*. The assembler will generate the same object code for +either instruction. This is tell the user looking at assembly code that we are +compiling MMA code to use dense math registers. + +I have built bootstrap little endian compilers on power10 systems, and +big endian compiler on power9 systems. There were no regression in the +tests. Can I add the patches to the GCC trunk after the -mcpu=future +patch is applied and GCC 17 has opened up? + +2026-05-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/mma.md (vvi4i4i8): Eliminate using the 'pm' prefix here, + so we can emit pmdm* on dense math systems. + (avvi4i4i8): Likewise. + (vvi4i4i2): Likewise. + (avvi4i4i2): Likewise. + (vvi4i4): Likewise. + (avvi4i4): Likewise. + (pvi4i2): Likewise. + (apvi4i2): Likewise. + (vvi4i4i4): Likewise. + (mma_<vv>): If -mdesne-math, emit 'dmxv*' form of the instruction + instead of 'xv*'. + (mma_<avv>): Likewise. + (mma_<pv>): Likewise. + (mma_<apv>): Likewise. + (mma_pm<vvi4i4i8>): If -mdense-math, emit 'pmdm*' instead of 'pm*'. + (mma_pm<avvi4i4i8>): Likewise. + (mma_pm<vvi4i4i2>): Likewise. + (mma_pm<avvi4i4i2>): Likewise. + (mma_pm<vvi4i4>): Likewise. + (mma_pm<avvi4i4>): Likewise. + (mma_pm<pvi4i2>): Likewise. + (mma_pm<apvi4i2>): Likewise. + (mma_pm<vvi4i4i4>): Likewise. + (mma_pm<avvi4i4i4>): Likewise. + * config/rs6000/rs6000.cc (print_operand): For %!, print 'dm' if + -mdense-math. + * config/rs6000/rs6000.h (PRINT_OPERAND_PUNCT_VALID_P): Allow %!. + +==================== Branch work246-dmf, patch #105 ==================== + +Add support for 1,024 bit dense math registers. + +This is part five of the dense math register patches for the PowerPC. +This is the 7th version of the dense math patches. + +Version 6 of the dense math register patches were posted on April 21st, +2026. + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713352.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713353.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713354.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713355.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713356.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713357.html + +This patch needs the -mcpu=future patch posted on April 8th, 2026: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/712532.html + +This patch is functionally the same as the version 6 patch, except I made the +same name changes as I discussed in the previous patch. + +This patch (#5) is a prelimianry patch to add the full 1,024 bit dense math +register (DMFs) for -mcpu=future. The MMA 512-bit accumulators map onto the top +of the DMR register. + +This patch only adds the new 1,024 bit register support. It does not add +support for any instructions that need 1,024 bit registers instead of 512 bit +registers. + +I used the new mode 'TDOmode' to be the opaque mode used for 1,024 bit +registers. The 'wD' constraint added in previous patches is used for these +registers. I added support to do load and store of DMRs via the VSX registers, +since there are no load/store dense math instructions. I added the new keyword +'__dm1024' to create 1,024 bit types that can be loaded into dense math +registers. + +I have built bootstrap little endian compilers on power10 systems, and +big endian compiler on power9 systems. There were no regression in the +tests. Can I add the patches to the GCC trunk after the -mcpu=future +patch is applied and GCC 17 has opened up? + +2026-05-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/mma.md (UNSPEC_DMF_INSERT512_UPPER): New unspec. + (UNSPEC_DMF_INSERT512_LOWER): Likewise. + (UNSPEC_DMF_EXTRACT512): Likewise. + (UNSPEC_DMF_RELOAD_FROM_MEMORY): Likewise. + (UNSPEC_DMF_RELOAD_TO_MEMORY): Likewise. + (movtdo): New define_expand and define_insn_and_split to implement 1,024 + bit dense math registers. + (movtdo_insert512_upper): New insn. + (movtdo_insert512_lower): Likewise. + (movtdo_extract512): Likewise. + (reload_tdo_from_memory): Likewise. + (reload_tdo_to_memory): Likewise. + * config/rs6000/rs6000-builtin.cc (rs6000_type_string): Add dense math + register support. + (rs6000_init_builtins): Add support for __dm1024 keyword. + * config/rs6000/rs6000-call.cc (rs6000_return_in_memory): Add support + for TDOmode. + (rs6000_function_arg): Likewise. + * config/rs6000/rs6000-modes.def (TDOmode): New mode. + * config/rs6000/rs6000.cc (rs6000_hard_regno_mode_ok_uncached): Add + support for TDOmode. + (rs6000_hard_regno_mode_ok): Likewise. + (rs6000_modes_tieable_p): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Add support for TDOmode. Setup reload + hooks for dense math TDO reload mode. + (reg_offset_addressing_ok_p): Add support for TDOmode. + (rs6000_emit_move): Likewise. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_mangle_type): Add mangling for __dm1024 type. + (rs6000_dmf_register_move_cost): Add support for TDOmode. + (rs6000_split_multireg_move): Likewise. + (rs6000_invalid_conversion): Likewise. + * config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): Add TDOmode. + (enum rs6000_builtin_type_index): Add dense math register type nodes. + (dm1024_type_node): Likewise. + (ptr_dm1024_type_node): Likewise. + +gcc/testsuite/ + + * gcc.target/powerpc/dm-1024bit.c: New test. + +==================== Branch work246-dmf, patch #104 ==================== + +Add support for dense math registers. + +This is part four of the dense math register patches for the PowerPC. +This is the 7th version of the dense math patches. + +Version 6 of the dense math register patches were posted on April 21st, +2026. + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713352.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713353.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713354.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713355.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713356.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713357.html + +This patch needs the -mcpu=future patch posted on April 8th, 2026: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/712532.html + +This patch (#4) combines version 6 patch #3 (which adds the basic dense math +register support) and the parts of patch #4 (switch mmf.md to use the new wD +constraint, use accumulator_operand, and add dense math zero/move support) that +weren't previously added in version 7 patch #1. Here are the changes from +version 6 of the patches: + + dense_math_operand to dmf_register_operand + FIRST_DM_REGNO to FIRST_DMF_REGNO + LAST_DM_REGNO to LAST_DMF_REGNO + UNITS_PER_DM_WORD to UNITS_PER_DMF_WORD + DM_REGNO_P to DMF_REGNO_P + DM_REGS to DMF_REGS + DM_REG_TYPE to DMF_REG_TYPE + rs6000_dense_math_register_move_cost to rs6000_dmf_register_move_cost + +I have built bootstrap little endian compilers on power10 systems, and +big endian compiler on power9 systems. There were no regression in the +tests. Can I add the patches to the GCC trunk after the -mcpu=future +patch is applied and GCC 17 has opened up? + +2026-05-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/mma.md (movoo): Allow -mdense-math -mno-mma. + (movxo): Convert to being a define_expand that can handle both the + original MMA support without dense math registers, and adding dense math + support. Allow -mdense-math -mno-mma. + (movxo_nodm): Rename original movxo insn, and restrict this insn to when + we do not have dense math registers. + (movxo_dm): New define_insn_and_split for dense math registers. + (vsx_assemble_pair): Allow -mdense-math -mno-mma. + (vsx_disassemble_pair): Likewise. + (mma_assemble_acc): Likewise. + (mma_disassemble_acc): Likewise. + (mma_<acc>): Allow built-ins to be used if -mdense-math. + (mma_xxsetaccz): Convert into a define_expand to handle both non-dense + math and dense math registers. + (mma_xxsetaccz_nodm): Rename from mma_xxsetaccz and limit code to non + dense math systems. + (mma_xxsetaccz_dm): New insn for direct math register support. + * config/rs6000/predicates.md (dmf_register_operand): New predicate. + (accumulator_operand): Add support for dense math registers. + * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Do + not issue xxmfacc (deprime) instruction if we have dense math registers. + * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): Add -mdense-math. + (POWERPC_MASKS): Likewise. + * config/rs6000/rs6000.cc (enum rs6000_reg_type): Add dense math + register support. + (enum rs6000_reload_reg_typ): Likewise. + (LAST_RELOAD_REG_CLASS): Likewise. + (reload_reg_map): Likewise. + (rs6000_reg_names): Likewise. + (alt_reg_names): Likewise. + (rs6000_hard_regno_nregs_internal): Likewise. + (rs6000_hard_regno_mode_ok_uncached): Likewise. + (rs6000_debug_reg_global): Likewise. + (rs6000_setup_reg_addr_masks): Likewise. + (rs6000_init_hard_regno_mode_ok): Likewise. + (rs6000_secondary_reload_memory): Likewise. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Likewise. + (rs6000_secondary_reload_class): Likewise. + (print_operand): Likewise. + (rs6000_dmf_register_move_cost): New helper function. + (rs6000_register_move_cost): Add dense math register support. + (rs6000_memory_move_cost): Likewise. + (rs6000_compute_pressure_classes): Likewise. + (rs6000_debugger_regno): Likewise. + (rs6000_opt_masks): Likewise. + (rs6000_split_multireg_move): Likewise. + * config/rs6000/rs6000.h (UNITS_PER_DMF_WORD): New macro. + (FIRST_PSEUDO_REGISTER): Add dense math register support. + (FIXED_REGISTERS): Likewise. + (CALL_REALLY_USED_REGISTERS): Likewise. + (REG_ALLOC_ORDER): Likewise. + (DMF_REGNO_P): New macro. + (enum reg_class): Add dense math register support. + (REG_CLASS_NAMES): Likewise. + (REGISTER_NAMES): Likewise. + (ADDITIONAL_REGISTER_NAMES): Likewise. + * config/rs6000/rs6000.md (FIRST_DMF_REGNO): New constant. + (LAST_DMF_REGNO): Likewise. + + +==================== Branch work246-dmf, patch #103 ==================== + +Add the -mdense-math option. + +This is part three of the dense math register patches for the PowerPC. +This is the 7th version of the dense math patches. + +Version 6 of the dense math register patches were posted on April 21st, +2026. + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713352.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713353.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713354.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713355.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713356.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713357.html + +This patch needs the -mcpu=future patch posted on April 8th, 2026: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/712532.html + +This patch (patch #3) is the same as patch #2 in the V6 patches. + +This patch adds the -mdense-math option for -mcpu=future. The next set of +patches will support for using dense math registers with the MMA instructions. +All this patch does is add the option. A future patch will implement support +for dense math registers, and another patch will then switch the MMA +instructions to use dense math registers. + +For users, the following macros are defined: + + __MMA_NO_DENSE_MATH__ ISA 3.1 MMA support. + __MMA_DENSE_MATH__ MMA with dense math registers. + +Within the compiler, the following macros are defined: + + TARGET_MMA_NO_DENSE_MATH ISA 3.1 MMA support. + TARGET_MMA_DENSE_MATH MMA with dense math registers. + +I have built bootstrap little endian compilers on power10 systems, and +big endian compiler on power9 systems. There were no regression in the +tests. Can I add the patches to the GCC trunk after the -mcpu=future +patch is applied and GCC 17 has opened up? + +2026-05-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/rs6000-c.cc (rs6000_define_or_undefine_macro): Define + __MMA_DENSE_MATH__ if we have MMA that uses dense math register + accumulators. Define __MMA_NO_DENSE_MATH__ if we have MMA but we are + using ISA 3.1 where the accumulators are overlaid over VSX registers + 0..32. Define __DENSE_MATH__ if we have dense math registers. + * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do not + allow -mdense-math unless -mcpu=future is used. + (rs6000_opt_masks): Add -mdense-math support. + * config/rs6000/rs6000.h (TARGET_MMA_DENSE_MATH): New macro. + (TARGET_MMA_NO_DENSE_MATH): Likewise. + * config/rs6000/rs6000.opt (-mdense-math): New option. + * doc/invoke.texi (RS/6000 and PowerPC Options): Add -mdense-math. + +==================== Branch work246-dmf, patch #102 ==================== + +Switch to use wD constraint in mma.md. + +This is part two of the dense math register patches for the PowerPC. +This is the 7th version of the dense math patches. + +Version 6 of the dense math register patches were posted on April 21st, +2026. + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713352.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713353.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713354.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713355.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713356.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713357.html + +This patch needs the -mcpu=future patch posted on April 8th, 2026: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/712532.html + +This patch changes mma.md to use the wD constraint and accumulator_operand +predicate that were added in the previous patch instead of using the d constrant +and vsx_register_operand predicate. This is in anticipation of adding dense +math registers in a future patch. + +In addition, I added a comment in front of each insn to indicate which +instructions are being generated. + +Originaly, these changes were in patch #4 in the V6 patches. I have removed +these patches switching to use wD from the other part of the patch adding dense +math register support. + +I have built bootstrap little endian compilers on power10 systems, and +big endian compiler on power9 systems. There were no regression in the +tests. Can I add the patches to the GCC trunk after the -mcpu=future +patch is applied and GCC 17 has opened up? + +2026-05-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/mma.md (mma_<vv>): Use the wD constraint and + accumulator_operand predicate for all MMA instructions taking + accumulator operands. + (mma_<avv>): Likewise. + (mma_<pv>"): Likewise. + (mma_<apv>): Likewise. + (mma_<vvi4i4i8>): Likewise. + (mma_<avvi4i4i8>): Likewise. + (mma_<vvi4i4i2>"): Likewise. + (mma_<avvi4i4i2>): Likewise. + (mma_<vvi4i4>): Likewise. + (mma_<avvi4i4>): Likewise. + (mma_<pvi4i2>): Likewise. + (mma_<apvi4i2>): Likewise. + (mma_<vvi4i4i4>): Likewise. + (mma_<avvi4i4i4>): Likewise. + +==================== Branch work246-dmf, patch #101 ==================== + +Add wD constraint. + +This is part one of the dense math register patches for the PowerPC. +This is the 7th version of the dense math patches. + +Version 6 of the dense math register patches were posted on April 21st, +2026. + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713352.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713353.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713354.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713355.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713356.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713357.html + +This patch needs the -mcpu=future patch posted on April 8th, 2026: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/712532.html + +This particular patch did not change from version 6. + +This patch adds a new constraint ('wD') that matches the accumulator registers +used by the MMA instructions. Possible future PowerPC machines are thinking +about having a new set of 8 dense math accumulators that will be 1,024 bits in +size. The 'wD' constaint was chosen because the VSX constraints start with 'w'. +The 'wd' constraint was already used, so I chose 'wD' to be similar. + +To change code to possibly use dense math registers, the 'd' constraint should +be changed to 'wD', and the predicate 'fpr_reg_operand' should be changed to +'accumulator_operand'. + +On current power10/power11 systems, the accumulators overlap with the 32 +traditional FPR registers (i.e. VSX vector registers 0..31). Each accumulator +uses 4 adjacent FPR/VSX registers for a 512 bit logical register. + +Possible future PowerPC machines would have these 8 accumulator registers be +separate registers, called dense math registers. It is anticipated that when in +dense math register mode, the MMA instructions would use the accumulators +instead of the adjacent VSX registers. I.e. in power10/power11 mode, +accumulator 1 will overlap with vector registers 4-7, but in dense math register +mode, accumulator 1 will be a separate register. + +Code compiled for power10/power11 systems will continue to work on the potential +future machine with dense math register support but the compiler will have fewer +vector registers available for allocation because it believe the accumulators +are using vector registers. For example, the file mma-double-test.c in the +gcc.target/powerpc testsuite directory has 8 more register spills to/from the +stack for power10/power11 code then when compiled with dense math register +support. + +I have built bootstrap little endian compilers on power10 systems, and +big endian compiler on power9 systems. There were no regression in the +tests. Can I add the patches to the GCC trunk after the -mcpu=future +patch is applied and GCC 17 has opened up? + +2026-05-11 Michael Meissner <[email protected]> + +gcc/ + + * config/rs6000/constraints.md (wD): New constraint. + * config/rs6000/predicates.md (accumulator_operand): New predicate. + * config/rs6000/rs6000.cc (rs6000_debug_reg_global): Print the register + class for the 'wD' constraint. + (rs6000_init_hard_regno_mode_ok): Set up the 'wD' register constraint + class. + * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add element for + the 'wD' constraint. + * doc/md.texi (PowerPC constraints): Document the 'wD' constraint. + +==================== Branch work246-dmf, patch #100 (info) ==================== + +This patch is a modification of the V6 patches that I sent out on April +21st, 2026. + +In particular, I made the changes in relation to the comments posted in +February that I didn't fully address previously. + +Here is comment from February: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-February/708071.html + +Here is my reply: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/715248.html + +Here are the V6 patches posted on April 21st, 2026: + + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713352.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713353.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713354.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713356.html + * https://gcc.gnu.org/pipermail/gcc-patches/2026-April/713357.html + +There are 7 patches in this patch set: + +Patch #1 adds the wD constraint and the accumulator_operand predicate. + +Patch #2 switches mma.md to use the wD constraint and accumulator_operand +predicate. + +Patch #3 adds the -mdense-math option, but in this patch, -mdense-math is not +implemented. + +Patch #4 adds support for 512-bit dense math registers. + +Patch #5 adds support for 1,024-bit dense math registers. + +Patch #6 is an optional patch that changes the name of the MMA instructions from +the original name used in the power10/power11 time line to a new alternate name +that has 'dm' (for dense math) in the instruction name. Note, this is a new +patch for the V7 patch set. + +Patch #7 clones the mma builtin tests to test the code generation of MMA +instructions if -mcpu=future is used. Note, this is a new patch for the V7 +patch set. If patch #6 is not applied, this patch will need to be modified. + + +The following is the description of dense math registes from previous versions +of the patches. + +The Dense Math Facility (dmf) is designed to be an extension to the ISA +3.1 (i.e. power10/power11) MMA facility. Now, since these are future +patches, the Dense Math Facility might appear in future PowerPC +machines or maybe it won't be used in real hardware. + +One of the concepts of the DMF system is the accumulators used in the +MMA and the DMF extensions will become separate registers, rather +than being overlaid over the traditional floating point registers +(i.e. VSX registers 0..31). + +In addition to being separate registers, the dense math accumulators +are now logically 1,024 biits instead of 512. + +The way the Dense Math registers and instructions are designed, +existing power10/power11 MMA instructions that operate on 512 bits will +work with Dense Math. In ISA 3.1, each of the 8 accumulators are +overlaid over 4 adjacent FPR registers, and the compiler must not touch +the 4 adjacent FPRs while the MMA accumulator is used. + +In the Dense Math system, the accumulator is a separate register. When +-mcpu=power11 or -mcpu=power10 is used, the GCC compiler will not +allocate the appropriate FPR (VSX) reigsters when generating MMA +instructions. + +If a function compiled for Power10/Power11 is run on a system with +Dense Math support enabled, the effect is a bunch of the FPR registers +will not be allocated because the compiler assumes the accumulaters are +there. After these patches are applied, if the user compiles the code +with -mcpu=future, the compiler can allocate up to 32 more vector +registers, because the Dense Math accumulators are separate registers. + +In fact two of the MMA tests (mma-double-test.c and mma-single-test.c) +do about 20 less spills of floating point values to the stack, since +the compiler can allocate those FPR vector registers for other +purposes. + +These 5 patches will allow GCC to allocate these registers if the +-mcpu=future option is used. + + 1: The first patch adds a new constraint (%wD) that can be used by + code generating MMA instructions. If the user used -mcpu=power10 + or -mcpu=power11, %wD will act like %d and insist the register be + VSX registers 0..31. If the user used -mcpu=future, the new + separate dense math accumulators will be used. + + 2: This patch just adds the -mdense-math option, but it does not add + support for dense math registers until patch #3. + + 3: This patch adds the support for the current MMA 512-bit + instructions to use separate accumulators instead of overlaid VSX + registers. + + 4: This patch adds support for an extension to MMA where the + accumulators grow to 1,024 bits instead of 512 bits. + + 5: This patch is an optional patch that adds comments to the various + MMA insn that explain what MMA instructions are generated by the + particular insn. + +This patch is the foundation for the Dense Math support. It is +expected other patches may be added to this to support potential new +features added to the Dense Math Facility. + +I have built bootstrap little endian compilers on power10 systems, and +big endian compiler on power9 systems. There were no regression in the +tests. Can I add the patches to the GCC trunk after the -mcpu=future +patch is applied and GCC 17 has opened up? + ==================== Branch work246-dmf, baseline ==================== 2026-05-11 Michael Meissner <[email protected]>
