On Mon, 2022-06-06 at 20:55 -0400, Michael Meissner wrote: > [PATCH 1/3] Disable generating store vector pair. > > Testing has revealed that the power10 has some slowdowns if the store > vector pair instruction is generated in some cases. This patch disables > generating the store vector pair instructions (stxvp, pstxvp, and stxvpx) > unless an undocumented switch is used. It is anticipated that perhaps > with future machines we can generate the store vector pair instruction. > > This patch does a split after reload to convert a store vector pair > instruction into a pair of store vector instructions. > > We do continue to generate the load vector pair instructions (lxvp, plxvp, > and lxvpx), since we have found that in code that heavily uses MMA, it is > still a win to generate the load vector pair instructions. > > There are two future patches planed: > > 1) Disable block moves from generating load/store vector pair > instructions unless the the store vector pair instructions are > being generted. > > 2) Make the built-in functions for generating store vector pair > always generate those instructions even if store vector pair > instructions are disabled. > > I have built bootstrap compilers and run the regression tests on three > different systems: > > 1) Little endian power10 using the --with-cpu=power10 option. > > 2) Little endian power9 using the --with-cpu=power9 option. > > 3) Big endian power8 using the --with-cpu=power8 option. On this > system, > both 64-bit and 32-bit code generation was tested. > > There were no regressions in the runs except for the tests that are > modified in patch #3 in these series of patches. Can I check this patch > into the trunk? If there are no changes needed for the backports, can I > check this code into the active branches after a burn-in period? > > 2022-06-06 Michael Meissner <meiss...@linux.ibm.com> > > gcc/ > > * config/rs6000/mma.md (movoo): Disable generating store vector > pair instructions unless these are enabled by the user. > (movxo): Likewise. > * config/rs6000/rs6000.cc (rs6000_setup_reg_addr_masks): If store > vector pair instructions are disabled, do not allow vector pair > addresses to be indexed. > (rs6000_split_multireg_move): Do not split XOmode stores into two > store vector pair instructions unless store vector pair > instructions are enabled. > * config/rs6000/rs6000.md (isa attribute): Add stxvp attribute. > (enabled attribute): Disable alternative using store vector pair > instructions unless they are enabled. > * config/rs6000/rs6000.opt (-mstore-vector-pair): New option. > > gcc/testsuite/ > > * gcc.target/powerpc/p10-store-vector-pair-1.c: New test. > * gcc.target/powerpc/p10-store-vector-pair-2.c: New test. > --- > gcc/config/rs6000/mma.md | 41 ++++++---- > gcc/config/rs6000/rs6000.cc | 9 +- > gcc/config/rs6000/rs6000.md | 8 +- > gcc/config/rs6000/rs6000.opt | 4 + > .../powerpc/p10-store-vector-pair-1.c | 82 +++++++++++++++++++ > .../powerpc/p10-store-vector-pair-2.c | 81 ++++++++++++++++++ > 6 files changed, 206 insertions(+), 19 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/p10-store-vector-pair-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/p10-store-vector-pair-2.c > > diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md > index a183b6a168a..9b5f243b88d 100644 > --- a/gcc/config/rs6000/mma.md > +++ b/gcc/config/rs6000/mma.md > @@ -274,26 +274,35 @@ (define_expand "movoo" > DONE; > }) > > +;; By default for power10, do not generate the stxvp/pstxvp/stxvpx > +;; instructions. Instead, split these instructions into two separate store > +;; vector instructions. We do always generate a lxvp/plxvp/lxvpx > instruction. > +;; We leave in the support for generating stxvp/pstxvp/stxvpx in future > +;; machines.
... and if (undocumented) STORE_VECTOR_PAIR option is indicated ? Nothing else jumps out at me. Thanks -Will