[gcc(refs/vendors/ibm/heads/gcc-12-branch)] ibm: Merge up to top of releases/gcc-12

2024-06-22 Thread Peter Bergner via Libstdc++-cvs
https://gcc.gnu.org/g:3409c8aae4b0e2a73fa34aec8c58c261384283dc

commit 3409c8aae4b0e2a73fa34aec8c58c261384283dc
Merge: 92786addfe0 218adac0fce
Author: Peter Bergner 
Date:   Sat Jun 22 08:54:14 2024 -0500

ibm: Merge up to top of releases/gcc-12

2024-06-22  Peter Bergner  

Merge up to releases/gcc-12 218adac0fce6135fcb5c0c56911272687f05872b

Diff:

 ChangeLog  |   4 +
 c++tools/ChangeLog |   4 +
 config/ChangeLog   |   4 +
 contrib/ChangeLog  |   4 +
 contrib/header-tools/ChangeLog |   4 +
 contrib/reghunt/ChangeLog  |   4 +
 contrib/regression/ChangeLog   |   4 +
 fixincludes/ChangeLog  |   4 +
 gcc/BASE-VER   |   2 +-
 gcc/ChangeLog  | 411 +
 gcc/ChangeLog.ibm  |   4 +
 gcc/DATESTAMP  |   2 +-
 gcc/ada/ChangeLog  |   9 +
 gcc/ada/exp_util.adb   |   6 +
 gcc/analyzer/ChangeLog |   4 +
 gcc/asan.cc|  26 +-
 gcc/attribs.cc |  17 +-
 gcc/bb-reorder.cc  |   3 +-
 gcc/bitmap.cc  |   2 +-
 gcc/builtins.cc|  16 +-
 gcc/c-family/ChangeLog |  34 ++
 gcc/c-family/c-attribs.cc  |  32 +-
 gcc/c-family/c-lex.cc  |  32 +-
 gcc/c-family/c-warn.cc |  13 +-
 gcc/c/ChangeLog|  14 +
 gcc/c/c-decl.cc|  15 +
 gcc/cfgexpand.cc   |  30 +-
 gcc/cfgrtl.cc  |   3 +-
 gcc/combine.cc |   6 +-
 gcc/config/aarch64/aarch64.cc  |   2 +
 gcc/config/alpha/alpha.md  |  21 +-
 gcc/config/alpha/constraints.md|   2 +-
 gcc/config/arm/arm.cc  |  76 +++-
 gcc/config/i386/i386-expand.cc |  17 +
 gcc/config/i386/i386.cc|  62 +++-
 gcc/config/i386/x86-tune.def   |   2 +-
 gcc/config/mips/mips.cc|  11 +-
 gcc/config/rs6000/rs6000-builtin.cc|   2 +-
 gcc/config/rs6000/rs6000-c.cc  |  62 ++--
 gcc/config/rs6000/rs6000-gen-builtins.cc   |  72 ++--
 gcc/cp/ChangeLog   |  35 ++
 gcc/cp/cp-gimplify.cc  |   4 +
 gcc/cp/semantics.cc|  10 +-
 gcc/d/ChangeLog|   4 +
 gcc/doc/generic.texi   |   2 +-
 gcc/doc/rtl.texi   |   2 +-
 gcc/fold-const.cc  |  20 +-
 gcc/fortran/ChangeLog  |   4 +
 gcc/ggc-common.cc  |   2 +-
 gcc/gimple-ssa-sprintf.cc  |  20 +-
 gcc/go/ChangeLog   |   4 +
 gcc/internal-fn.cc |  19 +
 gcc/ipa-icf.cc |  32 +-
 gcc/jit/ChangeLog  |   4 +
 gcc/lra-constraints.cc |   5 +
 gcc/lra.cc |   5 +-
 gcc/lto/ChangeLog  |   4 +
 gcc/objc/ChangeLog |   4 +
 gcc/objcp/ChangeLog|   4 +
 gcc/opts-common.cc |   6 +-
 gcc/po/ChangeLog   |   4 +
 gcc/rtl-ssa/blocks.cc  |   7 +-
 gcc/rtlanal.cc |  11 +-
 gcc/system.h   |  39 +-
 gcc/testsuite/ChangeLog| 317 
 gcc/testsuite/c-c++-common/Warray-compare-3.c  |  13 +
 gcc/testsuite/c-c++-common/Wattributes-3.c |  13 +
 gcc/testsuite/g++.dg/cpp1z/pr115440.C  |   8 +
 gcc/testsuite/g++.dg/cpp2a/bit-cast16.C|  16 +
 .../g++.dg/cpp2a/is-constant-evaluated15.C |  28 ++
 gcc/testsuite/g++.dg/ext/attrib68.C|   8 +
 gcc/testsuite/g++.dg/ext/pr114691.C|  22 ++
 gcc/testsuite/g++.dg/torture/vector-struct-1.C |  18 +
 gcc/testsuite/g++.target/i386/pr111497.C   |  22 ++
 gcc/testsuite/gcc.c-torture/compile/pr113603.c |  40 ++
 gcc/testsuite/gcc.c-torture

[gcc/ibm/heads/gcc-12-branch] (99 commits) ibm: Merge up to top of releases/gcc-12

2024-06-22 Thread Peter Bergner via Gcc-cvs
The branch 'ibm/heads/gcc-12-branch' was updated to point to:

 3409c8aae4b... ibm: Merge up to top of releases/gcc-12

It previously pointed to:

 92786addfe0... ibm: Merge up to top of releases/gcc-12

Diff:

Summary of changes (added commits):
---

  3409c8a... ibm: Merge up to top of releases/gcc-12
  218adac... Daily bump. (*)
  169d4d1... libstdc++: Fix test on x86_64 and non-simd targets (*)
  8b5bdeb... libstdc++: Fix find_last_set(simd_mask) to ignore padding b (*)
  cdbff5f... Daily bump. (*)
  b9569e7... libstdc++: Fix simd conversion for -fno-signed-char f (*)
  f79b273... libstdc++: Avoid MMX return types from __builtin_shufflevec (*)
  fb06754... libstdc++: Use __builtin_shufflevector for simd split and c (*)
  c60dd0e... diagnostics: Fix add_misspelling_candidates [PR115440] (*)
  8f612e6... c-family: Fix -Warray-compare warning ICE [PR115290] (*)
  98794d9... Bump BASE-VER. (*)
  2bada4b... Update ChangeLog and version files for release (*)
  fac4fbd... Daily bump. (*)
  9c0c31d... Daily bump. (*)
  b87418f... Daily bump. (*)
  ea0aa97... libstdc++: Fix declaration of posix_memalign for freestandi (*)
  caa4702... Daily bump. (*)
  26640a5... Daily bump. (*)
  573a5f3... Daily bump. (*)
  5319283... Daily bump. (*)
  6693b1f... Daily bump. (*)
  448dd00... arm: Add .type and .size to __gnu_cmse_nonsecure_call [PR11 (*)
  33663c0... cfgrtl: Fix MEM_EXPR update in duplicate_insn_chain [PR1149 (*)
  959cef9... [PR111497][LRA]: Copy substituted equivalence (*)
  844ff32... middle-end/40635 - SSA update losing PHI arg loations (*)
  1edc6a7... rtl-optimization/54052 - RTL SSA PHI insertion compile-time (*)
  3d9e4ee... testsuite: Fix expand-return CMSE test for Armv8.1-M [PR115 (*)
  55c1687... arm: Zero/Sign extends for CMSE security on Armv8-M.baselin (*)
  f38ffe3... Daily bump. (*)
  a995fde... Include safe-ctype.h after C++ standard headers, to avoid o (*)
  8f11ed1... libcc1: fix  include (*)
  d30afaa... PHIOPT: Don't transform minmax if middle bb contains a phi  (*)
  870e389... libstdc++: Fix fwrite error parameter (*)
  3837f95... libstdc++: Define __cpp_lib_constexpr_algorithms in  [ (*)
  80d0f82... libstdc++: Reverse arguments in constraint for std::optiona (*)
  c394e29... libstdc++: Destroy allocators in re-inserted container node (*)
  d4126b3... c: Fix up pointer types to may_alias structures [PR114493] (*)
  b065824... fold-const: Fix up CLZ handling in tree_call_nonnegative_wa (*)
  91a3712... builtins: Force SAVE_EXPR for __builtin_{add,sub,mul}_overf (*)
  bda8c28... rs6000: Fix up PCH in --enable-host-pie builds [PR115324] (*)
  840bc67... combine: Fix up simplify_compare_const [PR115092] (*)
  25bd98d... tree-inline: Remove .ASAN_MARK calls when inlining function (*)
  bf13440... gimple-ssa-sprintf: Use [0, 1] range for %lc with (wint_t)  (*)
  cc96dc5... openmp: Copy DECL_LANG_SPECIFIC and DECL_LANG_FLAG_? to tre (*)
  7d06735... rtlanal: Fix set_noop_p for volatile loads or stores [PR114 (*)
  b3ef00f... internal-fn: Temporarily disable flag_trapv during .{ADD,SU (*)
  bb21a7d... attribs: Don't crash on NULL TREE_TYPE in diag_attr_exclusi (*)
  e9b960e... c++: Fix bogus warnings about ignored annotations [PR114691 (*)
  082fe43... asan, v3: Fix up handling of > 32 byte aligned variables wi (*)
  b3b7176... c++: Fix up maybe_warn_for_constant_evaluated calls [PR1145 (*)
  f8a3279... vect: Don't clear base_misaligned in update_epilogue_loop_v (*)
  f33e8ee... c++: Fix ICE with weird copy assignment operator [PR114572] (*)
  42afabb... fold-const: Handle NON_LVALUE_EXPR in native_encode_initial (*)
  9987fe6... libquadmath: Don't assume the storage for __float128 argume (*)
  81c300b... icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged (*)
  9f48459... aarch64: Fix TImode __sync_*_compare_and_exchange expansion (*)
  b294d46... bb-reorder: Fix -freorder-blocks-and-partition ICEs on aarc (*)
  9299722... i386: Fix ICEs with SUBREGs from vector etc. constants to X (*)
  c2cd5ee... c: Handle scoped attributes in __has*attribute and scoped a (*)
  fda7a89... attribs: Don't canonicalize lookup_scoped_attribute_spec ar (*)
  e697601... ggc-common: Fix save PCH assertion (*)
  f5758e8... tree-ssa-strlen: Fix up handle_store [PR113603] (*)
  ba38543... docs: Fix 2 typos (*)
  bc51282... i386: Add -masm=intel profiling support [PR113122] (*)
  170c2bb... cfgexpand: Workaround CSE of ADDR_EXPRs in VAR_DECL partiti (*)
  3f0d1e5... libgomp: Fix up FLOCK fallback handling [PR113192] (*)
  ca8ad80... c-family: copy attribute diagnostic fixes [PR113262] (*)
  d73137a... tree-optimization/111070 - fix ICE with recent ifcombine fi (*)
  cc835f4... Daily bump. (*)
  12a3ba2... Fix crash on access-to-incomplete type (*)
  481a766... Daily bump. (*)
  6e35fb3... Daily bump. (*)
  4745c29... Daily bump. (*)
  5d52558... Disable FMADD in chains for Zen4 and generic (*)
  208c8dc... Daily bump. (*)
  a741bb3... Daily bump. (*)
  a7edd18... Daily bump. 

[PATCH] rs6000: ROP - Emit hashst and hashchk insns on Power8 and later [PR114759]

2024-06-19 Thread Peter Bergner
We currently only emit the ROP-protect hash* insns for Power10, where the
insns were added to the architecture.  We want to emit them for earlier
cpus (where they operate as NOPs), so that if those older binaries are
ever executed on a Power10, then they'll be protected from ROP attacks.
Binutils accepts hashst and hashchk back to Power8, so change GCC to emit
them for Power8 and later.  This matches clang's behavior.

This patch is independent of the ROP shrink-wrap fix submitted earlier.
This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for trunk?  

Peter



2024-06-19  Peter Bergner  

gcc/
PR target/114759
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Use TARGET_POWER8.
(rs6000_emit_prologue): Likewise.
* config/rs6000/rs6000.md (hashchk): Likewise.
(hashst): Likewise.
Fix whitespace.

gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-2.c: New test.
* lib/target-supports.exp (rop_ok): Use
check_effective_target_has_arch_pwr8.
---
 gcc/config/rs6000/rs6000-logue.cc |  6 +++---
 gcc/config/rs6000/rs6000.md   |  6 +++---
 gcc/testsuite/gcc.target/powerpc/pr114759-2.c | 17 +
 gcc/testsuite/lib/target-supports.exp |  2 +-
 4 files changed, 24 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-2.c

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index c384e48e378..bd363b625a4 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -716,7 +716,7 @@ rs6000_stack_info (void)
   info->calls_p = (!crtl->is_leaf || cfun->machine->ra_needs_full_frame);
   info->rop_hash_size = 0;
 
-  if (TARGET_POWER10
+  if (TARGET_POWER8
   && info->calls_p
   && DEFAULT_ABI == ABI_ELFv2
   && rs6000_rop_protect)
@@ -3277,7 +3277,7 @@ rs6000_emit_prologue (void)
   /* NOTE: The hashst isn't needed if we're going to do a sibcall,
  but there's no way to know that here.  Harmless except for
  performance, of course.  */
-  if (TARGET_POWER10 && rs6000_rop_protect && info->rop_hash_size != 0)
+  if (TARGET_POWER8 && rs6000_rop_protect && info->rop_hash_size != 0)
 {
   gcc_assert (DEFAULT_ABI == ABI_ELFv2);
   rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM);
@@ -5056,7 +5056,7 @@ rs6000_emit_epilogue (enum epilogue_type epilogue_type)
 
   /* The ROP hash check must occur after the stack pointer is restored
  (since the hash involves r1), and is not performed for a sibcall.  */
-  if (TARGET_POWER10
+  if (TARGET_POWER8
   && rs6000_rop_protect
   && info->rop_hash_size != 0
   && epilogue_type != EPILOGUE_TYPE_SIBCALL)
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index a5d20594789..694076e311f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -15808,9 +15808,9 @@ (define_insn "*cmpeqb_internal"
 
 (define_insn "hashst"
   [(set (match_operand:DI 0 "simple_offsettable_mem_operand" "=m")
-(unspec_volatile:DI [(match_operand:DI 1 "int_reg_operand" "r")]
+   (unspec_volatile:DI [(match_operand:DI 1 "int_reg_operand" "r")]
UNSPEC_HASHST))]
-  "TARGET_POWER10 && rs6000_rop_protect"
+  "TARGET_POWER8 && rs6000_rop_protect"
 {
   static char templ[32];
   const char *p = rs6000_privileged ? "p" : "";
@@ -15823,7 +15823,7 @@ (define_insn "hashchk"
   [(unspec_volatile [(match_operand:DI 0 "int_reg_operand" "r")
 (match_operand:DI 1 "simple_offsettable_mem_operand" "m")]
UNSPEC_HASHCHK)]
-  "TARGET_POWER10 && rs6000_rop_protect"
+  "TARGET_POWER8 && rs6000_rop_protect"
 {
   static char templ[32];
   const char *p = rs6000_privileged ? "p" : "";
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr114759-2.c
new file mode 100644
index 000..3881ebd416e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114759-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mrop-protect" } */
+/* { dg-require-effective-target rop_ok } Only enable on supported ABIs.  */
+
+/* Verify we generate ROP-protect hash insns when compiling for Power8.  */
+
+extern void foo (void);
+
+int
+bar (void)
+{
+  foo ();
+  return 5;
+}
+
+/* { dg-final { scan-assembler-times {\mhashst\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mhashchk\M} 1 } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-suppo

[PATCH v2] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

2024-06-18 Thread Peter Bergner
Updated patch.  This passed bootstrap and regtesting on powerpc64le-linux
with no regressions.  Ok for trunk?

Changes from v1:
1. Moved the disabling of shrink-wrapping to rs6000_emit_prologue
   and beefed up comment.  Used a more accurate test.
2. Added comment to the test case on why rop_ok is needed.

Peter


rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

Only disable shrink-wrapping when using -mrop-protect when we know we
will be emitting the ROP-protect hash instructions (ie, non-leaf functions).

2024-06-17  Peter Bergner  

gcc/
PR target/114759
* config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move
the disabling of shrink-wrapping from here
* config/rs6000/rs6000-logue.cc (rs6000_emit_prologue): ...to here.

gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-1.c: New test.
---
 gcc/config/rs6000/rs6000-logue.cc |  5 +
 gcc/config/rs6000/rs6000.cc   |  4 
 gcc/testsuite/gcc.target/powerpc/pr114759-1.c | 16 
 3 files changed, 21 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-1.c

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 193e2122c0f..c384e48e378 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -3018,6 +3018,11 @@ rs6000_emit_prologue (void)
&& (lookup_attribute ("no_split_stack",
  DECL_ATTRIBUTES (cfun->decl))
== NULL));
+  /* If we are inserting ROP-protect hash instructions, disable shrink-wrap
+ until the bug where the hashst insn is emitted in the wrong location
+ is fixed.  See PR101324 for details.  */
+  if (info->rop_hash_size)
+flag_shrink_wrap = 0;
 
   frame_pointer_needed_indeed
 = frame_pointer_needed && df_regs_ever_live_p (HARD_FRAME_POINTER_REGNUM);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index e4dc629ddcc..fd6e013c346 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3427,10 +3427,6 @@ rs6000_override_options_after_change (void)
 }
   else if (!OPTION_SET_P (flag_cunroll_grow_size))
 flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
-
-  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
-  if (rs6000_rop_protect)
-flag_shrink_wrap = 0;
 }
 
 #ifdef TARGET_USES_LINUX64_OPT
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
new file mode 100644
index 000..579e08e920f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect 
-fdump-rtl-pro_and_epilogue" } */
+/* { dg-require-effective-target rop_ok } Only enable on supported ABIs. */
+
+/* Verify we still attempt shrink-wrapping when using -mrop-protect
+   and there are no function calls.  */
+
+long
+foo (long arg)
+{
+  if (arg)
+asm ("" ::: "r20");
+  return 0;
+}
+
+/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 
"pro_and_epilogue" } } */
-- 
2.43.0



Re: [PATCH] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

2024-06-18 Thread Peter Bergner
On 6/18/24 3:38 PM, Segher Boessenkool wrote:
> From my viewpoint, -mrop-protect should not change code generation at
> all, except of course it has to emit some hash* insns :-)

Ideally, I agree with that.  That said, the hash* insns only accept negative
offsets and the allowed range is rather limited, so from a practical standpoint,
we might have to modify the prologue code slightly to satisfy the restrictions
those insns have.

Peter



Re: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2024-06-18 Thread Peter Bergner
On 6/12/24 2:50 AM, Kewen.Lin wrote:
> As the recent PR115355 shows, this issue can also affect the
> behavior when users are adopting vectorization optimization,
> IMHO we should get this landed as soon as possible.

I agree we want this fixed ASAP.




> As all said above, I believe this patch is a correct fix and
> considering the impact of the issue, I'd like to get this
> pushed next week if no objections.

The only complaint I have on the patch, and I know this existed before
the patch, is we're using register_operand for the predicate for these
patterns when we probably should be using altivec_register_operand or
vsx_register_operand depending on the specific pattern.

Yes, other pre-existing patterns use that, but those should probably be
fixed too.  Maybe we go with register_operand for now with this patch
and then have a follow-on patch (from us) that cleans those all up???

Otherwise, LGTM (although I can't approve it).

Peter




Re: [PATCH] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

2024-06-18 Thread Peter Bergner
On 6/18/24 8:20 AM, Segher Boessenkool wrote:
> On Mon, Jun 17, 2024 at 08:54:46PM -0500, Peter Bergner wrote:
>> So we should be able to shrink-wrap in the presence of the ROP protection.
[snip]
> But do we want to?  And, how far, in what cases not?

My answer to the above would be "yes", "as far as we do today without
-mrop-protect" and "none". :-)  I don't think -mrop-protect should affect
whether we shrink-wrap or not.  I don't think shrink-wrapping call free
paths makes the compiled code less secure by not emitting the hashst/hashchk
insns on those paths, so why would we do anything different wrt shrink-wrapping?


Peter



Re: [PATCH] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

2024-06-17 Thread Peter Bergner
On 6/17/24 7:57 PM, Segher Boessenkool wrote:
> On Mon, Jun 17, 2024 at 06:49:18PM -0500, Peter Bergner wrote:
>> On 6/17/24 6:11 PM, Segher Boessenkool wrote:
>> Yeah, I didn't write that, I only moved it, but I can try to come up with
>> an explanation of why we need to disable it now.  That said, my hope is to
>> not have to disable shrink-wrapping even when we emit the ROP protect hash
>> insns in the future, but that will take some extra work.  If I can manage
>> that, then this should all just go away. :-)  Until then, we can stick
>> with this patch's micro-optimization.
> 
> If you inline one function into another, there is no ROP protection on
> their boundary anymore (since there is no such boundary anymore!)  This
> is not necessarily a problem, but you do want some noipa or similar
> markup where without ROP protection you have no incentive to do that.
> 
> Shrink-wrapping allows more inlining, and more inlining allows more
> shrink-wrapping, but there is no direct relation between shrink-wrapping
> and our ROP protect stuff?  We just need to make sure the hashst and
> hashchk things are done at the very start and the very end of the
> functions, but we need to make sure of that anyway!
> 
> So yeah, please investigate a bit more :-)

So we should be able to shrink-wrap in the presence of the ROP protection.
The ROP attacks work by buffer overrun type issues, clobbering the return
address that was saved on the stack causing us to return to somewhere else.
If we don't need to save the return address on the stack like for leaf
functions, or shrink-wrapped sections that are call free, those codes
are not really susceptible to ROP attacks.  It's the call paths where we
save the return address on the stack that we have to protect.  If inlining
or shrink wrapping increases the amount of code that is call free (ie, we
don't need to save the return address), then that code is not less safe
than before but as safe or safer than before.  It seems the reason we
disabled shrink-wrapping now, was that we were emitting the hashst in the
wrong location (PR101324) causing us to store a bad hash value.  I think
that was just a "bug" that probably should have been fixed rather than
worked around by disabling shrink-wrapping.  It's on my TODO to take a
look at fixing that correctly.





>> At the moment, yes, since the rop_ok test not only checks for the -mcpu= 
>> level,
>> it also verifies that the ABI is ok.
> 
> Ah right!  Add a short comment?

Can do.


>> Currently, rop_ok makes sure we have
>> Power10 and ELFv2 ABI being used.  So currently, if we were to run this test
>> on BE, we'd get an UNSUPPORTED using the rop_ok check, but if we removed it,
>> we'd see a FAIL.  
> 
> Yup.


Peter




[PATCH] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

2024-06-17 Thread Peter Bergner
While auditing our ROP code generation for some test cases I wrote, I noticed
a few issues which I'm tracking in PR114759.  The first issue I noticed is we
disable shrink-wrapping when using -mrop-protect, even in the cases where we
never emit the ROP instructions because they're not needed.  The problem is
we disable shrink-wrapping too early, before we know whether we will need to
emit the ROP instructions or not.  The fix is to delay disabling shrink
wrapping until we've decided whether we will or won't be emitting the ROP
instructions.

This patch passed bootstrap and regtesting on powerpc64le-linux with no
regressions, with the unpatched build FAILing the new test case and the
patched build PASSing the new test case.
Ok for trunk?

Peter



rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

Only disable shrink-wrapping when using -mrop-protect when we know we
will be emitting the ROP instructions (ie, non-leaf functions).

2024-06-17  Peter Bergner  

gcc/
PR target/114759
* config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move
the disabling of shrink-wrapping from here
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): ...to here.

gcc/testsuite/
PR target/114759
* gcc.target/powerpc/pr114759-1.c: New test.
---
 gcc/config/rs6000/rs6000-logue.cc |  6 +-
 gcc/config/rs6000/rs6000.cc   |  4 
 gcc/testsuite/gcc.target/powerpc/pr114759-1.c | 16 
 3 files changed, 21 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr114759-1.c

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 193e2122c0f..659da0bd53f 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -720,7 +720,11 @@ rs6000_stack_info (void)
   && info->calls_p
   && DEFAULT_ABI == ABI_ELFv2
   && rs6000_rop_protect)
-info->rop_hash_size = 8;
+{
+  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
+  flag_shrink_wrap = 0;
+  info->rop_hash_size = 8;
+}
   else if (rs6000_rop_protect && DEFAULT_ABI != ABI_ELFv2)
 {
   /* We can't check this in rs6000_option_override_internal since
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index e4dc629ddcc..fd6e013c346 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3427,10 +3427,6 @@ rs6000_override_options_after_change (void)
 }
   else if (!OPTION_SET_P (flag_cunroll_grow_size))
 flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
-
-  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
-  if (rs6000_rop_protect)
-flag_shrink_wrap = 0;
 }
 
 #ifdef TARGET_USES_LINUX64_OPT
diff --git a/gcc/testsuite/gcc.target/powerpc/pr114759-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
new file mode 100644
index 000..b4ba366402f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect 
-fdump-rtl-pro_and_epilogue" } */
+/* { dg-require-effective-target rop_ok } */
+
+/* Verify we still attempt shrink-wrapping when using -mrop-protect
+   and there are no function calls.  */
+
+long
+foo (long arg)
+{
+  if (arg)
+asm ("" ::: "r20");
+  return 0;
+}
+
+/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 
"pro_and_epilogue" } } */
-- 
2.43.0



Re: [PATCH] rs6000: ROP - Do not disable shrink-wrapping for leaf functions [PR114759]

2024-06-17 Thread Peter Bergner
On 6/17/24 6:11 PM, Segher Boessenkool wrote:
> "ROP insns" are the instructions used in such exploits, not what you
> mean here :-)
> 
> The instructions are called "hash*"C, so maybe call tbem "hash insns"
> or "ROP protect hash insns"?.

Ok, that bad verbiage was in the extra commentary not part of the git
log entry.  That said, I'll reword that to the following:

 Only disable shrink-wrapping when using -mrop-protect when we know we
-will be emitting the ROP instructions (ie, non-leaf functions).
+will be emitting the ROP protect hash instructions (ie, non-leaf functions).




>>  * config/rs6000/rs6000.cc (rs6000_override_options_after_change): Move
>>  the disabling of shrink-wrapping from here
>>  * config/rs6000/rs6000-logue.cc (rs6000_stack_info): ...to here.
> 
> Hrm.  Can you do it in some particular caller of rs6000_stack_info,
> instead?  The rs6000_stack_info function itself is not suppposed to
> change any state whatsoever.

Sure, I can look at maybe moving that to the caller or maybe somewhere
better.  I'll repost the patch once I find a better location.



> The comment should say *why*!  The fact that we do is clear from the
> code itself already.  But why do we want this?
> 
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -3427,10 +3427,6 @@ rs6000_override_options_after_change (void)
>>  }
>>else if (!OPTION_SET_P (flag_cunroll_grow_size))
>>  flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
>> -
>> -  /* If we are inserting ROP-protect instructions, disable shrink wrap.  */
>> -  if (rs6000_rop_protect)
>> -flag_shrink_wrap = 0;
>>  }
> 
> (Yes, I know the original code didn't say either, but let's try to make
> things better :-) )

Yeah, I didn't write that, I only moved it, but I can try to come up with
an explanation of why we need to disable it now.  That said, my hope is to
not have to disable shrink-wrapping even when we emit the ROP protect hash
insns in the future, but that will take some extra work.  If I can manage
that, then this should all just go away. :-)  Until then, we can stick
with this patch's micro-optimization.




>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr114759-1.c
>> @@ -0,0 +1,16 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect 
>> -fdump-rtl-pro_and_epilogue" } */
>> +/* { dg-require-effective-target rop_ok } */
> 
> Do you want rop_ok while you are *forcing* it to be okay anyway?  Why?

At the moment, yes, since the rop_ok test not only checks for the -mcpu= level,
it also verifies that the ABI is ok.  Currently, rop_ok makes sure we have
Power10 and ELFv2 ABI being used.  So currently, if we were to run this test
on BE, we'd get an UNSUPPORTED using the rop_ok check, but if we removed it,
we'd see a FAIL.  

As we discussed offline, the plan is to eventually enable emitting the ROP 
protect
hash insns on other ABIs, but until then, I think we want to keep the rop_ok 
check
so as to keep Bill's CI builder from flagging it as a FAIL.

Peter




Re: [PATCH] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-17 Thread Peter Bergner
On 6/16/24 9:40 PM, Kewen.Lin wrote:
> on 2024/6/17 10:31, Peter Bergner wrote:
>> On 6/16/24 9:10 PM, Kewen.Lin wrote:
>>> on 2024/6/15 01:05, Peter Bergner wrote:
>>>> That said, the --with-cpu=power5 build without fortran did bootstrap and
>>>> regtest with no regressions, so the build did test that code path and
>>>> exposed no problems.
>>>
>>> OK, nice!  Thanks!
>>
>> I assume this means you're "OK" with the updated patch, correct?
> 
> Yes, OK for trunk, thanks!

Thanks.  We will need backports to GCC 11, as it is broken back to when
ROP was first added then.  I'll let things burn-in on trunk for a couple
of days so Bill's CI builders have a chance to test it on all of our
configs.  





>> Do you want to take a stab at writing that or do you want me to do that?
> 
> Either is fine for me, then let me give it a shot.

Sounds good, thanks.  That will allow me to handle the other ROP issues
I came across, which are reported in PR114759.

Peter




Re: [PATCH] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-17 Thread Peter Bergner
On 6/16/24 9:10 PM, Kewen.Lin wrote:
> on 2024/6/15 01:05, Peter Bergner wrote:
>> That said, the --with-cpu=power5 build without fortran did bootstrap and
>> regtest with no regressions, so the build did test that code path and
>> exposed no problems.
> 
> OK, nice!  Thanks!

I assume this means you're "OK" with the updated patch, correct?




>> Currently, TARGET_ALTIVEC_ABI is defined as:
>>
>>   #define TARGET_ALTIVEC_ABI rs6000_altivec_abi
>>
>> Would it make sense to redine it to:
>>
>>   #define TARGET_ALTIVEC_ABI (TARGET_ALTIVEC && rs6000_altivec_abi)
>>
>> ...or add some code in rs6000 option handling to disable rs6000_altivec_abi
>> when TARGET_ALTIVEC is false?  or do we care enough to even change it? 
>> :-)
> 
> Assuming the current code is robust enough (perfectly guarded by some altivec 
> related
> condition like this altivec register saving slot), there may not any actual 
> errors,
> but considering not surprising people, I'm inclined to add some option 
> handlings for
> it, like unsetting rs6000_altivec_abi if !TARGET_ALTIVEC and give some 
> warning if it's
> explicitly specified, what do you think?

I like it, since if Altivec is disabled, having TARGET_ALTIVEC_ABI enabled 
makes no
sense to me.  That is orthogonal to this bug though, so should be a separate 
patch.
Do you want to take a stab at writing that or do you want me to do that?


Peter




Re: [PATCH] rs6000, altivec-2-runnable.c update the require-effective-target

2024-06-17 Thread Peter Bergner
On 6/14/24 1:37 PM, Carl Love wrote:
> Per the additional feedback after patch: 
> 
>   commit c892525813c94b018464d5a4edc17f79186606b7
>   Author: Carl Love 
>   Date:   Tue Jun 11 14:01:16 2024 -0400
> 
>   rs6000, altivec-2-runnable.c should be a runnable test
> 
>   The test case has "dg-do compile" set not "dg-do run" for a runnable
>   test.  This patch changes the dg-do command argument to run.
> 
>   gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/altivec-2-runnable.c: Change dg-do
>   argument to run.

Test case altivec-1-runnable.c seems to have the same issue, in that it
is currently a dg-do compile test case rather than the intended dg-do run.
Can you have a look at changing that to dg-do run too?  My guess it that
this one will want something similar to some other altivec test cases, ala:

/* { dg-do run { target vmx_hw } } */
/* { dg-do compile { target { ! vmx_hw } } } */
/* { dg-require-effective-target powerpc_altivec_ok } */
/* { dg-options "-O2 -maltivec -mabi=altivec" } */


That said, I don't like not having a -mdejagnu-cpu=... here.
I think for our server cpus, this is fine, but on an embedded system
with a old ISA default for -mcpu=... (so we be doing a dg-do compile),
just adding -maltivec to that default may not make much sense for that
default and probably should be an error.  Maybe something like:

/* { dg-do run { target vmx_hw } } */
/* { dg-do compile { target { ! vmx_hw } } } */
/* { dg-require-effective-target powerpc_altivec_ok } */
/* { dg-options "-O2 -mdejagnu=power7" } */

...makes more sense?   Ke Wen & Segher, thoughts on that?
Ke Wen, should powerpc_altivec_ok be powerpc_altivec here???

Peter




[gcc r15-1377] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-17 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:c70eea0dba5f223d49c80cfb3e80e87b74330aac

commit r15-1377-gc70eea0dba5f223d49c80cfb3e80e87b74330aac
Author: Peter Bergner 
Date:   Fri Jun 14 14:36:20 2024 -0500

rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

We currently only compute the offset for the ROP hash save location in
the stack frame for Altivec compiles.  For non-Altivec compiles when we
emit ROP mitigation instructions, we use a default offset of zero which
corresponds to the backchain save location which will get clobbered on
any call.  The fix is to compute the ROP hash save location for all
compiles.

2024-06-14  Peter Bergner  

gcc/
PR target/115389
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Compute
rop_hash_save_offset for non-Altivec compiles.

gcc/testsuite
PR target/115389
* gcc.target/powerpc/pr115389.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-logue.cc   |  9 -
 gcc/testsuite/gcc.target/powerpc/pr115389.c | 17 +
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index d61a25a51264..193e2122c0f9 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -817,17 +817,16 @@ rs6000_stack_info (void)
  gcc_assert (info->altivec_size == 0
  || info->altivec_save_offset % 16 == 0);
 
- /* Adjust for AltiVec case.  */
- info->ehrd_offset = info->altivec_save_offset - ehrd_size;
-
  /* Adjust for ROP protection.  */
  info->rop_hash_save_offset
= info->altivec_save_offset - info->rop_hash_size;
- info->ehrd_offset -= info->rop_hash_size;
}
   else
-   info->ehrd_offset = info->gp_save_offset - ehrd_size;
+ /* Adjust for ROP protection.  */
+ info->rop_hash_save_offset
+   = info->gp_save_offset - info->rop_hash_size;
 
+  info->ehrd_offset = info->rop_hash_save_offset - ehrd_size;
   info->ehcr_offset = info->ehrd_offset - ehcr_size;
   info->cr_save_offset = reg_size; /* first word when 64-bit.  */
   info->lr_save_offset = 2*reg_size;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr115389.c 
b/gcc/testsuite/gcc.target/powerpc/pr115389.c
new file mode 100644
index ..a091ee8a1be0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr115389.c
@@ -0,0 +1,17 @@
+/* PR target/115389 */
+/* { dg-do assemble } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect -mno-vsx -mno-altivec 
-mabi=no-altivec -save-temps" } */
+/* { dg-require-effective-target rop_ok } */
+
+/* Verify we do not emit invalid offsets for our ROP insns.  */
+
+extern void foo (void);
+long
+bar (void)
+{
+  foo ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times {\mhashst\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mhashchk\M} 1 } } */


Re: [PATCH] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-14 Thread Peter Bergner
On 6/13/24 10:26 PM, Peter Bergner wrote:
> On 6/13/24 9:26 PM, Kewen.Lin wrote:
>>>> I understand this is just copied from the if arm, but if I read this 
>>>> right, it can be
>>>> simplified as:
>>>
>>> Ok, I'll retest with that simplification.
> 
> So I retested a normal powerpc64le-linux build (ie, we default to Power8
> with Altivec) and it bootstrapped and regtested with no regressions.
> I then attempted a --with-cpu=power5 build to test the non-altivec path,
> but both the unpatched and patched builds died building libgfortran with
> the following error: "error: ‘_Float128’ is not supported on this target".
> I believe that is related to PR113652.  I'll kick off the build again,
> this time disabling Fortran and seeing if the build completes.

My bad for calling the --with-cpu=power5 bootstrap build on ELFv2 a "bug".
It's not, since ELFv2 mandates a cpu with at least ISA 2.07 (eg. Power8)
support and some of the libgfortran code was written assuming that, so what
I was trying to do was really not supported (ie, luser error).

That said, the --with-cpu=power5 build without fortran did bootstrap and
regtest with no regressions, so the build did test that code path and
exposed no problems.



>>> That's what I expected too! :-)  However, I was surprised to learn that 
>>> -mno-altivec
>>> does *not* disable TARGET_ALTIVEC_ABI.  I had to explicitly use the -mabi= 
>>> option to
>>> expose the bug.
>>
>> oh, it's surprising, I learn something today! :) I guess it's not 
>> intentional but just no
>> one noticed it, as it seems nonsense to have altivec ABI extension but not 
>> using any altivec
>> features.

Currently, TARGET_ALTIVEC_ABI is defined as:

  #define TARGET_ALTIVEC_ABI rs6000_altivec_abi

Would it make sense to redine it to:

  #define TARGET_ALTIVEC_ABI (TARGET_ALTIVEC && rs6000_altivec_abi)

...or add some code in rs6000 option handling to disable rs6000_altivec_abi
when TARGET_ALTIVEC is false?  or do we care enough to even change it? :-)

Peter




Re: [PATCH] rs6000, altivec-2-runnable.c should be a runnable test

2024-06-13 Thread Peter Bergner
On 6/13/24 9:34 PM, Kewen.Lin wrote:
> on 2024/6/14 05:16, Carl Love wrote:

>>  /* { dg-options "-mvsx" } */
>>  /* { dg-additional-options "-mdejagnu-cpu=power8" { target { ! 
>> has_arch_pwr8 } } } */

With the above, we're going to compile and run this test case with -mcpu=power8
or higher, which means we could have P8, P9 or even P10 instructions emitted.



>>  /* { dg-require-effective-target powerpc_vsx } */
> 
> Since you changed this for "run", I think you also want s/powerpc_vsx/vsx_hw/ 
> .

...which means we'd need p8vector_hw, p9vector_hw or ... here.


Should we just always compile with -mcpu=power8 and then check for p8vector_hw
to make our lives easier?  Ala...


   /* { dg-options "-mdejagnu-cpu=power8" } */
   ...
   /* { dg-require-effective-target p8vector_hw } */


Note I've removed -mvsx, since that is implied by -mcpu=power8 and no
need for dg-additional-options.   Maybe we want to add -O2 as well?
Thoughts?

Peter




Re: [PATCH] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-13 Thread Peter Bergner
On 6/13/24 9:26 PM, Kewen.Lin wrote:
> on 2024/6/13 21:24, Peter Bergner wrote:
>> On 6/13/24 12:35 AM, Kewen.Lin wrote:
>>>> @@ -826,7 +826,14 @@ rs6000_stack_info (void)
>>>>  info->ehrd_offset -= info->rop_hash_size;
>>>>}
>>>>else
>>>> -  info->ehrd_offset = info->gp_save_offset - ehrd_size;
>>>> +  {
>>>> +info->ehrd_offset = info->gp_save_offset - ehrd_size;
>>>> +
>>>> +/* Adjust for ROP protection.  */
>>>> +info->rop_hash_save_offset
>>>> +  = info->gp_save_offset - info->rop_hash_size;
>>>> +info->ehrd_offset -= info->rop_hash_size;
>>>> +  }
>>>
>>> I understand this is just copied from the if arm, but if I read this right, 
>>> it can be
>>> simplified as:
>>
>> Ok, I'll retest with that simplification.

So I retested a normal powerpc64le-linux build (ie, we default to Power8
with Altivec) and it bootstrapped and regtested with no regressions.
I then attempted a --with-cpu=power5 build to test the non-altivec path,
but both the unpatched and patched builds died building libgfortran with
the following error: "error: ‘_Float128’ is not supported on this target".
I believe that is related to PR113652.  I'll kick off the build again,
this time disabling Fortran and seeing if the build completes.



>>>> +/* { dg-do assemble } */
>>>> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect -mno-vsx 
>>>> -mno-altivec -mabi=no-altivec -save-temps" } */
>>>
>>> I'd expect -mabi=no-altivec is default for -mno-altivec, but specifying it 
>>> explicitly
>>> looks fine to me. :)
>>
>> That's what I expected too! :-)  However, I was surprised to learn that 
>> -mno-altivec
>> does *not* disable TARGET_ALTIVEC_ABI.  I had to explicitly use the -mabi= 
>> option to
>> expose the bug.
> 
> oh, it's surprising, I learn something today! :) I guess it's not intentional 
> but just no
> one noticed it, as it seems nonsense to have altivec ABI extension but not 
> using any altivec
> features.

Agreed!

Peter




Re: [PATCH] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-13 Thread Peter Bergner
On 6/13/24 12:35 AM, Kewen.Lin wrote:
>> @@ -826,7 +826,14 @@ rs6000_stack_info (void)
>>info->ehrd_offset -= info->rop_hash_size;
>>  }
>>else
>> -info->ehrd_offset = info->gp_save_offset - ehrd_size;
>> +{
>> +  info->ehrd_offset = info->gp_save_offset - ehrd_size;
>> +
>> +  /* Adjust for ROP protection.  */
>> +  info->rop_hash_save_offset
>> += info->gp_save_offset - info->rop_hash_size;
>> +  info->ehrd_offset -= info->rop_hash_size;
>> +}
> 
> I understand this is just copied from the if arm, but if I read this right, 
> it can be
> simplified as:

Ok, I'll retest with that simplification.





>> +/* { dg-do assemble } */
>> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect -mno-vsx 
>> -mno-altivec -mabi=no-altivec -save-temps" } */
> 
> I'd expect -mabi=no-altivec is default for -mno-altivec, but specifying it 
> explicitly
> looks fine to me. :)

That's what I expected too! :-)  However, I was surprised to learn that 
-mno-altivec
does *not* disable TARGET_ALTIVEC_ABI.  I had to explicitly use the -mabi= 
option to
expose the bug.



Peter



[PING][PATCH] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-12 Thread Peter Bergner
Ping.

On 6/7/24 11:06 PM, Peter Bergner wrote:
> We currently only compute the offset for the ROP hash save location in
> the stack frame for Altivec compiles.  For non-Altivec compiles when we
> emit ROP mitigation instructions, we use a default offset of zero which
> corresponds to the backchain save location which will get clobbered on
> any call.  The fix is to compute the ROP hash save location for all
> compiles.
> 
> This passed bootstrap and regtesting on powerpc64le-linux.
> Ok for trunk and backports after some burn-in time?
> 
> Peter
> 
> 
> gcc/
>   PR target/115389
>   * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Compute
>   rop_hash_save_offset for non-Altivec compiles.
> 
> gcc/testsuite/
>   PR target/115389
>   * gcc.target/powerpc/pr115389.c: New test.
> 
> diff --git a/gcc/config/rs6000/rs6000-logue.cc 
> b/gcc/config/rs6000/rs6000-logue.cc
> index d61a25a5126..cfa8a67a5f3 100644
> --- a/gcc/config/rs6000/rs6000-logue.cc
> +++ b/gcc/config/rs6000/rs6000-logue.cc
> @@ -826,7 +826,14 @@ rs6000_stack_info (void)
> info->ehrd_offset -= info->rop_hash_size;
>   }
>else
> - info->ehrd_offset = info->gp_save_offset - ehrd_size;
> + {
> +   info->ehrd_offset = info->gp_save_offset - ehrd_size;
> +
> +   /* Adjust for ROP protection.  */
> +   info->rop_hash_save_offset
> + = info->gp_save_offset - info->rop_hash_size;
> +   info->ehrd_offset -= info->rop_hash_size;
> + }
>  
>info->ehcr_offset = info->ehrd_offset - ehcr_size;
>info->cr_save_offset = reg_size; /* first word when 64-bit.  */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr115389.c 
> b/gcc/testsuite/gcc.target/powerpc/pr115389.c
> new file mode 100644
> index 000..a091ee8a1be
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr115389.c
> @@ -0,0 +1,17 @@
> +/* PR target/115389 */
> +/* { dg-do assemble } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect -mno-vsx 
> -mno-altivec -mabi=no-altivec -save-temps" } */
> +/* { dg-require-effective-target rop_ok } */
> +
> +/* Verify we do not emit invalid offsets for our ROP insns.  */
> +
> +extern void foo (void);
> +long
> +bar (void)
> +{
> +  foo ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mhashst\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mhashchk\M} 1 } } */



[gcc r15-1232] rs6000: Fix pr66144-3.c test to accept multiple equivalent insns. [PR115262]

2024-06-12 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:ae8103a3a13ac412b9ca33222594cb507ceac9f7

commit r15-1232-gae8103a3a13ac412b9ca33222594cb507ceac9f7
Author: Peter Bergner 
Date:   Wed Jun 12 21:05:34 2024 -0500

rs6000: Fix pr66144-3.c test to accept multiple equivalent insns. [PR115262]

Jeff's commit r15-831-g05daf617ea22e1 changed the instruction we expected
for this test case into an equivalent instruction.  Modify the test case
so it will accept any of three instructions we could get depending on the
options used.

2024-06-12  Peter Bergner  

gcc/testsuite/
PR testsuite/115262
* gcc.target/powerpc/pr66144-3.c (dg-do): Compile for all targets.
(dg-options): Add -fno-unroll-loops and remove -mvsx.
(scan-assembler): Change from this...
(scan-assembler-times): ...to this.  Tweak regex to accept multiple
allowable instructions.

Diff:
---
 gcc/testsuite/gcc.target/powerpc/pr66144-3.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr66144-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
index 4c93b2a7a3da..14ecb809edc2 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
@@ -1,5 +1,5 @@
-/* { dg-do compile { target { powerpc64*-*-* } } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2 -ftree-vectorize" } */
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 -ftree-vectorize -fno-unroll-loops" 
} */
 /* { dg-require-effective-target powerpc_vsx } */
 
 /* Verify that we can optimize a vector conditional move, where one of the arms
@@ -20,7 +20,7 @@ test (void)
 a[i] = (b[i] == c[i]) ? -1 : a[i];
 }
 
-/* { dg-final { scan-assembler {\mvcmpequw\M} } } */
-/* { dg-final { scan-assembler {\mxxsel\M}} } */
+/* { dg-final { scan-assembler-times {\mvcmpequw\M} 1 } } */
+/* { dg-final { scan-assembler-times {\m(?:xxsel|xxlor|vor)\M} 1 } } */
 /* { dg-final { scan-assembler-not {\mvspltisw\M} } } */
 /* { dg-final { scan-assembler-not {\mxxlorc\M}   } } */


Re: [PATCH] testsuite: Fix pr66144-3.c test to accept multiple equivalent insns. [PR115262]

2024-06-12 Thread Peter Bergner
On 6/12/24 3:00 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jun 12, 2024 at 02:49:40PM -0500, Peter Bergner wrote:
>> testsuite: Fix pr66144-3.c test to accept multiple equivalent insns. 
>> [PR115262]
> 
> ("rs6000:", not "testsuite")

Done.



>>  /* { dg-do compile { target { powerpc64*-*-* } } } */
> 
> Probably should be an "lp64" instead?

Actually, there is nothing inherently 64-bit about the test case.
I removed the target test altogether and it executes just fine on
our BE system in both 32-bit and 64-bit modes, so I'll just drop
the target test as part of the patch.



>> -/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2 -ftree-vectorize" } */
>> +/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2 -ftree-vectorize 
>> -fno-unroll-loops" } */
> 
> The "-mvsx" is superfluous btw (implied by -mcpu=power8).

Ok, I removed -mvsx since I'm touching the line already.


>>  /* { dg-require-effective-target powerpc_vsx } */
> 
> This isn't needed either.

Maybe not strictly needed, but it shields us from users who force
some options to be used via RUNTESTFLAGS env var that can cause the
test case to FAIL.  I'm going to leave this for someone else to
clean up.



>> -/* { dg-final { scan-assembler {\mvcmpequw\M} } } */
>> -/* { dg-final { scan-assembler {\mxxsel\M}} } */
>> +/* { dg-final { scan-assembler-times {\mvcmpequw\M} 1 } } */
>> +/* { dg-final { scan-assembler-times {\m(?:xxsel|xxlor|vor)\M} 1 } } */
>>  /* { dg-final { scan-assembler-not {\mvspltisw\M} } } */
>>  /* { dg-final { scan-assembler-not {\mxxlorc\M}   } } */
> 
> Okido, thanks!  The three trivial tweaks I suggest here are okay as
> well if you want, after testing of course ;-)

Thanks.  For completeness, the final patch looks like the following.

Peter



gcc/testsuite/
PR testsuite/115262
* gcc.target/powerpc/pr66144-3.c (dg-do): Compile for all targets.
(dg-options): Add -fno-unroll-loops and remove -mvsx.
(scan-assembler): Change from this...
(scan-assembler-times): ...to this.  Tweak regex to accept multiple
allowable instructions.
---
 gcc/testsuite/gcc.target/powerpc/pr66144-3.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr66144-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
index 4c93b2a7a3d..14ecb809edc 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
@@ -1,5 +1,5 @@
-/* { dg-do compile { target { powerpc64*-*-* } } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2 -ftree-vectorize" } */
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 -ftree-vectorize -fno-unroll-loops" 
} */
 /* { dg-require-effective-target powerpc_vsx } */
 
 /* Verify that we can optimize a vector conditional move, where one of the arms
@@ -20,7 +20,7 @@ test (void)
 a[i] = (b[i] == c[i]) ? -1 : a[i];
 }
 
-/* { dg-final { scan-assembler {\mvcmpequw\M} } } */
-/* { dg-final { scan-assembler {\mxxsel\M}} } */
+/* { dg-final { scan-assembler-times {\mvcmpequw\M} 1 } } */
+/* { dg-final { scan-assembler-times {\m(?:xxsel|xxlor|vor)\M} 1 } } */
 /* { dg-final { scan-assembler-not {\mvspltisw\M} } } */
 /* { dg-final { scan-assembler-not {\mxxlorc\M}   } } */


[PATCH] testsuite: Fix pr66144-3.c test to accept multiple equivalent insns. [PR115262]

2024-06-12 Thread Peter Bergner
testsuite: Fix pr66144-3.c test to accept multiple equivalent insns. [PR115262]

Jeff's commit r15-831-g05daf617ea22e1 changed the instruction we expected
for this test case into an equivalent instruction.  Modify the test case
so it will accept any of three equivalent instructions we could get depending
on the options used.

I've verified this test case PASSes on all scenarios where the three possible
instructions are generated.  Ok for trunk?

Peter


2024-06-12  Peter Bergner  

gcc/testsuite/
PR testsuite/115262
* gcc.target/powerpc/pr66144-3.c: Add -fno-unroll-loops to options.
(scan-assembler): Change from this...
(scan-assembler-times): ...to this.  Tweak regex to accept multiple
equivalent instructions.


diff --git a/gcc/testsuite/gcc.target/powerpc/pr66144-3.c 
b/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
index 4c93b2a7a3d..dbd746c5489 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr66144-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc64*-*-* } } } */
-/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2 -ftree-vectorize" } */
+/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2 -ftree-vectorize 
-fno-unroll-loops" } */
 /* { dg-require-effective-target powerpc_vsx } */
 
 /* Verify that we can optimize a vector conditional move, where one of the arms
@@ -20,7 +20,7 @@ test (void)
 a[i] = (b[i] == c[i]) ? -1 : a[i];
 }
 
-/* { dg-final { scan-assembler {\mvcmpequw\M} } } */
-/* { dg-final { scan-assembler {\mxxsel\M}} } */
+/* { dg-final { scan-assembler-times {\mvcmpequw\M} 1 } } */
+/* { dg-final { scan-assembler-times {\m(?:xxsel|xxlor|vor)\M} 1 } } */
 /* { dg-final { scan-assembler-not {\mvspltisw\M} } } */
 /* { dg-final { scan-assembler-not {\mxxlorc\M}   } } */


[gcc(refs/vendors/ibm/heads/gcc-14-branch)] ibm: Create the ibm/gcc-14-branch

2024-06-12 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:96b284e64a7f1c3bfce4e5434c407799bbfbcd98

commit 96b284e64a7f1c3bfce4e5434c407799bbfbcd98
Author: Peter Bergner 
Date:   Wed Jun 12 11:19:31 2024 -0500

ibm: Create the ibm/gcc-14-branch

2024-06-12  Peter Bergner  

Create ibm/gcc-14-branch which follows the releases/gcc-14 branch.

Diff:
---
 gcc/ChangeLog.ibm | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/ChangeLog.ibm b/gcc/ChangeLog.ibm
new file mode 100644
index ..3ebc8710b3b7
--- /dev/null
+++ b/gcc/ChangeLog.ibm
@@ -0,0 +1,3 @@
+2024-06-12  Peter Bergner  
+
+   Create ibm/gcc-14-branch which follows the releases/gcc-14 branch.


[gcc] Created branch 'ibm/heads/gcc-14-branch' in namespace 'refs/vendors'

2024-06-12 Thread Peter Bergner via Gcc-cvs
The branch 'ibm/heads/gcc-14-branch' was created in namespace 'refs/vendors' 
pointing to:

 7593dae69ba0... arm: Add .type and .size to __gnu_cmse_nonsecure_call [PR11


[gcc r15-1115] rs6000: Update ELFv2 stack frame comment showing the correct ROP save location

2024-06-08 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:e91cf26a954a5c1bf431e36f3a1e69f94e9fa4fe

commit r15-1115-ge91cf26a954a5c1bf431e36f3a1e69f94e9fa4fe
Author: Peter Bergner 
Date:   Fri Jun 7 16:03:08 2024 -0500

rs6000: Update ELFv2 stack frame comment showing the correct ROP save 
location

The ELFv2 stack frame layout comment in rs6000-logue.cc shows the ROP
hash save slot in the wrong location.  Update the comment to show the
correct ROP hash save location in the frame.

2024-06-07  Peter Bergner  

gcc/
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Update comment.

Diff:
---
 gcc/config/rs6000/rs6000-logue.cc | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index bd5d56ba002..d61a25a5126 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -591,21 +591,21 @@ rs6000_savres_strategy (rs6000_stack_t *info,
+---+
| Parameter save area (+padding*) (P)   |  32
+---+
-   | Optional ROP hash slot (R)|  32+P
+   | Alloca space (A)  |  32+P
+---+
-   | Alloca space (A)  |  32+P+R
+   | Local variable space (L)  |  32+P+A
+---+
-   | Local variable space (L)  |  32+P+R+A
+   | Optional ROP hash slot (R)|  32+P+A+L
+---+
-   | Save area for AltiVec registers (W)   |  32+P+R+A+L
+   | Save area for AltiVec registers (W)   |  32+P+A+L+R
+---+
-   | AltiVec alignment padding (Y) |  32+P+R+A+L+W
+   | AltiVec alignment padding (Y) |  32+P+A+L+R+W
+---+
-   | Save area for GP registers (G)|  32+P+R+A+L+W+Y
+   | Save area for GP registers (G)|  32+P+A+L+R+W+Y
+---+
-   | Save area for FP registers (F)|  32+P+R+A+L+W+Y+G
+   | Save area for FP registers (F)|  32+P+A+L+R+W+Y+G
+---+
-   old SP->| back chain to caller's caller |  32+P+R+A+L+W+Y+G+F
+   old SP->| back chain to caller's caller |  32+P+A+L+R+W+Y+G+F
+---+
 
  * If the alloca area is present, the parameter save area is


[PATCH] rs6000: Compute rop_hash_save_offset for non-Altivec compiles [PR115389]

2024-06-07 Thread Peter Bergner
We currently only compute the offset for the ROP hash save location in
the stack frame for Altivec compiles.  For non-Altivec compiles when we
emit ROP mitigation instructions, we use a default offset of zero which
corresponds to the backchain save location which will get clobbered on
any call.  The fix is to compute the ROP hash save location for all
compiles.

This passed bootstrap and regtesting on powerpc64le-linux.
Ok for trunk and backports after some burn-in time?

Peter


gcc/
PR target/115389
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Compute
rop_hash_save_offset for non-Altivec compiles.

gcc/testsuite/
PR target/115389
* gcc.target/powerpc/pr115389.c: New test.

diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index d61a25a5126..cfa8a67a5f3 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -826,7 +826,14 @@ rs6000_stack_info (void)
  info->ehrd_offset -= info->rop_hash_size;
}
   else
-   info->ehrd_offset = info->gp_save_offset - ehrd_size;
+   {
+ info->ehrd_offset = info->gp_save_offset - ehrd_size;
+
+ /* Adjust for ROP protection.  */
+ info->rop_hash_save_offset
+   = info->gp_save_offset - info->rop_hash_size;
+ info->ehrd_offset -= info->rop_hash_size;
+   }
 
   info->ehcr_offset = info->ehrd_offset - ehcr_size;
   info->cr_save_offset = reg_size; /* first word when 64-bit.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr115389.c 
b/gcc/testsuite/gcc.target/powerpc/pr115389.c
new file mode 100644
index 000..a091ee8a1be
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr115389.c
@@ -0,0 +1,17 @@
+/* PR target/115389 */
+/* { dg-do assemble } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mrop-protect -mno-vsx -mno-altivec 
-mabi=no-altivec -save-temps" } */
+/* { dg-require-effective-target rop_ok } */
+
+/* Verify we do not emit invalid offsets for our ROP insns.  */
+
+extern void foo (void);
+long
+bar (void)
+{
+  foo ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times {\mhashst\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mhashchk\M} 1 } } */


[PATCH, obvious] rs6000: Update ELFv2 stack frame comment showing the correct ROP save location

2024-06-07 Thread Peter Bergner
I consider this one obvious, so I plan on pushing this soonish.

Peter


The ELFv2 stack frame layout comment in rs6000-logue.cc shows the ROP
hash save slot in the wrong location.  Update the comment to show the
correct ROP hash save location in the frame.

gcc/
* config/rs6000/rs6000-logue.cc (rs6000_stack_info): Update comment.


diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index bd5d56ba002..0ef309e043b 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -591,21 +591,21 @@ rs6000_savres_strategy (rs6000_stack_t *info,
+---+
| Parameter save area (+padding*) (P)   |  32
+---+
-   | Optional ROP hash slot (R)|  32+P
+   | Alloca space (A)  |  32+P
+---+
-   | Alloca space (A)  |  32+P+R
+   | Local variable space (L)  |  32+P+A
+---+
-   | Local variable space (L)  |  32+P+R+A
+   | Optional ROP hash slot (R)|  32+P+A+L
+---+
-   | Save area for AltiVec registers (W)   |  32+P+R+A+L
+   | Save area for AltiVec registers (W)   |  32+P+A+L+R
+---+
-   | AltiVec alignment padding (Y) |  32+P+R+A+L+W
+   | AltiVec alignment padding (Y) |  32+P+A+L+R+W
+---+
-   | Save area for GP registers (G)|  32+P+R+A+L+W+Y
+   | Save area for GP registers (G)|  32+P+A+L+R+W+Y
+---+
-   | Save area for FP registers (F)|  32+P+R+A+L+W+Y+G
+   | Save area for FP registers (F)|  32+P+A+L+R+W+Y+G
+---+
-   old SP->| back chain to caller's caller |  32+P+R+A+L+W+Y+G+F
+   old SP->| back chain to caller's caller |  32+P+A+L+R+W+Y+G+F
+---+
 
  * If the alloca area is present, the parameter save area is


[gcc(refs/vendors/ibm/heads/gcc-12-branch)] ibm: Merge up to top of releases/gcc-12

2024-05-29 Thread Peter Bergner via Libstdc++-cvs
https://gcc.gnu.org/g:92786addfe0797790a97ddc50f7709a1bf4791a9

commit 92786addfe0797790a97ddc50f7709a1bf4791a9
Merge: 9f2e51a88fb 342f577d8ea
Author: Peter Bergner 
Date:   Wed May 29 14:42:14 2024 -0500

ibm: Merge up to top of releases/gcc-12

2024-05-29  Peter Bergner  

Merge up to releases/gcc-12 342f577d8ea60c3473a6c1e66ef038b96f99f9d2

Diff:

 ChangeLog  |8 +
 configure  |2 +-
 configure.ac   |2 +-
 fixincludes/ChangeLog  |   20 +
 fixincludes/fixincl.x  |  109 +-
 fixincludes/inclhack.def   |   47 +
 fixincludes/tests/base/objc/runtime.h  |   24 +
 fixincludes/tests/base/stdio.h |7 +
 gcc/ChangeLog  |  954 +++
 gcc/ChangeLog.ibm  |4 +
 gcc/DATESTAMP  |2 +-
 gcc/ada/ChangeLog  |   18 +
 gcc/ada/exp_ch4.adb|2 -
 gcc/ada/exp_ch7.adb|   13 +
 gcc/ada/exp_util.adb   |   15 +-
 gcc/ada/sem_res.adb|   14 +-
 gcc/asan.cc|   15 +-
 gcc/c-family/ChangeLog |   16 +
 gcc/c-family/c-common.cc   |7 +-
 gcc/c-family/c-pch.cc  |5 +-
 gcc/cfgexpand.cc   |2 +-
 gcc/cfgrtl.cc  |   24 +-
 gcc/cfgrtl.h   |1 +
 gcc/cgraph.cc  |   10 +-
 gcc/cgraph.h   |   18 +-
 gcc/cgraphunit.cc  |2 +
 gcc/config.in  |   24 +
 gcc/config/aarch64/aarch64-cores.def   |2 +-
 gcc/config/aarch64/aarch64.cc  |   29 +-
 gcc/config/aarch64/aarch64.h   |2 +-
 gcc/config/aarch64/aarch64.md  |   35 +-
 gcc/config/aarch64/iterators.md|3 +
 gcc/config/arm/arm.cc  |   69 ++
 gcc/config/arm/neon.md |4 +-
 gcc/config/avr/avr-mcus.def|   83 +-
 gcc/config/avr/avr.cc  |   10 +
 gcc/config/darwin-protos.h |   11 +
 gcc/config/darwin-sections.def |4 +-
 gcc/config/darwin.cc   |  224 +++-
 gcc/config/darwin.h|   92 +-
 gcc/config/darwin.opt  |4 +
 gcc/config/i386/amxtileintrin.h|4 +-
 gcc/config/i386/darwin.h   |4 +-
 gcc/config/i386/i386-builtin.def   |4 +
 gcc/config/i386/i386-expand.cc |   19 +
 gcc/config/i386/i386-features.cc   |   50 +-
 gcc/config/i386/i386-features.h|1 +
 gcc/config/i386/i386.md|   24 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |   31 +-
 gcc/config/loongarch/gnu-user.h|4 +-
 gcc/config/loongarch/loongarch-opts.cc |   22 +
 gcc/config/loongarch/loongarch-opts.h  |   18 +
 gcc/config/loongarch/loongarch-protos.h|2 +-
 gcc/config/loongarch/loongarch.cc  |   69 +-
 gcc/config/loongarch/loongarch.h   |   22 +-
 gcc/config/loongarch/loongarch.md  |   23 +-
 gcc/config/loongarch/loongarch.opt |   31 +-
 gcc/config/loongarch/sync.md   |   46 +-
 gcc/config/mips/mips-msa.md|   18 +-
 gcc/config/pa/pa.md|6 +-
 gcc/config/riscv/sync.md   |9 +
 gcc/config/rs6000/darwin.h |6 +-
 gcc/config/rs6000/mma.md   |8 +-
 gcc/config/rs6000/predicates.md|2 +-
 gcc/config/rs6000/rs6000-builtin.cc|6 +-
 gcc/config/rs6000/rs6000-c.cc  |   14 +-
 gcc/config/rs6000/rs6000-cpus.def  |5 +-
 gcc/config/rs6000/rs6000.cc|   19 +-
 gcc/config/rs6000/rs6000.h |4 +-
 gcc/config/rs6000/rs6000.md|8 +-
 gcc/config/rs6000/rs6000.opt   |6 +-
 gcc/config/rs6000/vsx.md   |4 +-
 gcc/config/sh/sh.cc|3 +-
 gcc/configure  |  149 ++-
 gcc/configure.ac

[gcc/ibm/heads/gcc-12-branch] (363 commits) ibm: Merge up to top of releases/gcc-12

2024-05-29 Thread Peter Bergner via Gcc-cvs
The branch 'ibm/heads/gcc-12-branch' was updated to point to:

 92786addfe0... ibm: Merge up to top of releases/gcc-12

It previously pointed to:

 9f2e51a88fb... ibm: Merge up to top of releases/gcc-12

Diff:

Summary of changes (added commits):
---

  92786ad... ibm: Merge up to top of releases/gcc-12
  342f577... Daily bump. (*)
  da9b7a5... ubsan: Use right address space for MEM_REF created for bool (*)
  e0b2c4f... Fortran: Fix SHAPE for zero-size arrays (*)
  72f6b7e... ipa: Compare jump functions in ICF (PR 113907) (*)
  3bb534d... Daily bump. (*)
  4507501... Daily bump. (*)
  0bd259a... Daily bump. (*)
  e11d3dd... Daily bump. (*)
  ba57a52... c++: __is_constructible ref binding [PR100667] (*)
  6a5dcdb... c++: fix PR111529 backport (*)
  1982783... c++: unroll pragma in templates [PR111529] (*)
  419b5e1... c++: array of PMF [PR113598] (*)
  7076c56... c++: binding reference to comma expr [PR114561] (*)
  a1ff317... Daily bump. (*)
  df19155... Daily bump. (*)
  d9c8940... testsuite: Verify r0-r3 are extended with CMSE (*)
  13ced60... Daily bump. (*)
  113ddbe... Daily bump. (*)
  2f0c2cc... Daily bump. (*)
  1ba6e8b... Daily bump. (*)
  65e5547... middle-end/110176 - wrong zext (bool) <= (int) 4294967295u  (*)
  47e6bff... tree-optimization/111039 - abnormals and bit test merging (*)
  5db4b54... tree-optimization/112281 - loop distribution and zero depen (*)
  dbb5273... tree-optimization/112495 - alias versioning and address spa (*)
  4a71557... tree-optimization/112505 - bit-precision induction vectoriz (*)
  1f41e8e... debug/112718 - reset all type units with -ffat-lto-objects (*)
  9bad5cf... tree-optimization/112793 - SLP of constant/external code-ge (*)
  2d650c0... tree-optimization/114027 - fix testcase (*)
  6661a7c... tree-optimization/114027 - conditional reduction chain (*)
  c1b2185... tree-optimization/114375 - disallow SLP discovery of permut (*)
  a7b1d81... tree-optimization/114231 - use patterns for BB SLP discover (*)
  46b2e98... middle-end/114734 - wrong code with expand_call_mem_ref (*)
  42a0393... lto/114655 - -flto=4 at link time doesn't override -flto=au (*)
  56415e3... gcov-profile/114715 - missing coverage for switch (*)
  b656e65... Daily bump. (*)
  2183e5b... ipa: Self-DCE of uses of removed call LHSs (PR 108007) (*)
  4419198... ipa: Force args obtined through pass-through maps to the ex (*)
  de66146... Daily bump. (*)
  2beef72... Daily bump. (*)
  c5c3a4a... Fix range-ops operator_addr. (*)
  f7db003... Daily bump. (*)
  587596d... Objective-C, NeXT, v2: Correct a regression in code-gen. (*)
  3349a6c... Daily bump. (*)
  ffa41c6... testsuite: Fix up vector-subaccess-1.C test for ia32 [PR892 (*)
  f5c7306... Fix PR 110386: backprop vs ABSU_EXPR (*)
  58d11bf... testsuite: fix Wmismatched-new-delete-8.C with -m32 (*)
  16319f8... warn-access: Fix handling of unnamed types [PR109804] (*)
  39d56b9... Fix PR 111331: wrong code for `a > 28 ? MIN : 29` (*)
  d88fe82... Fold: Fix up merge_truthop_with_opposite_arm for NaNs [PR95 (*)
  0ab30fb... libstdc++: Fix conversion of simd to vector builtin (*)
  79aa696... libstdc++: Silence irrelevant warnings in  (*)
  7abc861... libstdc++: Fix -Wsystem-headers warnings in tests (*)
  c0c1207... libstdc++: Update  synopsis test for C++11 and late (*)
  2d174d4... libstdc++: Fix -Wsystem-headers warnings (*)
  14876f3... libstdc++: Improve doxygen docs for  (*)
  0a9cfae... libstdc++: Improve doxygen docs for some of  (*)
  0d128f5... libstdc++: Improve doxygen docs for algorithms and more (*)
  54de91d... libstdc++: Improve doxygen docs for std::allocator (*)
  e1800b8... libstdc++: Improve doxygen docs for  (*)
  f0db5df... libstdc++: Improve doxygen docs for  (*)
  914a226... libstdc++: Stop defining C++0x compat symbols for versioned (*)
  f8ab9b7... libstdc++: Add macros for the inline namespace std::_V2 (*)
  d9f006d... libstdc++: Disable Doxygen GROUP_NESTED_COMPOUNDS config op (*)
  f3d4e25... libstdc++: Simplify fs::path construction using variable te (*)
  57eb035... libstdc++: Update std::pointer_traits to match new LWG 3545 (*)
  1bb467f... libstdc++: Simplify detection idiom using concepts (*)
  51e9dcc... libstdc++: Improve doxygen docs for std::pointer_traits (*)
  c6f80dc... libstdc++: use grep -E instead of egrep in scripts (*)
  5c156f5... libstdc++: Fix allocator propagation in regex algorithms [P (*)
  e35b26c... libstdc++: Define std::basic_stringbuf::view() for old std: (*)
  0135f93... libstdc++: Add autoconf checks for mkdir, chmod, chdir, and (*)
  a389921... libstdc++: Explicitly default some copy ctors and assignmen (*)
  dc0964f... libstdc++: Add static_assert to std::integer_sequence [PR11 (*)
  15c5170... libstdc++: Remove non-void static assertions in variant's s (*)
  c285c1b... libstdc++: Fix exception thrown by std::shared_lock::unlock (*)
  6f5dcea... libstdc++: Fix conditions for using memcmp in std::lexicogr (*)
  8ec265c... libstdc++: Do not use memmove 

[gcc(refs/vendors/ibm/heads/gcc-13-branch)] ibm: Merge up to top of releases/gcc-13

2024-05-29 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:c3db5f495a1543fb22f725be910dc46249a15e57

commit c3db5f495a1543fb22f725be910dc46249a15e57
Merge: efb4bfb219d ebca6006f44
Author: Peter Bergner 
Date:   Wed May 29 10:48:31 2024 -0500

ibm: Merge up to top of releases/gcc-13

2024-05-29  Peter Bergner  

Merge up to releases/gcc-13 ebca6006f44408b8084868da6613f185b810db74

Diff:

 ChangeLog  |   15 +
 Makefile.in|   30 +
 Makefile.tpl   |   24 +
 c++tools/ChangeLog |4 +
 config/ChangeLog   |4 +
 contrib/ChangeLog  |   13 +
 contrib/dg-extract-results.sh  |   17 +-
 contrib/header-tools/ChangeLog |4 +
 contrib/reghunt/ChangeLog  |4 +
 contrib/regression/ChangeLog   |4 +
 fixincludes/ChangeLog  |4 +
 gcc/BASE-VER   |2 +-
 gcc/ChangeLog  | 1964 ++
 gcc/ChangeLog.ibm  |4 +
 gcc/DATESTAMP  |2 +-
 gcc/ada/ChangeLog  |   50 +
 gcc/ada/exp_attr.adb   |   63 +-
 gcc/ada/exp_ch4.adb|2 -
 gcc/ada/exp_ch7.adb|   13 +
 gcc/ada/exp_util.adb   |   15 +-
 gcc/ada/sem_aggr.adb   |9 +-
 gcc/ada/sem_ch13.adb   |   12 +-
 gcc/ada/sem_res.adb|   14 +-
 gcc/analyzer/ChangeLog |  148 +
 gcc/analyzer/call-summary.cc   |   12 +
 gcc/analyzer/checker-event.cc  |   40 -
 gcc/analyzer/constraint-manager.cc |  131 +
 gcc/analyzer/constraint-manager.h  |1 +
 gcc/analyzer/engine.cc |7 +
 gcc/analyzer/inlining-iterator.h   |   40 +
 gcc/analyzer/kf.cc |   22 +
 gcc/analyzer/region-model-manager.cc   |9 +-
 gcc/analyzer/region-model.cc   |  110 +-
 gcc/analyzer/region.cc |   77 +-
 gcc/analyzer/region.h  |   14 +-
 gcc/analyzer/sm-malloc.cc  |   40 +
 gcc/analyzer/sm-taint.cc   |6 +
 gcc/analyzer/state-purge.cc|9 +
 gcc/analyzer/store.cc  |   11 +-
 gcc/analyzer/store.h   |   10 +-
 gcc/analyzer/supergraph.cc |4 +
 gcc/analyzer/varargs.cc|   38 +-
 gcc/asan.cc|   52 +-
 gcc/attribs.cc |   17 +-
 gcc/bb-reorder.cc  |3 +-
 gcc/bitmap.cc  |2 +-
 gcc/c-family/ChangeLog |   49 +
 gcc/c-family/c-attribs.cc  |   32 +-
 gcc/c-family/c-common.cc   |8 +-
 gcc/c-family/c-lex.cc  |   32 +-
 gcc/c-family/c-pch.cc  |5 +-
 gcc/c/ChangeLog|   14 +
 gcc/c/c-decl.cc|7 +-
 gcc/calls.cc   |7 +-
 gcc/cfgexpand.cc   |   32 +-
 gcc/cfgrtl.cc  |   27 +-
 gcc/cfgrtl.h   |1 +
 gcc/cgraph.cc  |   10 +-
 gcc/cgraph.h   |   15 +-
 gcc/cgraphunit.cc  |2 +
 gcc/combine.cc |   12 +-
 gcc/common.opt |2 +-
 gcc/common/config/avr/avr-common.cc|6 -
 gcc/common/config/i386/i386-common.cc  |2 +-
 gcc/config.gcc |1 +
 gcc/config.in  |   21 +-
 gcc/config/aarch64/aarch64-arches.def  |2 +-
 gcc/config/aarch64/aarch64-builtins.cc |2 +-
 gcc/config/aarch64/aarch64-cores.def   |2 +-
 gcc/config/aarch64/aarch64.cc  |   31 +-
 gcc/config/aarch64/aarch64.md  |   35 +-
 gcc/config/aarch64/iterators.md|3 +
 gcc/config/aarch64/t-aarch64-rtems |   42 +
 gcc/config/alpha/alpha.cc  |3 +-
 gcc/config/arc/arc.cc

[gcc/ibm/heads/gcc-13-branch] (553 commits) ibm: Merge up to top of releases/gcc-13

2024-05-29 Thread Peter Bergner via Gcc-cvs
The branch 'ibm/heads/gcc-13-branch' was updated to point to:

 c3db5f495a1... ibm: Merge up to top of releases/gcc-13

It previously pointed to:

 efb4bfb219d... ibm: Merge up to top of releases/gcc-13

Diff:

Summary of changes (added commits):
---

  c3db5f4... ibm: Merge up to top of releases/gcc-13
  ebca600... Daily bump. (*)
  fd91953... libstdc++: Fix up 19_diagnostics/stacktrace/hash.cc on 13 b (*)
  3185cfe... Fortran: Fix SHAPE for zero-size arrays (*)
  67434fe... libstdc++: Guard use of sized deallocation [PR114940] (*)
  d7f9f23... Daily bump. (*)
  b954f15... Daily bump. (*)
  513d050... Daily bump. (*)
  91c7ec5... Daily bump. (*)
  53cdaa7... c++: unroll pragma in templates [PR111529] (*)
  5f14578... c++: array of PMF [PR113598] (*)
  cf76815... Daily bump. (*)
  6f8933c... Daily bump. (*)
  75d394c... testsuite: Verify r0-r3 are extended with CMSE (*)
  f0b88ec... Fortran: fix issues with class(*) assignment [PR114827] (*)
  2ebf3af... Fortran: fix reallocation on assignment of polymorphic vari (*)
  53bc98f... strlen: Fix up !si->full_string_p handling in count_nonzero (*)
  35ac28b... ubsan: Use right address space for MEM_REF created for bool (*)
  a841964... Daily bump. (*)
  9433e30... libstdc++: testsuite: Enhance codecvt_unicode with tests fo (*)
  bd5e672... libstdc++: Fix handling of surrogate CP in codecvt [PR10897 (*)
  0a9df2c... c++: Fix std dialect hint for std::to_address [PR107800] (*)
  5ed32d0... Fortran: fix dependency checks for inquiry refs [PR115039] (*)
  c827f46... testsuite: Adjust pr113359-2_*.c with unsigned long long [P (*)
  3f6a425... PHIOPT: Don't transform minmax if middle bb contains a phi  (*)
  d6cf49e... match: Disable `(type)zero_one_valuep*CST` for 1bit signed  (*)
  bde5894... Bump BASE-VER. (*)
  b71f1de... Update ChangeLog and version files for release (*)
  a021b58... Daily bump. (*)
  4416023... Daily bump. (*)
  94509b6... Daily bump. (*)
  162c441... [committed] Fix RISC-V missing stack tie (*)
  5b5342e... Daily bump. (*)
  851aa3b... Daily bump. (*)
  1db45e8... ipa: Compare jump functions in ICF (PR 113907) (*)
  10bf53a... ICF: Make ICF and SRA agree on padding (*)
  7dca716... libstdc++: Fix typo in std::stacktrace::max_size [PR115063] (*)
  71e941b... libstdc++: Fix infinite loop in std::binomial_distribution  (*)
  b9e2a32... libstdc++: Adjust expected locale-dependent date formats in (*)
  ebc61a9... libstdc++: Fix typo in Doxygen comment (*)
  bce15a5... libstdc++: Fix run_doxygen for Doxygen 1.10 man page format (*)
  47cac09... c++: build_extra_args recapturing local specs [PR114303] (*)
  12ee04d... Daily bump. (*)
  d3659e2... c++: constexpr union member access folding [PR114709] (*)
  2e353c6... Manually add ChangeLog entries for various commits from 202 (*)
  d629308... rtl-optimization/54052 - RTL SSA PHI insertion compile-time (*)
  6d1801f... Daily bump. (*)
  b7a2697... diagnostics: fix corrupt json/SARIF on stderr [PR114348] (*)
  2a6f99a... Fix ICE in -fdiagnostics-generate-patch [PR112684] (*)
  230f672... diagnostics: fix ICE on sarif output when source file is un (*)
  96f7a36... analyzer: fix ICE and false positive with -Wanalyzer-deref- (*)
  810d35a... analyzer: fix ICE due to type mismatch when replaying call  (*)
  ed02610... analyzer: fix -Wanalyzer-deref-before-check false positive  (*)
  67d104f... analyzer: fix -Wanalyzer-va-arg-type-mismatch false +ve on  (*)
  2c688f6... analyzer: fix skipping of debug stmts [PR113253] (*)
  0593151... analyzer: fix defaults in compound assignments from non-zer (*)
  132eb1a... analyzer: casting all zeroes should give all zeroes [PR1133 (*)
  994477c... analyzer: fix deref-before-check false positives due to inl (*)
  a1cb188... analyzer: fix ICE for 2 bits before the start of base regio (*)
  b8c772c... jit: dump string literal initializers correctly (*)
  44968a0... testsuite, analyzer: add test case [PR108171] (*)
  a0b13d0... analyzer: fix ICE on zero-sized arrays [PR110882] (*)
  0df1ee0... analyzer: fix ICE on division of tainted floating-point val (*)
  60dcb71... jit.exp: handle dwarf version mismatch in jit-check-debug-i (*)
  b38472f... jit: avoid using __vector in testcase [PR110466] (*)
  e0c5290... testsuite: Add more allocation size tests for conjured sval (*)
  ccf8d3e... analyzer: Fix allocation size false positive on conjured sv (*)
  89feb35... analyzer: add caching to globals with initializers [PR11011 (*)
  e30211c... [PR114415][scheduler]: Fixing wrong code generation (*)
  421311a... Fix range-ops operator_addr. (*)
  fefdb9f... Daily bump. (*)
  6f7674a... testsuite: Fix up vector-subaccess-1.C test for ia32 [PR892 (*)
  adba85b... AVR: target/114981 - Support __builtin_powi[l] / __powidf2. (*)
  44d84db... reassoc: Fix up optimize_range_tests_to_bit_test [PR114965] (*)
  cad27df... expansion: Use __trunchfbf2 calls rather than __extendhfbf2 (*)
  d1ec7bc... tree-inline: Remove .ASAN_MARK calls when inlining function (*)

Re: [PATCH-1v2, rs6000] Implement optab_isinf for SFDF and IEEE128

2024-05-22 Thread Peter Bergner
On 5/19/24 10:28 PM, HAO CHEN GUI wrote:
> +(define_expand "isinf2"
> +  [(use (match_operand:SI 0 "gpc_reg_operand"))
> +   (use (match_operand:SFDF 1 "gpc_reg_operand"))]
> +  "TARGET_HARD_FLOAT && TARGET_P9_VECTOR"
> +{
> +  emit_insn (gen_xststdcp (operands[0], operands[1], GEN_INT (0x30)));
> +  DONE;
> +})

Is there a reason not to use the vsx_register_operand predicate for op1
which matches the predicate for the operand of the xststdcp pattern
we're passing op1 to?

Ditto for the other optab patches you've submitted.

Peter



Re: [PATCH] rs6000: Don't pass -many to the assembler [PR112868]

2024-05-22 Thread Peter Bergner
On 5/21/24 8:27 AM, jeevitha wrote:
> The following patch has been bootstrapped and regtested with default 
> configuration
> [--enable-checking=yes] and with --enable-checking=release on 
> powerpc64le-linux.
> 
> This patch removes passing the -many assembler option for release builds. Now,
> GCC no longer passes -many under any conditions to the assembler.
> 
> 2024-05-15  Jeevitha Palanisamy  
> 
> gcc/
>   PR target/112868
>   * config/rs6000/rs6000.h (ASM_OPT_ANY): Removed Define.
>   (ASM_CPU_SPEC): Remove ASM_OPT_ANY usage.

You are missing a ChangeLog entry for the target-supports.exp change plus
there is no mention of why it's needed in the git log entry.

Otherwise, the rest LGTM.

Peter




Re: [PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-05-03 Thread Peter Bergner
On 4/12/24 3:36 PM, Peter Bergner wrote:
> Testing was clean on both LE and BE, so I pushed the changes.
> I'll let things bake on trunk for a bit before pushing the backports.

The backports all tested clean, so I pushed them.  Fixed everywhere.
Thanks everyone!

Peter



[gcc r11-11413] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-05-02 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:f8f02fd0bfeeb733a044a120b394eeac48de318a

commit r11-11413-gf8f02fd0bfeeb733a044a120b394eeac48de318a
Author: Peter Bergner 
Date:   Thu May 2 18:07:05 2024 -0500

rs6000: Add OPTION_MASK_POWER8 [PR101865]

The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

2024-04-12  Will Schmidt  
Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add 
OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Update
comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mpower8-internal): New.

gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefine-p7-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefine-p8-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefine-p9-novsx.c: New test.

(cherry picked from commit aa57af93ba22865be747f926e4e5f219e7f8758a)

Diff:
---
 gcc/config/rs6000/rs6000-c.c   |   2 +-
 gcc/config/rs6000/rs6000-cpus.def  |   2 +
 gcc/config/rs6000/rs6000.c |   7 +-
 gcc/config/rs6000/rs6000.h |   2 +-
 gcc/config/rs6000/rs6000.md|   8 +-
 gcc/config/rs6000/rs6000.opt   |   4 +
 .../gcc.target/powerpc/predefine-p7-novsx.c|  22 +
 .../powerpc/predefine-p8-noaltivec-novsx.c |  26 ++
 .../gcc.target/powerpc/predefine-p8-noaltivec.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-novsx.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-pragma-vsx.c   | 101 +
 .../gcc.target/powerpc/predefine-p9-novsx.c|  26 ++
 12 files changed, 244 insertions(+), 8 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 1e3117899bb..60cbb1118ec 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -432,7 +432,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags,
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
+  if ((flags & OPTION_MASK_POWER8) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 518897df935..6a1acbd3136 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -47,6 +47,7 @@
fusion here, instead set it in rs6000.c if we are tuning for a power8
system.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
+| OPTION_MASK_POWER8   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
@@ -137,6 +138,7 @@
 | OPTION_MASK_MODULO   \
 | OPTION_MASK_MULHW\
 | OPTION_MASK_NO_UPDATE\
+| OPTION_MASK_POWER8   \
 | OPTION_MASK_P8_FUSION\
 | OPTION_MA

[gcc r11-11412] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-05-02 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:26d48b6d3e2d07583f25f0769d0c005864760aee

commit r11-11412-g26d48b6d3e2d07583f25f0769d0c005864760aee
Author: Peter Bergner 
Date:   Tue Apr 9 15:24:39 2024 -0500

rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR 
[PR101865]

This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with 
OPTION_MASK_P8_VECTOR
and delete the now dead mask.

2024-04-09  Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete 
redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_P8_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mdirect-move): Remove Mask and Var.

(cherry picked from commit 7924e352523b37155ed9d76dc426701de9d11a22)

Diff:
---
 gcc/config/rs6000/rs6000-c.c  | 14 +-
 gcc/config/rs6000/rs6000-cpus.def |  3 ---
 gcc/config/rs6000/rs6000.c| 14 +++---
 gcc/config/rs6000/rs6000.h|  2 ++
 gcc/config/rs6000/rs6000.opt  |  2 +-
 5 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index afcb5bb6e39..1e3117899bb 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -432,19 +432,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags,
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  /* Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned on in the following condition:
- 1. TARGET_P8_VECTOR is enabled and OPTION_MASK_DIRECT_MOVE is not
-explicitly disabled.
-Hereafter, the OPTION_MASK_DIRECT_MOVE flag is considered to
-have been turned on explicitly.
- Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned off in any of the following conditions:
- 1. TARGET_HARD_FLOAT, TARGET_ALTIVEC, or TARGET_VSX is explicitly
-   disabled and OPTION_MASK_DIRECT_MOVE was not explicitly
-   enabled.
- 2. TARGET_VSX is off.  */
-  if ((flags & OPTION_MASK_DIRECT_MOVE) != 0)
+  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 907e1469736..518897df935 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -49,7 +49,6 @@
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_QUAD_MEMORY  \
 | OPTION_MASK_QUAD_MEMORY_ATOMIC)
@@ -94,7 +93,6 @@
 /* Flags that need to be turned off if -mno-power8-vector.  */
 #define OTHER_P8_VECTOR_MASKS  (OTHER_P9_VECTOR_MASKS  \
 | OPTION_MASK_P9_VECTOR\
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_CRYPTO)
 
 /* Flags that need to be turned off if -mno-vsx.  */
@@ -125,7 +123,6 @@
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_DLMZB\
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_FLOAT128_HW  \
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 3e5281c0

[gcc r12-10409] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-05-02 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:04ca18ff5e2592ac88a5b72248332f519a17184b

commit r12-10409-g04ca18ff5e2592ac88a5b72248332f519a17184b
Author: Will Schmidt 
Date:   Fri Apr 12 14:55:16 2024 -0500

rs6000: Add OPTION_MASK_POWER8 [PR101865]

The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

2024-04-12  Will Schmidt  
Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add 
OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mpower8-internal): New.

gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefine-p7-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefine-p8-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefine-p9-novsx.c: New test.

(cherry picked from commit aa57af93ba22865be747f926e4e5f219e7f8758a)

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc|   2 +-
 gcc/config/rs6000/rs6000-c.cc  |   2 +-
 gcc/config/rs6000/rs6000-cpus.def  |   2 +
 gcc/config/rs6000/rs6000.cc|   7 +-
 gcc/config/rs6000/rs6000.h |   2 +-
 gcc/config/rs6000/rs6000.md|   8 +-
 gcc/config/rs6000/rs6000.opt   |   4 +
 .../gcc.target/powerpc/predefine-p7-novsx.c|  22 +
 .../powerpc/predefine-p8-noaltivec-novsx.c |  26 ++
 .../gcc.target/powerpc/predefine-p8-noaltivec.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-novsx.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-pragma-vsx.c   | 101 +
 .../gcc.target/powerpc/predefine-p9-novsx.c|  26 ++
 13 files changed, 245 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 39a07a27c86..ff5830532d2 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -168,7 +168,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P7_64:
   return TARGET_POPCNTD && TARGET_POWERPC64;
 case ENB_P8:
-  return TARGET_DIRECT_MOVE;
+  return TARGET_POWER8;
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index cc848478a20..77d8de70e7a 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -432,7 +432,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags,
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
+  if ((flags & OPTION_MASK_POWER8) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 7dc8679ac9d..a052914b246 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -47,6 +47,7 @@
fusion here, instead set it in rs6000.cc if we are tuning for a power8
system.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
+| O

[gcc r12-10408] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-05-02 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:135402288a1b1b082d2e71ff2ee5c63b7dafed9f

commit r12-10408-g135402288a1b1b082d2e71ff2ee5c63b7dafed9f
Author: Peter Bergner 
Date:   Tue Apr 9 15:24:39 2024 -0500

rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR 
[PR101865]

This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with 
OPTION_MASK_P8_VECTOR
and delete the now dead mask.

2024-04-09  Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete 
redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_P8_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mdirect-move): Remove Mask and Var.

(cherry picked from commit 7924e352523b37155ed9d76dc426701de9d11a22)

Diff:
---
 gcc/config/rs6000/rs6000-c.cc | 14 +-
 gcc/config/rs6000/rs6000-cpus.def |  3 ---
 gcc/config/rs6000/rs6000.cc   | 14 +++---
 gcc/config/rs6000/rs6000.h|  2 ++
 gcc/config/rs6000/rs6000.opt  |  2 +-
 5 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index fa0c93e1841..cc848478a20 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -432,19 +432,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags,
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  /* Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned on in the following condition:
- 1. TARGET_P8_VECTOR is enabled and OPTION_MASK_DIRECT_MOVE is not
-explicitly disabled.
-Hereafter, the OPTION_MASK_DIRECT_MOVE flag is considered to
-have been turned on explicitly.
- Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned off in any of the following conditions:
- 1. TARGET_HARD_FLOAT, TARGET_ALTIVEC, or TARGET_VSX is explicitly
-   disabled and OPTION_MASK_DIRECT_MOVE was not explicitly
-   enabled.
- 2. TARGET_VSX is off.  */
-  if ((flags & OPTION_MASK_DIRECT_MOVE) != 0)
+  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 963947f6939..7dc8679ac9d 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -49,7 +49,6 @@
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_QUAD_MEMORY  \
 | OPTION_MASK_QUAD_MEMORY_ATOMIC)
@@ -94,7 +93,6 @@
 /* Flags that need to be turned off if -mno-power8-vector.  */
 #define OTHER_P8_VECTOR_MASKS  (OTHER_P9_VECTOR_MASKS  \
 | OPTION_MASK_P9_VECTOR\
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_CRYPTO)
 
 /* Flags that need to be turned off if -mno-vsx.  */
@@ -125,7 +123,6 @@
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_DLMZB\
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_FLOAT128_HW  \
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index

[gcc r13-8673] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-05-01 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:0ae9252f7b52151209b36d8a1cefc49f1b23fa46

commit r13-8673-g0ae9252f7b52151209b36d8a1cefc49f1b23fa46
Author: Will Schmidt 
Date:   Fri Apr 12 14:55:16 2024 -0500

rs6000: Add OPTION_MASK_POWER8 [PR101865]

The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

2024-04-12  Will Schmidt  
Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add 
OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mpower8-internal): New.

gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefine-p7-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefine-p8-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefine-p9-novsx.c: New test.

(cherry picked from commit aa57af93ba22865be747f926e4e5f219e7f8758a)

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc|   2 +-
 gcc/config/rs6000/rs6000-c.cc  |   2 +-
 gcc/config/rs6000/rs6000-cpus.def  |   2 +
 gcc/config/rs6000/rs6000.cc|   7 +-
 gcc/config/rs6000/rs6000.h |   2 +-
 gcc/config/rs6000/rs6000.md|   8 +-
 gcc/config/rs6000/rs6000.opt   |   4 +
 .../gcc.target/powerpc/predefine-p7-novsx.c|  22 +
 .../powerpc/predefine-p8-noaltivec-novsx.c |  26 ++
 .../gcc.target/powerpc/predefine-p8-noaltivec.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-novsx.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-pragma-vsx.c   | 101 +
 .../gcc.target/powerpc/predefine-p9-novsx.c|  26 ++
 13 files changed, 245 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 2b4412e0403..5b17132a101 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -165,7 +165,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P7_64:
   return TARGET_POPCNTD && TARGET_POWERPC64;
 case ENB_P8:
-  return TARGET_DIRECT_MOVE;
+  return TARGET_POWER8;
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 2bedc0fc938..a931efd2409 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,7 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
+  if ((flags & OPTION_MASK_POWER8) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4f8d07b9a0d..641ad09a3ba 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -47,6 +47,7 @@
fusion here, instead set it in rs6000.cc if we are tuning for a power8
system.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
+| O

[gcc r13-8672] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-05-01 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:d42105742841e73ca867b6da0c5ca6ad4d86fed6

commit r13-8672-gd42105742841e73ca867b6da0c5ca6ad4d86fed6
Author: Peter Bergner 
Date:   Tue Apr 9 15:24:39 2024 -0500

rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR 
[PR101865]

This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with 
OPTION_MASK_P8_VECTOR
and delete the now dead mask.

2024-04-09  Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete 
redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_P8_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mdirect-move): Remove Mask and Var.

(cherry picked from commit 7924e352523b37155ed9d76dc426701de9d11a22)

Diff:
---
 gcc/config/rs6000/rs6000-c.cc | 14 +-
 gcc/config/rs6000/rs6000-cpus.def |  3 ---
 gcc/config/rs6000/rs6000.cc   | 14 +++---
 gcc/config/rs6000/rs6000.h|  2 ++
 gcc/config/rs6000/rs6000.opt  |  2 +-
 5 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 8555174d36e..2bedc0fc938 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,19 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  /* Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned on in the following condition:
- 1. TARGET_P8_VECTOR is enabled and OPTION_MASK_DIRECT_MOVE is not
-explicitly disabled.
-Hereafter, the OPTION_MASK_DIRECT_MOVE flag is considered to
-have been turned on explicitly.
- Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned off in any of the following conditions:
- 1. TARGET_HARD_FLOAT, TARGET_ALTIVEC, or TARGET_VSX is explicitly
-   disabled and OPTION_MASK_DIRECT_MOVE was not explicitly
-   enabled.
- 2. TARGET_VSX is off.  */
-  if ((flags & OPTION_MASK_DIRECT_MOVE) != 0)
+  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4f350da378c..4f8d07b9a0d 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -49,7 +49,6 @@
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_QUAD_MEMORY  \
 | OPTION_MASK_QUAD_MEMORY_ATOMIC)
@@ -93,7 +92,6 @@
 /* Flags that need to be turned off if -mno-power8-vector.  */
 #define OTHER_P8_VECTOR_MASKS  (OTHER_P9_VECTOR_MASKS  \
 | OPTION_MASK_P9_VECTOR\
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_CRYPTO)
 
 /* Flags that need to be turned off if -mno-vsx.  */
@@ -124,7 +122,6 @@
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_DLMZB\
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_FLOAT128_HW  \
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index

Re: [PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-12 Thread Peter Bergner
On 4/11/24 11:23 PM, Peter Bergner wrote:
> I'll make the changes above, modulo leaving the option name unchanged until
> we hear from Segher on that and report back on the LE and BE testing.

I made all of the requested changes and went with -mpower8-internal since
Segher was fine with that (offline) along with the helpful Warn message.

Testing was clean on both LE and BE, so I pushed the changes.
I'll let things bake on trunk for a bit before pushing the backports.

Thanks!

Peter



[gcc r14-9949] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-12 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:aa57af93ba22865be747f926e4e5f219e7f8758a

commit r14-9949-gaa57af93ba22865be747f926e4e5f219e7f8758a
Author: Will Schmidt 
Date:   Fri Apr 12 14:55:16 2024 -0500

rs6000: Add OPTION_MASK_POWER8 [PR101865]

The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

2024-04-12  Will Schmidt  
Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add 
OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mpower8-internal): New.

gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefine-p7-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefine-p8-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefine-p9-novsx.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000-builtin.cc|   2 +-
 gcc/config/rs6000/rs6000-c.cc  |   2 +-
 gcc/config/rs6000/rs6000-cpus.def  |   2 +
 gcc/config/rs6000/rs6000.cc|   7 +-
 gcc/config/rs6000/rs6000.h |   2 +-
 gcc/config/rs6000/rs6000.md|   8 +-
 gcc/config/rs6000/rs6000.opt   |   4 +
 .../gcc.target/powerpc/predefine-p7-novsx.c|  22 +
 .../powerpc/predefine-p8-noaltivec-novsx.c |  26 ++
 .../gcc.target/powerpc/predefine-p8-noaltivec.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-novsx.c|  26 ++
 .../gcc.target/powerpc/predefine-p8-pragma-vsx.c   | 101 +
 .../gcc.target/powerpc/predefine-p9-novsx.c|  26 ++
 13 files changed, 245 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index e7d6204074c..320affd79e3 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -165,7 +165,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P7_64:
   return TARGET_POPCNTD && TARGET_POWERPC64;
 case ENB_P8:
-  return TARGET_DIRECT_MOVE;
+  return TARGET_POWER8;
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 647f20de7f2..bd493ab87c5 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,7 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
+  if ((flags & OPTION_MASK_POWER8) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 45dd5a85901..6ee678e69c3 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -47,6 +47,7 @@
fusion here, instead set it in rs6000.cc if we are tuning for a power8
system.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
+| OPTION_MASK_POWER8   \
 

Re: [PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-11 Thread Peter Bergner
On 4/11/24 10:31 PM, Kewen.Lin wrote:
>> The passed bootstrap and regtest on powerpc64le-linux.  Ok for trunk?
> 
> Thanks for fixing this.  I guess it should go well on powerpc64-linux too,
> but since it's very late stage4 now, could you also test this on BE machine?

Will do, after making the changes suggested below.



>> +;; This option exists only to create its MASK.  It is not intended for 
>> users.
>> +mdo-not-use-this-option
>> +Target RejectNegative Mask(POWER8) Var(rs6000_isa_flags) WarnRemoved
>> +
> 
> I can understand the given name is to avoid users to use it, but it looks 
> odd, personally
> I'm inclined to mpower8 (or even mpower8-internal) even if it's more likely 
> to be used but
> it's a bit more meaningful (especially we already have mpower10), 
> theoretically speaking
> it's undocumented users shouldn't use it at all.

Sorry, I should have mentioned this, but I originally had it -mpower8, but given
it was an option we don't want users to use, Segher mentioned offline to give it
a name something like the above and not -mpower8.  I kind of like 
-mpower8-internal
now that you mention it, but I'd like Segher's input here whether he prefers
-mdo-not-use-this-option or -mpower8-internal or something else???



> And I think we want explicit "Undocumented" here, and WarnRemoved seems not 
> suitable here
> since it's for some option which worked before but then wasn't supported any 
> longer, but
> this one is new, may be "Warn(Don't use %qs)" instead?

Oops, yes, we want Undocumented here.  Thanks for catching that!
Good idea on the Warn versus WarnRemoved.


>> +/* { dg-options "-mdejagnu-cpu=power8 -mvsx -O2" } */
> 
> Nit: -O2 looks useless and can be dropped?

Ok, I'll drop it.


I'll make the changes above, modulo leaving the option name unchanged until
we hear from Segher on that and report back on the LE and BE testing.
Thanks!

Peter




[PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-11 Thread Peter Bergner
FYI: This patch is an update to Will Schmidt's patches to fix PR101865:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601825.html
  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601823.html

...taking into consideration patch reviews received than.  I also found
a few more locations that needed patching, as well as simplifying the
testsuite test cases by removing the need to scan for the predefined macros.



The bug in PR101865 is the _ARCH_PWR8 predefined macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

The passed bootstrap and regtest on powerpc64le-linux.  Ok for trunk?

This is also broken on the release branches, so ok for backports after
some burn-in time on trunk?

Peter


2024-04-11  Will Schmidt  
Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mdo-not-use-this-option): New.

gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefined-p7-novsx.c: New test.
* gcc.target/powerpc/predefined-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefined-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefined-p8-novsx.c: New test.
* gcc.target/powerpc/predefined-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefined-p9-novsx.c: New test.

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index e7d6204074c..320affd79e3 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -165,7 +165,7 @@ rs6000_builtin_is_supported (enum rs6000_gen_builtins 
fncode)
 case ENB_P7_64:
   return TARGET_POPCNTD && TARGET_POWERPC64;
 case ENB_P8:
-  return TARGET_DIRECT_MOVE;
+  return TARGET_POWER8;
 case ENB_P8V:
   return TARGET_P8_VECTOR;
 case ENB_P9:
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 647f20de7f2..bd493ab87c5 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,7 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
+  if ((flags & OPTION_MASK_POWER8) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 45dd5a85901..6ee678e69c3 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -47,6 +47,7 @@
fusion here, instead set it in rs6000.cc if we are tuning for a power8
system.  */
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
+| OPTION_MASK_POWER8   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
@@ -130,6 +131,7 @@
 | OPTION_MASK_MODULO   \
 | OPTION_MASK_MULHW\
 | OPTION_MASK_NO_UPDATE\
+| OPTION_MASK_POWER8   \
 | OPTION_MASK_P8_FUSION\
 | OPTION_MASK_P8_VECTOR\

Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-09 Thread Peter Bergner
On 4/9/24 3:19 PM, Peter Bergner wrote:
> Ok, current trunk ignores -mno-direct-move and warns on -mdirect-move, so to
> keep the same behavior for GCC 14 (before removing in stage1), we want just:
> 
> mdirect-move
> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
> +Target Undocumented WarnRemoved
> 
> which is what I'll commit and push after one last round of testing.

Testing was clean as expected, so I pushed the commit.  Thanks.

Peter



[gcc r14-9884] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-09 Thread Peter Bergner via Gcc-cvs
https://gcc.gnu.org/g:7924e352523b37155ed9d76dc426701de9d11a22

commit r14-9884-g7924e352523b37155ed9d76dc426701de9d11a22
Author: Peter Bergner 
Date:   Tue Apr 9 15:24:39 2024 -0500

rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR 
[PR101865]

This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with 
OPTION_MASK_P8_VECTOR
and delete the now dead mask.

2024-04-09  Peter Bergner  

gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete 
redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_VSX_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mdirect-move): Remove Mask and Var.

Diff:
---
 gcc/config/rs6000/rs6000-c.cc | 14 +-
 gcc/config/rs6000/rs6000-cpus.def |  3 ---
 gcc/config/rs6000/rs6000.cc   | 14 +++---
 gcc/config/rs6000/rs6000.h|  2 ++
 gcc/config/rs6000/rs6000.opt  |  2 +-
 5 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index ce0b14a8d37..647f20de7f2 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,19 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  /* Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned on in the following condition:
- 1. TARGET_P8_VECTOR is enabled and OPTION_MASK_DIRECT_MOVE is not
-explicitly disabled.
-Hereafter, the OPTION_MASK_DIRECT_MOVE flag is considered to
-have been turned on explicitly.
- Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned off in any of the following conditions:
- 1. TARGET_HARD_FLOAT, TARGET_ALTIVEC, or TARGET_VSX is explicitly
-   disabled and OPTION_MASK_DIRECT_MOVE was not explicitly
-   enabled.
- 2. TARGET_VSX is off.  */
-  if ((flags & OPTION_MASK_DIRECT_MOVE) != 0)
+  if ((flags & OPTION_MASK_P8_VECTOR) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 28249600318..45dd5a85901 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -49,7 +49,6 @@
 #define ISA_2_7_MASKS_SERVER   (ISA_2_6_MASKS_SERVER   \
 | OPTION_MASK_P8_VECTOR\
 | OPTION_MASK_CRYPTO   \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_QUAD_MEMORY  \
 | OPTION_MASK_QUAD_MEMORY_ATOMIC)
@@ -90,7 +89,6 @@
 #define OTHER_VSX_VECTOR_MASKS (OPTION_MASK_EFFICIENT_UNALIGNED_VSX\
 | OPTION_MASK_FLOAT128_KEYWORD \
 | OPTION_MASK_P8_VECTOR\
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_P9_VECTOR\
 | OPTION_MASK_FLOAT128_HW  \
@@ -118,7 +116,6 @@
 | OPTION_MASK_CMPB \
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DFP  \
-| OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_DLMZB\
 | OPTION_MASK_EFFICIENT_UNALIGNED_VSX  \
 | OPTION_MASK_FLOAT128_HW  \
diff --git a/gcc/config/rs6000/rs60

Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-09 Thread Peter Bergner
On 4/9/24 12:37 AM, Kewen.Lin wrote:
> Since removing it completely is a stage1 thing, I prefer to keep mdirect-move
> and -mno-direct-move handlings as before, WarnRemoved and Ignore separately.

Ok, current trunk ignores -mno-direct-move and warns on -mdirect-move, so to
keep the same behavior for GCC 14 (before removing in stage1), we want just:

mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented WarnRemoved

which is what I'll commit and push after one last round of testing.
Thanks.

Peter




Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Peter Bergner
On 4/8/24 9:37 PM, Kewen.Lin wrote:
> on 2024/4/8 21:21, Peter Bergner wrote:
> I prefer to remove it completely, that is:
> 
>> -mdirect-move
>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
> 
> The reason why you still kept it is to keep a historical record here?

I believe we've never completely removed an option before.  I think the
thought was, if some software package explicitly used the option, then
they shouldn't see an 'unrecognized command-line option' error, but
rather either a warning that the option was removed or just silently
ignore it.  Ie, we don't want to make a package that used to build with
an old compiler now break its build because the option doesn't exist
anymore.



> Segher pointed out to me that this kind of option complete removal should be
> stage 1 stuff, so let's defer to make it in a separated patch next release
> (including some other options like mfpgpr you showed below etc.). :)

If we're going to completely remove it, then for sure, it's a stage1 thing.
I'd like to hear Segher's thoughts on whether we should completely remove
it or just silently ignore it.



> For the original patch,
> 
>> +mno-direct-move
>> +Target Undocumented WarnRemoved
> 
> s/WarnRemoved/Ignore/ to match some other existing practice, there is no
> warning now if specifying -mno-direct-move and it would be good to keep
> the same behavior for users.

If we want to silently ignore -mdirect-move and -mno-direct-move, then we
just need to do:

mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented Ignore

There's no need to mention -mno-direct-move at all then.  It was only in the
case I thought we wanted to warn against it's use that I added -mno-direct-move.



>> That said, it's not what we've done with
>> other options, but maybe those just need to be changed too?
> 
> Yes, I think they need to be changed too (next release).

If that's the consensus with Segher, sure, we can plan on that in stage1.

Peter




Re: [PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-08 Thread Peter Bergner
On 4/8/24 3:55 AM, Kewen.Lin wrote:
> on 2024/4/6 06:28, Peter Bergner wrote:
>> +mno-direct-move
>> +Target Undocumented WarnRemoved
>> +
>>  mdirect-move
>> -Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
>> +Target Undocumented WarnRemoved
> 
> When reviewing my previous patch to "neuter option -mpower{8,9}-vector",
> Segher mentioned that we don't need to keep such option warning all the
> time and can drop it like in a release later as users should be aware of
> this information then, I agreed and considering that patch disabling
> -m[no-]direct-move was r8-7845-g57f108f5a1e1b2, I think we can just remove
> m[no-]direct-move here?  What do you think?


I'm fine with that if that is what we want.  So something like the following?

+;; This option existed in the past, but now is always silently ignored.
mdirect-move
-Target Undocumented Mask(DIRECT_MOVE) Var(rs6000_isa_flags) WarnRemoved
+Target Undocumented Ignore


The above seems to silently ignore both -mdirect-move and -mno-direct-move
which I think is what we want.  That said, it's not what we've done with
other options, but maybe those just need to be changed too?


Peter


;; This option existed in the past, but now is always off.
mno-mfpgpr
Target RejectNegative Undocumented Ignore

mmfpgpr
Target RejectNegative Undocumented WarnRemoved

[snip other examples]


Re: [PATCH, rs6000] Split TARGET_POWER8 from TARGET_DIRECT_MOVE [PR101865] (2/2)

2024-04-07 Thread Peter Bergner
I'm picking up Will's patches for this bug.  As an FYI, this is the bug where
_ARCH_PWR8 is conditional on TARGET_DIRECT_MOVE which can be disabled with
-mno-vsx which is bad.

I already posted the cleanup patch that the updated patch for this bug will rely
on, that removed the OPTION_MASK_DIRECT_MOVE because it is fully redundant with
OPTION_MASK_P8_VECTOR.  I've also incorporated some of Ke Wen's review comments
on Will's original patch.  I have a couple of comments on your review though...


On 10/17/22 1:08 PM, Segher Boessenkool wrote:
> On Mon, Sep 19, 2022 at 11:13:20AM -0500, will schmidt wrote:
>> @@ -24046,10 +24045,11 @@ static struct rs6000_opt_mask const 
>> rs6000_opt_masks[] =
>>{ "block-ops-vector-pair",OPTION_MASK_BLOCK_OPS_VECTOR_PAIR,
>>  false, true  },
>>{ "cmpb", OPTION_MASK_CMPB,   false, true  },
>>{ "crypto",   OPTION_MASK_CRYPTO, false, 
>> true  },
>>{ "direct-move",  OPTION_MASK_DIRECT_MOVE,false, true  },
>> +  { "power8",   OPTION_MASK_POWER8, false, 
>> true  },
> 
> Why would we want a #pragma power8 ?

Agreed, we don't want that.  We have target attribute cpu=power8 for that.



>> +mpower8
>> +Target Mask(POWER8) Var(rs6000_isa_flags)
>> +Use instructions added in ISA 2.07 (power8).
> 
> There should not be such an option.  It is set by -mcpu=power8 and
> later, but can never be enabled or disabled direfctly by the user.

So we need an OPTION_MASK_POWER8 to be created for use in rs6000_isa_flags, but
the only way I see that we can do that is to create an option in rs6000.opt.
Did I miss that there is another way?  Otherwise, I was thinking of creating a
dummy option that is WarnRemoved from the start ala:

+;; This option exists only for its MASK.  It is not intended for users.
+mpower8
+Target Mask(POWER8) Var(rs6000_isa_flags) WarnRemoved
+

Is there a better way?  The problem is P8 created lots of new instructions, but
they were basically all vector and htm instructions.  There were no general
GPR or FPR instructions (ie, what we'd think of as base architecture) added,
so there's no other OPTION_MASK_*/TARGET_* we can use as a P8 base architecture
test.

I'll note I tried just a bare "Target Mask(POWER8) Var(rs6000_isa_flags)" with 
no
option name mentioned at all, but that didn't work, as no OPTION_MASK_POWER8 was
created.

Peter




[PATCH] rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]

2024-04-05 Thread Peter Bergner
This is a cleanup patch in preparation to fixing the real bug in PR101865.
TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that.
Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR
and delete the now dead mask.

This passed bootstrap and retesting on powerpc64le-linux with no regressions.
Ok for trunk?

Eventually we'll want to backport this along with the follow-on patch that
actually fixes PR101865.

Peter


gcc/
PR target/101865
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete redundant
OPTION_MASK_DIRECT_MOVE usage.  Delete TARGET_DIRECT_MOVE dead code.
(rs6000_opt_masks): Neuter the "direct-move" option.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace
OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR.  Delete useless
comment.
* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete
OPTION_MASK_DIRECT_MOVE.
(OTHER_VSX_VECTOR_MASKS): Likewise.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.opt (mno-direct-move): New.
(mdirect-move): Remove Mask and Var.


diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 68bc45d65ba..77d045c9f6e 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -471,6 +471,8 @@ extern int rs6000_vector_align[];
 #define TARGET_EXTSWSLI(TARGET_MODULO && TARGET_POWERPC64)
 #define TARGET_MADDLD  TARGET_MODULO
 
+/* TARGET_DIRECT_MOVE is redundant to TARGET_P8_VECTOR, so alias it to that.  
*/
+#define TARGET_DIRECT_MOVE TARGET_P8_VECTOR
 #define TARGET_XSCVDPSPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
 #define TARGET_XSCVSPDPN   (TARGET_DIRECT_MOVE || TARGET_P8_VECTOR)
 #define TARGET_VADDUQM (TARGET_P8_VECTOR && TARGET_POWERPC64)
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 6ba9df4f02e..c241371147c 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3811,7 +3811,7 @@ rs6000_option_override_internal (bool global_init_p)
  Testing for direct_move matches power8 and later.  */
   if (!BYTES_BIG_ENDIAN
   && !(processor_target_table[tune_index].target_enable
-  & OPTION_MASK_DIRECT_MOVE))
+  & OPTION_MASK_P8_VECTOR))
 rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
 
   /* Add some warnings for VSX.  */
@@ -3853,8 +3853,7 @@ rs6000_option_override_internal (bool global_init_p)
   && (rs6000_isa_flags_explicit & (OPTION_MASK_SOFT_FLOAT
   | OPTION_MASK_ALTIVEC
   | OPTION_MASK_VSX)) != 0)
-rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO
-  | OPTION_MASK_DIRECT_MOVE)
+rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO)
 & ~rs6000_isa_flags_explicit);
 
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
@@ -3939,13 +3938,6 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_FPRND;
 }
 
-  if (TARGET_DIRECT_MOVE && !TARGET_VSX)
-{
-  if (rs6000_isa_flags_explicit & OPTION_MASK_DIRECT_MOVE)
-   error ("%qs requires %qs", "-mdirect-move", "-mvsx");
-  rs6000_isa_flags &= ~OPTION_MASK_DIRECT_MOVE;
-}
-
   if (TARGET_P8_VECTOR && !TARGET_ALTIVEC)
 rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;
 
@@ -24429,7 +24421,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
false, true  },
   { "cmpb",OPTION_MASK_CMPB,   false, true  },
   { "crypto",  OPTION_MASK_CRYPTO, false, true  },
-  { "direct-move", OPTION_MASK_DIRECT_MOVE,false, true  },
+  { "direct-move", 0,  false, true  },
   { "dlmzb",   OPTION_MASK_DLMZB,  false, true  },
   { "efficient-unaligned-vsx", OPTION_MASK_EFFICIENT_UNALIGNED_VSX,
false, true  },
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index ce0b14a8d37..647f20de7f2 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -429,19 +429,7 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6");
   if ((flags & OPTION_MASK_POPCNTD) != 0)
 rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7");
-  /* Note that the OPTION_MASK_DIRECT_MOVE flag is automatically
- turned on in the following condition:
- 1. TARGET_P8_VECTOR is enabled and OPTION_MASK_DIRECT_MOVE is not
-explicitly disabled.
-Hereafter, 

Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread Peter Bergner
On 4/3/24 10:45 AM, Andrew Pinski wrote:
> On Wed, Apr 3, 2024 at 8:32 AM Peter Bergner  wrote:
> I think you misunderstood the patch/situtation. Most ifunc resolves
> don't use TLS at all; what is happening here is that the profiler
> (-fprofile-generate) is adding TLS usage to the ifunc resolver which
> then causes issues. And the use of TLS causes a PLT call to be inside
> the ifun which causes all the fun stuff.
> 
> This is not about ifunc resolves using TLS directly in code but rather
> indirectly via -fprofile-generate.

Ah, ok.  Thanks to you and H.J. for clarifying.

Peter



Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread Peter Bergner
On 4/3/24 7:40 AM, H.J. Lu wrote:
> We can't profile indirect calls to IFUNC resolvers nor their callees as
> it requires TLS which hasn't been set up yet when the dynamic linker is
> resolving IFUNC symbols.
> 
> Add an IFUNC resolver caller marker to cgraph_node and set it if the
> function is called by an IFUNC resolver.  Disable indirect call profiling
> for IFUNC resolvers and their callees.

The IFUNC resolvers on Power do not use TLS, so isn't this a little too
conservative?  Should this be triggered via a target hook so architectures
that don't use TLS in their IFUNC resolvers could still profile them?

Peter




Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-03-23 Thread Peter Bergner
On 3/23/24 4:33 AM, Ajit Agarwal wrote:
>>> -  else if (align_words < GP_ARG_NUM_REG)
>>> +  else if (align_words < GP_ARG_NUM_REG
>>> +  || (cum->hidden_string_length
>>> +  && cum->actual_parm_length <= GP_ARG_NUM_REG))
>> {
>>   if (TARGET_32BIT && TARGET_POWERPC64)
>> return rs6000_mixed_function_arg (mode, type, align_words);
>>
>>   return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words);
>> }
>>   else
>> return NULL_RTX;
>>
>> The old code for the unused hidden parameter (which was the 9th param) would
>> fall thru to the "return NULL_RTX;" which would make the callee assume there
>> was a parameter save area allocated.  Now instead, we'll return a reg rtx,
>> probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now
>> see a copy of r11 into a pseudo like we do for the other param regs.
>> Is that a problem? Given it's an unused parameter, it'll probably get deleted
>> as dead code, but could it cause any issues?  What if we have more than one
>> unused hidden parameter and we return r12 and r13 which have specific uses
>> in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead.
>> Have you verified what the callee RTL looks like after expand for these
>> unused hidden parameters?  Is there a rtx we can return that isn't a NULL_RTX
>> which triggers the assumption of a parameter save area, but isn't a reg rtx
>> which might lead to some rtl being generated?  Would a (const_int 0) or
>> something else work?
>>
>>
> For the above use case it will return 
> 
> (reg:DI 5 %r5) and below check entry_parm = 
> (reg:DI 5 %r5) and the following check will not return TRUE and hence
>parameter save area will not be allocated.

Why r5?!?!   The 8th (integer) param would return r10, so I'd assume if
the next param was a hidden param, then it'd get the next gpr, so r11.
How does it jump back to r5 which may have been used by the 3rd param?





> It will not generate any rtx in the callee rtl code but it just used to
> check whether to allocate parameter save area or not when number of args > 8.
> 
> /* If there is no incoming register, we need a stack.  */
>   entry_parm = rs6000_function_arg (args_so_far, arg);
>   if (entry_parm == NULL)
> return true;
> 
>   /* Likewise if we need to pass both in registers and on the stack.  */
>   if (GET_CODE (entry_parm) == PARALLEL
>   && XEXP (XVECEXP (entry_parm, 0, 0), 0) == NULL_RTX)
> return true;

Yes, this code in rs6000_parm_needs_stack() uses the rs6000_function_arg()
return value as a boolean to tell us whether a parameter save area is required
so what we return is unimportant other than to know it's not NULL_RTX.

I'm more concerned about the use of the target hook targetm.calls.function_arg
used in the generic parts of the compiler.  What will that code do differently
now that we return a reg rtx rather than NULL_RTX?  Might that code use
the reg rtx to emit something?  I'd feel better if you could verify what
happens in that code when we return a reg rtx for that 9th hidden param which
isn't really being passed in a register.


Peter




Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-03-22 Thread Peter Bergner
On 3/22/24 5:15 AM, Ajit Agarwal wrote:
> When using FlexiBLAS with OpenBLAS we noticed corruption of
> the parameters passed to OpenBLAS functions. FlexiBLAS
> basically provides a BLAS interface where each function
> is a stub that forwards the arguments to a real BLAS lib,
> like OpenBLAS.
> 
> Fixes the corruption of caller frame checking number of
> arguments is less than equal to GP_ARG_NUM_REG (8)
> excluding hidden unused DECLS.

I think the git log entry commentary could be a little more descriptive
of what the problem is. How about something like the following?

  When using FlexiBLAS with OpenBLAS, we noticed corruption of the caller
  stack frame when calling OpenBLAS functions.  This was caused by the
  FlexiBLAS C/C++ caller and OpenBLAS Fortran callee disagreeing on the
  number of function parameters in the callee due to hidden Fortran
  parameters. This can cause problems when the callee believes the caller
  has allocated a parameter save area when the caller has not done so.
  That means any writes by the callee into the non-existent parameter save
  area will corrupt the caller stack frame.

  The workaround implemented here, is for the callee to determine whether
  the caller has allocated a parameter save area or not, by ignoring any
  unused hidden parameters when counting the number of parameters.



>   PR rtk-optimization/100799

s/rtk/rtl/



>   * config/rs6000/rs6000-calls.cc (rs6000_function_arg): Don't
>   generate parameter save area if number of arguments passed
>   less than equal to GP_ARG_NUM_REG (8) excluding hidden
>   parameter.

The callee doesn't generate or allocate the parameter save area, the
caller does.  The code here is for the callee trying to determine
whether the caller has done so.  How about saying the following instead?

  Don't assume a parameter save area has been allocated if the number of
  formal parameters, excluding unused hidden parameters, is less than or
  equal to GP_ARG_NUM_REG (8).




>   (init_cumulative_args): Check for hidden parameter in fortran
>   routine and set the flag hidden_string_length and actual
>   parameter passed excluding hidden unused DECLS.

Check for unused hidden Fortran parameters and set hidden_string_length
and actual_parm_length.


> +  /* When the buggy C/C++ wrappers call the function with fewer arguments
> + than it actually has and doesn't expect the parameter save area on the
> + caller side because of that while the callee expects it and the callee
> + actually stores something in the parameter save area, it corrupts
> + whatever is in the caller stack frame at that location.  */

The wrapper/caller is the one that allocates the parameter save area, so
saying "...doesn't expect the parameter save area on the caller side..."
doesn't make sense, since it knows whether it allocated it or not.
How about saying something like the following instead?

  Check whether this function contains any unused hidden parameters and
  record how many there are for use in rs6000_function_arg() to determine
  whether its callers have allocated a parameter save area or not.
  See PR100799 for details.



> +  unsigned int num_args = 0;
> +  unsigned int hidden_length = 0;
> +
> +  for (tree arg = DECL_ARGUMENTS (current_function_decl);
> +   arg; arg = DECL_CHAIN (arg))
> +{
> +  num_args++;
> +  if (DECL_HIDDEN_STRING_LENGTH (arg))
> + {
> +   tree parmdef = ssa_default_def (cfun, arg);
> +   if (parmdef == NULL || has_zero_uses (parmdef))
> + {
> +   cum->hidden_string_length = 1;
> +   hidden_length++;
> + }
> + }
> +   }
> +
> +  cum->actual_parm_length = num_args - hidden_length;

This code looks fine, but do we really need two new fields in rs6000_args?
Can't we just get along with only cum->actual_parm_length by modifying
the rs6000_function_arg() change from:

> +  else if (align_words < GP_ARG_NUM_REG
> +|| (cum->hidden_string_length
> +&& cum->actual_parm_length <= GP_ARG_NUM_REG))

to:

+  else if (align_words < GP_ARG_NUM_REG
+  || cum->actual_parm_length <= GP_ARG_NUM_REG)

???

That said, I have a further comment below on what happens here when 
align_words >= GP_ARG_NUM_REG and cum->actual_parm_length <= GP_ARG_NUM_REG.




> + /* When the buggy C/C++ wrappers call the function with fewer arguments
> + than it actually has and doesn't expect the parameter save area on the
> + caller side because of that while the callee expects it and the callee
> + actually stores something in the parameter save area, it corrupts
> + whatever is in the caller stack frame at that location.  */

Same comment as before, so same problem with the comment, but the following
change...

> -  else if (align_words < GP_ARG_NUM_REG)
> +  else if (align_words < GP_ARG_NUM_REG
> +|| (cum->hidden_string_length
> +&& cum->actual_parm_length 

Re: [PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2024-03-18 Thread Peter Bergner
On 3/18/24 9:36 AM, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Feb 23, 2024 at 03:04:13PM +0530, jeevitha wrote:
>> PTImode attribute assists in generating even/odd register pairs on 128 bits.
> 
> It is a mode, not an attribute.  Attributes are on declarations, while
> modes are on a much more fundamental level (every value has a mode, in
> GCC!)
> 
>> When the user specifies PTImode as an attribute, it breaks because there is 
>> no
>> internal type to handle this mode . We have created a tree node with dummy 
>> type
>> to handle PTImode.
> 
> You are talking about the mode attribute.  Aha.

Correct.



>> We are not documenting this dummy type since users are not
>> allowed to use this type externally.
> 
> Not sure what this is meant to mean?  What does "allowed to" mean, even?
> We do not forbid people from doing anything (we can discourage them
> though).  Or dso you mean something just doesn't work?

The use case is the Linux kernel will use the mode attribute as shown in
the test case.  The type she created was needed for the code to "work"
internally, but we don't want to advertise it to users, so that's why we're
not documenting it.  Yes, pesky users who go digging through the GCC source
could find it and use it, but we don't want to encourage them. :-)


>>  PR target/106895
>>  * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields
>>  to hold PTImode type.
> 
> An enum does not have fields.  What do you actually mean?

Yeah, as per your follow-on comment, I think a simple "Add RS6000_BTI_INTPTI."
should suffice.


Peter



Re: [PATCH V3] rs6000: Don't ICE when compiling the __builtin_vsx_splat_2di built-in [PR113950]

2024-03-15 Thread Peter Bergner
On 3/6/24 3:27 AM, Kewen.Lin wrote:
> on 2024/3/4 02:55, jeevitha wrote:
>> The following patch has been bootstrapped and regtested on powerpc64le-linux.
>>  
>> When we expand the __builtin_vsx_splat_2di function, we were allowing 
>> immediate
>> value for second operand which causes an unrecognizable insn ICE. Even though
>> the immediate value was forced into a register, it wasn't correctly assigned
>> to the second operand. So corrected the assignment of op1 to operands[1].
[snip]
> As the discussions in the thread of the previous versions, I think
> Segher agreed this approach, so OK for trunk with the above nits
> tweaked, thanks!

The bogus vsx_splat_ code goes all the way back to GCC 8, so we
should backport this fix.  Segher and Ke Wen, can we get an approval
to backport this to all the open release branches (GCC 13, 12, 11)?
Thanks.

Jeevitha, once we get approval, please perform the backports.

Peter




Re: Stepping up as maintainer for ia64

2024-03-08 Thread Peter Bergner via Gcc
On 3/8/24 5:28 PM, Jonathan Wakely wrote:
> On Fri, 8 Mar 2024 at 22:35, Frank Scheiner via Gcc  wrote:
>>
>> On 08.03.24 23:00, Peter Bergner wrote:
>>> On 3/8/24 7:16 AM, Richard Biener via Gcc wrote:
>>>> I CCed Jeff who is on the commitee to forward the maintainer proposal
>>>> though I guess this will not go forward as a first step.  Instead
>>>> you are probably expected to show activity on the port, for example
>>>> post the patch series to make ia64 use LRA, get write access to the
>>>> git repository and then be promoted maintainer.
>>>
>>> One other method for showing activity is posting regular testsuite
>>> results on the gcc-testresults mailing list to show the community
>>> the port is "working".
>>
>> I don't want to spam this or the other list each and every week, but I
> 
> Sending test results to the gcc-testresults list is **not** spamming,
> that's what the list is for!

100% agree!  If you look at what we (IBM) post, we roughly post somewhere
around 7 testsuite results per day due to runs on different hardware,
endianness and OS (Linux versus AIX).  So spam ...err... post away!



> If you're testing uncommon targets (e.g. ia64-linux) then sending test
> results to the list is essential so we know the target builds, because
> nobody else is testing it.

Again, 100% agree!

Peter



Re: [PATCH] fix PowerPC < 7 w/ Altivec not to default to power7

2024-03-08 Thread Peter Bergner via Gcc
On 3/8/24 5:30 AM, Jonathan Wakely via Gcc wrote:
> Patches should be sent to the gcc-patches list instead of this one,
> and should be against trunk not an old gcc-11 RC. See
> https://gcc.gnu.org/contribute.html#patches for more details - thanks!

And you need to CC the rs6000/powerpc port maintainers which you can find
along with their preferred email addresses in the MAINTAINERS file.  If you
don't CC them, they may miss seeing the patch.

Peter




Re: Stepping up as maintainer for ia64

2024-03-08 Thread Peter Bergner via Gcc
On 3/8/24 7:16 AM, Richard Biener via Gcc wrote:
> I CCed Jeff who is on the commitee to forward the maintainer proposal
> though I guess this will not go forward as a first step.  Instead
> you are probably expected to show activity on the port, for example
> post the patch series to make ia64 use LRA, get write access to the
> git repository and then be promoted maintainer.

One other method for showing activity is posting regular testsuite
results on the gcc-testresults mailing list to show the community
the port is "working".

Peter



Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-28 Thread Peter Bergner
On 2/28/24 8:31 AM, Segher Boessenkool wrote:
> On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote:
>> So it seems you're not NAKing the use of splat_input_operand, but
>> just that it needs more explanation in the git log entry, correct?
> 
> I NAK the patch.  _Of course_ there needs to be *something* done, there
> is a bug after all, it needs to be fixed.
> 
> But no, there are big questions about if splat_input_operand is correct
> as well.  This needs to be justified in the patch submission.

Ok, then Jeevitha, repost the patch with the s/op1/operands[1]/ only change.
Jeevitha has already bootstrapped and regtested that change and it does
fix the bug.

Clearly, the splat_input_operand change needs more discussion and would
be a follow-on patch...if we decide to do it at all.

Peter



Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-27 Thread Peter Bergner
On 2/27/24 6:40 AM, Segher Boessenkool wrote:
> On Tue, Feb 27, 2024 at 02:02:38AM +0530, jeevitha wrote:
>> There is no immediate value splatting instruction in Power. Currently, those
>> values need to be stored in a register or memory. To address this issue, I
>> have updated the predicate for the second operand in vsx_splat to
>> splat_input_operand and corrected the assignment of op1 to operands[1].
>> These changes ensure that operand1 is stored in a register.
> 
> input_operand allows a lot of things that splat_input_operand does not,
> not just immediate operands.  NAK.
> 
> (For example, *all* memory is okay for input_operand, always).
> 
> I'm not saying we do not want to restrict these things, but a commit
> that doesn't discuss this at all is not okay.  Sorry.

So it seems you're not NAKing the use of splat_input_operand, but
just that it needs more explanation in the git log entry, correct?

Yes, input_operand accepts a lot more things than splat_input_operand
does, but the multiple define_insns this define_expand feeds, uses
gpc_reg_operand, memory_operand and splat_input_operand for their
operands[1] operand (splat_input_operand accepts reg and mem too),
so it seems to match better what the patterns will be accepting and
I always thought that using predicates that more accurately reflect
what the define_insns expect/accept lead to better code gen.

Mike, was it just an oversight to not use splat_input_operand for the
vsx_splat_ expander or was input_operand a conscious decision?

If input_operand was used purposely, then we can just fall back to
the s/op1/operands[1]/ change which we already know fixes the bug.


Peter




Re: [PATCH] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-26 Thread Peter Bergner
On 2/26/24 7:55 PM, Kewen.Lin wrote:
> on 2024/2/26 23:07, Peter Bergner wrote:
>> so I think we should use both Jeevitha's predicate change and
>> your operands[1] change.
> 
> Since either the original predicate change or operands[1] change
> can fix this issue, I think it's implied that only either of them
> is enough, so we can remove "else if (!REG_P (op1))" arm (or even
> replaced with one else arm to assert REG_P (op1))?

splat_input_operand allows, mem, reg and subreg, so I don't think
we can just assert on REG_P (op1), since op1 could be a subreg.
I do agree we can remove the "if (!REG_P (op1))" test on the else
branch, since force_reg() has an early exit for regs, so a simple:

  ...
  else
operands[1] = force_reg (mode, op1);

..should work.




> Good point, or maybe just an explicit -mvsx like some existing ones, which
> can avoid to only test some fixed cpu type.

If a simple "-O1 -vsx" is enough to expose the ICE on an unpacthed
compiler and a PASS on a patched compiler, then I'm all for it.
Jeevitha, can you try confirming that?


Peter




Re: [PATCH] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-26 Thread Peter Bergner
On 2/26/24 4:49 AM, Kewen.Lin wrote:
> on 2024/2/26 14:18, jeevitha wrote:
>> Hi All,
>> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
>> index 6111cc90eb7..e5688ff972a 100644
>> --- a/gcc/config/rs6000/vsx.md
>> +++ b/gcc/config/rs6000/vsx.md
>> @@ -4660,7 +4660,7 @@
>>  (define_expand "vsx_splat_"
>>[(set (match_operand:VSX_D 0 "vsx_register_operand")
>>  (vec_duplicate:VSX_D
>> - (match_operand: 1 "input_operand")))]
>> + (match_operand: 1 "splat_input_operand")))]
>>"VECTOR_MEM_VSX_P (mode)"
>>  {
>>rtx op1 = operands[1];
> 
> This hunk actually does force_reg already:
> 
> ...
>   else if (!REG_P (op1))
> op1 = force_reg (mode, op1);
> 
> but it's assigning to op1 unexpectedly (an omission IMHO), so just
> simply fix it with:
> 
>   else if (!REG_P (op1))
> -op1 = force_reg (mode, op1);
> +operands[1] = force_reg (mode, op1);

I agree op1 was an oversight and it should be operands[1].
That said, I think using more precise predicates is a good thing,
so I think we should use both Jeevitha's predicate change and
your operands[1] change.

I'll note that Jeevitha originally had the operands[1] change, but I
didn't look closely enough at the issue or the pattern and mentioned
that these kinds of bugs can be caused by too loose constraints and
predicates, which is when she found the updated predicate to use.
I believe she already even bootstrapped and regtested the operands[1]
only change.  Jeevitha???




>> +/* PR target/113950 */
>> +/* { dg-do compile } */
> 
> We need an effective target to ensure vsx support, for now it's 
> powerpc_vsx_ok.
> ie: /* { dg-require-effective-target powerpc_vsx_ok } */

Agreed.


>> +/* { dg-options "-O1" } */

I think we should also use a -mcpu=XXX option to ensure VSX is enabled
when compiling these VSX built-in functions.  I'm fine using any CPU
(power7 or later) where the ICE exists with an unpatched compiler.
Otherwise, testing will be limited to our server systems that have
VSX enabled by default.


Peter





Re: [PATCH] rs6000: Neuter option -mpower{8,9}-vector [PR109987]

2024-02-20 Thread Peter Bergner
On 2/20/24 3:27 AM, Kewen.Lin wrote:
> on 2024/2/20 02:45, Segher Boessenkool wrote:
>> On Tue, Jan 16, 2024 at 10:50:01AM +0800, Kewen.Lin wrote:
>>> it consists of some aspects:
>>>   - effective target powerpc_p{8,9}vector_ok are removed
>>> and replaced with powerpc_vsx_ok.
>>
>> So all such testcases already arrange to have p8 or p9 some other way?

Shouldn't that be replaced with powerpc_vsx instead of powerpc_vsx_ok?
That way we know VSX code gen is enabled for the options being used,
even those in RUNTESTFLAGS.

I thought we agreed that powerpc_vsx_ok was almost always useless and
we always want to use powerpc_vsx.  ...or did I miss that we removed
the old powerpc_vsx_ok and renamed powerpc_vsx to powerpc_vsx_ok?



>>>   - Some test cases are updated with explicit -mvsx.
>>>   - Some test cases with those two option mixed are adjusted
>>> to keep the test points, like -mpower8-vector
>>> -mno-power9-vector are updated with -mdejagnu-cpu=power8
>>> -mvsx etc.
>>
>> -mcpu=power8 implies -mvsx already.

Then we can omit the explicit -msx option, correct?  Ie, if the
user forces -mno-vsx in RUNTESTFLAGS, then we'll just skip the
test case as UNSUPPORTED rather than trying to compile some
vsx test case with vsx disabled via the options.



Peter


Re: [PATCH] rs6000: Update instruction counts due to combine changes [PR112103]

2024-02-20 Thread Peter Bergner
On 2/20/24 3:29 AM, Kewen.Lin wrote:
> on 2024/2/20 06:35, Peter Bergner wrote:
>> rs6000: Update instruction counts due to combine changes [PR112103]
>>
>> The PR91865 combine fix changed instruction counts slightly for rlwinm-0.c.
>> Adjust expected instruction counts accordingly.
>>
>> This passed on both powerpc64le-linux and powerpc64-linux running the
>> testsuite in both 32-bit and 64-bit modes.  Ok for trunk?
> 
> OK for trunk, thanks for fixing!

Ok, pushed.  Thanks.


>> FYI, I will open a new bug to track the removing of the superfluous
>> insns detected in PR112103.
> 
> Hope this test case will become not fragile any more once this filed
> issue gets fixed. :)

I think this will become less fragile after we fix PR114004 which is
the bug I opened to track fixing the superfluous insn that was emitted
that we found in this bug.  The fragility was due to the superfluous
insn being different before and after Roger's patch.  Once we don't
emit it anymore, this test case should be less fragile.

Peter



[PATCH] rs6000: Update instruction counts due to combine changes [PR112103]

2024-02-19 Thread Peter Bergner
rs6000: Update instruction counts due to combine changes [PR112103]

The PR91865 combine fix changed instruction counts slightly for rlwinm-0.c.
Adjust expected instruction counts accordingly.

This passed on both powerpc64le-linux and powerpc64-linux running the
testsuite in both 32-bit and 64-bit modes.  Ok for trunk?

FYI, I will open a new bug to track the removing of the superfluous
insns detected in PR112103.


Peter


gcc/testsuite/
PR target/112103
* gcc.target/powerpc/rlwinm-0.c: Adjust expected instruction counts.

diff --git a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c 
b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
index 4f4fca2d8ef..a10d9174306 100644
--- a/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/rlwinm-0.c
@@ -4,10 +4,10 @@
 /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 6739 { target ilp32 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9716 { target lp64 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+blr} 3375 } } */
-/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3081 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 3090 { target lp64 } } } 
*/
 
 /* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3197 { target ilp32 } } 
} */
-/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3093 { target lp64 } } } 
*/
+/* { dg-final { scan-assembler-times {(?n)^\s+rlwinm} 3084 { target lp64 } } } 
*/
 /* { dg-final { scan-assembler-times {(?n)^\s+rotlwi} 154 } } */
 /* { dg-final { scan-assembler-times {(?n)^\s+srwi} 13 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {(?n)^\s+srdi} 13 { target lp64 } } } */


Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-24 Thread Peter Bergner
On 1/24/24 12:04 AM, Kewen.Lin wrote:
> on 2024/1/24 11:11, Peter Bergner wrote:
>> But not with this.  The -mdejagnu-cpu=power10 option already enables -mvsx.
>> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
>> The options set in RUNTESTFLAGS come after the options in the dg-options
>> line, so even adding -mvsx like the above won't help the test case PASS
> 
> But this is NOT true, at least on one of internal Power10 machine 
> (ltcden2-lp1).
> 
> With the command below:
>   
>   make check-gcc-c RUNTESTFLAGS="--target_board=unix/-mno-vsx 
> powerpc.exp=pr112886.c"
> 
> this test case fails without the explicit -mvsx in dg-options.
> 
> From the verbose dumping, the compilation command looks like:
> 
> /home/linkw/gcc/build/gcc-test-debug/gcc/xgcc 
> -B/home/linkw/gcc/build/gcc-test-debug/gcc/
> /home/linkw/gcc/gcc-test/gcc/testsuite/gcc.target/powerpc/pr112886.c  
> -mno-vsx 
> -fdiagnostics-plain-output  -mdejagnu-cpu=power10 -O2 -ffat-lto-objects 
> -fno-ident -S
> -o pr112886.s
> 
> "-mno-vsx" comes **before** "-mdejagnu-cpu=power10 -O2" rather than **after**.
> 
> I guess it might be due to different behaviors of different versions of 
> runtest framework?

That is confusing, unless as you say, the behavior changed.  The whole reason 
we added
-mdejagnu-cpu= (and the dg-skip usage before that) was due to encountering 
problems
when the test case wanted a specific -mcpu= value and the user overrode it in 
their
RUNTESTFLAGS and that can only happen when its options come last on the command 
line.

Then again, why didn't the powerpc_vsx_ok test not save us here?



> So there can be two cases with user explicitly specified -mno-vsx:
> 
> 1) RUNTESTFLAGS comes after dg-options (assuming same order for -mvsx in 
> powerpc_vsx_ok)
> 
>   powerpc_vsx_ok test failed, so UNSUPPORTED
> 
>   // with explicit -mvsx does nothing as you said.
> 
> 2) RUNTESTFLAGS comes before dg-options
> 
>   powerpc_vsx_ok test succeeds, but FAIL.
>   
>  // with suggested -mvsx, make it match the powerpc_vsx_ok checking and the 
> case not fail.
> 
> As above I think we still need to append the "-mvsx" explicitly.  As 
> tested/verified, it
> does help the case not to fail on ltcden2-lp1.

I'd like to verify that the behavior did change before we enforce adding that 
option.
The problem is, there are a HUGE number of older test cases that would need 
updating
to "fix" them too.  ...and not just adding -mnsx, but -maltivec and basically 
any
-mfoo option where the test case is expecting the feature foo to be used/tested.
It would be a huge mess.

Peter



Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-23 Thread Peter Bergner
On 1/23/24 8:30 PM, Kewen.Lin wrote:
>> -output_operand_lossage ("invalid %%x value");
>> +output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 
>> 'x'));
> 
> Nit: Seems simpler with
>   
>   output_operand_lossage ("invalid %%%c value", (char) code);

Agreed, good catch.




>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>> new file mode 100644
>> index 000..4e59dcda6ea
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target power10_ok } */
> 
> I think this needs one more:
> 
> /* { dg-require-effective-target powerpc_vsx_ok } */

I agree with this...



>> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> 
> ... and
> 
> /* { dg-options "-mdejagnu-cpu=power10 -O2 -mvsx" } */
> 
> , otherwise with explicit -mno-vsx, this test case would fail.

But not with this.  The -mdejagnu-cpu=power10 option already enables -mvsx.
If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
The options set in RUNTESTFLAGS come after the options in the dg-options
line, so even adding -mvsx like the above won't help the test case PASS
if we didn't have the powerpc_vsx_ok test.  In other words, the -mvsx option
doesn't help with anything.

All we need is the new powerpc_vsx_ok check and that will guard against the FAIL
in the case the user forces -mno-vsx.  In that case, we'll just get an 
UNSUPPORTED
and that is fine.

Peter





Re: [PATCH, V2] PR target/112886, Add %S to print_operand for vector pair support.

2024-01-19 Thread Peter Bergner
On 1/11/24 11:29 AM, Michael Meissner wrote:
> This is version 2 of the patch.  The only difference is I made the test case
> simpler to read.
[snip]
> gcc/
> 
>   PR target/112886
>   * config/rs6000/rs6000.cc (print_operand): Add %S output modifier.
>   * doc/md.texi (Modifiers): Mention %S can be used like %x.
> 
> gcc/testsuite/
> 
>   PR target/112886
>   * /gcc.target/powerpc/pr112886.c: New test.

This resolves my issue with the first patch, so LGTM.

Peter




Re: [PATCH] PR target/112886, Add %S to print_operand for vector pair support

2024-01-09 Thread Peter Bergner
On 1/5/24 4:18 PM, Michael Meissner wrote:
> @@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code)
>   print_operand (file, x, 0);
>return;
>  
> +case 'S':
>  case 'x':
> -  /* X is a FPR or Altivec register used in a VSX context.  */
> +  /* X is a FPR or Altivec register used in a VSX context.  %x prints
> +  the VSX register number, %S prints the 2nd register number for
> +  vector pair, decimal 128-bit floating and IBM 128-bit binary floating
> +  values.  */
>if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
> - output_operand_lossage ("invalid %%x value");
> + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 
> 'x'));
>else
>   {
> -   int reg = REGNO (x);
> +   int reg = REGNO (x) + (code == 'S' ? 1 : 0);
> int vsx_reg = (FP_REGNO_P (reg)
>? reg - 32
>: reg - FIRST_ALTIVEC_REGNO + 32);

The above looks good to me.  However:


> +: "=v" (*p)
> +: "v" (*q), "v" (*r));

These really should use "wa" rather than "v", since these are
VSX instructions... or did you use those to ensure you got
Altivec registers numbers assigned?



> +/* { dg-final { scan-assembler-times {\mxvadddp 
> (3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3])\M}
>  2 } } */

...and this is really ugly and hard to read/understand.  Can't we use
register variables to make it simpler?  Something like the following
which tests having both FPR and Altivec reg numbers assigned?

...
void
test (__vector_pair *ptr)
{
  register __vector_pair p asm ("vs10");
  register __vector_pair q asm ("vs42");
  register __vector_pair r asm ("vs44");
  q = ptr[1];
  r = ptr[2];
  __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
   : "=wa" (p)
   : "wa" (q), "wa" (r));
  ptr[2] = p;
}

/* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
/* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */
...

Peter



Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-12-14 Thread Peter Bergner
On 12/14/23 9:57 PM, Peter Bergner wrote:
> On 7/16/23 10:40 PM, P Jeevitha via Gcc-patches wrote:
>> +  /* For non PC-relative code, GPR2 is unavailable for register allocation. 
>>  */
>> +  if (FIXED_R2 && !rs6000_pcrel_p ())
>> +fixed_regs[2] = 1;
[snip]
> On a related note, Jeevitha's patch above allows using r2 for normal register
> allocation if r2 is not fixed and pcrel is enabled.  Given pcrel with this 
> patch
> enables pcrel on ELFv1, that means this patch can also enable using r2 for 
> normal
> register allocation on ELFv1.

Nevermind, I'm daft and r2 usage is not allowed on ELFv1.  The rs6000_pcrel_p()
call above is always false for non ELFv2 compiles, so we'll mark r2 as fixed
for ELFv1.  Move along, nothing to see. :-)

That said, I think we need a "dg-require-effective-target powerpc_elfv2" for
the first test case where we're checking that we do use r2 for normal RA.
That'll only be true on ELFv2 compiles, hence the need for the extra target
requirement.  I've asked Jeevitha to add that to the pr111045-1.c test
case and verify it fixes the failure of that test case on her BE run.

Peter







Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-12-14 Thread Peter Bergner
On 7/16/23 10:40 PM, P Jeevitha via Gcc-patches wrote:
> Normally, GPR2 is the TOC pointer and is defined as a fixed and non-volatile
> register. However, it can be used as volatile for PCREL addressing. Therefore,
> modified r2 to be non-fixed in FIXED_REGISTERS and set it to fixed if it is 
> not
> PCREL and also when the user explicitly requests TOC or fixed. If the register
> r2 is fixed, it is made as non-volatile. Changes in register preservation 
> roles
> can be accomplished with the help of available target hooks
> (TARGET_CONDITIONAL_REGISTER_USAGE).
> 
> 2023-07-12  Jeevitha Palanisamy  
> 
> gcc/
>   PR target/PR110320
>   * config/rs6000/rs6000.cc (rs6000_conditional_register_usage): Change
>   GPR2 to volatile and non-fixed register for PCREL.
> 
> gcc/testsuite/
>   PR target/PR110320
>   * gcc.target/powerpc/pr110320-1.c: New testcase.
>   * gcc.target/powerpc/pr110320-2.c: New testcase.
>   * gcc.target/powerpc/pr110320-3.c: New testcase.
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 44b448d2ba6..9aa04ec5d57 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10193,9 +10193,13 @@ rs6000_conditional_register_usage (void)
>  for (i = 32; i < 64; i++)
>fixed_regs[i] = call_used_regs[i] = 1;
>  
> +  /* For non PC-relative code, GPR2 is unavailable for register allocation.  
> */
> +  if (FIXED_R2 && !rs6000_pcrel_p ())
> +fixed_regs[2] = 1;
> +
>/* The TOC register is not killed across calls in a way that is
>   visible to the compiler.  */
> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
> +  if (fixed_regs[2] && (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2))
>  call_used_regs[2] = 0;

Segher and Ke Wen,

As we discussed on my PR111045 patch that disabled PCREL on everything other
than ELFv2 and Segher said, well, it should work on ELFv1, modulo fixing bugs.
Segher said we should attempt to fix those bugs before we ship the next release
and if we miss that, we can push my PR111045 patch then to disable it.

On a related note, Jeevitha's patch above allows using r2 for normal register
allocation if r2 is not fixed and pcrel is enabled.  Given pcrel with this patch
enables pcrel on ELFv1, that means this patch can also enable using r2 for 
normal
register allocation on ELFv1.  Is that safe?  Should we add a check above where
we set fixed_regs[2] = 1, to also check for whether this is not an ELFv2 
compile?
...or Segher, should we leave this as is and add it to the things to check for
non-ELFv2 compiles before the next release and possible disable it then if we
know/aren't sure whether it legal?

So I guessing I'm wondering, should Jeevitha push the above approved patch as
is, or should we modify it so r2 is only available for RA on ELFv2 and pcrel?

Peter




Re: [PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]

2023-12-13 Thread Peter Bergner
On 11/24/23 3:28 AM, Kewen.Lin wrote:
>> +  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
> 
> Is it intentional to keep GET_MODE_SIZE (V16QImode) instead of 16?
> I think if one day NUM_POLY_INT_COEFFS isn't 1 on rs6000 any more,
> we have to add one explicit .to_constant () here.  So I prefer this
> to use 16 directly, maybe one comment above to indicate what's for
> the value 16.

I normally don't like hard coding constants in the code, so used
GET_MODE_SIZE (V16QImode) as the number of bytes of a vector register,
but if that's going to cause an issue in the future, I'm fine using 16.
Changed.



>> +  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
> 
> Likewise.

Changed.




>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index 5f56c3ed85b..f2efa46c147 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -1964,9 +1964,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, 
>> machine_mode mode)
>>  static bool
>>  rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>>  {
>> -  if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
>> -  || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode)
>> -return mode1 == mode2;
>> +   if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
>> +   || mode2 == PTImode || mode2 == XOmode)
>> + return mode1 == mode2;
>> + 
>> +  if (mode2 == OOmode)
>> +return ALTIVEC_OR_VSX_VECTOR_MODE (mode1);
> 
> I vaguely remembered that Segher mentioned it's unexpected for opaque
> modes to have tieable modes excepting for themselves, but if this is the
> only way to get rid of those extra moves, I guess we can special-case
> them here.  Looking forward to Segher's comments on this part.

To be honest, my original patch didn't have this.  I think it was Mike who
said we want or need this.  Mike, why do we want/need this again?

That said, the documentation for TARGET_MODES_TIEABLE_P says:

  This hook returns true if a value of mode mode1 is accessible in
  mode mode2 without copying.

Given OOmode (ie, __vector_pair) under the covers is two consecutive
vector registers, and we use them/initialize them with two vectors,
then mode1 being a vector mode could be accesible from an OOmode mode2
without copying, meaning we could access it directly from the registers
holding mode2.

Segher, your input to the above an the subreg portion of the patch in general?

Peter





Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Peter Bergner
On 12/13/23 2:05 AM, Jakub Jelinek wrote:
> On Wed, Dec 13, 2023 at 08:51:16AM +0100, Richard Biener wrote:
>> On Tue, 12 Dec 2023, Peter Bergner wrote:
>>
>>> On 12/12/23 8:36 PM, Jason Merrill wrote:
>>>> This test is failing for me below C++17, I think you need
>>>>
>>>> // { dg-do compile { target c++17 } }
>>>> or
>>>> // { dg-require-effective-target c++17 }
>>>
>>> Sorry about that.  Should we do the above or should we just add
>>> -std=c++17 to dg-options?  ...or do we need to do both?
>>
>> Just do the above, the C++ testsuite iterates over all standards,
>> adding -std=c++17 would just run that 5 times.  But the above
>> properly skips unsupported cases.
> 
> I believe if one uses explicit -std=gnu++17 or -std=c++17 in dg-options
> then it will not iterate:
> # If the testcase specifies a standard, use that one.
> # If not, run it under several standards, allowing GNU extensions
> # if there's a dg-options line.
> if ![search_for $test "-std=*++"] {
> and otherwise how many times exactly it iterates depends on what the user
> asked for or what effective target is there (normally the default is
> to iterate 4 times (98,14,17,20), one can use e.g.
> GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 to iterate 7 times, or the
> default also changes if c++23, c++26 or c++11_only effective targets
> are present somewhere in the test.
> 
> But sure, if the test is valid in C++17, 20, 23, 26, then
> // { dg-do compile { target c++17 } }
> is best (unless the test is mostly language version independent and
> very expensive to compile or run).

I confirmed the test case builds with C++17, 20, 23, 26 and errors out
with C++11, so I went with your solution.  Thanks for the input and
sorry for the breakage.  Pushed.

Peter


testsuite: Add dg-do compile target c++17 directive for testcase [PR112822]

Add dg-do compile target directive that limits the test case to being built
on c++17 compiles or greater.

2023-12-13  Peter Bergner  

gcc/testsuite/
PR tree-optimization/112822
* g++.dg/pr112822.C: Add dg-do compile target c++17 directive.

diff --git a/gcc/testsuite/g++.dg/pr112822.C b/gcc/testsuite/g++.dg/pr112822.C
index d1490405493..a8557522467 100644
--- a/gcc/testsuite/g++.dg/pr112822.C
+++ b/gcc/testsuite/g++.dg/pr112822.C
@@ -1,4 +1,5 @@
 /* PR tree-optimization/112822 */
+/* { dg-do compile { target c++17 } } */
 /* { dg-options "-w -O2" } */
 
 /* Verify we do not ICE on the following noisy creduced test case.  */




Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 8:36 PM, Jason Merrill wrote:
> This test is failing for me below C++17, I think you need
> 
> // { dg-do compile { target c++17 } }
> or
> // { dg-require-effective-target c++17 }

Sorry about that.  Should we do the above or should we just add
-std=c++17 to dg-options?  ...or do we need to do both?

Peter





Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 1:26 PM, Richard Biener wrote:
>> Am 12.12.2023 um 19:51 schrieb Peter Bergner :
>>
>> On 12/12/23 12:45 PM, Peter Bergner wrote:
>>> +/* PR target/112822 */
>>
>> Oops, this should be:
>>
>> /* PR tree-optimization/112822 */
>>
>> It's fixed on my end.
> 
> Ok

Pushed now that Martin has pushed his fix.  Thanks!

Peter




Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 12:45 PM, Peter Bergner wrote:
> +/* PR target/112822 */

Oops, this should be:

/* PR tree-optimization/112822 */

It's fixed on my end.

Peter






Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-12 Thread Peter Bergner
On 12/12/23 10:50 AM, Martin Jambor wrote:
> The testcase has reasonable size but it is specific to ppc64le and its
> altivec vectors.  My plan is to ask the bug reporter to massage it into
> a target specific testcase in bugzilla.  Alternatively I can try to
> craft a testcase from scratch but that will take time.

I rewrote the Altivec specific part of the testcase to use generic C code
and it still ICEs for me on ppc64le using an unpatched compiler.  Therefore,
I think we can just add the updated testcase to the generic g++ tests. 

I'll note I was wrong in the bugzilla comments, -O3 -mcpu=power10 is not
required to hit the ICE.  A simple -O2 on ppc64le is enough to hit the ICE.

Ok for trunk?

Peter


testsuite: Add testcase for already fixed PR [PR112822]

gcc/testsuite/
PR tree-optimization/112822
* g++.dg/pr112822.C: New test.

diff --git a/gcc/testsuite/g++.dg/pr112822.C b/gcc/testsuite/g++.dg/pr112822.C
new file mode 100644
index 000..3921d5c1bbe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr112822.C
@@ -0,0 +1,369 @@
+/* PR target/112822 */
+/* { dg-options "-w -O2" } */
+
+/* Verify we do not ICE on the following noisy creduced test case.  */
+
+namespace b {
+typedef int c;
+template  struct d;
+template  struct d { using f = e; };
+template  struct aa;
+template  struct aa { using f = h; };
+template  using ab = typename d::f;
+template  using n = typename aa::f;
+template  class af {
+public:
+  typedef __complex__ ah;
+  template  af operator+=(e) {
+ah o;
+x = o;
+return *this;
+  }
+  ah x;
+};
+} // namespace b
+namespace {
+enum { p };
+enum { ac, ad };
+struct ae;
+struct al;
+struct ag;
+typedef b::c an;
+namespace ai {
+template  struct ak { typedef aj f; };
+template  using ar = typename ak::f;
+template  struct am {
+  enum { at };
+};
+template  struct ao {
+  enum { at };
+};
+template  struct ap;
+template  struct aq {
+  enum { at };
+};
+} // namespace ai
+template  struct ay;
+template  class as;
+template  class ba;
+template  class aw;
+template  class be;
+template  class az;
+namespace ai {
+template  struct bg;
+template ::bd>
+struct bk;
+template  struct bf;
+template  struct bm;
+template  struct bh;
+template ::bj>::at> struct bp {
+  typedef bi f;
+};
+template  struct br {
+  typedef typename bp::f>::f f;
+};
+template  struct bn;
+template  struct bn {
+  typedef aw f;
+};
+template  struct bx {
+  typedef typename bn::bs, aj ::bo>::f f;
+};
+template  struct bt { typedef b::n<0, aj, aj> f; };
+template ::f> struct cb {
+  enum { bw };
+  typedef b::n::f> f;
+};
+template ::bs> struct by {
+  typedef be f;
+};
+template  struct bz {
+  typedef typename by::f f;
+};
+template  struct ch;
+template  struct ch { typedef ci bd; };
+} // namespace ai
+template > struct cg;
+template  struct cg { typedef aj cn; };
+namespace ai {
+template  cj cp;
+template  void cl(bu *cr, cj cs) { ct(cr, cs); }
+typedef __attribute__((altivec(vector__))) double co;
+void ct(double *cr, co cs) { *(co *)cr = cs; }
+struct cq {
+  co q;
+};
+template <> struct bm> { typedef cq f; };
+template <> struct bh { typedef cq bj; };
+void ct(b::af *cr, cq cs) { ct((double *)cr, cs.q); }
+template  struct cx {
+  template  void cu(cw *a, cj) {
+cl(a, cp);
+  }
+};
+} // namespace ai
+template  class ba : public ay {
+public:
+  typedef ai::ap bu;
+  typedef b::n::bo, bu, b::n::at, bu, bu>> cv;
+  typedef ay db;
+  db::dc;
+  cv coeff(an dd, an col) const { return dc().coeff(dd, col); }
+};
+template  class cz : public ba::at> {
+public:
+  ai::ap b;
+  enum { da, dg, dh, bv, bq, di = dg, bo };
+};
+template  class be : public cz {
+public:
+  typedef typename ai::ap::bu bu;
+  typedef cz db;
+  db::dc;
+  template  cd +=(const be &);
+  template  az df(de);
+};
+template  struct ay {
+  cd () { return *static_cast(this); }
+  cd dc() const;
+};
+template  class dl;
+namespace ai {
+template  struct ap> {
+  typedef bb dj;
+  typedef bc r;
+  typedef ap s;
+  typedef ap t;
+  typedef typename cg::cn bu;
+  typedef typename ch::bd>::bd cf;
+  enum { bo };
+};
+} // namespace ai
+template 
+class az : public dl, ai::ap, ai::bg::bd>> {
+public:
+  typedef dk bb;
+  typedef Rhs_ bc;
+  typedef typename ai::bt::f LhsNested;
+  typedef typename ai::bt::f dn;
+  typedef ai::ar u;
+  typedef ai::ar RhsNestedCleaned;
+  u lhs();
+  RhsNestedCleaned rhs();
+};
+template 
+class dl : public ai::bz, al>::f {};
+namespace ai {
+template  struct v { typedef ag w; };
+template  struct evaluator_traits_base {
+  typedef typename v::cf>::w w;
+};
+template  struct ax : evaluator_traits_base {};
+template  struct y { static const bool at = false; };
+template  class plainobjectbase_evaluator_data {
+public:
+  plainobjectbase_evaluator_data(bu *ptr, an) : data(ptr) {}
+  an outerStride() { return z; }
+  bu *data;
+};
+template  struct evaluator {
+  typedef cd PlainObjectType;
+  typedef typename PlainObjectType::bu bu;
+  enum { IsVectorAtCompileTime };
+  enum { 

[PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]

2023-11-15 Thread Peter Bergner
PR109116 exposes an issue where using unspecs to access each vector component
of an opaque mode variable leads to unneeded register copies, because our rtl
optimizers cannot handle unspecs.  Instead, use subregs to access each vector
component of the opaque mode variable, which our optimizers know how to handle.

I did not include a test case with the patch, since writing a test case that
attempts to ensure we don't emit unneeded register copies is nearly impossible
since those copies can still be generated for reasons other than the causes
in this patch.  I have verified that this patch does improve code generation
for some unit tests and our AI libraries team has confirmed that performance
of their tests improved when using this patch.

This passed bootstrap and regtesting with no regressions on powerpc64le-linux
and powerpc64-linux.  Ok for trunk?

Peter


gcc/
PR target/109116
* config/rs6000/mma.md (vsx_disassemble_pair): Expand into a vector
register sized subreg.
* config/rs6000/mma.md (*vsx_disassemble_pair): Delete.
(mma_disassemble_acc): Expand into a vector register sized subreg.
(*mma_disassemble_acc): Delete.
* config/rs6000/rs6000.cc (rs6000_modes_tieable_p): Allow vector modes
to tie with OOmode.

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 575751d477e..2ca405469e2 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -398,29 +398,8 @@ (define_expand "vsx_disassemble_pair"
(match_operand 2 "const_0_to_1_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*vsx_disassemble_pair"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:OO 1 "vsx_register_operand" "wa")
- (match_operand 2 "const_0_to_1_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && vsx_register_operand (operands[1], OOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], OOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })
@@ -472,29 +451,8 @@ (define_expand "mma_disassemble_acc"
(match_operand 2 "const_0_to_3_operand")]
   "TARGET_MMA"
 {
-  rtx src;
-  int regoff = INTVAL (operands[2]);
-  src = gen_rtx_UNSPEC (V16QImode,
-   gen_rtvec (2, operands[1], GEN_INT (regoff)),
-   UNSPEC_MMA_EXTRACT);
-  emit_move_insn (operands[0], src);
-  DONE;
-})
-
-(define_insn_and_split "*mma_disassemble_acc"
-  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
-   (unspec:V16QI [(match_operand:XO 1 "fpr_reg_operand" "d")
- (match_operand 2 "const_0_to_3_operand")]
- UNSPEC_MMA_EXTRACT))]
-  "TARGET_MMA
-   && fpr_reg_operand (operands[1], XOmode)"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
-{
-  int reg = REGNO (operands[1]);
-  int regoff = INTVAL (operands[2]);
-  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
+  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);
+  rtx src = simplify_gen_subreg (V16QImode, operands[1], XOmode, regoff);
   emit_move_insn (operands[0], src);
   DONE;
 })
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5f56c3ed85b..f2efa46c147 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1964,9 +1964,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
 static bool
 rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2)
 {
-  if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
-  || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode)
-return mode1 == mode2;
+   if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
+   || mode2 == PTImode || mode2 == XOmode)
+ return mode1 == mode2;
+ 
+  if (mode2 == OOmode)
+return ALTIVEC_OR_VSX_VECTOR_MODE (mode1);
 
   if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1))
 return ALTIVEC_OR_VSX_VECTOR_MODE (mode2);


Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-14 Thread Peter Bergner
On 11/14/23 9:12 PM, Lehua Ding wrote:
> I've applied for machine permissions on the compile farm, can you give
> me the way to compile and run tests on PPC64BE machine? I'll take a look
> at it too, thanks a lot.

That's an old system, with too old system libgmp, etc.  Let me attempt a
build there so I can give you correct build directions for that system.

That said, unfortunately, that system is currently almost out of available
disk space:

  [bergner@gcc1-power7 ~]$ df -h
  Filesystem  Size  Used Avail Use% Mounted on
  ...
  /dev/md41.6T  1.6T  9.0G 100% /home

Segher, can you please send out an admin note for people to clean up
unneeded space on cfarm110?  Thanks.

Peter




[PATCH] rs6000: Only enable PCREL on supported ABIs [PR111045]

2023-11-14 Thread Peter Bergner
PCREL data accesses are only officially supported on ELFv2.  We currently
incorrectly enable PCREL on all Power10 compiles in which prefix instructions
are also enabled.  Rework the option override code so we only enable PCREL
for those ABIs that actually support it.

Jeevitha has confirmed this patch fixes the testsuite fallout seen with her
PR110320 patch.

This has been bootstrapped and regtested with no regressions on the following
builds: powerpc64le-linux, powerpc64le-linux --with-cpu=power10 and
powerpc64-linux - testsuite run in both 32-bit and 64-bit modes.
Ok for trunk?

Ok for the release branches after some burn-in on trunk?

Peter


gcc/
PR target/111045
* config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI.
* config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL
from power10.
* config/rs6000/predicates.md: Use TARGET_PCREL.
* config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise.
(rs6000_global_entry_point_prologue_needed_p): Likewise.
(rs6000_output_function_prologue): Likewise.
* config/rs6000/rs6000.md: Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework
the logic for enabling PCREL by default.
(rs6000_legitimize_tls_address): Use TARGET_PCREL.
(rs6000_call_template_1): Likewise.
(rs6000_indirect_call_template_1): Likewise.
(rs6000_longcall_ref): Likewise.
(rs6000_call_aix): Likewise.
(rs6000_sibcall_aix): Likewise.
(rs6000_pcrel_p): Remove.
* config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise.

gcc/testsuite/
PR target/111045
* gcc.target/powerpc/pr111045.c: New test.
* gcc.target/powerpc/float128-constant.c: Add instruction counts for
non-pcrel compiles.

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 98b7255c95f..5b77bd7fd51 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -563,8 +563,5 @@ extern int dot_symbols;
 #define TARGET_FLOAT128_ENABLE_TYPE 1
 
 /* Enable using prefixed PC-relative addressing on POWER10 if the ABI
-   supports it.  The ELF v2 ABI only supports PC-relative relocations for
-   the medium code model.  */
-#define PCREL_SUPPORTED_BY_OS  (TARGET_POWER10 && TARGET_PREFIXED  \
-&& ELFv2_ABI_CHECK \
-&& TARGET_CMODEL == CMODEL_MEDIUM)
+   supports it.  */
+#define PCREL_SUPPORTED_BY_OS  (ELFv2_ABI_CHECK)
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4f350da378c..fe01a2312ae 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | 
ISA_2_7_MASKS_SERVER
| OPTION_MASK_HTM)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER
| OPTION_MASK_HTM)
-RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | 
ISA_3_1_MASKS_SERVER)
+RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64
+   | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL))
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT
| MASK_POWERPC64)
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ef7d3f214c4..0b76541fc0a 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1216,7 +1216,7 @@
 && SYMBOL_REF_DECL (op) != NULL
 && TREE_CODE (SYMBOL_REF_DECL (op)) == FUNCTION_DECL
 && (rs6000_fndecl_pcrel_p (SYMBOL_REF_DECL (op))
-!= rs6000_pcrel_p ()))")))
+!= TARGET_PCREL))")))
 
 ;; Return 1 if this operand is a valid input for a move insn.
 (define_predicate "input_operand"
diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 98846f781ec..9e08d9bb4d2 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -1106,7 +1106,7 @@ rs6000_decl_ok_for_sibcall (tree decl)
 r2 for its caller's TOC.  Such a function may make sibcalls to any
 function, whether local or external, without restriction based on
 TOC-save/restore rules.  */
-  if (rs6000_pcrel_p ())
+  if (TARGET_PCREL)
return true;
 
   /* Otherwise, under the AIX or ELFv2 ABIs we can't allow sibcalls
@@ -2583,7 +2583,7 @@ rs6000_global_entry_point_prologue_needed_p (void)
 return false;
 
   /* PC-relative functions never generate a global entry point prologue.  */
-  if (rs6000_pcrel_p ())
+  if (TARGET_PCREL)
 return false;
 
   /* Ensure we have a global entry point for thunks.   ??? We could
@@ -4045,7 +4045,7 @@ rs6000_output_function_prologue (FILE *file)
  

Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-14 Thread Peter Bergner
On 11/13/23 11:37 PM, Lehua Ding wrote:
> On 2023/11/14 3:37, Vladimir Makarov wrote:
>> Also besides testing major targets I'd recommend testing at least one big
>> endian target (I'd recommend ppc64be. gcc110.fsfrance.org could be used
>> for this).  Plenty RA issues occur because BE targets are not tested.
> 
> You said the address looks a bit wrong, it should be this gcc110.fsffrance.org
> right? I looked for it and it looks like you have to go to portal.cfarm.net
> first to apply for an account on this site, I'll try that, thanks a lot.


The compile farm just went through with a domain name change, so the Power7 BE
gcc110.fsffrance.org system is now reachable via cfarm110.cfarm.net.
You are correct on the address for requesting a cfarm account.

That said, I posted results using your V3 patches for both LE and BE Power
in my other reply. 

Peter



Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-14 Thread Peter Bergner
On 11/12/23 6:08 AM, Lehua Ding wrote:
> V3 Changes:
>   1. fix three ICE.
>   2. rebase


I tested this on powerpc64le-linux and powerpc64-linux.  The LE build
bootstrapped fine and it looks like only one testsuite FAIL which I have
to look into why it's FAILing.

The BE build did bootstrap, but the 32-bit and 64-bit testsuite runs both
had lots of FAILs (over 100 between them both) which I have yet to look
into what is happening.

I'll also note I have done no performance testing yet until I have an
idea of what the testsuite failures are.  I think a patch like this that
can affect the performance of all architectures needs some performance
testing to ensure we don't have unintended performance degradations.
I'll have someone on my team kick off some builds once I have a handle
on the testsuite FAILs.

Peter




Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-14 Thread Peter Bergner
On 11/13/23 8:33 PM, Kewen.Lin wrote:
>> if (PCREL_SUPPORTED_BY_OS)
> 
>> +  else
>> +{
>> +  if (TARGET_PCREL
>> +  && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
>> +error ("use of %qs is invalid for this target", "-mpcrel");
>>rs6000_isa_flags &= ~OPTION_MASK_PCREL;
>>  }
> 
> now, I think it should be fine not to explicitly mask it off Power10
> default flags?  Then leave Power10 default flags unchanged seems more
> consistent.  The others look good to me!  Thanks again.

Ah, good catch, yes, I'll remove the clearing of the bit, since we know
it's either already clear (because that is the default value) or if it's
set, then that's the error condition we're catching here.  That also means
that we can remove the test for rs6000_isa_flags_explicit & OPTION_MASK_PCREL) 
!= 0
too, since at this point, TARGET_PCREL can only be true if it was explicitly
enabled with -mpcrel by the user.  Therefor, the code can now look like:

  if (PCREL_SUPPORTED_BY_OS)

  else if (TARGET_PCREL)
error ("use of %qs is invalid for this target", "-mpcrel");

I'll make that change, redo the bootstrap and regtesting and then
officially submit the patch.  Thanks!

Peter



Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-10 Thread Peter Bergner
On 8/25/23 6:20 AM, Kewen.Lin wrote:
> btw, I was also expecting that we don't implicitly set
> OPTION_MASK_PCREL any more for Power10, that is to remove
> OPTION_MASK_PCREL from OTHER_POWER10_MASKS.

So my patch removes the flag from the default power10 flags, like
you want.  However, it doesn't remove it from OTHER_POWER10_MASKS,
since that is used to set ISA_3_1_MASKS_SERVER and I didn't want
to change how rs6000_machine_from_flags() behaves, so instead, I
just explicitly mask it off when defining the power10 default flags.

Peter



Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-10 Thread Peter Bergner
On 8/27/23 9:06 PM, Kewen.Lin wrote:
> Assuming we only have ELFv2_ABI_CHECK in PCREL_SUPPORTED_BY_OS, we
> can have either TARGET_PCREL or !TARGET_PCREL after the checking.
> For the latter, it's fine and don't need any checks. For the former,
> if it's implicit, for !TARGET_PREFIXED we will clean it silently;
> while if it's explicit, for !TARGET_PREFIXED we will emit an error.
> TARGET_PREFIXED checking has considered Power10, so it's also
> concerned accordingly.
[snip]
> Yeah, looking forward to their opinions.  IMHO, with the current proposed
> change, pcrel doesn't look like a pure Power10 hardware feature, it also
> quite relies on ABIs, that's why I thought it seems good not to turn it
> on by default for Power10.

Ok, how about the patch below?  This removes OPTION_MASK_PCREL from the
power10 flags, so instead of our options override code needing to disable
PCREL on the systems that don't support it, we now enable it only on those
systems that do support it.

Jeevitha, can you test this patch to see whether it fixes the testsuite
issue caused by your earlier patch that was approved, but not yet pushed?
That was the use GPR2 for register allocation, correct?  Note, you'll need
to update the patch to replace the rs6000_pcrel_p() usage with just
TARGET_PCREL, since this patch removes rs6000_pcrel_p().

If testing is clean and everyone is OK with the patch, I'll officially
submit it for review with git log entry, etc.

Peter


gcc/
* config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI.
* config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL
from power10.
* config/rs6000/predicates.md: Use TARGET_PCREL.
* config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise.
(rs6000_global_entry_point_prologue_needed_p): Likewise.
(rs6000_output_function_prologue): Likewise.
* config/rs6000/rs6000.md: Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework
the logic for enabling PCREL by default.
(rs6000_legitimize_tls_address): Use TARGET_PCREL.
(rs6000_call_template_1): Likewise.
(rs6000_indirect_call_template_1): Likewise.
(rs6000_longcall_ref): Likewise.
(rs6000_call_aix): Likewise.
(rs6000_sibcall_aix): Likewise.
(rs6000_pcrel_p): Remove.
* config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise.

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 98b7255c95f..5b77bd7fd51 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -563,8 +563,5 @@ extern int dot_symbols;
 #define TARGET_FLOAT128_ENABLE_TYPE 1
 
 /* Enable using prefixed PC-relative addressing on POWER10 if the ABI
-   supports it.  The ELF v2 ABI only supports PC-relative relocations for
-   the medium code model.  */
-#define PCREL_SUPPORTED_BY_OS  (TARGET_POWER10 && TARGET_PREFIXED  \
-&& ELFv2_ABI_CHECK \
-&& TARGET_CMODEL == CMODEL_MEDIUM)
+   supports it.  */
+#define PCREL_SUPPORTED_BY_OS  (ELFv2_ABI_CHECK)
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 4f350da378c..fe01a2312ae 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | 
ISA_2_7_MASKS_SERVER
| OPTION_MASK_HTM)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER
| OPTION_MASK_HTM)
-RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | 
ISA_3_1_MASKS_SERVER)
+RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64
+   | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL))
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT
| MASK_POWERPC64)
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ef7d3f214c4..0b76541fc0a 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1216,7 +1216,7 @@
 && SYMBOL_REF_DECL (op) != NULL
 && TREE_CODE (SYMBOL_REF_DECL (op)) == FUNCTION_DECL
 && (rs6000_fndecl_pcrel_p (SYMBOL_REF_DECL (op))
-!= rs6000_pcrel_p ()))")))
+!= TARGET_PCREL))")))
 
 ;; Return 1 if this operand is a valid input for a move insn.
 (define_predicate "input_operand"
diff --git a/gcc/config/rs6000/rs6000-logue.cc 
b/gcc/config/rs6000/rs6000-logue.cc
index 98846f781ec..9e08d9bb4d2 100644
--- a/gcc/config/rs6000/rs6000-logue.cc
+++ b/gcc/config/rs6000/rs6000-logue.cc
@@ -1106,7 +1106,7 @@ rs6000_decl_ok_for_sibcall (tree decl)
 r2 for its caller's TOC.  Such a function may make sibcalls to any
 function, whether local or external, without restriction based 

[PATCH] rs6000: Update instruction counts to match vec_* calls [PR111228]

2023-08-30 Thread Peter Bergner via Gcc-patches
Commit  r14-3258-ge7a36e4715c716 increased the amount of folding we perform,
leading to better code.  Update the expected instruction counts to match the
the number of associated vec_* built-in calls.

Tested on powerpc64le-linux with no regressions.  Ok for mainline?

Peter

gcc/testsuite/
PR testsuite/111228
* gcc.target/powerpc/fold-vec-logical-ors-char.c: Update instruction
counts to match the number of associated vec_* built-in calls.
* gcc.target/powerpc/fold-vec-logical-ors-int.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-ors-short.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-char.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-int.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-logical-other-short.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-char.c
index 713fed7824a..7406039d054 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-char.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-char.c
@@ -120,6 +120,6 @@ test6_nor (vector unsigned char x, vector unsigned char y)
   return *foo;
 }
 
-/* { dg-final { scan-assembler-times {\mxxlor\M} 7 } } */
+/* { dg-final { scan-assembler-times {\mxxlor\M} 6 } } */
 /* { dg-final { scan-assembler-times {\mxxlxor\M} 6 } } */
-/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxxlnor\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-int.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-int.c
index 4d1c78f40ec..a7c6366b938 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-int.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-int.c
@@ -119,6 +119,6 @@ test6_nor (vector unsigned int x, vector unsigned int y)
   return *foo;
 }
 
-/* { dg-final { scan-assembler-times {\mxxlor\M} 7 } } */
+/* { dg-final { scan-assembler-times {\mxxlor\M} 6 } } */
 /* { dg-final { scan-assembler-times {\mxxlxor\M} 6 } } */
-/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxxlnor\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c
index 27ef09ada80..10c69d3d87b 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-longlong.c
@@ -156,6 +156,6 @@ test6_nor (vector unsigned long long x, vector unsigned 
long long y)
 // For simplicity, this test now only targets "powerpc_p8vector_ok" 
environments
 // where the answer is expected to be 6.
 
-/* { dg-final { scan-assembler-times {\mxxlor\M} 9 } } */
+/* { dg-final { scan-assembler-times {\mxxlor\M} 6 } } */
 /* { dg-final { scan-assembler-times {\mxxlxor\M} 6 } } */
-/* { dg-final { scan-assembler-times {\mxxlnor\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxxlnor\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-short.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-short.c
index f796c5b33a9..8352a7f4dc5 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-short.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-ors-short.c
@@ -119,6 +119,6 @@ test6_nor (vector unsigned short x, vector unsigned short y)
   return *foo;
 }
 
-/* { dg-final { scan-assembler-times {\mxxlor\M} 7 } } */
+/* { dg-final { scan-assembler-times {\mxxlor\M} 6 } } */
 /* { dg-final { scan-assembler-times {\mxxlxor\M} 6 } } */
-/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxxlnor\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-char.c
index e74308ccda2..7fe3e0b8e0e 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-char.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-char.c
@@ -104,5 +104,5 @@ test6_nand (vector unsigned char x, vector unsigned char y)
   return *foo;
 }
 
-/* { dg-final { scan-assembler-times {\mxxlnand\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxxlnand\M} 6 } } */
 /* { dg-final { scan-assembler-times {\mxxlorc\M} 6 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-int.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-int.c
index 57edaad52a8..61d34059b67 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-int.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-logical-other-int.c
@@ -104,5 +104,5 @@ test6_nand (vector unsigned int x, vector unsigned int y)
   return *foo;
 }
 
-/* { dg-final { scan-assembler-times {\mxxlnand\M} 3 } } 

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-25 Thread Peter Bergner via Gcc-patches
On 8/25/23 6:20 AM, Kewen.Lin wrote:
> Assuming the current PCREL_SUPPORTED_BY_OS unchanged, when
> PCREL_SUPPORTED_BY_OS is true, all its required conditions are
> satisfied, it should be safe.  while PCREL_SUPPORTED_BY_OS is
> false, it means the given subtarget doesn't support it, or one
> or more of conditions below don't hold:
> 
>  - TARGET_POWER10 
>  - TARGET_PREFIXED
>  - ELFv2_ABI_CHECK
>  - TARGET_CMODEL == CMODEL_MEDIUM

The pcrel instructions are 64-bit/prefix instructions, so I think
for PCREL_SUPPORTED_BY_OS, we want to keep the TARGET_POWER10 and
the TARGET_PREFIXED checks.  Then we should have the checks for
the OS that can support pcrel, in this case, only ELFv2_ABI_CHECK
for now.  I think we should remove the TARGET_CMODEL check, since
that isn't strictly required, it's a current code requirement for
ELFv2, but could change in the future.  In fact, Mike has talked
about adding pcrel support for ELFv2 and -mcmodel=large, so I think
that should move moved out of PCREL_SUPPORTED_BY_OS and into the
option override checks.



> I guess your proposal seems to drop CMODEL_MEDIUM (even PREFIX?)
> from PCREL_SUPPORTED_BY_OS, to leave it just ABI_ELFv2 & P10? (
> otherwise TARGET_CMODEL should be always CMODEL_MEDIUM for
> ABI_ELFv2 with the original PCREL_SUPPORTED_BY_OS).  And it
> seems to make the subtarget specific checking according to the
> ABI type further?

As I said above, we should also keep TARGET_PREFIX, since that is
required for the pcrel instructions, which are 64-bit/prefix insns.
I think the MCMODEL check should be removed from PCREL_SUPPORTED_BY_OS
and moved to the option override checks, since there's not technical
reason the different MCMODEL options cannot be made to work with
pcrel.


> I was expecting that when new subtargets need to be supported, we
> can move these subtarget specific checkings into macro/function
> SUB*TARGET_OVERRIDE_OPTIONS, IMHO the upside is each subtarget
> can take care of its own checkings separately.  Maybe we can
> just move them now (to rs6000_linux64_override_options).  And in
> the common code, for the cases with zero value PCREL_SUPPORTED_BY_OS,
> (assuming PCREL_SUPPORTED_BY_OS just simple as like ABI_ELFv2), we
> can emit an invalid error for explicit -mpcrel as you proposed
> below.  Thoughts?

Yeah, I think that would probably help simplify the tests, since
they're semi-target specific already.  Of course, we there is no
SUB*TARGET_OVERRIDE_OPTIONS and the other OSes, so those would
have to be added.



> btw, I was also expecting that we don't implicitly set
> OPTION_MASK_PCREL any more for Power10, that is to remove
> OPTION_MASK_PCREL from OTHER_POWER10_MASKS.

I'm on the fence about this one and would like to hear from Segher
and Mike on what they think.  In some respect, pcrel is a Power10
hardware feature, so that would seem to make sense to set the flag,
but yeah, we only have one OS that currently supports it, so maybe
not setting it makes sense?  Like I said, I think I need Segher and
Mike to chime in with their thoughts.

Peter




Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-24 Thread Peter Bergner via Gcc-patches
On 8/24/23 12:56 AM, Kewen.Lin wrote:
> By looking into the uses of function rs6000_pcrel_p, I think we can
> just replace it with TARGET_PCREL.  Previously we don't require PCREL
> unset for any unsupported target/OS, so we need rs6000_pcrel_p() to
> ensure it's really supported in those use places, now if we can guarantee
> TARGET_PCREL only hold for all places where it's supported, it looks
> we can just check TARGET_PCREL only?

I think you're correct on only needing TARGET_PCREL instead of rs6000_pcrel_p()
as long as we correctly disable PCREL on the targets that either don't support
it or those that due, but don't due to explicit options (ie, -mno-prel or
ELFv2 and -mcmodel != medium, etc.).



> Then the code structure can look like:
> 
> if (PCREL_SUPPORTED_BY_OS
> && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) == 0)
>// enable
> else if (TARGET_PCREL && DEFAULT_ABI != ABI_ELFv2)
>// disable
> else if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
>// disable

I don't like that first conditional expression. The problem is,
PCREL_SUPPORTED_BY_OS could be true or false for the following
tests, because it's anded with the explicit option test, and
sometimes that won't make sense.  I think something safer is
something like:

if (PCREL_SUPPORTED_BY_OS)
  { 
/* PCREL on ELFv2 requires -mcmodel=medium.  */
if (DEFAULT_ABI == ABI_ELFv2 && TARGET_CMODEL != CMODEL_MEDIUM)
  error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");

if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) == 0)
  rs6000_isa_flags |= OPTION_MASK_PCREL;
  } 
else
  {
if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0
&& TARGET_PCREL)
  error ("use of %qs is invalid for this target", "-mpcrel");
rs6000_isa_flags &= ~OPTION_MASK_PCREL;
  }

...although, even the cmodel != medium test above should probably have
some extra tests to ensure TARGET_PCREL is true (and explicit?) and
mcmodel != medium before giving an error???  Ie, a ELFv2 P10 compile
with an implicit -mpcrel and explicit -mcmodel={small,large} probably
should not error and just silently disable pcrel???  Meaning only
when we explicitly say -mpcrel -mcmodel={small,large} should we give
that error.  Thoughts on that?

Peter



Re: [PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2023-08-24 Thread Peter Bergner via Gcc-patches
On 8/24/23 12:35 PM, Michael Meissner wrote:
> On Thu, Jul 20, 2023 at 10:05:28AM +0530, jeevitha wrote:
>> gcc/
>>  PR target/110411
>>  * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields
>>  to hold PTImode type.
>>  * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Add node
>>  for PTImode type.
> 
> It is good as far as it goes, but I suspect we will eventually need to extend
> it.  In particular, the reason people need PTImode is they need the even/odd
> register layout.  What you've done enables users to declare this value.

Sure, it could be extended, but that is not what this patch is about.
It's purely to allow the kernel team access to the guaranteed even/odd
register layout for some inline asm code.  Any extension would be a
follow-on patch to this.



On 8/9/23 3:48 AM, Kewen.Lin wrote:
> IIUC, this builtin type registering makes this type expose to users, so
> I wonder if we want to actually expose this type for users' uses.
> If yes, we need to update the documentation (and not sure if the current
> name is good enough); otherwise, I wonder if there is some existing
> practice to declare a builtin type with a name which users can't actually
> use and is just for shadowing a mode.

Segher, Mike, Jeevitha and I talked about the patch and Segher mentioned
that under some conditions, it's fine to keep the type undocumented.
Hopefully he'll weigh in on whether this particular patch is one of
those cases or not.  


Peter


Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-23 Thread Peter Bergner via Gcc-patches
On 8/21/23 8:51 PM, Kewen.Lin wrote:
>> The following patch has been bootstrapped and regtested on powerpc64-linux.
> 
> I think we should test this on powerpc64le-linux P8 or P9 (no P10) as well.

That's a good idea!



> I think this should be moved to be with the hunk on PCREL:
> 
>   /* If the ABI has support for PC-relative relocations, enable it by default.
>  This test depends on the sub-target tests above setting the code model to
>  medium for ELF v2 systems.  */
>   if (PCREL_SUPPORTED_BY_OS
>   && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) == 0)
> rs6000_isa_flags |= OPTION_MASK_PCREL;
> 
>   /* -mpcrel requires -mcmodel=medium, but we can't check TARGET_CMODEL until
>   after the subtarget override options are done.  */
>   else if (TARGET_PCREL && TARGET_CMODEL != CMODEL_MEDIUM)
> {
>   if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
>   error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
> 
>   rs6000_isa_flags &= ~OPTION_MASK_PCREL;
> }

Agreed on the location, but...

Looking at this closer, I don't think I'm happy with the current code.
We seem to have duplicated tests for whether the target supports pcrel
or not in both PCREL_SUPPORTED_BY_OS and rs6000_pcrel_p().  That means
if another target were to add support for pcrel, they'd have to update
multiple locations of the code, and that seems error prone.

I think we should standardize our tests for whether the target/OS
supports pcrel (irrespective of the -mpcrel or -mcmodel=medium options)
and have that in PCREL_SUPPORTED_BY_OS.  Ie, a one-stop-shop for testing
whether the current target/OS can support pcrel.  Then we should modify
rs6000_pcrel_p() use PCREL_SUPPORTED_BY_OS rather than its own
semi-duplicated target/OS tests, plus any other tests for options that
might disqualify the current target/OS from supporting pcrel, when it
normally can (ie, -mmodel != medium for ELFv2).

I think then, that should allow simplifying the code in
rs6000_option_override_internal.

Thoughts?


Peter




Re: [PING][PATCH] ira: update allocated_hardreg_p[] in improve_allocation() [PR110254]

2023-08-18 Thread Peter Bergner via Gcc-patches
On 8/2/23 8:23 AM, Vladimir Makarov wrote:
>>> gcc/
>>> PR rtl-optimization/PR110254
>>> * ira-color.cc (improve_allocation): Update array
> 
> I guess you missed the next line in the changelog.  I suspect it should be 
> "Update array allocated_hard_reg_p."
> 
> Please, fix it before committing the patch.

Is this a fix we want backported?

Peter




Re: [PATCH ver 2] rs6000, add overloaded DFP quantize support

2023-08-16 Thread Peter Bergner via Gcc-patches
On 8/16/23 7:19 PM, Carl Love wrote:
> +(define_insn "dfp_dquan_"
> +  [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
> +(unspec:DDTD [(match_operand:DDTD 1 "gpc_reg_operand" "d")
> +   (match_operand:DDTD 2 "gpc_reg_operand" "d")
> +   (match_operand:QI 3 "immediate_operand" "i")]
> + UNSPEC_DQUAN))]
> +  "TARGET_DFP"
> +  "dqua %0,%1,%2,%3"
> +  [(set_attr "type" "dfp")
> +   (set_attr "size" "")])

operand 3 refers to the RMC operand field of the insn we are emitting.
RMC is a two bit unsigned operand, so I think the predicate should be
const_0_to_3_operand rather than immediate_operand.  It's always best
to use a tighter predicate if we have one. Ditto for the other patterns
with an RMC operand.

I don't think we allow anything other than an integer for that operand
value, so I _think_ that "n" is probably a better constraint than "i"?
Ke Wen/Segher???


> +(define_insn "dfp_dquan_i"
> +  [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
> +(unspec:DDTD [(match_operand:SI 1 "const_int_operand" "n")
> +   (match_operand:DDTD 2 "gpc_reg_operand" "d")
> +   (match_operand:SI 3 "immediate_operand" "i")]
> + UNSPEC_DQUAN))]
> +  "TARGET_DFP"
> +  "dquai %1,%0,%2,%3"
> +  [(set_attr "type" "dfp")
> +   (set_attr "size" "")])

operand 1 refers to the TE operand field and that is a 5-bit signed operand.
For that, I think we should be using the s5bit_cint_operand predicate,
rather than const_int_operand.



Peter


Re: [PATCH V2] rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411]

2023-08-06 Thread Peter Bergner via Gcc-patches
On 7/19/23 11:46 AM, jeevitha via Gcc-patches wrote:
> gcc/
>   PR target/110411
>   * config/rs6000/mma.md (define_insn_and_split movoo): Disallow
>   AltiVec address in movoo and movxo pattern.

No need to mention movxo here, since the next change covers movxo.
And maybe better as "Disallow AltiVec address operands."?


>   (define_insn_and_split movxo): Likewise.

Fine.


>   *config/rs6000/predicates.md (vsx_quad_dform_memory_operand):Remove
 ^   ^
Need a space in the two spots above.

I cannot approve it, but it looks good to me with the above bits fixed.

Peter




Re: [PATCH] rs6000: Remove redundant initialization [PR106907]

2023-07-10 Thread Peter Bergner via Gcc-patches
On 6/29/23 4:31 AM, Kewen.Lin via Gcc-patches wrote:
> This is okay for trunk (no backports needed btw), this fix can even be
> taken as obvious, thanks!
> 
>>
>> 2023-06-07  Jeevitha Palanisamy  
>>
>> gcc/
>>  PR target/106907
> 
> One curious question is that this PR106907 seemed not to report this issue,
> is there another PR reporting this?  Or do I miss something?

I think Jeevitha just ran cppcheck by hand and noticed the "new" warnings
and added them to the list of things to fixup.  Yeah, it would be nice to
add the new warnings to the PR for historical reasons.

Peter





Re: [PATCH, OBVIOUS] rs6000: Remove redundant MEM_P predicate usage

2023-07-10 Thread Peter Bergner via Gcc-patches
On 7/10/23 11:47 AM, Peter Bergner wrote:
> While helping someone on the team debug an issue, I noticed some redundant
> tests in a couple of our predicates which can be removed.  I'm going to
> commit the following as obvious once bootstrap and regtesting come back
> clean.
> 
> Peter
> 
> 
> rs6000: Remove redundant MEM_P predicate usage
> 
> The quad_memory_operand and vsx_quad_dform_memory_operand predicates contain
> a (match_code "mem") test, making their MEM_P usage redundant.  Remove them.
> 
> gcc/
>   * config/rs6000/predicates.md (quad_memory_operand): Remove redundant
>   MEM_P usage.
>   (vsx_quad_dform_memory_operand): Likewise.

Testing was clean as expected.  Pushed to trunk.

Peter



Re: [PATCH ver3] rs6000, Add return value to __builtin_set_fpscr_rn

2023-07-10 Thread Peter Bergner via Gcc-patches
On 7/10/23 2:18 PM, Carl Love wrote:
> +  /* Get the current FPSCR fields, bits 29:31 (DRN) and bits 56:63 (VE, OE, 
> UE,
> +  ZE, XE, NI, RN) from the FPSCR and return them.  */

The 'Z' above should line up directly under the 'G' in Get.


> -  /* Insert new RN mode into FSCPR.  */
> -  emit_insn (gen_rs6000_mffs (tmp_df));
> -  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> -  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> -  emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> +  /* Insert the new RN value from tmp_rn into FPSCR bit [62:63].  */
> +  emit_insn (gen_anddi3 (tmp_di1, tmp_di2, GEN_INT (-4)));
> +  emit_insn (gen_iordi3 (tmp_di1, tmp_di1, tmp_rn));

This is an expander, so you shouldn't reuse temporaries as multiple
destination pseudos, since that limits the register allocator's freedom.
I know the old code did it, but since you're changing the line, you
might as well use a new temp.


I cannot approve it, but it LGTM with those fixed.

Peter




[PATCH, OBVIOUS] rs6000: Remove redundant MEM_P predicate usage

2023-07-10 Thread Peter Bergner via Gcc-patches
While helping someone on the team debug an issue, I noticed some redundant
tests in a couple of our predicates which can be removed.  I'm going to
commit the following as obvious once bootstrap and regtesting come back
clean.

Peter


rs6000: Remove redundant MEM_P predicate usage

The quad_memory_operand and vsx_quad_dform_memory_operand predicates contain
a (match_code "mem") test, making their MEM_P usage redundant.  Remove them.

gcc/
* config/rs6000/predicates.md (quad_memory_operand): Remove redundant
MEM_P usage.
(vsx_quad_dform_memory_operand): Likewise.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 8479331482e..3552d908e9d 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -912,7 +912,7 @@ (define_predicate "quad_memory_operand"
   if (!TARGET_QUAD_MEMORY && !TARGET_SYNC_TI)
 return false;
 
-  if (GET_MODE_SIZE (mode) != 16 || !MEM_P (op) || MEM_ALIGN (op) < 128)
+  if (GET_MODE_SIZE (mode) != 16 || MEM_ALIGN (op) < 128)
 return false;
 
   return quad_address_p (XEXP (op, 0), mode, false);
@@ -924,7 +924,7 @@ (define_predicate "quad_memory_operand"
 (define_predicate "vsx_quad_dform_memory_operand"
   (match_code "mem")
 {
-  if (!TARGET_P9_VECTOR || !MEM_P (op) || GET_MODE_SIZE (mode) != 16)
+  if (!TARGET_P9_VECTOR || GET_MODE_SIZE (mode) != 16)
 return false;
 
   return quad_address_p (XEXP (op, 0), mode, false);


Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-07 Thread Peter Bergner via Gcc-patches
On 7/6/23 6:28 PM, Segher Boessenkool wrote:
> On Thu, Jul 06, 2023 at 02:48:19PM -0500, Peter Bergner wrote:
>> On 7/6/23 12:33 PM, Segher Boessenkool wrote:
>>> On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote:
>>>> --- a/gcc/config/rs6000/rs6000.cc
>>>> +++ b/gcc/config/rs6000/rs6000.cc
>>>> @@ -9894,6 +9894,8 @@ rs6000_legitimate_address_p (machine_mode mode, rtx 
>>>> x, bool reg_ok_strict)
>>>>  
>>>>/* Handle unaligned altivec lvx/stvx type addresses.  */
>>>>if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
>>>> +  && mode !=  OOmode
>>>> +  && mode !=  XOmode
>>>>&& GET_CODE (x) == AND
>>>>&& CONST_INT_P (XEXP (x, 1))
>>>>&& INTVAL (XEXP (x, 1)) == -16)
>>>
>>> Why do we need this for OOmode and XOmode here, but not for the other
>>> modes that are equally not allowed?  That makes no sense.
>>
>> VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) already filters those modes out
>> (eg, SImode, DFmode, etc.), just not OOmode and XOmode, since those both
>> are modes used in/with VSX registers.
> 
> It does not filter anything out, no.  That simply checks if a datum of
> that mode can be loaded into vector registers or not.  For example
> SImode could very well be loaded into vector registers!  (It just is not
> such a great idea).

I spent some time looking at how the compiler fixes this up in the
-mno-block-ops-vector-pair case and I see the constraints used in the
vsx_mov_64bit pattern for loads and stores disallows these types
of addresses, so LRA fixes them up for us.  Clearly movoo should do the
same and that is enough to fix the ICE.  I'll work with Jeevitha on
submitting a patch using that solution.

That said, I think it would be good to modify rs6000_legitimate_address_p
to disallow these altivec style addresses for OOmode and XOmode, since we
know early-on that they're not going to be valid, but that would be a
different patch.




> dg-do compile *does* invoke the assembler, btw.  As it should.

There is dg-do "preprocess", "compile", "assemble", "link" and "run"
(ignoring "precompile" and "repo").  Dg-do compile produces an assembly
file, but doesn't actually call the assembler, which we don't strictly
need for a test case that checks whether GCC ICEs or not.  If you want
to run the assembler too and then stop, then you'd want dg-do assemble.

Peter




  1   2   3   4   5   6   7   8   9   10   >