Re: [PATCH] s390: Extend two element float vector

2024-06-11 Thread Andreas Krebbel

On 6/11/24 10:26, Stefan Schulze Frielinghaus wrote:

This implements a V2SF -> V2DF extend.

gcc/ChangeLog:

* config/s390/vector.md (*vmrhf): New.
(extendv2sfv2df2): New.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-extend-3.c: New test.


Since we already have a *vmrhf pattern, should we perhaps add something 
to the name to make it easier to distinguish in the rtl dumps? You have 
added the mode already, but perhaps something like *vmrhf_half or 
something like this?


Ok with or without that change. Thanks!


Andreas




Re: [PATCH] s390: Extend two/four element integer vectors

2024-06-11 Thread Andreas Krebbel



On 6/11/24 10:24, Stefan Schulze Frielinghaus wrote:

For the moment I deliberately left out one-element QHS vectors since it
is unclear whether these are pathological cases or whether they are
really used.  If we ever get an extend for V1DI -> V1TI we should
reconsider this.

As a side-effect this fixes PR115261.

gcc/ChangeLog:

target/PR115261
* config/s390/s390.md (any_extend,extend_insn,zero_extend):
New code attributes and code iterator.
* config/s390/vector.md (V_EXTEND): New mode iterator.
(2): New insn.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-extend-1.c: New test.
* gcc.target/s390/vector/vec-extend-2.c: New test.
---
  Bootstrap and regtested on s390.  Ok for mainline?


Ok. Thanks!


Andreas




[Committed] IBM Z: Fix ICE in expand_perm_as_replicate

2024-06-10 Thread Andreas Krebbel
The current implementation assumes to always be invoked with register
operands. For memory operands we even have an instruction
though (vlrep). With the patch we try this first and only if it fails
force the input into a register and continue.

vec_splats generation fails for single element 128bit types which are
allowed for vec_splat. This is something to sort out with another
patch I guess.

Bootstrapped and regtested on IBM Z. Committed to mainline. Needs to
be committed to GCC 14 branch as well.

gcc/ChangeLog:

* config/s390/s390.cc (expand_perm_as_replicate): Handle memory
operands.
* config/s390/vx-builtins.md (vec_splats): Turn into 
parameterized expander.
(@vec_splats): New expander.

gcc/testsuite/ChangeLog:

* g++.dg/torture/vshuf-mem.C: New test.
---
 gcc/config/s390/s390.cc  | 17 +--
 gcc/config/s390/vx-builtins.md   |  2 +-
 gcc/testsuite/g++.dg/torture/vshuf-mem.C | 27 
 3 files changed, 43 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/vshuf-mem.C

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index fa517bd3e77..ec836ec3cd4 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -17940,7 +17940,8 @@ expand_perm_as_replicate (const struct 
expand_vec_perm_d )
   unsigned char i;
   unsigned char elem;
   rtx base = d.op0;
-  rtx insn;
+  rtx insn = NULL_RTX;
+
   /* Needed to silence maybe-uninitialized warning.  */
   gcc_assert (d.nelt > 0);
   elem = d.perm[0];
@@ -17954,7 +17955,19 @@ expand_perm_as_replicate (const struct 
expand_vec_perm_d )
  base = d.op1;
  elem -= d.nelt;
}
-  insn = maybe_gen_vec_splat (d.vmode, d.target, base, GEN_INT (elem));
+  if (memory_operand (base, d.vmode))
+   {
+ /* Try to use vector load and replicate.  */
+ rtx new_base = adjust_address (base, GET_MODE_INNER (d.vmode),
+elem * GET_MODE_UNIT_SIZE (d.vmode));
+ insn = maybe_gen_vec_splats (d.vmode, d.target, new_base);
+   }
+  if (insn == NULL_RTX)
+   {
+ base = force_reg (d.vmode, base);
+ insn = maybe_gen_vec_splat (d.vmode, d.target, base, GEN_INT (elem));
+   }
+
   if (insn == NULL_RTX)
return false;
   emit_insn (insn);
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 93c0d408a43..bb271c09a7d 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -145,7 +145,7 @@
   DONE;
 })
 
-(define_expand "vec_splats"
+(define_expand "@vec_splats"
   [(set (match_operand:VEC_HW  0 "register_operand" "")
(vec_duplicate:VEC_HW (match_operand: 1 "general_operand"  
"")))]
   "TARGET_VX")
diff --git a/gcc/testsuite/g++.dg/torture/vshuf-mem.C 
b/gcc/testsuite/g++.dg/torture/vshuf-mem.C
new file mode 100644
index 000..5f1ebf65665
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/vshuf-mem.C
@@ -0,0 +1,27 @@
+// { dg-options "-std=c++11" }
+// { dg-do run }
+// { dg-additional-options "-march=z14" { target s390*-*-* } }
+
+/* This used to trigger (2024-05-28) the vectorize_vec_perm_const
+   backend hook to be invoked with a MEM source operand.  Extracted
+   from onnxruntime's mlas library.  */
+
+typedef float V4SF __attribute__((vector_size (16)));
+typedef int V4SI __attribute__((vector_size (16)));
+
+template < unsigned I0, unsigned I1, unsigned I2, unsigned I3 > V4SF
+MlasShuffleFloat32x4 (V4SF Vector)
+{
+  return __builtin_shuffle (Vector, Vector, V4SI{I0, I1, I2, I3});
+}
+
+int
+main ()
+{
+  V4SF f = { 1.0f, 2.0f, 3.0f, 4.0f };
+  if (MlasShuffleFloat32x4 < 1, 1, 1, 1 > (f)[3] != 2.0f)
+__builtin_abort ();
+  if (MlasShuffleFloat32x4 < 3, 3, 3, 3 > (f)[1] != 4.0f)
+__builtin_abort ();
+  return 0;
+}
-- 
2.45.1



Re: [PATCH] s390: Implement TARGET_NOCE_CONVERSION_PROFITABLE_P [PR109549]

2024-05-16 Thread Andreas Krebbel
On 5/8/24 10:06, Stefan Schulze Frielinghaus wrote:
> Consider a NOCE conversion as profitable if there is at least one
> conditional move.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (TARGET_NOCE_CONVERSION_PROFITABLE_P):
>   Define.
>   (s390_noce_conversion_profitable_p): Implement.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/ccor.c: Order of loads are reversed, now, as a
>   consequence the condition has to be reversed.
> ---
>  Bootstrapped and regtested on s390.  Ok for mainline?
> 
>  gcc/config/s390/s390.cc  | 32 
>  gcc/testsuite/gcc.target/s390/ccor.c |  4 ++--
>  2 files changed, 34 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index bf46eab2d63..23b18b5c506 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -78,6 +78,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-pass.h"
>  #include "context.h"
>  #include "builtins.h"
> +#include "ifcvt.h"
>  #include "rtl-iter.h"
>  #include "intl.h"
>  #include "tm-constrs.h"
> @@ -18037,6 +18038,37 @@ s390_vectorize_vec_perm_const (machine_mode vmode, 
> machine_mode op_mode,
>return vectorize_vec_perm_const_1 (d);
>  }
>  
> +/* Consider a NOCE conversion as profitable if there is at least one
> +   conditional move.  */
> +
> +#undef TARGET_NOCE_CONVERSION_PROFITABLE_P
> +#define TARGET_NOCE_CONVERSION_PROFITABLE_P s390_noce_conversion_profitable_p
We collect these definitions at the very end of s390.cc

> +
> +static bool
> +s390_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info 
> *if_info)
> +{
> +  if (if_info->speed_p)
> +{
> +  for (rtx_insn *insn = seq; insn; insn = NEXT_INSN (insn))
> + {
> +   rtx set = single_set (insn);
> +   if (set == NULL)
> + continue;
> +   if (GET_CODE (SET_SRC (set)) != IF_THEN_ELSE)
> + continue;
> +   rtx src = SET_SRC (set);
> +   machine_mode mode = GET_MODE (src);
> +   if (GET_MODE_CLASS (mode) != MODE_INT
> +   && GET_MODE_CLASS (mode) != MODE_FLOAT)
> + continue;
> +   if (GET_MODE_SIZE (mode) > GET_MODE_SIZE (Pmode))
I guess GET_MODE_SIZE(Pmode) should be UNITS_PER_WORD here to enable the 
conversion also for 64 bit
modes with -m31 -mzarch.

Ok with these changes. Thanks!

Andreas

> + continue;
> +   return true;
> + }
> +}
> +  return default_noce_conversion_profitable_p (seq, if_info);
> +}
> +
>  /* Initialize GCC target structure.  */
>  
>  #undef  TARGET_ASM_ALIGNED_HI_OP
> diff --git a/gcc/testsuite/gcc.target/s390/ccor.c 
> b/gcc/testsuite/gcc.target/s390/ccor.c
> index 31f30f60314..36a3c3a999a 100644
> --- a/gcc/testsuite/gcc.target/s390/ccor.c
> +++ b/gcc/testsuite/gcc.target/s390/ccor.c
> @@ -42,7 +42,7 @@ GENFUN1(2)
>  
>  GENFUN1(3)
>  
> -/* { dg-final { scan-assembler {locrno} } } */
> +/* { dg-final { scan-assembler {locro} } } */
>  
>  GENFUN2(0,1)
>  
> @@ -58,7 +58,7 @@ GENFUN2(0,3)
>  
>  GENFUN2(1,2)
>  
> -/* { dg-final { scan-assembler {locrnlh} } } */
> +/* { dg-final { scan-assembler {locrlh} } } */
>  
>  GENFUN2(1,3)
>  



Re: [PATCH] s390: testsuite: Fix risbg-ll-2.c

2024-04-30 Thread Andreas Krebbel
On 4/30/24 10:34, Stefan Schulze Frielinghaus wrote:
> Starting with r14-2047-gd0e891406b16dc we see through subregs which
> means for f10 in risbg-ll-2.c we do not end up with rosbg_si_noshift but
> rather rosbg_di_noshift which materializes in slightly different start
> index.  This saves us an extend.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/risbg-ll-2.c: Fix start offset for rosbg of
>   f10.

Ok. Thanks!

Andreas

> ---
>  Ok for mainline?
> 
>  gcc/testsuite/gcc.target/s390/risbg-ll-2.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/risbg-ll-2.c 
> b/gcc/testsuite/gcc.target/s390/risbg-ll-2.c
> index 8bf1a0ff88b..ca80602a83f 100644
> --- a/gcc/testsuite/gcc.target/s390/risbg-ll-2.c
> +++ b/gcc/testsuite/gcc.target/s390/risbg-ll-2.c
> @@ -113,7 +113,7 @@ i32 f9 (i64 v_x, i32 v_y)
>  // ands with incompatible masks.
>  i32 f10 (i64 v_x, i32 v_y)
>  {
> -  /* { dg-final { scan-assembler 
> "f10:\n\tsrlg\t%r2,%r2,48\n\trosbg\t%r2,%r3,32,39,0" { target { lp64 } } } } 
> */
> +  /* { dg-final { scan-assembler 
> "f10:\n\tsrlg\t%r2,%r2,48\n\trosbg\t%r2,%r3,0,39,0" { target { lp64 } } } } */
>/* { dg-final { scan-assembler 
> "f10:\n\tnilf\t%r4,4278190080\n\trosbg\t%r4,%r2,48,63,48" { target { ! lp64 } 
> } } } */
>i64 v_shr6 = ((ui64)v_x) >> 48;
>i32 v_conv = (ui32)v_shr6;



Re: [PATCH] s390: testsuite: Fix zero_bits_compound-1.c

2024-04-30 Thread Andreas Krebbel
On 4/30/24 10:32, Stefan Schulze Frielinghaus wrote:
> Starting with r12-2731-g96146e61cd7aee we do not generate code like
> 
> _5 = (unsigned int) c_2(D);
> i_6 = _5 << 8;
> _7 = _5 << 20;
> i_8 = i_6 | _7;
> 
> anymore but instead
> 
> _5 = (unsigned int) c_2(D);
> _3 = _5 * 1048832;
> 
> which leads finally to slightly different assembly code where we
> previously ended up for z10 or newer with
> 
> lr  %r1,%r2
> sll %r1,8
> rosbg   %r1,%r2,32,43,20
> llgfr   %r2,%r1
> br  %r14
> 
> and now
> 
> lr  %r1,%r2
> sll %r1,12
> ar  %r2,%r1
> risbg   %r2,%r2,35,128+55,8
> br  %r14
> 
> The zero-extend materializes via risbg for which the pattern contains an
> "and" which is why the test fails.  Thus, instead of scanning for RTL
> expressions rather scan for assembler instructions for s390.
> ---
>  Ok for mainline?

Ok. Thanks!

Andreas

> 
>  gcc/testsuite/gcc.dg/zero_bits_compound-1.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/zero_bits_compound-1.c 
> b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c
> index e71594911b2..f1e267e0fb0 100644
> --- a/gcc/testsuite/gcc.dg/zero_bits_compound-1.c
> +++ b/gcc/testsuite/gcc.dg/zero_bits_compound-1.c
> @@ -39,4 +39,5 @@ unsigned long bar (unsigned char c)
>  }
>  
>  /* Check that no pattern containing an AND expression was used.  */
> -/* { dg-final { scan-assembler-not "\\(and:" } } */
> +/* { dg-final { scan-assembler-not "\\(and:" { target { ! { s390*-*-* } } } 
> } } */
> +/* { dg-final { scan-assembler-not "\\tng?rk?\\t" { target { s390*-*-* } } } 
> } */



[Committed] s390x: Fix vec_xl/vec_xst type aliasing [PR114676]

2024-04-23 Thread Andreas Krebbel
The requirements of the vec_xl/vec_xst intrinsincs wrt aliasing of the
pointer argument are not really documented.  As it turns out, users
are likely to get it wrong.  With this patch we let the pointer
argument alias everything in order to make it more robust for users.

Committed to mainline. Will be cherry-picked for stable branches as well.

gcc/ChangeLog:

PR target/114676
* config/s390/s390-c.cc (s390_expand_overloaded_builtin): Use a
MEM_REF with an addend of type ptr_type_node.

gcc/testsuite/ChangeLog:

PR target/114676
* gcc.target/s390/zvector/pr114676.c: New test.

Suggested-by: Jakub Jelinek 
---
 gcc/config/s390/s390-c.cc | 16 +---
 .../gcc.target/s390/zvector/pr114676.c| 19 +++
 2 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/pr114676.c

diff --git a/gcc/config/s390/s390-c.cc b/gcc/config/s390/s390-c.cc
index 8d3d1a467a8..1bb6e810766 100644
--- a/gcc/config/s390/s390-c.cc
+++ b/gcc/config/s390/s390-c.cc
@@ -498,11 +498,11 @@ s390_expand_overloaded_builtin (location_t loc,
/* Build a vector type with the alignment of the source
   location in order to enable correct alignment hints to be
   generated for vl.  */
-   tree mem_type = build_aligned_type (return_type,
-   TYPE_ALIGN (TREE_TYPE (TREE_TYPE 
((*arglist)[1];
+   unsigned align = TYPE_ALIGN (TREE_TYPE (TREE_TYPE ((*arglist)[1])));
+   tree mem_type = build_aligned_type (return_type, align);
return build2 (MEM_REF, mem_type,
   fold_build_pointer_plus ((*arglist)[1], (*arglist)[0]),
-  build_int_cst (TREE_TYPE ((*arglist)[1]), 0));
+  build_int_cst (ptr_type_node, 0));
   }
 case S390_OVERLOADED_BUILTIN_s390_vec_xst:
 case S390_OVERLOADED_BUILTIN_s390_vec_xstd2:
@@ -511,11 +511,13 @@ s390_expand_overloaded_builtin (location_t loc,
/* Build a vector type with the alignment of the target
   location in order to enable correct alignment hints to be
   generated for vst.  */
-   tree mem_type = build_aligned_type (TREE_TYPE((*arglist)[0]),
-   TYPE_ALIGN (TREE_TYPE (TREE_TYPE 
((*arglist)[2];
+   unsigned align = TYPE_ALIGN (TREE_TYPE (TREE_TYPE ((*arglist)[2])));
+   tree mem_type = build_aligned_type (TREE_TYPE ((*arglist)[0]), align);
return build2 (MODIFY_EXPR, mem_type,
-  build1 (INDIRECT_REF, mem_type,
-  fold_build_pointer_plus ((*arglist)[2], 
(*arglist)[1])),
+  build2 (MEM_REF, mem_type,
+  fold_build_pointer_plus ((*arglist)[2],
+   (*arglist)[1]),
+  build_int_cst (ptr_type_node, 0)),
   (*arglist)[0]);
   }
 case S390_OVERLOADED_BUILTIN_s390_vec_load_pair:
diff --git a/gcc/testsuite/gcc.target/s390/zvector/pr114676.c 
b/gcc/testsuite/gcc.target/s390/zvector/pr114676.c
new file mode 100644
index 000..bdc66b2920a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/pr114676.c
@@ -0,0 +1,19 @@
+/* { dg-do run { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z14 -mzvector" } */
+
+#include 
+
+void __attribute__((noinline)) foo (int *mem)
+{
+  vec_xst ((vector float){ 1.0f, 2.0f, 3.0f, 4.0f }, 0, (float*)mem);
+}
+
+int
+main ()
+{
+  int m[4] = { 0 };
+  foo (m);
+  if (m[3] == 0)
+__builtin_abort ();
+  return 0;
+}
-- 
2.44.0



[Committed] s390x: Do not default to -mvx for -mesa

2024-04-22 Thread Andreas Krebbel
We currently enable the vector extensions also for -march=z13 -m31
mesa which is very wrong.

Not a regression but an obvious fix, so I've committed it to mainline
now. Will have to cherry-pick it for stable branches as well.

gcc/ChangeLog:

* config/s390/s390.cc (s390_option_override_internal): Check zarch
flag before enabling -mvx.
---
 gcc/config/s390/s390.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index bf46eab2d63..5968808fcb6 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -16104,7 +16104,7 @@ s390_option_override_internal (struct gcc_options *opts,
 }
   else
 {
-  if (TARGET_CPU_VX_P (opts))
+  if (TARGET_CPU_VX_P (opts) && TARGET_ZARCH_P (opts->x_target_flags))
/* Enable vector support if available and not explicitly disabled
   by user.  E.g. with -m31 -march=z13 -mzarch */
opts->x_target_flags |= MASK_OPT_VX;
-- 
2.44.0



Re: [PATCH] s390: testsuite: Fix forwprop-4{0,1}.c

2024-04-22 Thread Andreas Krebbel
Hi Stefan,

due to that missed optimization we currently generate silly code for these two 
tests and should
really fix this (after gcc entering stage1). So just skipping it on s390x would 
definitely be the
wrong choice I think.

I think our vectorize_vec_perm_const correctly rejects this permute pattern, 
since it would require
a load from literal pool. Question is why we do have to rely on this being 
turned into a permute
first to get rid of the obviously redundant assignments. Shouldn't fwprop be 
able to handle this
without it?

I'm ok with your patch, but please also open a BZ for it and perhaps mention it 
in the comment close
to the xfail.

Thanks!

Andreas

On 4/22/24 08:23, Stefan Schulze Frielinghaus wrote:
> The tests fail on s390 since can_vec_perm_const_p fails and therefore
> the bit insert/ref survive which r14-3381-g27de9aa152141e aims for.
> Strictly speaking, the tests only fail in case the target supports
> vectors, i.e., for targets prior z13 or in case of -mesa the emulated
> vector operations are optimized out.
> 
> Easiest would be to skip the entire test for s390.  Another solution
> would be to xfail in case of vector support hoping that eventually we
> end up with an xpass for a future machine generation or if gcc advances.
> That is implemented by this patch.  In order to do so I implemented a
> new target test s390_mvx which tests whether vector support is available
> or not.  Maybe this is already over-engineered for a simple test?  Any
> thoughts?
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c |  4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c |  4 ++--
>  gcc/testsuite/lib/target-supports.exp   | 14 ++
>  3 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> index 7513497f552..b67e3e93a7f 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> @@ -10,5 +10,5 @@ vector int g(vector int a)
>return a;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" { xfail 
> s390_mvx } } } */
> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail 
> s390_mvx } } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> index b1e75797a90..0f119675207 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> @@ -11,6 +11,6 @@ vector int g(vector int a, int c)
>return a;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" { xfail 
> s390_mvx } } } */
> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail 
> s390_mvx } } } */
>  /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "optimized" } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index edce672c0e2..5a692baa8ef 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -12380,6 +12380,20 @@ proc check_effective_target_profile_update_atomic {} 
> {
>  } "-fprofile-update=atomic -fprofile-generate"]
>  }
>  
> +# Return 1 if the target has a vector facility.
> +proc check_effective_target_s390_mvx { } {
> +if ![istarget s390*-*-*] then {
> + return 0;
> +}
> +
> +return [check_no_compiler_messages_nocache s390_mvx assembly {
> + #if !defined __VX__
> + #error no vector facility.
> + #endif
> + int dummy;
> +} [current_compiler_flags]]
> +}
> +
>  # Return 1 if vector (va - vector add) instructions are understood by
>  # the assembler and can be executed.  This also covers checking for
>  # the VX kernel feature.  A kernel without that feature does not



Re: [PATCH] s390: testsuite: Remove xfail for vpopct{b,h}

2024-04-22 Thread Andreas Krebbel
On 4/22/24 08:01, Stefan Schulze Frielinghaus wrote:
> Starting with r14-9316-g7890836de20912 patterns for vpopct{b,h} are also
> detected.  Thus, remove xfails.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vxe/popcount-1.c: Remove xfail.

Ok. Thanks!

Andreas

> ---
>  Ok for mainline?
> 
>  gcc/testsuite/gcc.target/s390/vxe/popcount-1.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c 
> b/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c
> index 9ea835a1cf0..25ef354f963 100644
> --- a/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c
> +++ b/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c
> @@ -21,7 +21,7 @@ vpopctb (uv16qi a)
>  
>return r;
>  }
> -/* { dg-final { scan-assembler "vpopctb\t%v24,%v24" { xfail *-*-* } } } */
> +/* { dg-final { scan-assembler "vpopctb\t%v24,%v24" } } */
>  
>  uv8hi __attribute__((noinline))
>  vpopcth (uv8hi a)
> @@ -34,7 +34,7 @@ vpopcth (uv8hi a)
>  
>return r;
>  }
> -/* { dg-final { scan-assembler "vpopcth\t%v24,%v24" { xfail *-*-* } } } */
> +/* { dg-final { scan-assembler "vpopcth\t%v24,%v24" } } */
>  
>  uv4si __attribute__((noinline))
>  vpopctf (uv4si a)



Re: [PATCH] s390: avoid peeking eof after __vector

2024-04-17 Thread Andreas Krebbel
On 4/17/24 03:52, Jiufu Guo wrote:
> 
> Hi,
> 
> I would like to ping this patch.
> 
> 
> Jeff (Jiufu Guo)
> 
> Jiufu Guo  writes:
> 
>> Hi,
>>
>> Same like PR101168, this patch is need for s390 to
>> avoid peeking eof after vector keyword.
>> And similar test case is also ok for s390.
>>
>> Is this ok for trunk?
>>
>> Jeff (Jiufu Guo)
>>
>>  PR target/95782
>>
>> gcc/ChangeLog:
>>
>>  * config/s390/s390-c.cc (s390_macro_to_expand): Avoid empty identifier.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * g++.target/s390/pr95782.C: New test.

Sorry for the delay. This is ok. Thanks!

Andreas

>>
>> ---
>>  gcc/config/s390/s390-c.cc   | 4 +++-
>>  gcc/testsuite/g++.target/s390/pr95782.C | 5 +
>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/g++.target/s390/pr95782.C
>>
>> diff --git a/gcc/config/s390/s390-c.cc b/gcc/config/s390/s390-c.cc
>> index 8d3d1a467a8..45f164d978b 100644
>> --- a/gcc/config/s390/s390-c.cc
>> +++ b/gcc/config/s390/s390-c.cc
>> @@ -275,7 +275,9 @@ s390_macro_to_expand (cpp_reader *pfile, const cpp_token 
>> *tok)
>>/* __vector long __bool a; */
>>if (ident == C_CPP_HASHNODE (__bool_keyword))
>>  expand_bool_p = true;
>> -  else
>> +
>> +  /* If there are more tokens to check.  */
>> +  else if (ident)
>>  {
>>/* Triggered with: __vector long long __bool a; */
>>do
>> diff --git a/gcc/testsuite/g++.target/s390/pr95782.C 
>> b/gcc/testsuite/g++.target/s390/pr95782.C
>> new file mode 100644
>> index 000..daf887fc6fe
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.target/s390/pr95782.C
>> @@ -0,0 +1,5 @@
>> +// { dg-do compile }
>> +// { dg-options "-march=z14 -mzvector" }
>> +
>> +using vdbl =  __vector double;
>> +#define BREAK 1



Re: [PATCH] s390: testsuite: Xfail range-sincos.c and vrp-float-abs-1.c

2024-04-12 Thread Andreas Krebbel
On 4/12/24 10:16, Stefan Schulze Frielinghaus wrote:
> As mentioned in PR114678 those failures will be fixed by
> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648303.html
> For GCC 14 just xfail them which should be reverted once the patch is
> applied.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/range-sincos.c: Xfail for s390.
>   * gcc.dg/tree-ssa/vrp-float-abs-1.c: Dito.> ---
>  Ok for mainline?

Ok, thanks!

Andreas

> 
>  gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c| 2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c
> index 337f9cda02f..35b38c3c914 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/range-sincos.c
> @@ -40,4 +40,4 @@ stool (double x)
>  link_error ();
>  }
>  
> -// { dg-final { scan-tree-dump-not "link_error" "evrp" { target { { 
> *-*-linux* } && { glibc } } } } }
> +// { dg-final { scan-tree-dump-not "link_error" "evrp" { target { { 
> *-*-linux* } && { glibc } } xfail s390*-*-* } } } xfail: PR114678
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c
> index 4b7b75833e0..a814a973963 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-abs-1.c
> @@ -14,4 +14,4 @@ foo (double x, double y)
>  }
>  }
>  
> -// { dg-final { scan-tree-dump-not "link_error" "evrp" } }
> +// { dg-final { scan-tree-dump-not "link_error" "evrp" { xfail s390*-*-* } } 
> } xfail: PR114678



Re: [PATCH v2] s390x: Optimize vector permute with constant indexes

2024-04-09 Thread Andreas Krebbel
On 4/9/24 16:31, Juergen Christ wrote:
> Loop vectorizer can generate vector permutes with constant indexes
> where all indexes are equal.  Optimize this case to use vector
> replicate instead of vector permute.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (expand_perm_as_replicate): Implement.
>   (vectorize_vec_perm_const_1): Call new function.
>   * config/s390/vx-builtins.md (vec_splat): Change to...
>   (@vec_splat): ...this.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vec-expand-replicate.c: New test.
> 
> Bootstrapped and regtested on s390x.  Ok for trunk?

Does this also work when using the vec_perm intrinsic or would we need to 
define a matching RTX for
that?

Ok. Thanks!

Andreas



Re: [PATCH] s390: Fix s390_const_int_pool_entry_p and movdi peephole2 [PR114605]

2024-04-08 Thread Andreas Krebbel
On 4/8/24 13:43, Ilya Leoshkevich wrote:
> On Sat, 2024-04-06 at 18:58 +0200, Jakub Jelinek wrote:
>> Hi!
>>
>> The following testcase is miscompiled, because we have initially
>> a movti which loads the 0x3f803f80ULL TImode constant
>> from constant pool.  Later on we split it into a pair of DImode
>> loads.  Now, for the first load (why just that?, though not stage4
>> material) we trigger the peephole2 which uses
>> s390_const_int_pool_entry_p.
>> That function doesn't check at all the constant pool mode though,
>> sees
>> the constant pool at that address has a CONST_INT value and just
>> assumes
>> that is the value to return, which is especially wrong for big-
>> endian,
>> if it is a DImode load from offset 0, it should be loading 0 rather
>> than
>> 0x3f803f80ULL.
>> The following patch adds checks if we are extracing a MODE_INT mode,
>> if the constant pool has MODE_INT mode as well, punts if constant
>> pool
>> has smaller mode size than the extraction one (then it would be UB),
>> if it has the same mode as before keeps using what it did before,
>> if constant pool has a larger mode than the one being extracted, uses
>> simplify_subreg.  I'd have used avoid_constant_pool_reference
>> instead which can handle also offsets into the constant pool
>> constants,
>> but it can't handle UNSPEC_LTREF.
>>
>> Another thing is that once that is fixed, we ICE when we extract
>> constant
>> like 0, ior insn predicate require non-0 constant.  So, the patch
>> also
>> fixes the peephole2 so that if either 32-bit half is zero, it uses a
>> mere
>> load of the constant into register rather than a pair of such load
>> and ior.
>>
>> Bootstrapped/regtested on s390x-linux, ok for trunk?
> 
> Hi Jakub, thanks for the patch, it looks good to me.
> Since I'm not a maintainer, we need to wait for Andreas' opinion.

Ok. Thank you very much Jakub for fixing this!

Andreas

> 
>>
>> 2024-04-06  Jakub Jelinek  
>>
>>  PR target/114605
>>  * config/s390/s390.cc (s390_const_int_pool_entry_p): Punt
>>  if mem doesn't have MODE_INT mode, or pool constant doesn't
>>  have MODE_INT mode, or if pool constant mode is smaller than
>>  mem mode.  If mem mode is different from pool constant mode,
>>  try to simplify subreg.  If that doesn't work, punt, if it
>>  does, use the simplified constant instead of the constant
>> pool
>>  constant.
>>  * config/s390/s390.md (movdi from const pool peephole): If
>>  either low or high 32-bit part is zero, just emit move insn
>>  instead of move + ior.
>>
>>  * gcc.dg/pr114605.c: New test.



Re: [PATCH] libsanitizer: Do not mention MSan and DFSan in an error message

2024-04-04 Thread Andreas Krebbel
On 4/4/24 14:22, Jakub Jelinek wrote:
> On Thu, Apr 04, 2024 at 02:19:08PM +0200, Andreas Krebbel wrote:
>> On 4/4/24 13:38, Ilya Leoshkevich wrote:
>>> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
>>>
>>>
>>> libsanitizer/ChangeLog:
>>>
>>> * sanitizer_common/sanitizer_linux_s390.cpp (AvoidCVE_2016_2143):
>>> Do not mention MSan and DFSan, which are not supported by GCC.
>>
>> Ok, Thanks!
> 
> This then needs to be added to libsanitizer/LOCAL_PATCHES , otherwise
> it will disappear on the next merge from upstream.
> 
> Though, I must say I'm not entirely convinced the change is worth the
> hassle on every libsanitizer merge.

You are right. We will leave the message as is.

Thanks!

Andreas

> 
>>> diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp 
>>> b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp
>>> index 74db831b0aa..65ba825fa97 100644
>>> --- a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp
>>> +++ b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp
>>> @@ -212,7 +212,7 @@ void AvoidCVE_2016_2143() {
>>>  return;
>>>Report(
>>>  "ERROR: Your kernel seems to be vulnerable to CVE-2016-2143.  Using 
>>> ASan,\n"
>>> -"MSan, TSan, DFSan or LSan with such kernel can and will crash your\n"
>>> +"TSan or LSan with such kernel can and will crash your\n"
>>>  "machine, or worse.\n"
>>>  "\n"
>>>  "If you are certain your kernel is not vulnerable (you have compiled 
>>> it\n"
> 
>   Jakub
> 



Re: [PATCH] libsanitizer: Do not mention MSan and DFSan in an error message

2024-04-04 Thread Andreas Krebbel
On 4/4/24 13:38, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> libsanitizer/ChangeLog:
> 
>   * sanitizer_common/sanitizer_linux_s390.cpp (AvoidCVE_2016_2143):
>   Do not mention MSan and DFSan, which are not supported by GCC.

Ok, Thanks!

Andreas

> ---
>  libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp 
> b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp
> index 74db831b0aa..65ba825fa97 100644
> --- a/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp
> +++ b/libsanitizer/sanitizer_common/sanitizer_linux_s390.cpp
> @@ -212,7 +212,7 @@ void AvoidCVE_2016_2143() {
>  return;
>Report(
>  "ERROR: Your kernel seems to be vulnerable to CVE-2016-2143.  Using 
> ASan,\n"
> -"MSan, TSan, DFSan or LSan with such kernel can and will crash your\n"
> +"TSan or LSan with such kernel can and will crash your\n"
>  "machine, or worse.\n"
>  "\n"
>  "If you are certain your kernel is not vulnerable (you have compiled 
> it\n"



Re: [PATCH] s390: testsuite: Fix backprop-6.c

2024-03-22 Thread Andreas Krebbel
On 3/22/24 10:49, Stefan Schulze Frielinghaus wrote:
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/backprop-6.c: On s390 we also have a copysign
>   optab for long double.  Thus, scan 3 instead of 2 times for it.
> ---
>  OK for mainline?

Ok. Thanks!

Andreas



Re: [PATCH] s390: testsuite: Fix abs-4.c

2024-03-21 Thread Andreas Krebbel
On 3/21/24 15:41, Stefan Schulze Frielinghaus wrote:
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/abs-4.c: On s390 we also have a copysign optab
>   for long double.  Thus, scan 3 instead of 2 times for it.
> ---
>  Ok for mainline?

Ok. Thanks!

Andreas


[Committed] IBM Z: Fix -munaligned-symbols

2024-03-14 Thread Andreas Krebbel
With this fix we make sure that only symbols with a natural alignment
smaller than 2 are considered misaligned with
-munaligned-symbols. Background is that -munaligned-symbols is only
supposed to affect symbols whose natural alignment wouldn't be enough
to fulfill our ABI requirement of having all symbols at even
addresses. Because only these are the cases where we differ from other
architectures.

This fixes the unaligned-1 testcase, no regressions. Committed to mainline.

gcc/ChangeLog:

* config/s390/s390.cc (s390_encode_section_info): Adjust the check
for misaligned symbols.
* config/s390/s390.opt: Improve documentation.

gcc/testsuite/ChangeLog:

* gcc.target/s390/aligned-1.c: Add weak and void variables
incorporating the cases from unaligned-2.c.
* gcc.target/s390/unaligned-1.c: Likewise.
* gcc.target/s390/unaligned-2.c: Removed.
---
 gcc/config/s390/s390.cc |  15 ++-
 gcc/config/s390/s390.opt|   7 +-
 gcc/testsuite/gcc.target/s390/aligned-1.c   | 101 +--
 gcc/testsuite/gcc.target/s390/unaligned-1.c | 103 ++--
 gcc/testsuite/gcc.target/s390/unaligned-2.c |  16 ---
 5 files changed, 201 insertions(+), 41 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/s390/unaligned-2.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index e63965578f1..372a2324403 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -13802,10 +13802,19 @@ s390_encode_section_info (tree decl, rtx rtl, int 
first)
 that can go wrong (i.e. no FUNC_DECLs).
 All symbols without an explicit alignment are assumed to be 2
 byte aligned as mandated by our ABI.  This behavior can be
-overridden for external symbols with the -munaligned-symbols
-switch.  */
+overridden for external and weak symbols with the
+-munaligned-symbols switch.
+For all external symbols without explicit alignment
+DECL_ALIGN is already trimmed down to 8, however for weak
+symbols this does not happen.  These cases are catched by the
+type size check.  */
+  const_tree size = TYPE_SIZE (TREE_TYPE (decl));
+  unsigned HOST_WIDE_INT size_num = (tree_fits_uhwi_p (size)
+? tree_to_uhwi (size) : 0);
   if ((DECL_USER_ALIGN (decl) && DECL_ALIGN (decl) % 16)
- || (s390_unaligned_symbols_p && !decl_binds_to_current_def_p (decl)))
+ || (s390_unaligned_symbols_p
+ && !decl_binds_to_current_def_p (decl)
+ && (DECL_USER_ALIGN (decl) ? DECL_ALIGN (decl) % 16 : size_num < 
16)))
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
   else if (DECL_ALIGN (decl) % 32)
SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
diff --git a/gcc/config/s390/s390.opt b/gcc/config/s390/s390.opt
index 901ae4beb01..a5b5aa95a12 100644
--- a/gcc/config/s390/s390.opt
+++ b/gcc/config/s390/s390.opt
@@ -332,7 +332,8 @@ Store all argument registers on the stack.
 
 munaligned-symbols
 Target Var(s390_unaligned_symbols_p) Init(0)
-Assume external symbols to be potentially unaligned.  By default all
-symbols without explicit alignment are assumed to reside on a 2 byte
-boundary as mandated by the IBM Z ABI.
+Assume external symbols, whose natural alignment would be 1, to be
+potentially unaligned.  By default all symbols without explicit
+alignment are assumed to reside on a 2 byte boundary as mandated by
+the IBM Z ABI.
 
diff --git a/gcc/testsuite/gcc.target/s390/aligned-1.c 
b/gcc/testsuite/gcc.target/s390/aligned-1.c
index 2dc99cf66bd..3f5a2611ef1 100644
--- a/gcc/testsuite/gcc.target/s390/aligned-1.c
+++ b/gcc/testsuite/gcc.target/s390/aligned-1.c
@@ -1,20 +1,103 @@
-/* Even symbols without explicite alignment are assumed to reside on a
+/* Even symbols without explicit alignment are assumed to reside on a
2 byte boundary, as mandated by the IBM Z ELF ABI, and therefore
can be accessed using the larl instruction.  */
 
 /* { dg-do compile } */
 /* { dg-options "-O3 -march=z900 -fno-section-anchors" } */
 
-extern unsigned char extern_implicitly_aligned;
-extern unsigned char extern_explicitly_aligned __attribute__((aligned(2)));
-unsigned char aligned;
+extern unsigned char extern_char;
+extern unsigned char extern_explicitly_aligned_char 
__attribute__((aligned(2)));
+extern unsigned char extern_explicitly_unaligned_char 
__attribute__((aligned(1)));
+extern unsigned char __attribute__((weak)) extern_weak_char;
+extern unsigned char extern_explicitly_aligned_weak_char 
__attribute__((weak,aligned(2)));
+extern unsigned char extern_explicitly_unaligned_weak_char 
__attribute__((weak,aligned(1)));
 
-unsigned char
+unsigned char normal_char;
+unsigned char explicitly_unaligned_char __attribute__((aligned(1)));
+unsigned char __attribute__((weak)) weak_char = 0;
+unsigned char explicitly_aligned_weak_char __attribute__((weak,aligned(2)));
+unsigned 

Re: [PATCH] s390: Deprecate some vector builtins

2024-03-11 Thread Andreas Krebbel
On 3/1/24 16:57, Stefan Schulze Frielinghaus wrote:
> According to IBM Open XL C/C++ for z/OS version 1.1 builtins
> 
> - vec_permi
> - vec_ctd
> - vec_ctsl
> - vec_ctul
> - vec_ld2f
> - vec_st2f
> 
> are deprecated.  Also deprecate helper builtins vec_ctd_s64 and
> vec_ctd_u64.
> 
> Furthermore, the overloads of vec_insert which make use of a bool vector
> are deprecated, too.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtins.def (vec_permi): Deprecate.
>   (vec_ctd): Deprecate.
>   (vec_ctd_s64): Deprecate.
>   (vec_ctd_u64): Deprecate.
>   (vec_ctsl): Deprecate.
>   (vec_ctul): Deprecate.
>   (vec_ld2f): Deprecate.
>   (vec_st2f): Deprecate.
>   (vec_insert): Deprecate overloads with bool vectors.

Ok. Thanks!

Andreas


Re: [PATCH] s390: Streamline vector builtins with LLVM

2024-03-11 Thread Andreas Krebbel
On 3/1/24 10:29, Stefan Schulze Frielinghaus wrote:
> Similar as to s390_lcbb, s390_vll, s390_vstl, et al. make use of a
> signed vector type for vlbb.  Furthermore, a const void pointer seems
> more common and an integer for the mask.
> 
> For s390_vfi(s,d)b make use of integers for masks, too.
> 
> Use unsigned integers for all s390_vlbr/vstbr variants.
> 
> Make use of type UV16QI for the length operand of s390_vstrs(,z)(h,f).
> 
> Following the Principles of Operation, change from signed to unsigned
> type for s390_va(c,cc,ccc)q and s390_vs(,c,bc)biq and s390_vmslg.
> 
> Make use of scalar type UINT128 instead of UV16QI for s390_vgfm(,a)g,
> and s390_vsumq(f,g).
> 
> Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtin-types.def: Update to reflect latest
>   changes.
>   * config/s390/s390-builtins.def: Streamline vector builtins with
>   LLVM.

Ok. Thanks!

Andreas



Re: [PATCH] s390: Fix test vector/long-double-to-i64.c

2024-03-11 Thread Andreas Krebbel
On 2/29/24 13:15, Stefan Schulze Frielinghaus wrote:
> Starting with r14-8319-g86de9b66480b71 fwprop improved so that vpdi is
> no longer required.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/long-double-to-i64.c: Fix scan
>   assembler directive.

Should we perhaps rather turn the scan-assembler directives into something 
which checks for the
absence of vpdi then? In order to get notified once this really useful 
optimization breaks?

Andreas

> ---
>  .../gcc.target/s390/vector/long-double-to-i64.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c 
> b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> index 2dbbb5d1c03..ed89878e6ee 100644
> --- a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> +++ b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> @@ -1,19 +1,24 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
>  /* { dg-do run { target { s390_z14_hw } } } */
> +/* { dg-final { check-function-bodies "**" "" "" { target { lp64 } } } } */
> +
>  #include 
>  #include 
>  
> +/*
> +** long_double_to_i64:
> +**   ld  %f0,0\(%r2\)
> +**   ld  %f2,8\(%r2\)
> +**   cgxbr   %r2,5,%f0
> +**   br  %r14
> +*/
>  __attribute__ ((noipa)) static int64_t
>  long_double_to_i64 (long double x)
>  {
>return x;
>  }
>  
> -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,1\n} 1 } } 
> */
> -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,5\n} 1 } } 
> */
> -/* { dg-final { scan-assembler-times {\n\tcgxbr\t} 1 } } */
> -
>  int
>  main (void)
>  {



Re: [PATCH] s390: Fix tests rosbg_si_srl and rxsbg_si_srl

2024-03-11 Thread Andreas Krebbel
On 2/29/24 13:14, Stefan Schulze Frielinghaus wrote:
> Starting with r14-2047-gd0e891406b16dc two SI mode tests are optimized
> into DI mode.  Thus, the scan-assembler directives fail.  For example
> RTL expression
> 
> (ior:SI (subreg:SI (lshiftrt:DI (reg:DI 69)
> (const_int 2 [0x2])) 4)
> (subreg:SI (reg:DI 68) 4))
> 
> is optimized into
> 
> (ior:DI (lshiftrt:DI (reg:DI 69)
> (const_int 2 [0x2]))
> (reg:DI 68))
> 
> Fixed by moving operands into memory in order to enforce SI mode
> computation.
> 
> Furthermore, in r9-6056-g290dfd9bc7bea2 the starting bit position of the
> scan-assembler directive for rosbg was incorrectly set to 32 which
> actually should be 32+SHIFT_AMOUNT, i.e., in this particular case 34.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/md/rXsbg_mode_sXl.c: Fix tests rosbg_si_srl
>   and rxsbg_si_srl.

Ok, thanks!

Andreas



Re: [PATCH] s390: Fix TARGET_SECONDARY_RELOAD for non-SYMBOL_REFs

2024-03-11 Thread Andreas Krebbel
On 2/29/24 13:13, Stefan Schulze Frielinghaus wrote:
> RTX X must not necessarily be a SYMBOL_REF and may e.g. be an
> UNSPEC_GOTENT for which SYMBOL_FLAG_NOTALIGN2_P fails.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_secondary_reload): Guard
>   SYMBOL_FLAG_NOTALIGN2_P.
Ok. Thanks!

Andreas

> ---
>  gcc/config/s390/s390.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 943fc9bfd72..12430d77786 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -4778,7 +4778,7 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t 
> rclass_i,
>if (in_p
> && s390_loadrelative_operand_p (x, , )
> && mode == Pmode
> -   && !SYMBOL_FLAG_NOTALIGN2_P (symref)
> +   && (!SYMBOL_REF_P (symref) || !SYMBOL_FLAG_NOTALIGN2_P (symref))
> && (offset & 1) == 1)
>   sri->icode = ((mode == DImode) ? CODE_FOR_reloaddi_larl_odd_addend_z10
> : CODE_FOR_reloadsi_larl_odd_addend_z10);



Re: [PATCH] IBM Z: Preserve exceptions in autovec-*-signaling-eq.c tests

2024-02-19 Thread Andreas Krebbel
On 2/19/24 13:39, Ilya Leoshkevich wrote:
> DSE, DCE, and other passes are removing redundant signaling comparisons
> from these tests, but the whole point is to check that GCC knows how to
> emit them.  Use -fno-delete-dead-exceptions to prevent that.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/s390/zvector/autovec-double-signaling-eq.c:
>   Preserve exceptions.
> * gcc.target/s390/zvector/autovec-float-signaling-eq.c:
>   Likewise.

Ok. Thanks!

Andreas

> ---
>  .../gcc.target/s390/zvector/autovec-double-signaling-eq.c   | 2 +-
>  .../gcc.target/s390/zvector/autovec-float-signaling-eq.c| 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git 
> a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c 
> b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c
> index 3645d3cc393..b23568e06b4 100644
> --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c
> +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions 
> -fnon-call-exceptions" } */
> +/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions 
> -fnon-call-exceptions -fno-delete-dead-exceptions" } */
>  
>  #include "autovec.h"
>  
> diff --git 
> a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c 
> b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c
> index d98aa0c494e..cd25d10c577 100644
> --- a/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c
> +++ b/gcc/testsuite/gcc.target/s390/zvector/autovec-float-signaling-eq.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions 
> -fnon-call-exceptions" } */
> +/* { dg-options "-O3 -march=z14 -mzvector -mzarch -fexceptions 
> -fnon-call-exceptions -fno-delete-dead-exceptions" } */
>  
>  #include "autovec.h"
>  



Re: [PATCH] [s390] target/112280 - properly guard permute query

2024-01-11 Thread Andreas Krebbel
On 1/11/24 14:58, Richard Biener wrote:
> The following adds guards avoiding code generation to
> expand_perm_as_a_vlbr_vstbr_candidate when d.testing_p.
> 
> Built and tested on the testcase in the PR.
> 
> OK to push as obvious?  Otherwise please pick up, test and push.

Ok to commit now. Thanks for the fix!

I've just started a regression test and will take care of any fallout.

Bye,

Andreas

> 
> Thanks,
> Richard.
> 
>   PR target/112280
>   * config/s390/s390.cc (expand_perm_as_a_vlbr_vstbr_candidate):
>   Do not generate code when d.testing_p.
> ---
>  gcc/config/s390/s390.cc | 36 
>  1 file changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 748ad9cd932..f182c26e78b 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -17867,33 +17867,45 @@ expand_perm_as_a_vlbr_vstbr_candidate (const struct 
> expand_vec_perm_d )
>  
>if (memcmp (d.perm, perm[0], MAX_VECT_LEN) == 0)
>  {
> -  rtx target = gen_rtx_SUBREG (V8HImode, d.target, 0);
> -  rtx op0 = gen_rtx_SUBREG (V8HImode, d.op0, 0);
> -  emit_insn (gen_bswapv8hi (target, op0));
> +  if (!d.testing_p)
> + {
> +   rtx target = gen_rtx_SUBREG (V8HImode, d.target, 0);
> +   rtx op0 = gen_rtx_SUBREG (V8HImode, d.op0, 0);
> +   emit_insn (gen_bswapv8hi (target, op0));
> + }
>return true;
>  }
>  
>if (memcmp (d.perm, perm[1], MAX_VECT_LEN) == 0)
>  {
> -  rtx target = gen_rtx_SUBREG (V4SImode, d.target, 0);
> -  rtx op0 = gen_rtx_SUBREG (V4SImode, d.op0, 0);
> -  emit_insn (gen_bswapv4si (target, op0));
> +  if (!d.testing_p)
> + {
> +   rtx target = gen_rtx_SUBREG (V4SImode, d.target, 0);
> +   rtx op0 = gen_rtx_SUBREG (V4SImode, d.op0, 0);
> +   emit_insn (gen_bswapv4si (target, op0));
> + }
>return true;
>  }
>  
>if (memcmp (d.perm, perm[2], MAX_VECT_LEN) == 0)
>  {
> -  rtx target = gen_rtx_SUBREG (V2DImode, d.target, 0);
> -  rtx op0 = gen_rtx_SUBREG (V2DImode, d.op0, 0);
> -  emit_insn (gen_bswapv2di (target, op0));
> +  if (!d.testing_p)
> + {
> +   rtx target = gen_rtx_SUBREG (V2DImode, d.target, 0);
> +   rtx op0 = gen_rtx_SUBREG (V2DImode, d.op0, 0);
> +   emit_insn (gen_bswapv2di (target, op0));
> + }
>return true;
>  }
>  
>if (memcmp (d.perm, perm[3], MAX_VECT_LEN) == 0)
>  {
> -  rtx target = gen_rtx_SUBREG (V1TImode, d.target, 0);
> -  rtx op0 = gen_rtx_SUBREG (V1TImode, d.op0, 0);
> -  emit_insn (gen_bswapv1ti (target, op0));
> +  if (!d.testing_p)
> + {
> +   rtx target = gen_rtx_SUBREG (V1TImode, d.target, 0);
> +   rtx op0 = gen_rtx_SUBREG (V1TImode, d.op0, 0);
> +   emit_insn (gen_bswapv1ti (target, op0));
> + }
>return true;
>  }
>  



[Committed] IBM Z: Cover weak symbols with -munaligned-symbols

2023-12-18 Thread Andreas Krebbel
With the recently introduced -munaligned-symbols option byte-sized
variables which are resolved externally are considered to be
potentially misaligned.
However, this should rather also be applied to symbols which resolve
locally if they are weak. Done with this patch.

Committed to mainline.

gcc/ChangeLog:

* config/s390/s390.cc (s390_encode_section_info): Replace
SYMBOL_REF_LOCAL_P with decl_binds_to_current_def_p.

gcc/testsuite/ChangeLog:

* gcc.target/s390/unaligned-2.c: New test.
---
 gcc/config/s390/s390.cc |  6 ++
 gcc/testsuite/gcc.target/s390/unaligned-2.c | 16 
 2 files changed, 18 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/unaligned-2.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 044de874590..a5c36b43972 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -13802,10 +13802,8 @@ s390_encode_section_info (tree decl, rtx rtl, int 
first)
 byte aligned as mandated by our ABI.  This behavior can be
 overridden for external symbols with the -munaligned-symbols
 switch.  */
-  if (DECL_ALIGN (decl) % 16
- && (DECL_USER_ALIGN (decl)
- || (!SYMBOL_REF_LOCAL_P (XEXP (rtl, 0))
- && s390_unaligned_symbols_p)))
+  if ((DECL_USER_ALIGN (decl) && DECL_ALIGN (decl) % 16)
+ || (s390_unaligned_symbols_p && !decl_binds_to_current_def_p (decl)))
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
   else if (DECL_ALIGN (decl) % 32)
SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
diff --git a/gcc/testsuite/gcc.target/s390/unaligned-2.c 
b/gcc/testsuite/gcc.target/s390/unaligned-2.c
new file mode 100644
index 000..c1ece6d5935
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/unaligned-2.c
@@ -0,0 +1,16 @@
+/* weak symbols might get overridden in another module by symbols
+   which are not aligned on a 2-byte boundary.  Although this violates
+   the zABI we try to handle this gracefully by not using larl on
+   these symbols if -munaligned-symbols has been specified.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=z900 -fno-section-anchors -munaligned-symbols" } */
+unsigned char __attribute__((weak)) weaksym = 0;
+
+unsigned char
+foo ()
+{
+  return weaksym;
+}
+
+/* { dg-final { scan-assembler-times "larl\t%r\[0-9\]*,weaksym\n" 0 } } */
-- 
2.43.0



Re: [PATCH] s390: Fix expansion of vec_step

2023-12-06 Thread Andreas Krebbel
On 12/4/23 11:14, Stefan Schulze Frielinghaus wrote:
> Add missing "s390" while expanding vec_step to __builtin_s390_vec_step.
> 
> gcc/ChangeLog:
> 
>   * config/s390/vecintrin.h (vec_step): Expand vec_step to
>   __builtin_s390_vec_step.

Ok, Thanks!

Andreas



Re: [PATCH] testsuite: Fix up gcc.target/s390/pr96127.c test for modern C [PR96127]

2023-12-06 Thread Andreas Krebbel
On 12/3/23 19:36, Jakub Jelinek wrote:
> Hi!
> 
> I've noticed this test regressed on s390x-linux with the addition of the
> switch to modern C patchset.  Haven't tried to reproduce the ICE, but as it
> was a backend ICE and FE after warning used to add such casts before (now
> errors), I think this ought to keep the testcase testing what was intended
> before.
> 
> Ok for trunk?

Ok, thanks!

Andreas



Re: [PATCH] s390x: Fix PR112753

2023-11-30 Thread Andreas Krebbel
On 11/30/23 16:45, Juergen Christ wrote:
> Commit 466b100e5fee808d77598e0f294654deec281150 introduced a bug in
> s390_md_asm_adjust if vector extensions are not available.  Fix the control
> flow of this function to not adjust long double values.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_md_asm_adjust): Fix.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/pr112753.c: New test.
> 
> Bootstrapped and tested on s390x.

Committed to mainline with a slightly more verbose changelog which also refers 
to the BZ. Thanks!

Andreas

> 
> Signed-off-by: Juergen Christ 
> ---
>  gcc/config/s390/s390.cc  | 4 
>  gcc/testsuite/gcc.target/s390/pr112753.c | 8 
>  2 files changed, 12 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/s390/pr112753.c
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 29b5dc979207..3a4d2d346f0c 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -17604,6 +17604,10 @@ s390_md_asm_adjust (vec , vec 
> ,
>outputs[i] = fprx2;
>  }
>  
> +  if (!TARGET_VXE)
> +/* Long doubles are stored in FPR pairs - nothing left to do.  */
> +return after_md_seq;
> +
>for (unsigned i = 0; i < ninputs; i++)
>  {
>if (GET_MODE (inputs[i]) != TFmode)
> diff --git a/gcc/testsuite/gcc.target/s390/pr112753.c 
> b/gcc/testsuite/gcc.target/s390/pr112753.c
> new file mode 100644
> index ..7183b3f12bed
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/pr112753.c
> @@ -0,0 +1,8 @@
> +/* This caused an ICE on s390x due to a bug in s390_md_asm_adjust when no
> +   vector extension is available.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=zEC12" } */
> +
> +long double strtold_l_internal___x;
> +void strtold_l_internal() { __asm__("" : : 
> "fm"(strtold_l_internal___x)); }



Re: [PATCH] s390: Fix builtin-classify-type-1.c on s390 too [PR112725]

2023-11-30 Thread Andreas Krebbel
On 11/30/23 17:34, Jakub Jelinek wrote:
> On Wed, Nov 29, 2023 at 07:27:20PM +0100, Jakub Jelinek wrote:
>> Given that the s390 backend defines pretty much the same target hook
>> as rs6000, I believe it suffers (at least when using -mvx?) the same
>> problem as rs6000, though admittedly this is so far completely
>> untested.
>>
>> Ok for trunk if it passes bootstrap/regtest there?
> 
> Now successfully bootstrapped/regtested on s390x-linux and indeed it
> fixes
> -FAIL: c-c++-common/builtin-classify-type-1.c  -Wc++-compat  (test for excess 
> errors)
> -UNRESOLVED: c-c++-common/builtin-classify-type-1.c  -Wc++-compat  
> compilation failed to produce executable
> there as well.
> 
>> 2023-11-29  Jakub Jelinek  
>>
>>  PR target/112725
>>  * config/s390/s390.cc (s390_invalid_arg_for_unprototyped_fn): Return
>>  NULL for __builtin_classify_type calls with vector arguments.

Ok. Thank you, Jakub!

Andreas



Re: [PATCH] s390: Add missing builtin type

2023-11-27 Thread Andreas Krebbel
On 11/27/23 13:38, Stefan Schulze Frielinghaus wrote:
> One builtin type slipped through the cracks of the last commits.
> 
> Bootstrapped on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtin-types.def (BT_FN_UV8HI_UV8HI_UINT):
>   Add missing builtin type.

Ok

Andreas



Re: [PATCH] s390: Fixup builtins vec_rli and verll

2023-11-27 Thread Andreas Krebbel
On 11/27/23 10:53, Stefan Schulze Frielinghaus wrote:
> Commit 248df13b966f46649e16dc3c8c92b263790ef503 restricted the rotate
> count to immediates.  Although the documentation of vec_rli (Vector
> Element Rotate Left Immediate) can be read as if it where restricted to
> immediates, this is not the case.  Thus, revert this commit.
> 
> In order to finally allow register operands, the rotate count must be of
> type unsigned char since the expander expects it to be of mode QI.  The
> previously used type unsigned integer worked out for immediates since
> those are of VOID mode anyway.
> 
> Bootstrapped and regtested on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtin-types.def: Remove types.
>   * config/s390/s390-builtins.def (O_U64): Remove 64-bit literal support.
>   Don't restrict s390_vec_rli and s390_verll[bhfg] to immediates.
>   * config/s390/s390.cc (s390_const_operand_ok): Remove 64-bit
>   literal support.

Ok, Thanks!

Andreas



Re: [PATCH] s390: Streamline NNPA builtins with their LLVM counterparts

2023-11-27 Thread Andreas Krebbel
Ok, thanks!

Andreas

On 11/27/23 10:12, Stefan Schulze Frielinghaus wrote:
> Ping.
> 
> On Thu, Nov 16, 2023 at 01:07:30PM +0100, Stefan Schulze Frielinghaus wrote:
>> For the opaque NNP-data type prefer unsigned over signed integer types.
>>
>> gcc/ChangeLog:
>>
>>  * config/s390/s390-builtin-types.def: Add/remove types.
>>  * config/s390/s390-builtins.def
>>  (s390_vclfnhs,s390_vclfnls,s390_vcrnfs,s390_vcfn,s390_vcnf):
>>  Replace type V8HI with UV8HI.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/s390/zvector/vec-nnpa-fp16-convert.c: Replace V8HI
>>  types with UV8HI.
>>  * gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c: Dito.
>>  * gcc.target/s390/zvector/vec_convert_from_fp16.c: Dito.
>>  * gcc.target/s390/zvector/vec_convert_to_fp16.c: Dito.
>>  * gcc.target/s390/zvector/vec_extend_to_fp32_hi.c: Dito.
>>  * gcc.target/s390/zvector/vec_extend_to_fp32_lo.c: Dito.
>>  * gcc.target/s390/zvector/vec_round_from_fp32.c: Dito.
>> ---
>>  gcc/config/s390/s390-builtin-types.def |  5 ++---
>>  gcc/config/s390/s390-builtins.def  | 10 +-
>>  .../gcc.target/s390/zvector/vec-nnpa-fp16-convert.c|  6 +++---
>>  .../gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c  |  2 +-
>>  .../gcc.target/s390/zvector/vec_convert_from_fp16.c|  4 ++--
>>  .../gcc.target/s390/zvector/vec_convert_to_fp16.c  |  4 ++--
>>  .../gcc.target/s390/zvector/vec_extend_to_fp32_hi.c|  2 +-
>>  .../gcc.target/s390/zvector/vec_extend_to_fp32_lo.c|  2 +-
>>  .../gcc.target/s390/zvector/vec_round_from_fp32.c  |  2 +-
>>  9 files changed, 18 insertions(+), 19 deletions(-)
>>
>> diff --git a/gcc/config/s390/s390-builtin-types.def 
>> b/gcc/config/s390/s390-builtin-types.def
>> index 3d8b30cdcc8..0bf759bd77a 100644
>> --- a/gcc/config/s390/s390-builtin-types.def
>> +++ b/gcc/config/s390/s390-builtin-types.def
>> @@ -265,9 +265,9 @@ DEF_FN_TYPE_2 (BT_FN_V2DI_V2DF_V2DF, BT_V2DI, BT_V2DF, 
>> BT_V2DF)
>>  DEF_FN_TYPE_2 (BT_FN_V2DI_V2DI_V2DI, BT_V2DI, BT_V2DI, BT_V2DI)
>>  DEF_FN_TYPE_2 (BT_FN_V2DI_V4SI_V4SI, BT_V2DI, BT_V4SI, BT_V4SI)
>>  DEF_FN_TYPE_2 (BT_FN_V4SF_FLT_INT, BT_V4SF, BT_FLT, BT_INT)
>> +DEF_FN_TYPE_2 (BT_FN_V4SF_UV8HI_UINT, BT_V4SF, BT_UV8HI, BT_UINT)
>>  DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_UCHAR, BT_V4SF, BT_V4SF, BT_UCHAR)
>>  DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_V4SF, BT_V4SF, BT_V4SF, BT_V4SF)
>> -DEF_FN_TYPE_2 (BT_FN_V4SF_V8HI_UINT, BT_V4SF, BT_V8HI, BT_UINT)
>>  DEF_FN_TYPE_2 (BT_FN_V4SI_BV4SI_V4SI, BT_V4SI, BT_BV4SI, BT_V4SI)
>>  DEF_FN_TYPE_2 (BT_FN_V4SI_INT_VOIDCONSTPTR, BT_V4SI, BT_INT, 
>> BT_VOIDCONSTPTR)
>>  DEF_FN_TYPE_2 (BT_FN_V4SI_UV4SI_UV4SI, BT_V4SI, BT_UV4SI, BT_UV4SI)
>> @@ -279,7 +279,6 @@ DEF_FN_TYPE_2 (BT_FN_V8HI_BV8HI_V8HI, BT_V8HI, BT_BV8HI, 
>> BT_V8HI)
>>  DEF_FN_TYPE_2 (BT_FN_V8HI_UV8HI_UV8HI, BT_V8HI, BT_UV8HI, BT_UV8HI)
>>  DEF_FN_TYPE_2 (BT_FN_V8HI_V16QI_V16QI, BT_V8HI, BT_V16QI, BT_V16QI)
>>  DEF_FN_TYPE_2 (BT_FN_V8HI_V4SI_V4SI, BT_V8HI, BT_V4SI, BT_V4SI)
>> -DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_UINT, BT_V8HI, BT_V8HI, BT_UINT)
>>  DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_V8HI, BT_V8HI, BT_V8HI, BT_V8HI)
>>  DEF_FN_TYPE_2 (BT_FN_VOID_UINT64PTR_UINT64, BT_VOID, BT_UINT64PTR, 
>> BT_UINT64)
>>  DEF_FN_TYPE_2 (BT_FN_VOID_V2DF_FLTPTR, BT_VOID, BT_V2DF, BT_FLTPTR)
>> @@ -317,6 +316,7 @@ DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_USHORT_INT, BT_UV8HI, 
>> BT_UV8HI, BT_USHORT, BT_I
>>  DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_UV8HI_INT, BT_UV8HI, BT_UV8HI, BT_UV8HI, 
>> BT_INT)
>>  DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_UV8HI_INTPTR, BT_UV8HI, BT_UV8HI, 
>> BT_UV8HI, BT_INTPTR)
>>  DEF_FN_TYPE_3 (BT_FN_UV8HI_UV8HI_UV8HI_UV8HI, BT_UV8HI, BT_UV8HI, BT_UV8HI, 
>> BT_UV8HI)
>> +DEF_FN_TYPE_3 (BT_FN_UV8HI_V4SF_V4SF_UINT, BT_UV8HI, BT_V4SF, BT_V4SF, 
>> BT_UINT)
>>  DEF_FN_TYPE_3 (BT_FN_V16QI_UV16QI_UV16QI_INTPTR, BT_V16QI, BT_UV16QI, 
>> BT_UV16QI, BT_INTPTR)
>>  DEF_FN_TYPE_3 (BT_FN_V16QI_V16QI_V16QI_INTPTR, BT_V16QI, BT_V16QI, 
>> BT_V16QI, BT_INTPTR)
>>  DEF_FN_TYPE_3 (BT_FN_V16QI_V16QI_V16QI_V16QI, BT_V16QI, BT_V16QI, BT_V16QI, 
>> BT_V16QI)
>> @@ -347,7 +347,6 @@ DEF_FN_TYPE_3 (BT_FN_V4SI_V4SI_V4SI_V4SI, BT_V4SI, 
>> BT_V4SI, BT_V4SI, BT_V4SI)
>>  DEF_FN_TYPE_3 (BT_FN_V4SI_V8HI_V8HI_V4SI, BT_V4SI, BT_V8HI, BT_V8HI, 
>> BT_V4SI)
>>  DEF_FN_TYPE_3 (BT_FN_V8HI_UV8HI_UV8HI_INTPTR, BT_V8HI, BT_UV8HI, BT_UV8HI, 
>> BT_INTPTR)
>>  DEF_FN_TYPE_3 (BT_FN_V8HI_V16QI_V16QI_V8HI, BT_V8HI, BT_V16QI, BT_V16QI, 
>> BT_V8HI)
>> -DEF_FN_TYPE_3 (BT_FN_V8HI_V4SF_V4SF_UINT, BT_V8HI, BT_V4SF, BT_V4SF, 
>> BT_UINT)
>>  DEF_FN_TYPE_3 (BT_FN_V8HI_V4SI_V4SI_INTPTR, BT_V8HI, BT_V4SI, BT_V4SI, 
>> BT_INTPTR)
>>  DEF_FN_TYPE_3 (BT_FN_V8HI_V8HI_V8HI_INTPTR, BT_V8HI, BT_V8HI, BT_V8HI, 
>> BT_INTPTR)
>>  DEF_FN_TYPE_3 (BT_FN_V8HI_V8HI_V8HI_V8HI, BT_V8HI, BT_V8HI, BT_V8HI, 
>> BT_V8HI)
>> diff --git a/gcc/config/s390/s390-builtins.def 
>> b/gcc/config/s390/s390-builtins.def
>> index 964d86c74a0..f331eba100a 100644
>> --- a/gcc/config/s390/s390-builtins.def
>> +++ 

Re: [PATCH] s390: Fix constraint for insn *cmphi_ccu

2023-11-27 Thread Andreas Krebbel
Ok, thanks!

Andreas

On 11/27/23 10:12, Stefan Schulze Frielinghaus wrote:
> Ping.
> 
> On Wed, Oct 25, 2023 at 11:27:33AM +0200, Stefan Schulze Frielinghaus wrote:
>> Currently for an unsigned 16-bit comparison between memory and an
>> immediate where the high bit is set, a clc is emitted.  This is because
>> the constant is created for mode HI and therefore sign extended.  This
>> means constraint D does not hold anymore.  Since the mode already
>> restricts the immediate to 16 bit, it is enough to make use of
>> constraint n and chop of the high bits in the output template.
>>
>> Bootstrapped and regtested on s390.  Ok for mainline?
>>
>> gcc/ChangeLog:
>>
>>  * config/s390/s390.md (*cmphi_ccu): For immediate operand 1 make
>>  use of constraint n instead of D and chop of high bits in the
>>  output template.
>> ---
>>  gcc/config/s390/s390.md | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
>> index 3f29ba21442..777a20f8e77 100644
>> --- a/gcc/config/s390/s390.md
>> +++ b/gcc/config/s390/s390.md
>> @@ -1355,13 +1355,13 @@
>>  (define_insn "*cmphi_ccu"
>>[(set (reg CC_REGNUM)
>>  (compare (match_operand:HI 0 "nonimmediate_operand" "d,d,Q,Q,BQ")
>> - (match_operand:HI 1 "general_operand"  "Q,S,D,BQ,Q")))]
>> + (match_operand:HI 1 "general_operand"  "Q,S,n,BQ,Q")))]
>>"s390_match_ccmode (insn, CCUmode)
>> && !register_operand (operands[1], HImode)"
>>"@
>> clm\t%0,3,%S1
>> clmy\t%0,3,%S1
>> -   clhhsi\t%0,%1
>> +   clhhsi\t%0,%x1
>> #
>> #"
>>[(set_attr "op_type" "RS,RSY,SIL,SS,SS")
>> -- 
>> 2.41.0
>>



Re: [PATCH] s390: Fix builtins floating-point convert to/from fixed

2023-11-27 Thread Andreas Krebbel
Ok, thanks!

Andreas

On 11/27/23 10:11, Stefan Schulze Frielinghaus wrote:
> Ping.
> 
> On Tue, Nov 14, 2023 at 04:19:59PM +0100, Stefan Schulze Frielinghaus wrote:
>> Remove flags for non-existing operands 2 and 3.
>>
>> Bootstrapped on s390.  Ok for mainline?
>>
>> gcc/ChangeLog:
>>
>>  * config/s390/s390-builtins.def
>>  (s390_vcefb,s390_vcdgb,s390_vcelfb,s390_vcdlgb,s390_vcfeb,s390_vcgdb,
>>  s390_vclfeb,s390_vclgdb): Remove flags for non-existing operands
>>  2 and 3.
>> ---
>>  gcc/config/s390/s390-builtins.def | 16 
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/gcc/config/s390/s390-builtins.def 
>> b/gcc/config/s390/s390-builtins.def
>> index 964d86c74a0..5bcf0d16ba3 100644
>> --- a/gcc/config/s390/s390-builtins.def
>> +++ b/gcc/config/s390/s390-builtins.def
>> @@ -2840,10 +2840,10 @@ OB_DEF (s390_vec_double,
>> s390_vec_double_s64,s390_vec_double_u64,
>>  OB_DEF_VAR (s390_vec_double_s64,s390_vcdgb, 0,  
>> 0,  BT_OV_V2DF_V2DI)
>>  OB_DEF_VAR (s390_vec_double_u64,s390_vcdlgb,0,  
>> 0,  BT_OV_V2DF_UV2DI)
>>  
>> -B_DEF  (s390_vcefb, floatv4siv4sf2, 0,  
>> B_VXE2, O2_U4 | O3_U3,  BT_FN_V4SF_V4SI)
>> -B_DEF  (s390_vcdgb, floatv2div2df2, 0,  
>> B_VX,   O2_U4 | O3_U3,  BT_FN_V2DF_V2DI)
>> -B_DEF  (s390_vcelfb,floatunsv4siv4sf2,  0,  
>> B_VXE2, O2_U4 | O3_U3,  BT_FN_V4SF_UV4SI)
>> -B_DEF  (s390_vcdlgb,floatunsv2div2df2,  0,  
>> B_VX,   O2_U4 | O3_U3,  BT_FN_V2DF_UV2DI)
>> +B_DEF  (s390_vcefb, floatv4siv4sf2, 0,  
>> B_VXE2, 0,  BT_FN_V4SF_V4SI)
>> +B_DEF  (s390_vcdgb, floatv2div2df2, 0,  
>> B_VX,   0,  BT_FN_V2DF_V2DI)
>> +B_DEF  (s390_vcelfb,floatunsv4siv4sf2,  0,  
>> B_VXE2, 0,  BT_FN_V4SF_UV4SI)
>> +B_DEF  (s390_vcdlgb,floatunsv2div2df2,  0,  
>> B_VX,   0,  BT_FN_V2DF_UV2DI)
>>  
>>  OB_DEF (s390_vec_signed,
>> s390_vec_signed_flt,s390_vec_signed_dbl,B_VX,   
>> BT_FN_OV4SI_OV4SI)
>>  OB_DEF_VAR (s390_vec_signed_flt,s390_vcfeb, B_VXE2, 
>> 0,  BT_OV_V4SI_V4SF)
>> @@ -2853,10 +2853,10 @@ OB_DEF (s390_vec_unsigned,  
>> s390_vec_unsigned_flt,s390_vec_unsigned_
>>  OB_DEF_VAR (s390_vec_unsigned_flt,  s390_vclfeb,B_VXE2, 
>> 0,  BT_OV_UV4SI_V4SF)
>>  OB_DEF_VAR (s390_vec_unsigned_dbl,  s390_vclgdb,0,  
>> 0,  BT_OV_UV2DI_V2DF)
>>  
>> -B_DEF  (s390_vcfeb, fix_truncv4sfv4si2, 0,  
>> B_VXE2, O2_U4 | O3_U3,  BT_FN_V4SI_V4SF)
>> -B_DEF  (s390_vcgdb, fix_truncv2dfv2di2, 0,  
>> B_VX,   O2_U4 | O3_U3,  BT_FN_V2DI_V2DF)
>> -B_DEF  (s390_vclfeb,fixuns_truncv4sfv4si2, 0,   
>> B_VXE2, O2_U4 | O3_U3,  BT_FN_UV4SI_V4SF)
>> -B_DEF  (s390_vclgdb,fixuns_truncv2dfv2di2, 0,   
>> B_VX,   O2_U4 | O3_U3,  BT_FN_UV2DI_V2DF)
>> +B_DEF  (s390_vcfeb, fix_truncv4sfv4si2, 0,  
>> B_VXE2, 0,  BT_FN_V4SI_V4SF)
>> +B_DEF  (s390_vcgdb, fix_truncv2dfv2di2, 0,  
>> B_VX,   0,  BT_FN_V2DI_V2DF)
>> +B_DEF  (s390_vclfeb,fixuns_truncv4sfv4si2, 0,   
>> B_VXE2, 0,  BT_FN_UV4SI_V4SF)
>> +B_DEF  (s390_vclgdb,fixuns_truncv2dfv2di2, 0,   
>> B_VX,   0,  BT_FN_UV2DI_V2DF)
>>  
>>  B_DEF  (s390_vfisb, vec_fpintv4sf,  0,  
>> B_VXE,  O2_U4 | O3_U3,  BT_FN_V4SF_V4SF_UCHAR_UCHAR)
>>  B_DEF  (s390_vfidb, vec_fpintv2df,  0,  
>> B_VX,   O2_U4 | O3_U3,  BT_FN_V2DF_V2DF_UCHAR_UCHAR)
>> -- 
>> 2.41.0
>>



Re: [PATCH] s390: implement flags output

2023-11-23 Thread Andreas Krebbel
On 11/15/23 14:15, Juergen Christ wrote:
> Implement flags output for inline assemblies.  Only use one output constraint
> that captures the whole condition code.  No breakout into different condition
> codes is allowed.  Also, only one condition code variable is allowed.
> 
> Add further logic to canonicalize various cases where we combine different
> cases of possible condition codes.
> 
> Bootstrapped and tested on s390.  OK for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-c.cc (s390_cpu_cpp_builtins): Define
>   __GCC_ASM_FLAG_OUTPUTS__.
>   * config/s390/s390.cc (s390_canonicalize_comparison): More
>   UNSPEC_CC_TO_INT cases.
>   (s390_md_asm_adjust): Implement flags output.
>   * config/s390/s390.md (ccstore4): Allow mask operands.
>   * doc/extend.texi: Document flags output.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/ccor.c: New test.
> 
> Signed-off-by: Juergen Christ 

Committed to mainline with a few minor formatting fixes. Thanks!

Andreas



Re: [PATCH] s390: split int128 load

2023-11-23 Thread Andreas Krebbel
On 11/15/23 14:15, Juergen Christ wrote:
> Issue two loads when using GPRs instead of one load-multiple.
> 
> Bootstrapped and tested on s390.  OK for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md: Split TImode loads.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/int128load.c: New test.
> 
> Signed-off-by: Juergen Christ 

Since the testcase is using __int128 it needs to be gated like this to prevent 
it from being tested
with -m31:

/* { dg-do compile { target int128 } } */

Committed to mainline with that change. Thanks!

Andreas



Re: [PATCH] s390: Fix ICE in testcase pr89233

2023-11-23 Thread Andreas Krebbel
On 11/15/23 14:12, Juergen Christ wrote:
> When using GNU vector extensions, an access outside of the vector size
> caused an ICE on s390.  Fix this by aligning with the vec_extract
> builtin, i.e., computing constant index modulo number of lanes.
> 
> Fixes testcase gcc.target/s390/pr89233.c.
> 
> Bootstrapped and tested on s390.  OK for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/vector.md: (*vec_extract) Fix.

Committed to mainline. Thanks!

Andreas



Re: [PATCH] s390: Fix generation of s390-gen-builtins.h

2023-11-15 Thread Andreas Krebbel
On 11/15/23 14:29, Stefan Schulze Frielinghaus wrote:
> By default the preprocessed output includes linemarkers.  This leads to
> an error if -pedantic is used as e.g. during bootstrap:
> 
> s390-gen-builtins.h:1:3: error: style of line directive is a GCC extension 
> [-Werror]
> 
> Fixed by omitting linemarkers while generating s390-gen-builtins.h.
> 
> gcc/ChangeLog:
> 
>   * config/s390/t-s390: Generate s390-gen-builtins.h without
>   linemarkers.

Ok, Thanks!

Andreas


> ---
>  gcc/config/s390/t-s390 | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/s390/t-s390 b/gcc/config/s390/t-s390
> index 4ab9718f6e2..2e884c367de 100644
> --- a/gcc/config/s390/t-s390
> +++ b/gcc/config/s390/t-s390
> @@ -33,4 +33,4 @@ s390-d.o: $(srcdir)/config/s390/s390-d.cc
>   $(POSTCOMPILE)
>  
>  s390-gen-builtins.h: $(srcdir)/config/s390/s390-builtins.h
> - $(COMPILER) -E $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< > $@
> + $(COMPILER) -E -P $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< > 
> $@



Re: [PATCH] s390: Fix vec_scatter_element for vectors of floats

2023-11-14 Thread Andreas Krebbel
On 11/14/23 12:44, Stefan Schulze Frielinghaus wrote:
> The offset for vec_scatter_element of floats should be a vector of type
> UV4SI instead of V4SF.  Note, this is an incompatibility change.
> 
> Bootstrapped on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtin-types.def: Add/remove types.
>   * config/s390/s390-builtins.def (s390_vec_scatter_element_flt):
>   The type for the offset should be UV4SI instead of V4SF.

Ok, Thanks!

Andreas

> ---
>  gcc/config/s390/s390-builtin-types.def | 2 +-
>  gcc/config/s390/s390-builtins.def  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/s390/s390-builtin-types.def 
> b/gcc/config/s390/s390-builtin-types.def
> index 3d8b30cdcc8..22ee348dbbb 100644
> --- a/gcc/config/s390/s390-builtin-types.def
> +++ b/gcc/config/s390/s390-builtin-types.def
> @@ -856,7 +856,7 @@ DEF_OV_TYPE (BT_OV_VOID_V2DI_LONG_LONGLONGPTR, BT_VOID, 
> BT_V2DI, BT_LONG, BT_LON
>  DEF_OV_TYPE (BT_OV_VOID_V2DI_UV2DI_LONGLONGPTR_ULONGLONG, BT_VOID, BT_V2DI, 
> BT_UV2DI, BT_LONGLONGPTR, BT_ULONGLONG)
>  DEF_OV_TYPE (BT_OV_VOID_V4SF_FLTPTR_UINT, BT_VOID, BT_V4SF, BT_FLTPTR, 
> BT_UINT)
>  DEF_OV_TYPE (BT_OV_VOID_V4SF_LONG_FLTPTR, BT_VOID, BT_V4SF, BT_LONG, 
> BT_FLTPTR)
> -DEF_OV_TYPE (BT_OV_VOID_V4SF_V4SF_FLTPTR_ULONGLONG, BT_VOID, BT_V4SF, 
> BT_V4SF, BT_FLTPTR, BT_ULONGLONG)
> +DEF_OV_TYPE (BT_OV_VOID_V4SF_UV4SI_FLTPTR_ULONGLONG, BT_VOID, BT_V4SF, 
> BT_UV4SI, BT_FLTPTR, BT_ULONGLONG)
>  DEF_OV_TYPE (BT_OV_VOID_V4SI_INTPTR_UINT, BT_VOID, BT_V4SI, BT_INTPTR, 
> BT_UINT)
>  DEF_OV_TYPE (BT_OV_VOID_V4SI_LONG_INTPTR, BT_VOID, BT_V4SI, BT_LONG, 
> BT_INTPTR)
>  DEF_OV_TYPE (BT_OV_VOID_V4SI_UV4SI_INTPTR_ULONGLONG, BT_VOID, BT_V4SI, 
> BT_UV4SI, BT_INTPTR, BT_ULONGLONG)
> diff --git a/gcc/config/s390/s390-builtins.def 
> b/gcc/config/s390/s390-builtins.def
> index 964d86c74a0..b59fa09fe07 100644
> --- a/gcc/config/s390/s390-builtins.def
> +++ b/gcc/config/s390/s390-builtins.def
> @@ -708,7 +708,7 @@ OB_DEF_VAR (s390_vec_scatter_element_u32,s390_vscef,  
>   0,
>  OB_DEF_VAR (s390_vec_scatter_element_s64,s390_vsceg,0,   
>O4_U1,  BT_OV_VOID_V2DI_UV2DI_LONGLONGPTR_ULONGLONG)
>  OB_DEF_VAR (s390_vec_scatter_element_b64,s390_vsceg,0,   
>O4_U1,  BT_OV_VOID_BV2DI_UV2DI_ULONGLONGPTR_ULONGLONG)
>  OB_DEF_VAR (s390_vec_scatter_element_u64,s390_vsceg,0,   
>O4_U1,  BT_OV_VOID_UV2DI_UV2DI_ULONGLONGPTR_ULONGLONG)
> -OB_DEF_VAR (s390_vec_scatter_element_flt,s390_vscef,B_VXE,   
>O4_U2,  BT_OV_VOID_V4SF_V4SF_FLTPTR_ULONGLONG)
> +OB_DEF_VAR (s390_vec_scatter_element_flt,s390_vscef,B_VXE,   
>O4_U2,  BT_OV_VOID_V4SF_UV4SI_FLTPTR_ULONGLONG)
>  OB_DEF_VAR (s390_vec_scatter_element_dbl,s390_vsceg,0,   
>O4_U1,  BT_OV_VOID_V2DF_UV2DI_DBLPTR_ULONGLONG)
>  
>  B_DEF  (s390_vscef, vec_scatter_elementv4si,0,   
>B_VX,   O4_U2,  
> BT_FN_VOID_UV4SI_UV4SI_UINTPTR_ULONGLONG)



[Committed] IBM Z: Add GTY marker to builtin data structures

2023-11-14 Thread Andreas Krebbel
This adds GTY markers to s390_builtin_types, s390_builtin_fn_types,
and s390_builtin_decls. These were missing causing problems in
particular when using builtins after including a precompiled header.

Unfortunately the declaration of these data structures use enum values
from s390-builtins.h.  This file however is not included everywhere
and is rather large.  In order to include it only for the purpose of
gtype-desc.cc we place a preprocessed copy of it in the build
directory and include only this.

This is going to be backported to GCC 12 and 13.

Bootstrapped and regression tested on IBM Z.

Committed to mainline.

gcc/ChangeLog:

* config.gcc: Add s390-gen-builtins.h to target_gtfiles.
* config/s390/s390-builtins.h (s390_builtin_types)
(s390_builtin_fn_types, s390_builtin_decls): Add GTY marker.
* config/s390/t-s390 (EXTRA_GTYPE_DEPS): Add s390-gen-builtins.h.
Add build rule for s390-gen-builtins.h.
---
 gcc/config.gcc  |  1 +
 gcc/config/s390/s390-builtins.h | 10 +-
 gcc/config/s390/t-s390  |  4 
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index ba6d63e33ac..c1460ca354e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -571,6 +571,7 @@ s390*-*-*)
d_target_objs="s390-d.o"
extra_options="${extra_options} fused-madd.opt"
extra_headers="s390intrin.h htmintrin.h htmxlintrin.h vecintrin.h"
+   target_gtfiles="./s390-gen-builtins.h"
;;
 # Note the 'l'; we need to be able to match e.g. "shle" or "shl".
 sh[123456789lbe]*-*-* | sh-*-*)
diff --git a/gcc/config/s390/s390-builtins.h b/gcc/config/s390/s390-builtins.h
index 45bba876828..84676fe5b3f 100644
--- a/gcc/config/s390/s390-builtins.h
+++ b/gcc/config/s390/s390-builtins.h
@@ -88,8 +88,8 @@ enum s390_builtin_ov_type_index
 
 #define MAX_OV_OPERANDS 6
 
-extern tree s390_builtin_types[BT_MAX];
-extern tree s390_builtin_fn_types[BT_FN_MAX];
+extern GTY(()) tree s390_builtin_types[BT_MAX];
+extern GTY(()) tree s390_builtin_fn_types[BT_FN_MAX];
 
   /* Builtins.  */
 
@@ -172,6 +172,6 @@ opflags_for_builtin (int fcode)
 return opflags_builtin[fcode];
 }
 
-extern tree s390_builtin_decls[S390_BUILTIN_MAX +
-  S390_OVERLOADED_BUILTIN_MAX +
-  S390_OVERLOADED_BUILTIN_VAR_MAX];
+extern GTY(()) tree s390_builtin_decls[S390_BUILTIN_MAX +
+  S390_OVERLOADED_BUILTIN_MAX +
+  S390_OVERLOADED_BUILTIN_VAR_MAX];
diff --git a/gcc/config/s390/t-s390 b/gcc/config/s390/t-s390
index 828818bed2d..4ab9718f6e2 100644
--- a/gcc/config/s390/t-s390
+++ b/gcc/config/s390/t-s390
@@ -19,6 +19,7 @@
 TM_H += $(srcdir)/config/s390/s390-builtins.def
 TM_H += $(srcdir)/config/s390/s390-builtin-types.def
 PASSES_EXTRA += $(srcdir)/config/s390/s390-passes.def
+EXTRA_GTYPE_DEPS += ./s390-gen-builtins.h
 
 s390-c.o: $(srcdir)/config/s390/s390-c.cc \
   $(srcdir)/config/s390/s390-protos.h $(CONFIG_H) $(SYSTEM_H) coretypes.h \
@@ -30,3 +31,6 @@ s390-c.o: $(srcdir)/config/s390/s390-c.cc \
 s390-d.o: $(srcdir)/config/s390/s390-d.cc
$(COMPILE) $<
$(POSTCOMPILE)
+
+s390-gen-builtins.h: $(srcdir)/config/s390/s390-builtins.h
+   $(COMPILER) -E $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< > $@
-- 
2.41.0



[Committed] IBM Z: Fix ICE with overloading and checking enabled

2023-11-14 Thread Andreas Krebbel
s390_resolve_overloaded_builtin, when called on NON_DEPENDENT_EXPR,
ICEs when using the type from it which ends up as error_mark_node.

This particular instance of the problem does not occur anymore since
NON_DEPENDENT_EXPR has been removed.  Nevertheless that case needs to
be handled here.

Bootstrapped and regression tested on IBM Z.

Committed to mainline.

gcc/ChangeLog:

* config/s390/s390-c.cc (s390_fn_types_compatible): Add a check
for error_mark_node.

gcc/testsuite/ChangeLog:

* g++.target/s390/zvec-templ-1.C: New test.
---
 gcc/config/s390/s390-c.cc|  3 +++
 gcc/testsuite/g++.target/s390/zvec-templ-1.C | 24 
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/s390/zvec-templ-1.C

diff --git a/gcc/config/s390/s390-c.cc b/gcc/config/s390/s390-c.cc
index 269f4f8e978..fce569342f3 100644
--- a/gcc/config/s390/s390-c.cc
+++ b/gcc/config/s390/s390-c.cc
@@ -781,6 +781,9 @@ s390_fn_types_compatible (enum s390_builtin_ov_type_index 
typeindex,
   tree in_arg = (*arglist)[i];
   tree in_type = TREE_TYPE (in_arg);
 
+  if (in_type == error_mark_node)
+   goto mismatch;
+
   if (VECTOR_TYPE_P (b_arg_type))
{
  /* Vector types have to match precisely.  */
diff --git a/gcc/testsuite/g++.target/s390/zvec-templ-1.C 
b/gcc/testsuite/g++.target/s390/zvec-templ-1.C
new file mode 100644
index 000..07bb65f199b
--- /dev/null
+++ b/gcc/testsuite/g++.target/s390/zvec-templ-1.C
@@ -0,0 +1,24 @@
+// { dg-do compile }
+// { dg-options "-O0 -mzvector -march=arch14 -mzarch" }
+// { dg-bogus "internal compiler error" "ICE" { target s390*-*-* } 23 }
+// { dg-excess-errors "" }
+
+/* This used to ICE with checking enabled because
+   s390_resolve_overloaded_builtin gets called on NON_DEPENDENT_EXPR
+   arguments. We then try to determine the type of it, get an error
+   node and ICEd consequently when using this.
+
+   This particular instance of the problem disappeared when
+   NON_DEPENDENT_EXPRs got removed with:
+
+   commit dad311874ac3b3cf4eca1c04f67cae80c953f7b8
+   Author: Patrick Palka 
+   Date:   Fri Oct 20 10:45:00 2023 -0400
+
+c++: remove NON_DEPENDENT_EXPR, part 1
+
+   Nevertheless we should check for error mark nodes in that code.  */
+
+template  void foo() {
+  __builtin_s390_vec_perm( , , );
+}
-- 
2.41.0



Re: [PATCH] s390: Reduce number of patterns where the condition is false anyway

2023-11-09 Thread Andreas Krebbel
On 11/9/23 09:24, Stefan Schulze Frielinghaus wrote:
> For patterns which make use of two modes, do not build the cross product
> and then exclude illegal combinations via conditions but rather do not
> create those in the first place.  Here we are following the idea of the
> attribute TOINTVEC/tointvec and introduce TOINT/toint.
> 
> Bootstrapped and regtested on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md (VX_CONV_INT): Remove iterator.
>   (gf): Add float mappings.
>   (TOINT, toint): New attribute.
>   (*fixuns_trunc2_z13):
>   Remove.
>   (*fixuns_trunc2_z13): Add.
>   (*fix_trunc2_bfp_z13):
>   Remove.
>   (*fix_trunc2_bfp_z13): Add.
>   (*floatuns2_z13): Remove.
>   (*floatuns2_z13): Add.
>   * config/s390/vector.md (VX_VEC_CONV_INT): Remove iterator.
>   (float2): Remove.
>   (float2): Add.
>   (floatuns2): Remove.
>   (floatuns2): Add.
>   (fix_trunc2):
>   Remove.
>   (fix_trunc2): Add.
>   (fixuns_trunc2):
>   Remove.
>   (fixuns_trunc2): Add.

Ok, thanks!

Andreas



Re: [PATCH 2/3] s390: Add expand_perm_reverse_elements

2023-11-09 Thread Andreas Krebbel
On 11/9/23 09:22, Stefan Schulze Frielinghaus wrote:
> Replace expand_perm_with_rot, expand_perm_with_vster, and
> expand_perm_with_vstbrq with a general implementation
> expand_perm_reverse_elements.
> 
> Bootstrapped and regtested on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (expand_perm_with_rot): Remove.
>   (expand_perm_reverse_elements): New.
>   (expand_perm_with_vster): Remove.
>   (expand_perm_with_vstbrq): Remove.
>   (vectorize_vec_perm_const_1): Replace removed functions with new
>   one.

Ok, thanks!

Andreas



Re: [PATCH 3/3] s390: Revise vector reverse elements

2023-11-09 Thread Andreas Krebbel
On 11/9/23 09:22, Stefan Schulze Frielinghaus wrote:
> Replace UNSPEC_VEC_ELTSWAP with a vec_select implementation.
> 
> Furthermore, for a vector reverse elements operation between registers
> of mode V8HI perform three rotates instead of a vperm operation since
> the latter involves loading the permutation vector from the literal
> pool.
> 
> Prior z15, instead of
>   larl + vl + vl + vperm
> prefer
>   vl + vpdi (+ verllg (+ verllf))
> for a load operation.
> 
> Likewise, prior z15, instead of
>   larl + vl + vperm + vst
> prefer
>   vpdi (+ verllg (+ verllf)) + vst
> for a store operation.
> 
> Bootstrapped and regtested on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md: Remove UNSPEC_VEC_ELTSWAP.
>   * config/s390/vector.md (eltswapv16qi): New expander.
>   (*eltswapv16qi): New insn and splitter.
>   (eltswapv8hi): New insn and splitter.
>   (eltswap): New insn and splitter for modes V_HW_4 as well
>   as V_HW_2.
>   * config/s390/vx-builtins.md (eltswap): Remove.
>   (*eltswapv16qi): Remove.
>   (*eltswap): Remove.
>   (*eltswap_emu): Remove.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/zvector/vec-reve-load-halfword-z14.c: Remove
>   vperm and substitude by vpdi et al.
>   * gcc.target/s390/zvector/vec-reve-load-halfword.c: Likewise.
>   * gcc.target/s390/vector/reverse-elements-1.c: New test.
>   * gcc.target/s390/vector/reverse-elements-2.c: New test.
>   * gcc.target/s390/vector/reverse-elements-3.c: New test.
>   * gcc.target/s390/vector/reverse-elements-4.c: New test.
>   * gcc.target/s390/vector/reverse-elements-5.c: New test.
>   * gcc.target/s390/vector/reverse-elements-6.c: New test.
>   * gcc.target/s390/vector/reverse-elements-7.c: New test.

Ok, thanks!

Andreas



Re: [PATCH 1/3] s390: Recognize further vpdi and vmr{l,h} pattern

2023-11-09 Thread Andreas Krebbel
On 11/9/23 09:22, Stefan Schulze Frielinghaus wrote:
> Deal with cases where vpdi and vmr{l,h} are still applicable if the
> operands of those instructions are swapped.  For example, currently for
> 
> V2DI foo (V2DI x)
> {
>   return (V2DI) {x[1], x[0]};
> }
> 
> the assembler sequence
> 
> vlgvg   %r1,%v24,1
> vzero   %v0
> vlvgg   %v0,%r1,0
> vmrhg   %v24,%v0,%v24
> 
> is emitted.  With this patch a single vpdi is emitted.
> 
> Extensive tests are included in a subsequent patch of this series where
> more cases are covered.
> 
> Bootstrapped and regtested on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (expand_perm_with_merge): Deal with cases
>   where vmr{l,h} are still applicable if the operands are swapped.
>   (expand_perm_with_vpdi): Likewise for vpdi.

Ok, Thanks!

Andreas



Re: [PATCH] s390: fix htm-builtins test cases

2023-10-25 Thread Andreas Krebbel
On 10/25/23 16:50, Juergen Christ wrote:
> Transactional and non-transactional stores to the same cache line cause
> transactions to abort on newer generations.  Add sufficient padding to make
> sure another cache line is used.
> 
> Tested on s390.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/htm-builtins-1.c: Fix.
>   * gcc.target/s390/htm-builtins-2.c: Fix.

Ok. Thanks!

Andreas

> 
> Signed-off-by: Juergen Christ 
> ---
>  gcc/testsuite/gcc.target/s390/htm-builtins-1.c | 4 +++-
>  gcc/testsuite/gcc.target/s390/htm-builtins-2.c | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-1.c 
> b/gcc/testsuite/gcc.target/s390/htm-builtins-1.c
> index ff43be9fe736..4f95bf3accaa 100644
> --- a/gcc/testsuite/gcc.target/s390/htm-builtins-1.c
> +++ b/gcc/testsuite/gcc.target/s390/htm-builtins-1.c
> @@ -53,9 +53,11 @@ __attribute__ ((aligned(256))) struct
>  __attribute__ ((aligned(256))) struct
>  {
>volatile uint64_t c1;
> +  char pad1[256 - sizeof(uint64_t)];
>volatile uint64_t c2;
> +  char pad2[256 - sizeof(uint64_t)];
>volatile uint64_t c3;
> -} counters = { 0, 0, 0 };
> +} counters = { 0 };
>  
>  /*  local helper functions - 
> */
>  
> diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-2.c 
> b/gcc/testsuite/gcc.target/s390/htm-builtins-2.c
> index bb9d346ea560..2e838caacc8c 100644
> --- a/gcc/testsuite/gcc.target/s390/htm-builtins-2.c
> +++ b/gcc/testsuite/gcc.target/s390/htm-builtins-2.c
> @@ -94,9 +94,11 @@ float global_float_3 = 0.0;
>  __attribute__ ((aligned(256))) struct
>  {
>volatile uint64_t c1;
> +  char pad1[256 - sizeof(uint64_t)];
>volatile uint64_t c2;
> +  char pad2[256 - sizeof(uint64_t)];
>volatile uint64_t c3;
> -} counters = { 0, 0, 0 };
> +} counters = { 0 };
>  
>  /*  local helper functions - 
> */
>  



Re: [PATCH] s390: Fix expander popcountv8hi2_vx

2023-10-16 Thread Andreas Krebbel
On 10/16/23 13:20, Stefan Schulze Frielinghaus wrote:
> The normal form of a CONST_INT which represents an integer of a mode
> with fewer bits than in HOST_WIDE_INT is sign extended.  This even holds
> for unsigned integers.
> 
> This fixes an ICE during cse1 where we bail out at rtl.h:2297 since
> INTVAL (x.first) == sext_hwi (INTVAL (x.first), precision) does not hold.
> 
> gcc/ChangeLog:
> 
>   * config/s390/vector.md (popcountv8hi2_vx): Sign extend each
>   unsigned vector element.

Ok. Thanks!

Bye,

Andreas



Re: [PATCH] s390: Make use of new copysign RTL

2023-10-06 Thread Andreas Krebbel
On 10/5/23 08:46, Stefan Schulze Frielinghaus wrote:
> gcc/ChangeLog:
> 
>   * config/s390/s390.md: Make use of new copysign RTL.

Ok. Thanks!

Andreas

> ---
>  gcc/config/s390/s390.md | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index 9631b2a8c60..3f29ba21442 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -124,7 +124,6 @@
>  
> ; Byte-wise Population Count
> UNSPEC_POPCNT
> -   UNSPEC_COPYSIGN
>  
> ; Load FP Integer
> UNSPEC_FPINT_FLOOR
> @@ -11918,9 +11917,8 @@
>  
>  (define_insn "copysign3"
>[(set (match_operand:FP 0 "register_operand" "=f")
> -  (unspec:FP [(match_operand:FP 1 "register_operand" "")
> -  (match_operand:FP 2 "register_operand" "f")]
> -  UNSPEC_COPYSIGN))]
> + (copysign:FP (match_operand:FP 1 "register_operand" "")
> +  (match_operand:FP 2 "register_operand" "f")))]
>"TARGET_Z196"
>"cpsdr\t%0,%2,%1"
>[(set_attr "op_type"  "RRF")



Re: [PATCH] s390: Fix builtins vec_rli and verll

2023-09-11 Thread Andreas Krebbel via Gcc-patches
On 9/11/23 08:56, Stefan Schulze Frielinghaus wrote:
> On Mon, Aug 28, 2023 at 11:33:37AM +0200, Andreas Krebbel wrote:
>> Hi Stefan,
>>
>> do you really need to introduce a new flag for U64 given that the type of 
>> the builtin is unsigned long?
> 
> In function s390_const_operand_ok the immediate is checked whether it is
> valide w.r.t. the flag:
> 
>   tree_to_uhwi (arg) > ((HOST_WIDE_INT_1U << (bitwidth - 1) << 1) - 1)
> 
> Here bitwidth is derived from the flag.

I see, it is about enabling the constant check at all.

Ok, thanks!

Andreas

> 
> Cheers,
> Stefan
> 
>>
>> Andreas
>>
>> On 8/21/23 17:56, Stefan Schulze Frielinghaus wrote:
>>> The second argument of these builtins is an unsigned immediate.  For
>>> vec_rli the API allows immediates up to 64 bits whereas the instruction
>>> verll only allows immediates up to 32 bits.  Since the shift count
>>> equals the immediate modulo vector element size, truncating those
>>> immediates is fine.
>>>
>>> Bootstrapped and regtested on s390.  Ok for mainline?
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/s390/s390-builtins.def (O_U64): New.
>>> (O1_U64): Ditto.
>>> (O2_U64): Ditto.
>>> (O3_U64): Ditto.
>>> (O4_U64): Ditto.
>>> (O_M12): Change bit position.
>>> (O_S2): Ditto.
>>> (O_S3): Ditto.
>>> (O_S4): Ditto.
>>> (O_S5): Ditto.
>>> (O_S8): Ditto.
>>> (O_S12): Ditto.
>>> (O_S16): Ditto.
>>> (O_S32): Ditto.
>>> (O_ELEM): Ditto.
>>> (O_LIT): Ditto.
>>> (OB_DEF_VAR): Add operand constraints.
>>> (B_DEF): Ditto.
>>> * config/s390/s390.cc (s390_const_operand_ok): Honour 64 bit
>>> operands.
>>> ---
>>>  gcc/config/s390/s390-builtins.def | 60 ++-
>>>  gcc/config/s390/s390.cc   |  6 ++--
>>>  2 files changed, 39 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/gcc/config/s390/s390-builtins.def 
>>> b/gcc/config/s390/s390-builtins.def
>>> index a16983b18bd..c829f445a11 100644
>>> --- a/gcc/config/s390/s390-builtins.def
>>> +++ b/gcc/config/s390/s390-builtins.def
>>> @@ -28,6 +28,7 @@
>>>  #undef O_U12
>>>  #undef O_U16
>>>  #undef O_U32
>>> +#undef O_U64
>>>  
>>>  #undef O_M12
>>>  
>>> @@ -88,6 +89,11 @@
>>>  #undef O3_U32
>>>  #undef O4_U32
>>>  
>>> +#undef O1_U64
>>> +#undef O2_U64
>>> +#undef O3_U64
>>> +#undef O4_U64
>>> +
>>>  #undef O1_M12
>>>  #undef O2_M12
>>>  #undef O3_M12
>>> @@ -157,20 +163,21 @@
>>>  #define O_U127 /* unsigned 16 bit literal */
>>>  #define O_U168 /* unsigned 16 bit literal */
>>>  #define O_U329 /* unsigned 32 bit literal */
>>> +#define O_U64   10 /* unsigned 64 bit literal */
>>>  
>>> -#define O_M12   10 /* matches bitmask of 12 */
>>> +#define O_M12   11 /* matches bitmask of 12 */
>>>  
>>> -#define O_S211 /* signed  2 bit literal */
>>> -#define O_S312 /* signed  3 bit literal */
>>> -#define O_S413 /* signed  4 bit literal */
>>> -#define O_S514 /* signed  5 bit literal */
>>> -#define O_S815 /* signed  8 bit literal */
>>> -#define O_S12   16 /* signed 12 bit literal */
>>> -#define O_S16   17 /* signed 16 bit literal */
>>> -#define O_S32   18 /* signed 32 bit literal */
>>> +#define O_S212 /* signed  2 bit literal */
>>> +#define O_S313 /* signed  3 bit literal */
>>> +#define O_S414 /* signed  4 bit literal */
>>> +#define O_S515 /* signed  5 bit literal */
>>> +#define O_S816 /* signed  8 bit literal */
>>> +#define O_S12   17 /* signed 12 bit literal */
>>> +#define O_S16   18 /* signed 16 bit literal */
>>> +#define O_S32   19 /* signed 32 bit literal */
>>>  
>>> -#define O_ELEM  19 /* Element selector requiring modulo arithmetic. */
>>> -#define O_LIT   20 /* Operand must be a literal fitting the target type.  
>>> */
>>> +#define O_ELEM  20 /* Element selector requiring modulo arithmetic. */
>>> +#define O_LIT   21 /* Operand must be a literal fitting the target type.  
>>> */
>>>  
>>>  #define O_SHIFT 5
>>>  
>>> @@ -223,6 +230,11 @@
>>>  #define O3_U32 (O_U32 << (2 * O_SHIFT))
>&g

Re: [PATCH] s390: Fix builtins vec_rli and verll

2023-08-28 Thread Andreas Krebbel via Gcc-patches
Hi Stefan,

do you really need to introduce a new flag for U64 given that the type of the 
builtin is unsigned long?

Andreas

On 8/21/23 17:56, Stefan Schulze Frielinghaus wrote:
> The second argument of these builtins is an unsigned immediate.  For
> vec_rli the API allows immediates up to 64 bits whereas the instruction
> verll only allows immediates up to 32 bits.  Since the shift count
> equals the immediate modulo vector element size, truncating those
> immediates is fine.
> 
> Bootstrapped and regtested on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtins.def (O_U64): New.
>   (O1_U64): Ditto.
>   (O2_U64): Ditto.
>   (O3_U64): Ditto.
>   (O4_U64): Ditto.
>   (O_M12): Change bit position.
>   (O_S2): Ditto.
>   (O_S3): Ditto.
>   (O_S4): Ditto.
>   (O_S5): Ditto.
>   (O_S8): Ditto.
>   (O_S12): Ditto.
>   (O_S16): Ditto.
>   (O_S32): Ditto.
>   (O_ELEM): Ditto.
>   (O_LIT): Ditto.
>   (OB_DEF_VAR): Add operand constraints.
>   (B_DEF): Ditto.
>   * config/s390/s390.cc (s390_const_operand_ok): Honour 64 bit
>   operands.
> ---
>  gcc/config/s390/s390-builtins.def | 60 ++-
>  gcc/config/s390/s390.cc   |  6 ++--
>  2 files changed, 39 insertions(+), 27 deletions(-)
> 
> diff --git a/gcc/config/s390/s390-builtins.def 
> b/gcc/config/s390/s390-builtins.def
> index a16983b18bd..c829f445a11 100644
> --- a/gcc/config/s390/s390-builtins.def
> +++ b/gcc/config/s390/s390-builtins.def
> @@ -28,6 +28,7 @@
>  #undef O_U12
>  #undef O_U16
>  #undef O_U32
> +#undef O_U64
>  
>  #undef O_M12
>  
> @@ -88,6 +89,11 @@
>  #undef O3_U32
>  #undef O4_U32
>  
> +#undef O1_U64
> +#undef O2_U64
> +#undef O3_U64
> +#undef O4_U64
> +
>  #undef O1_M12
>  #undef O2_M12
>  #undef O3_M12
> @@ -157,20 +163,21 @@
>  #define O_U127 /* unsigned 16 bit literal */
>  #define O_U168 /* unsigned 16 bit literal */
>  #define O_U329 /* unsigned 32 bit literal */
> +#define O_U64   10 /* unsigned 64 bit literal */
>  
> -#define O_M12   10 /* matches bitmask of 12 */
> +#define O_M12   11 /* matches bitmask of 12 */
>  
> -#define O_S211 /* signed  2 bit literal */
> -#define O_S312 /* signed  3 bit literal */
> -#define O_S413 /* signed  4 bit literal */
> -#define O_S514 /* signed  5 bit literal */
> -#define O_S815 /* signed  8 bit literal */
> -#define O_S12   16 /* signed 12 bit literal */
> -#define O_S16   17 /* signed 16 bit literal */
> -#define O_S32   18 /* signed 32 bit literal */
> +#define O_S212 /* signed  2 bit literal */
> +#define O_S313 /* signed  3 bit literal */
> +#define O_S414 /* signed  4 bit literal */
> +#define O_S515 /* signed  5 bit literal */
> +#define O_S816 /* signed  8 bit literal */
> +#define O_S12   17 /* signed 12 bit literal */
> +#define O_S16   18 /* signed 16 bit literal */
> +#define O_S32   19 /* signed 32 bit literal */
>  
> -#define O_ELEM  19 /* Element selector requiring modulo arithmetic. */
> -#define O_LIT   20 /* Operand must be a literal fitting the target type.  */
> +#define O_ELEM  20 /* Element selector requiring modulo arithmetic. */
> +#define O_LIT   21 /* Operand must be a literal fitting the target type.  */
>  
>  #define O_SHIFT 5
>  
> @@ -223,6 +230,11 @@
>  #define O3_U32 (O_U32 << (2 * O_SHIFT))
>  #define O4_U32 (O_U32 << (3 * O_SHIFT))
>  
> +#define O1_U64 O_U64
> +#define O2_U64 (O_U64 << O_SHIFT)
> +#define O3_U64 (O_U64 << (2 * O_SHIFT))
> +#define O4_U64 (O_U64 << (3 * O_SHIFT))
> +
>  #define O1_M12 O_M12
>  #define O2_M12 (O_M12 << O_SHIFT)
>  #define O3_M12 (O_M12 << (2 * O_SHIFT))
> @@ -1989,19 +2001,19 @@ B_DEF  (s390_verllvf,   vrotlv4si3,   
>   0,
>  B_DEF  (s390_verllvg,   vrotlv2di3, 0,   
>B_VX,   0,  BT_FN_UV2DI_UV2DI_UV2DI)
>  
>  OB_DEF (s390_vec_rli,   s390_vec_rli_u8,
> s390_vec_rli_s64,   B_VX,   BT_FN_OV4SI_OV4SI_ULONG)
> -OB_DEF_VAR (s390_vec_rli_u8,s390_verllb,0,   
>0,  BT_OV_UV16QI_UV16QI_ULONG)
> -OB_DEF_VAR (s390_vec_rli_s8,s390_verllb,0,   
>0,  BT_OV_V16QI_V16QI_ULONG)
> -OB_DEF_VAR (s390_vec_rli_u16,   s390_verllh,0,   
>0,  BT_OV_UV8HI_UV8HI_ULONG)
> -OB_DEF_VAR (s390_vec_rli_s16,   s390_verllh,0,   
>0,  BT_OV_V8HI_V8HI_ULONG)
> -OB_DEF_VAR (s390_vec_rli_u32,   s390_verllf,0,   
>0,  BT_OV_UV4SI_UV4SI_ULONG)
> -OB_DEF_VAR (s390_vec_rli_s32,   s390_verllf,0,   
>0,  BT_OV_V4SI_V4SI_ULONG)
> -OB_DEF_VAR (s390_vec_rli_u64,   s390_verllg,0,   
>0,  BT_OV_UV2DI_UV2DI_ULONG)
> -OB_DEF_VAR 

Re: [PATCH] s390: Fix some builtin definitions

2023-08-28 Thread Andreas Krebbel via Gcc-patches
On 8/21/23 17:58, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested on s390.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtins.def (s390_vec_signed_flt): Fix
>   builtin flag.
>   (s390_vec_unsigned_flt): Ditto.
>   (s390_vec_revb_flt): Ditto.
>   (s390_vec_reve_flt): Ditto.
>   (s390_vclfnhs): Fix operand flags.
>   (s390_vclfnls): Ditto.
>   (s390_vcrnfs): Ditto.
>   (s390_vcfn): Ditto.
>   (s390_vcnf): Ditto.

Ok. Thanks!

Andreas


> ---
>  gcc/config/s390/s390-builtins.def | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/gcc/config/s390/s390-builtins.def 
> b/gcc/config/s390/s390-builtins.def
> index c829f445a11..964d86c74a0 100644
> --- a/gcc/config/s390/s390-builtins.def
> +++ b/gcc/config/s390/s390-builtins.def
> @@ -2846,12 +2846,12 @@ B_DEF  (s390_vcelfb,
> floatunsv4siv4sf2,  0,
>  B_DEF  (s390_vcdlgb,floatunsv2div2df2,  0,   
>B_VX,   O2_U4 | O3_U3,  BT_FN_V2DF_UV2DI)
>  
>  OB_DEF (s390_vec_signed,
> s390_vec_signed_flt,s390_vec_signed_dbl,B_VX,   BT_FN_OV4SI_OV4SI)
> -OB_DEF_VAR (s390_vec_signed_flt,s390_vcfeb, 0,   
>B_VXE2, BT_OV_V4SI_V4SF)
> +OB_DEF_VAR (s390_vec_signed_flt,s390_vcfeb, B_VXE2,  
>0,  BT_OV_V4SI_V4SF)
>  OB_DEF_VAR (s390_vec_signed_dbl,s390_vcgdb, 0,   
>0,  BT_OV_V2DI_V2DF)
>  
>  OB_DEF (s390_vec_unsigned,  
> s390_vec_unsigned_flt,s390_vec_unsigned_dbl,B_VX,   BT_FN_OV4SI_OV4SI)
> -OB_DEF_VAR (s390_vec_unsigned_flt,  s390_vclfeb,0,   
>  B_VXE2, BT_OV_UV4SI_V4SF)
> -OB_DEF_VAR (s390_vec_unsigned_dbl,  s390_vclgdb,0,   
>  0,  BT_OV_UV2DI_V2DF)
> +OB_DEF_VAR (s390_vec_unsigned_flt,  s390_vclfeb,B_VXE2,  
>0,  BT_OV_UV4SI_V4SF)
> +OB_DEF_VAR (s390_vec_unsigned_dbl,  s390_vclgdb,0,   
>0,  BT_OV_UV2DI_V2DF)
>  
>  B_DEF  (s390_vcfeb, fix_truncv4sfv4si2, 0,   
>B_VXE2, O2_U4 | O3_U3,  BT_FN_V4SI_V4SF)
>  B_DEF  (s390_vcgdb, fix_truncv2dfv2di2, 0,   
>B_VX,   O2_U4 | O3_U3,  BT_FN_V2DI_V2DF)
> @@ -2929,7 +2929,7 @@ OB_DEF_VAR (s390_vec_revb_s32,  s390_vlbrf, 
> 0,
>  OB_DEF_VAR (s390_vec_revb_u32,  s390_vlbrf, 0,   
>0,  BT_OV_UV4SI_UV4SI)
>  OB_DEF_VAR (s390_vec_revb_s64,  s390_vlbrg, 0,   
>0,  BT_OV_V2DI_V2DI)
>  OB_DEF_VAR (s390_vec_revb_u64,  s390_vlbrg, 0,   
>0,  BT_OV_UV2DI_UV2DI)
> -OB_DEF_VAR (s390_vec_revb_flt,  s390_vlbrf_flt, 0,   
>B_VXE,  BT_OV_V4SF_V4SF)
> +OB_DEF_VAR (s390_vec_revb_flt,  s390_vlbrf_flt, B_VXE,   
>0,  BT_OV_V4SF_V4SF)
>  OB_DEF_VAR (s390_vec_revb_dbl,  s390_vlbrg_dbl, 0,   
>0,  BT_OV_V2DF_V2DF)
>  
>  B_DEF  (s390_vlbrh, bswapv8hi,  0,   
>B_VX,   0,   BT_FN_V8HI_V8HI)
> @@ -2960,7 +2960,7 @@ OB_DEF_VAR (s390_vec_reve_u32,  s390_vlerf, 
> 0,
>  OB_DEF_VAR (s390_vec_reve_b64,  s390_vlerg, 0,   
>0,  BT_OV_BV2DI_BV2DI)
>  OB_DEF_VAR (s390_vec_reve_s64,  s390_vlerg, 0,   
>0,  BT_OV_V2DI_V2DI)
>  OB_DEF_VAR (s390_vec_reve_u64,  s390_vlerg, 0,   
>0,  BT_OV_UV2DI_UV2DI)
> -OB_DEF_VAR (s390_vec_reve_flt,  s390_vlerf_flt, 0,   
>B_VXE,  BT_OV_V4SF_V4SF)
> +OB_DEF_VAR (s390_vec_reve_flt,  s390_vlerf_flt, B_VXE,   
>0,  BT_OV_V4SF_V4SF)
>  OB_DEF_VAR (s390_vec_reve_dbl,  s390_vlerg_dbl, 0,   
>0,  BT_OV_V2DF_V2DF)
>  
>  B_DEF  (s390_vlerb, eltswapv16qi,   0,   
>B_VX,   0,   BT_FN_V16QI_V16QI)
> @@ -3037,10 +3037,10 @@ B_DEF  (s390_vstrszf,vstrszv4si,  
>   0,
>  
>  /* arch 14 builtins */
>  
> -B_DEF  (s390_vclfnhs,vclfnhs_v8hi,  0,   
>B_NNPA, O3_U4,  BT_FN_V4SF_V8HI_UINT)
> -B_DEF  (s390_vclfnls,vclfnls_v8hi,  0,   
>B_NNPA, O3_U4,  BT_FN_V4SF_V8HI_UINT)
> +B_DEF  (s390_vclfnhs,vclfnhs_v8hi,  0,   
>B_NNPA, O2_U4,

Re: [PATCH] s390: Try to emit vlbr/vstbr instead of vperm et al.

2023-08-03 Thread Andreas Krebbel via Gcc-patches
On 8/3/23 08:51, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested on s390x.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (expand_perm_as_a_vlbr_vstbr_candidate):
>   New function which handles bswap patterns for vec_perm_const.
>   (vectorize_vec_perm_const_1): Call new function.
>   * config/s390/vector.md (*bswap): Fix operands in output
>   template.
>   (*vstbr): New insn.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/s390.exp: Add subdirectory vxe2.
>   * gcc.target/s390/vxe2/vlbr-1.c: New test.
>   * gcc.target/s390/vxe2/vstbr-1.c: New test.
>   * gcc.target/s390/vxe2/vstbr-2.c: New test.

Ok. Thanks!

Andreas


> ---
>  gcc/config/s390/s390.cc  | 55 
>  gcc/config/s390/vector.md| 16 --
>  gcc/testsuite/gcc.target/s390/s390.exp   |  3 ++
>  gcc/testsuite/gcc.target/s390/vxe2/vlbr-1.c  | 29 +++
>  gcc/testsuite/gcc.target/s390/vxe2/vstbr-1.c | 29 +++
>  gcc/testsuite/gcc.target/s390/vxe2/vstbr-2.c | 42 +++
>  6 files changed, 170 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/s390/vxe2/vlbr-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/vxe2/vstbr-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/vxe2/vstbr-2.c
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index d9f10542473..91eb9232b10 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -17698,6 +17698,58 @@ expand_perm_with_vstbrq (const struct 
> expand_vec_perm_d )
>return false;
>  }
>  
> +/* Try to emit vlbr/vstbr.  Note, this is only a candidate insn since
> +   TARGET_VECTORIZE_VEC_PERM_CONST operates on vector registers only.  Thus,
> +   either fwprop, combine et al. "fixes" one of the input/output operands 
> into
> +   a memory operand or a splitter has to reverse this into a general vperm
> +   operation.  */
> +
> +static bool
> +expand_perm_as_a_vlbr_vstbr_candidate (const struct expand_vec_perm_d )
> +{
> +  static const char perm[4][MAX_VECT_LEN]
> += { { 1,  0,  3,  2,  5,  4,  7, 6, 9,  8,  11, 10, 13, 12, 15, 14 },
> + { 3,  2,  1,  0,  7,  6,  5, 4, 11, 10, 9,  8,  15, 14, 13, 12 },
> + { 7,  6,  5,  4,  3,  2,  1, 0, 15, 14, 13, 12, 11, 10, 9,  8  },
> + { 15, 14, 13, 12, 11, 10, 9, 8, 7,  6,  5,  4,  3,  2,  1,  0  } };
> +
> +  if (!TARGET_VXE2 || d.vmode != V16QImode || d.op0 != d.op1)
> +return false;
> +
> +  if (memcmp (d.perm, perm[0], MAX_VECT_LEN) == 0)
> +{
> +  rtx target = gen_rtx_SUBREG (V8HImode, d.target, 0);
> +  rtx op0 = gen_rtx_SUBREG (V8HImode, d.op0, 0);
> +  emit_insn (gen_bswapv8hi (target, op0));
> +  return true;
> +}
> +
> +  if (memcmp (d.perm, perm[1], MAX_VECT_LEN) == 0)
> +{
> +  rtx target = gen_rtx_SUBREG (V4SImode, d.target, 0);
> +  rtx op0 = gen_rtx_SUBREG (V4SImode, d.op0, 0);
> +  emit_insn (gen_bswapv4si (target, op0));
> +  return true;
> +}
> +
> +  if (memcmp (d.perm, perm[2], MAX_VECT_LEN) == 0)
> +{
> +  rtx target = gen_rtx_SUBREG (V2DImode, d.target, 0);
> +  rtx op0 = gen_rtx_SUBREG (V2DImode, d.op0, 0);
> +  emit_insn (gen_bswapv2di (target, op0));
> +  return true;
> +}
> +
> +  if (memcmp (d.perm, perm[3], MAX_VECT_LEN) == 0)
> +{
> +  rtx target = gen_rtx_SUBREG (V1TImode, d.target, 0);
> +  rtx op0 = gen_rtx_SUBREG (V1TImode, d.op0, 0);
> +  emit_insn (gen_bswapv1ti (target, op0));
> +  return true;
> +}
> +
> +  return false;
> +}
>  
>  /* Try to find the best sequence for the vector permute operation
> described by D.  Return true if the operation could be
> @@ -17720,6 +17772,9 @@ vectorize_vec_perm_const_1 (const struct 
> expand_vec_perm_d )
>if (expand_perm_with_rot (d))
>  return true;
>  
> +  if (expand_perm_as_a_vlbr_vstbr_candidate (d))
> +return true;
> +
>return false;
>  }
>  
> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> index 21bec729efa..f0e9ed3d263 100644
> --- a/gcc/config/s390/vector.md
> +++ b/gcc/config/s390/vector.md
> @@ -47,6 +47,7 @@
>  (define_mode_iterator VI_HW [V16QI V8HI V4SI V2DI])
>  (define_mode_iterator VI_HW_QHS [V16QI V8HI V4SI])
>  (define_mode_iterator VI_HW_HSD [V8HI  V4SI V2DI])
> +(define_mode_iterator VI_HW_HSDT [V8HI V4SI V2DI V1TI TI])
>  (define_mode_iterator VI_HW_HS  [V8HI  V4SI])
>  (define_mode_iterator VI_HW_QH  [V16QI V8HI])
>  
> @@ -2876,12 +2877,12 @@
>   (use (match_dup 2))])]
>"TARGET_VX"
>  {
> -  static char p[4][16] =
> +  static const char p[4][16] =
>  { { 1,  0,  3,  2,  5,  4,  7, 6, 9,  8,  11, 10, 13, 12, 15, 14 },   /* 
> H */
>{ 3,  2,  1,  0,  7,  6,  5, 4, 11, 10, 9,  8,  15, 14, 13, 12 },   /* 
> S */
>{ 7,  6,  5,  4,  3,  2,  1, 0, 15, 14, 13, 12, 11, 10, 9,  8  },   /* 
> D */
>{ 15, 14, 13, 12, 11, 10, 9, 8, 7,  6,  5,  4,  3,  2,  

Re: [PATCH] s390: Enable vect_bswap test cases

2023-08-03 Thread Andreas Krebbel via Gcc-patches
On 8/3/23 08:48, Stefan Schulze Frielinghaus wrote:
> This enables the following tests which rely on instruction vperm which
> is available since z13 with the initial vector support.
> 
> testsuite/gcc.dg/vect/vect-bswap16.c
> 42:/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { 
> target { vect_bswap || sse4_runtime } } } } */
> 
> testsuite/gcc.dg/vect/vect-bswap32.c
> 42:/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { 
> target { vect_bswap || sse4_runtime } } } } */
> 
> testsuite/gcc.dg/vect/vect-bswap64.c
> 42:/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { 
> target { vect_bswap || sse4_runtime } } } } */
> 
> Ok for mainline?

Ok. Thanks!

Andreas

> 
> gcc/testsuite/ChangeLog:
> 
>   * lib/target-supports.exp (check_effective_target_vect_bswap):
>   Add s390.
> ---
>  gcc/testsuite/lib/target-supports.exp | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 4d04df2a709..2ccc0291442 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -7087,9 +7087,11 @@ proc check_effective_target_whole_vector_shift { } {
>  
>  proc check_effective_target_vect_bswap { } {
>  return [check_cached_effective_target_indexed vect_bswap {
> -  expr { [istarget aarch64*-*-*]
> -  || [is-effective-target arm_neon]
> -  || [istarget amdgcn-*-*] }}]
> +  expr { ([istarget aarch64*-*-*]
> +   || [is-effective-target arm_neon]
> +   || [istarget amdgcn-*-*])
> +  || ([istarget s390*-*-*]
> +  && [check_effective_target_s390_vx]) }}]
>  }
>  
>  # Return 1 if the target supports comparison of bool vectors for at



[Committed] IBM Z: Handle unaligned symbols

2023-08-01 Thread Andreas Krebbel via Gcc-patches
The IBM Z ELF ABI mandates every symbol to reside on a 2 byte boundary
in order to be able to use the larl instruction. However, in some
situations it is difficult to enforce this, e.g. for common linker
scripts as used in the Linux kernel. This patch introduces the
-munaligned-symbols option. When that option is used, external symbols
without an explicit alignment are considered unaligned and its address
will be pushed into GOT or the literal pool.

If the symbol in the final linker step turns out end up on a 2 byte
boundary the linker is able to take this back and replace the indirect
reference with larl again. This should minimize the effect to symbols
which are actually unaligned in the end.

Bootstrapped and regression tested on s390x. Committed to mainline.

Backports to stable branches will follow.

gcc/ChangeLog:

* config/s390/s390.cc (s390_encode_section_info): Assume external
symbols without explicit alignment to be unaligned if
-munaligned-symbols has been specified.
* config/s390/s390.opt (-munaligned-symbols): New option.

gcc/testsuite/ChangeLog:

* gcc.target/s390/aligned-1.c: New test.
* gcc.target/s390/unaligned-1.c: New test.
---
 gcc/config/s390/s390.cc |  9 +++--
 gcc/config/s390/s390.opt|  7 +++
 gcc/testsuite/gcc.target/s390/aligned-1.c   | 20 
 gcc/testsuite/gcc.target/s390/unaligned-1.c | 20 
 4 files changed, 54 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/aligned-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/unaligned-1.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 13970edcb5e..89474fd487a 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -13709,8 +13709,13 @@ s390_encode_section_info (tree decl, rtx rtl, int 
first)
 a larl/load-relative instruction.  We only handle the cases
 that can go wrong (i.e. no FUNC_DECLs).
 All symbols without an explicit alignment are assumed to be 2
-byte aligned as mandated by our ABI.  */
-  if (DECL_USER_ALIGN (decl) && DECL_ALIGN (decl) % 16)
+byte aligned as mandated by our ABI.  This behavior can be
+overridden for external symbols with the -munaligned-symbols
+switch.  */
+  if (DECL_ALIGN (decl) % 16
+ && (DECL_USER_ALIGN (decl)
+ || (!SYMBOL_REF_LOCAL_P (XEXP (rtl, 0))
+ && s390_unaligned_symbols_p)))
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
   else if (DECL_ALIGN (decl) % 32)
SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
diff --git a/gcc/config/s390/s390.opt b/gcc/config/s390/s390.opt
index 344aa551f44..496572046f7 100644
--- a/gcc/config/s390/s390.opt
+++ b/gcc/config/s390/s390.opt
@@ -329,3 +329,10 @@ Target Undocumented Var(unroll_only_small_loops) Init(0) 
Save
 mpreserve-args
 Target Var(s390_preserve_args_p) Init(0)
 Store all argument registers on the stack.
+
+munaligned-symbols
+Target Var(s390_unaligned_symbols_p) Init(0)
+Assume external symbols to be potentially unaligned.  By default all
+symbols without explicit alignment are assumed to reside on a 2 byte
+boundary as mandated by the IBM Z ABI.
+
diff --git a/gcc/testsuite/gcc.target/s390/aligned-1.c 
b/gcc/testsuite/gcc.target/s390/aligned-1.c
new file mode 100644
index 000..2dc99cf66bd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/aligned-1.c
@@ -0,0 +1,20 @@
+/* Even symbols without explicite alignment are assumed to reside on a
+   2 byte boundary, as mandated by the IBM Z ELF ABI, and therefore
+   can be accessed using the larl instruction.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=z900 -fno-section-anchors" } */
+
+extern unsigned char extern_implicitly_aligned;
+extern unsigned char extern_explicitly_aligned __attribute__((aligned(2)));
+unsigned char aligned;
+
+unsigned char
+foo ()
+{
+  return extern_implicitly_aligned + extern_explicitly_aligned + aligned;
+}
+
+/* { dg-final { scan-assembler-times 
"larl\t%r\[0-9\]*,extern_implicitly_aligned\n" 1 } } */
+/* { dg-final { scan-assembler-times 
"larl\t%r\[0-9\]*,extern_explicitly_aligned\n" 1 } } */
+/* { dg-final { scan-assembler-times "larl\t%r\[0-9\]*,aligned\n" 1 } } */
diff --git a/gcc/testsuite/gcc.target/s390/unaligned-1.c 
b/gcc/testsuite/gcc.target/s390/unaligned-1.c
new file mode 100644
index 000..421330aded1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/unaligned-1.c
@@ -0,0 +1,20 @@
+/* With the -munaligned-symbols option all external symbols without
+   explicite alignment are assumed to be potentially unaligned and
+   therefore cannot be accessed with larl.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=z900 -fno-section-anchors -munaligned-symbols" } */
+
+extern unsigned char extern_unaligned;
+extern unsigned char extern_explicitly_aligned __attribute__((aligned(2)));
+unsigned char aligned;
+
+unsigned 

Re: [PATCH] s390: Optimize vec_cmpge followed by vec_sel

2023-07-18 Thread Andreas Krebbel via Gcc-patches
On 7/17/23 17:09, Juergen Christ wrote:
> A vec_cmpge produces a negation.  Replace this negation by swapping the two
> selection choices of a vec_sel based on the result of the vec_cmpge.
> 
> Bootstrapped and regression tested on s390x.
> 
> gcc/ChangeLog:
> 
>   * config/s390/vx-builtins.md: New vsel pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vec-cmpge.c: New test.
> 
> Signed-off-by: Juergen Christ 

Committed to mainline. Thanks!

Bye,

Andreas



Re: [PATCH] s390: Fix vec_init default expander

2023-07-07 Thread Andreas Krebbel via Gcc-patches
On 7/7/23 15:51, Juergen Christ wrote:
> Do not reinitialize vector lanes to zero since they are already initialized to
> zero.
> 
> Bootstrapped and regression tested on s390x.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (vec_init): Fix default case
> 
> gcc/Testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vec-init-3.c: New test.

Ok. Pushed to mainline. Thanks!

Andreas



[Committed] IBM zSystems: Assume symbols without explicit alignment to be ok

2023-06-26 Thread Andreas Krebbel via Gcc-patches
A change we have committed back in 2015 relies on the backend
requested ABI alignment to be applied to ALL symbols by the
middle-end. However, this does not appear to be the case for external
symbols. With this commit we assume all symbols without explicit
alignment to be aligned according to the ABI. That's the behavior we
had before.
This fixes a performance regression caused by the 2015 patch. Since
then the address of external char type symbols have been pushed to the
literal pool, although it is safe to access them with larl (which
requires symbols to reside at even addresses).

Bootstrapped and regression tested on s390x.

gcc/
* config/s390/s390.cc (s390_encode_section_info): Set
SYMBOL_FLAG_SET_NOTALIGN2 only if the symbol has explicitely been
misaligned.

gcc/testsuite/
* gcc.target/s390/larl-1.c: New test.
---
 gcc/config/s390/s390.cc|  6 +++--
 gcc/testsuite/gcc.target/s390/larl-1.c | 32 ++
 2 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/larl-1.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 9284477396d..d9f10542473 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -13706,8 +13706,10 @@ s390_encode_section_info (tree decl, rtx rtl, int 
first)
 {
   /* Store the alignment to be able to check if we can use
 a larl/load-relative instruction.  We only handle the cases
-that can go wrong (i.e. no FUNC_DECLs).  */
-  if (DECL_ALIGN (decl) == 0 || DECL_ALIGN (decl) % 16)
+that can go wrong (i.e. no FUNC_DECLs).
+All symbols without an explicit alignment are assumed to be 2
+byte aligned as mandated by our ABI.  */
+  if (DECL_USER_ALIGN (decl) && DECL_ALIGN (decl) % 16)
SYMBOL_FLAG_SET_NOTALIGN2 (XEXP (rtl, 0));
   else if (DECL_ALIGN (decl) % 32)
SYMBOL_FLAG_SET_NOTALIGN4 (XEXP (rtl, 0));
diff --git a/gcc/testsuite/gcc.target/s390/larl-1.c 
b/gcc/testsuite/gcc.target/s390/larl-1.c
new file mode 100644
index 000..5ef2ef63f82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/larl-1.c
@@ -0,0 +1,32 @@
+/* Check if load-address-relative instructions are created */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O2 -march=z10 -mzarch -fno-section-anchors" } */
+
+/* An explicitely misaligned symbol.  This symbol is NOT aligned as
+   mandated by our ABI.  However, the back-end needs to handle that in
+   order to make things like __attribute__((packed)) work.  The symbol
+   address is expected to be loaded from literal pool.  */
+/* { dg-final { scan-assembler "lgrl\t%r2," { target { lp64 } } } } */
+/* { dg-final { scan-assembler "lrl\t%r2," { target { ! lp64 } } } } */
+extern char align1 __attribute__((aligned(1)));
+
+/* { dg-final { scan-assembler "larl\t%r2,align2" } } */
+extern char align2 __attribute__((aligned(2)));
+
+/* { dg-final { scan-assembler "larl\t%r2,align4" } } */
+extern char align4 __attribute__((aligned(4)));
+
+/* An external char symbol without explicit alignment has a DECL_ALIGN
+   of just 8. In contrast to local definitions DATA_ABI_ALIGNMENT is
+   NOT applied to DECL_ALIGN in that case.  Make sure the backend
+   still assumes this symbol to be aligned according to ABI
+   requirements.  */
+/* { dg-final { scan-assembler "larl\t%r2,align_default" } } */
+extern char align_default;
+
+char * foo1 () { return  }
+char * foo2 () { return  }
+char * foo3 () { return  }
+char * foo4 () { return _default; }
+
-- 
2.41.0



Re: [PATCH] libgcc: Use initarray section type for .init_stack

2023-05-25 Thread Andreas Krebbel via Gcc-patches
On 3/20/23 07:33, Kewen.Lin wrote:
> Hi,
> 
> One of my workmates found there is a warning like:
> 
>   libgcc/config/rs6000/morestack.S:402: Warning: ignoring
> incorrect section type for .init_array.0
> 
> when compiling libgcc/config/rs6000/morestack.S.
> 
> Since commit r13-6545 touched that file recently, which was
> suspected to be responsible for this warning, I did some
> investigation and found this is a warning staying for a long
> time.  For section .init_stack*, it's preferred to use
> section type SHT_INIT_ARRAY.  So this patch is use
> "@init_array" to replace "@progbits".
> 
> Although the warning is trivial, Segher suggested me to
> post this to fix it, in order to avoid any possible
> misunderstanding/confusion on the warning.
> 
> As Alan confirmed, this doesn't require a premise check
> on if the existing binutils supports "@init_array" or not,
> "because if you want split-stack to work, you must link
> with gold, any version of binutils that has gold has an
> assembler that understands @init_array". (Thanks Alan!)
> 
> Bootstrapped and regtested on x86_64-redhat-linux
> and powerpc64{,le}-linux-gnu.
> 
> Is it ok for trunk when next stage 1 comes?
> 
> BR,
> Kewen
> -
> libgcc/ChangeLog:
> 
>   * config/i386/morestack.S: Use @init_array rather than
>   @progbits for section type of section .init_array.
>   * config/rs6000/morestack.S: Likewise.
>   * config/s390/morestack.S: Likewise.

s390 parts are ok. I did run a bootstrap and regression. Looks all good. Thanks!

Andreas



Re: [PATCH] s390: Implement TARGET_ATOMIC_ALIGN_FOR_MODE

2023-05-16 Thread Andreas Krebbel via Gcc-patches
On 5/16/23 08:43, Stefan Schulze Frielinghaus wrote:
> So far atomic objects are aligned according to their default alignment.
> For 128 bit scalar types like int128 or long double this results in an
> 8 byte alignment which is wrong and must be 16 byte.
> 
> libstdc++ already computes a correct alignment, though, still adding a
> test case in order to make sure that both implementations are
> compatible.
> 
> Bootstrapped and regtested.  Ok for mainline?  Since this is an ABI
> break, is a backport to GCC 13 reasonable?

Ok for mainline.

I would also like to have it in GCC 13. It is an ABI breakage but on the other 
hand it also fixes an
ABI inconsistency between C and C++ which we should fix asap I think.

Andreas


> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (TARGET_ATOMIC_ALIGN_FOR_MODE):
>   New.
>   (s390_atomic_align_for_mode): New.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.target/s390/atomic-align-1.C: New test.
>   * gcc.target/s390/atomic-align-1.c: New test.
>   * gcc.target/s390/atomic-align-2.c: New test.
> ---
>  gcc/config/s390/s390.cc   |  8 ++
>  .../g++.target/s390/atomic-align-1.C  | 25 +++
>  .../gcc.target/s390/atomic-align-1.c  | 23 +
>  .../gcc.target/s390/atomic-align-2.c  | 18 +
>  4 files changed, 74 insertions(+)
>  create mode 100644 gcc/testsuite/g++.target/s390/atomic-align-1.C
>  create mode 100644 gcc/testsuite/gcc.target/s390/atomic-align-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/atomic-align-2.c
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 505de995da8..4813bf91dc4 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -450,6 +450,14 @@ s390_preserve_fpr_arg_p (int regno)
> && regno >= FPR0_REGNUM);
>  }
>  
> +#undef TARGET_ATOMIC_ALIGN_FOR_MODE
> +#define TARGET_ATOMIC_ALIGN_FOR_MODE s390_atomic_align_for_mode
> +static unsigned int
> +s390_atomic_align_for_mode (machine_mode mode)
> +{
> +  return GET_MODE_BITSIZE (mode);
> +}
> +
>  /* A couple of shortcuts.  */
>  #define CONST_OK_FOR_J(x) \
>   CONST_OK_FOR_CONSTRAINT_P((x), 'J', "J")
> diff --git a/gcc/testsuite/g++.target/s390/atomic-align-1.C 
> b/gcc/testsuite/g++.target/s390/atomic-align-1.C
> new file mode 100644
> index 000..43aa0bc39ed
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/s390/atomic-align-1.C
> @@ -0,0 +1,25 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-std=c++11" } */
> +/* { dg-final { scan-assembler-times {\.align\t2} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t4} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t8} 3 } } */
> +/* { dg-final { scan-assembler-times {\.align\t16} 2 } } */
> +
> +#include 
> +
> +// 2
> +std::atomic var_char;
> +std::atomic var_short;
> +// 4
> +std::atomic var_int;
> +// 8
> +std::atomic var_long;
> +std::atomic var_long_long;
> +// 16
> +std::atomic<__int128> var_int128;
> +// 4
> +std::atomic var_float;
> +// 8
> +std::atomic var_double;
> +// 16
> +std::atomic var_long_double;
> diff --git a/gcc/testsuite/gcc.target/s390/atomic-align-1.c 
> b/gcc/testsuite/gcc.target/s390/atomic-align-1.c
> new file mode 100644
> index 000..b2e1233e3ee
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/atomic-align-1.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-std=c11" } */
> +/* { dg-final { scan-assembler-times {\.align\t2} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t4} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t8} 3 } } */
> +/* { dg-final { scan-assembler-times {\.align\t16} 2 } } */
> +
> +// 2
> +_Atomic char var_char;
> +_Atomic short var_short;
> +// 4
> +_Atomic int var_int;
> +// 8
> +_Atomic long var_long;
> +_Atomic long long var_long_long;
> +// 16
> +_Atomic __int128 var_int128;
> +// 4
> +_Atomic float var_float;
> +// 8
> +_Atomic double var_double;
> +// 16
> +_Atomic long double var_long_double;
> diff --git a/gcc/testsuite/gcc.target/s390/atomic-align-2.c 
> b/gcc/testsuite/gcc.target/s390/atomic-align-2.c
> new file mode 100644
> index 000..0bf17341bf8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/atomic-align-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-O -std=c11" } */
> +/* { dg-final { scan-assembler-not {abort} } } */
> +
> +/* The stack is 8 byte aligned which means GCC has to manually align a 16 
> byte
> +   aligned object.  This is done by allocating not 16 but rather 24 bytes for
> +   variable X and then manually aligning a pointer inside the memory block.
> +   Validate this by ensuring that the if-statement is optimized out.  */
> +
> +void bar (_Atomic unsigned __int128 *ptr);
> +
> +void foo (void) {
> +  _Atomic unsigned __int128 x;
> +  unsigned long n = (unsigned long)
> +  if (n % 16 != 0)
> +__builtin_abort ();
> +  bar ();
> +}



Re: [PATCH 0/3] Refactor memory block operations

2023-05-15 Thread Andreas Krebbel via Gcc-patches
On 5/15/23 09:17, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested.  Ok for mainline?
> 
> Stefan Schulze Frielinghaus (3):
>   s390: Refactor block operation cpymem
>   s390: Add block operation movmem
>   s390: Refactor block operation setmem
> 
>  gcc/config/s390/s390-protos.h|   5 +-
>  gcc/config/s390/s390.cc  | 301 ---
>  gcc/config/s390/s390.md  |  61 -
>  gcc/testsuite/gcc.target/s390/memset-1.c |   7 +-
>  4 files changed, 331 insertions(+), 43 deletions(-)
> 

Ok. Thanks!

Andreas



Re: [PATCH] s390: Fix ifcvt test cases

2023-03-03 Thread Andreas Krebbel via Gcc-patches
On 3/2/23 19:13, Robin Dapp wrote:
> Hi,
> 
> we seem to flip flop between the "high" and "not low" variants of load on
> condition.  Accept both in the affected test cases.
> 
> Going to commit this as obvious.
> 
> Regards
>  Robin
> 
> --
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/ifcvt-two-insns-bool.c: Allow "high" and
>   "not low or equal" load on condition variant.
>   * gcc.target/s390/ifcvt-two-insns-int.c: Dito.
>   * gcc.target/s390/ifcvt-two-insns-long.c: Dito.

Ok. Thanks!

Andreas

> ---
>  gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c | 4 ++--
>  gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c  | 4 ++--
>  gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c | 4 ++--
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c 
> b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
> index 1027ddceb935..a56bc4676143 100644
> --- a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
> +++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-bool.c
> @@ -3,8 +3,8 @@
>  /* { dg-do run } */
>  /* { dg-options "-O2 -march=z13 -mzarch --save-temps" } */
>  
> -/* { dg-final { scan-assembler "lochih\t%r.?,1" } } */
> -/* { dg-final { scan-assembler "locrh\t.*" } } */
> +/* { dg-final { scan-assembler "lochi(?:h|nle)\t%r.?,1" } } */
> +/* { dg-final { scan-assembler "locr(?:h|nle)\t.*" } } */
>  #include 
>  #include 
>  #include 
> diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c 
> b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
> index fc6946f2466d..64b8a732290e 100644
> --- a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
> +++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-int.c
> @@ -3,8 +3,8 @@
>  /* { dg-do run } */
>  /* { dg-options "-O2 -march=z13 -mzarch --save-temps" } */
>  
> -/* { dg-final { scan-assembler "lochih\t%r.?,1" } } */
> -/* { dg-final { scan-assembler "locrh\t.*" } } */
> +/* { dg-final { scan-assembler "lochi(h|nle)\t%r.?,1" } } */
> +/* { dg-final { scan-assembler "locr(?:h|nle)\t.*" } } */
>  #include 
>  #include 
>  #include 
> diff --git a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c 
> b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
> index 51af4985247a..f2d784e762a8 100644
> --- a/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
> +++ b/gcc/testsuite/gcc.target/s390/ifcvt-two-insns-long.c
> @@ -3,8 +3,8 @@
>  /* { dg-do run } */
>  /* { dg-options "-O2 -march=z13 -mzarch --save-temps" } */
>  
> -/* { dg-final { scan-assembler "locghih\t%r.?,1" } } */
> -/* { dg-final { scan-assembler "locgrh\t.*" } } */
> +/* { dg-final { scan-assembler "locghi(?:h|nle)\t%r.?,1" } } */
> +/* { dg-final { scan-assembler "locgr(?:h|nle)\t.*" } } */
>  
>  #include 
>  #include 



Re: [PATCH] s390: libatomic: Fix 16 byte atomic {cas,load,store}

2023-03-03 Thread Andreas Krebbel via Gcc-patches
On 3/2/23 16:24, Stefan Schulze Frielinghaus wrote:
> This is a follow-up to commit a4c6bd0821099f6b8c0f64a96ffd9d01a025c413
> introducing a runtime check for alignment for 16 byte atomic
> compare-exchange, load, and store.
> 
> Bootstrapped and regtested on s390.
> Ok for mainline and gcc-{12,11,10}?
> 
> libatomic/ChangeLog:
> 
>   * config/s390/cas_n.c: New file.
>   * config/s390/load_n.c: New file.
>   * config/s390/store_n.c: New file.

Ok. Thanks!

Andreas

> ---
>  libatomic/config/s390/cas_n.c   | 65 +
>  libatomic/config/s390/load_n.c  | 57 +
>  libatomic/config/s390/store_n.c | 54 +++
>  3 files changed, 176 insertions(+)
>  create mode 100644 libatomic/config/s390/cas_n.c
>  create mode 100644 libatomic/config/s390/load_n.c
>  create mode 100644 libatomic/config/s390/store_n.c
> 
> diff --git a/libatomic/config/s390/cas_n.c b/libatomic/config/s390/cas_n.c
> new file mode 100644
> index 000..44b7152ca5d
> --- /dev/null
> +++ b/libatomic/config/s390/cas_n.c
> @@ -0,0 +1,65 @@
> +/* Copyright (C) 2018-2023 Free Software Foundation, Inc.
> +
> +   This file is part of the GNU Atomic Library (libatomic).
> +
> +   Libatomic is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   Libatomic is distributed in the hope that it will be useful, but WITHOUT 
> ANY
> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> +   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +   more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#include 
> +
> +
> +/* Analog to config/s390/exch_n.c.  */
> +
> +#if !DONE && N == 16
> +bool
> +SIZE(libat_compare_exchange) (UTYPE *mptr, UTYPE *eptr, UTYPE newval,
> +   int smodel, int fmodel UNUSED)
> +{
> +  if (!((uintptr_t)mptr & 0xf))
> +{
> +  return __atomic_compare_exchange_n (
> + (UTYPE *)__builtin_assume_aligned (mptr, 16), eptr, newval, false,
> + __ATOMIC_SEQ_CST, __ATOMIC_RELAXED);
> +}
> +  else
> +{
> +  UTYPE oldval;
> +  UWORD magic;
> +  bool ret;
> +
> +  pre_seq_barrier (smodel);
> +  magic = protect_start (mptr);
> +
> +  oldval = *mptr;
> +  ret = (oldval == *eptr);
> +  if (ret)
> + *mptr = newval;
> +  else
> + *eptr = oldval;
> +
> +  protect_end (mptr, magic);
> +  post_seq_barrier (smodel);
> +
> +  return ret;
> +}
> +}
> +#define DONE 1
> +#endif /* N == 16 */
> +
> +#include "../../cas_n.c"
> diff --git a/libatomic/config/s390/load_n.c b/libatomic/config/s390/load_n.c
> new file mode 100644
> index 000..335d2f8b2c3
> --- /dev/null
> +++ b/libatomic/config/s390/load_n.c
> @@ -0,0 +1,57 @@
> +/* Copyright (C) 2018-2023 Free Software Foundation, Inc.
> +
> +   This file is part of the GNU Atomic Library (libatomic).
> +
> +   Libatomic is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   Libatomic is distributed in the hope that it will be useful, but WITHOUT 
> ANY
> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> +   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +   more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#include 
> +
> +
> +/* Analog to config/s390/exch_n.c.  */
> +
> +#if !DONE && N == 16
> +UTYPE
> +SIZE(libat_load) (UTYPE *mptr, int smodel)
> +{
> +  if (!((uintptr_t)mptr & 0xf))
> +{
> +  return __atomic_load_n ((UTYPE *)__builtin_assume_aligned (mptr, 16),
> +   __ATOMIC_SEQ_CST);
> +}
> +  else
> +{
> +  UTYPE ret;
> +  UWORD magic;
> +
> +  pre_seq_barrier (smodel);
> +  magic = protect_start 

Re: [PATCH] s390: Use arch14 instead of z16 for -march=native.

2023-03-03 Thread Andreas Krebbel via Gcc-patches
On 3/2/23 19:17, Robin Dapp wrote:
> Hi,
> 
> When compiling on a system where binutils do not yet support the 'z16'
> name assembling fails with -march=native which we currently interpret
> as -march=z16 (on a z16 machine).  This patch uses -march=arch14
> instead.
> 
> Is it OK?

Ok. Thanks!

Andreas


> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
>   * config/s390/driver-native.cc (s390_host_detect_local_cpu): Use
>   arch14 instead of z16.
> ---
>  gcc/config/s390/driver-native.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/s390/driver-native.cc 
> b/gcc/config/s390/driver-native.cc
> index 563da45c7f6e..3b9c1e1ca5df 100644
> --- a/gcc/config/s390/driver-native.cc
> +++ b/gcc/config/s390/driver-native.cc
> @@ -125,10 +125,10 @@ s390_host_detect_local_cpu (int argc, const char **argv)
> break;
>   case 0x3931:
>   case 0x3932:
> -   cpu = "z16";
> +   cpu = "arch14";
> break;
>   default:
> -   cpu = "z16";
> +   cpu = "arch14";
> break;
>   }
>   }



Re: [PATCH] s390: Add LEN_LOAD/LEN_STORE support.

2023-02-27 Thread Andreas Krebbel via Gcc-patches
On 2/27/23 11:13, Robin Dapp wrote:
>> Do you really need a copy of the address register? Couldn't you just do a
>> src = adjust_address (operands[1], BLKmode, 0);
>> You create a paradoxical subreg of the QImode input but vll actually
>> uses the whole 32 bit value. Couldn't we end up with uninitialized
>> bytes being used as part of the length then? Do we need a zero-extend
>> here?
> 
> v2 attached with these problems addressed.
> 
> Testsuite and bootstrap as before.

Ok. Thanks!

Andreas




Re: [PATCH] IBM zSystems: Do not propagate scheduler state across basic blocks [PR108102]

2023-02-13 Thread Andreas Krebbel via Gcc-patches
On 2/11/23 16:59, Stefan Schulze Frielinghaus wrote:
> So far we propagate scheduler state across basic blocks within EBBs and
> reset the state otherwise.  In certain circumstances the entry block of
> an EBB might be empty, i.e., no_real_insns_p is true.  In those cases
> scheduler state is not reset and subsequently wrong state is propagated
> to following blocks of the same EBB.
> 
> Since the performance benefit of tracking state across basic blocks is
> questionable on modern hardware, simply reset the state for each basic
> block.
> 
> Fix also resetting f{p,x}d_longrunning.
> 
> Bootstrapped and regtested on IBM zSystems.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_bb_fallthru_entry_likely): Remove.
>   (struct s390_sched_state): Initialise to zero.
>   (s390_sched_variable_issue): For better debuggability also emit
>   the current side.
>   (s390_sched_init): Unconditionally reset scheduler state.

Ok. Thanks!

Andreas




Re: [PATCH] IBM zSystems: Fix predicate execute_operation

2023-02-13 Thread Andreas Krebbel via Gcc-patches
On 2/11/23 17:10, Stefan Schulze Frielinghaus wrote:
> Use constrain_operands in order to check whether there exists a valid
> alternative instead of extract_constrain_insn which ICEs in case no
> alternative is found.
> 
> Bootstrapped and regtested on IBM zSystems.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/predicates.md (execute_operation): Use
>   constrain_operands instead of extract_constrain_insn in order to
>   determine wheter there exists a valid alternative.

Ok. Thanks!

Andreas



Re: [PATCH] s390: Add LEN_LOAD/LEN_STORE support.

2023-02-13 Thread Andreas Krebbel via Gcc-patches
On 2/2/23 09:43, Robin Dapp wrote:
> Hi,
> 
> this patch adds LEN_LOAD/LEN_STORE support for z14 and newer.
> It defines a bias value of -1 and implements the LEN_LOAD and LEN_STORE
> optabs.
> 
> It also includes various vll/vstl testcases adapted from Kewen Lin's patch
> for Power.
> 
> Bootstrapped and regtested on z13-z16.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   * config/s390/predicates.md (vll_bias_operand): Add -1 bias.
>   * config/s390/s390.cc (s390_option_override_internal): Make
>   partial vector usage the default from z13 on.
>   * config/s390/vector.md (len_load_v16qi): Add.
>   (len_store_v16qi): Add.

...

> +;
> +; Implement len_load/len_store optabs with vll/vstl.
> +(define_expand "len_load_v16qi"
> +  [(match_operand:V16QI 0 "register_operand")
> +   (match_operand:V16QI 1 "memory_operand")
> +   (match_operand:QI 2 "register_operand")
> +   (match_operand:QI 3 "vll_bias_operand")
> +  ]
> +  "TARGET_VX && TARGET_64BIT"
> +{
> +  rtx src1 = XEXP (operands[1], 0);
> +  rtx src = gen_reg_rtx (Pmode);
> +  emit_move_insn (src, src1);
> +  rtx mem = gen_rtx_MEM (BLKmode, src);

Do you really need a copy of the address register? Couldn't you just do a
src = adjust_address (operands[1], BLKmode, 0);

> +
> +  rtx len = gen_lowpart (SImode, operands[2]);
> +  emit_insn (gen_vllv16qi (operands[0], len, mem));

You create a paradoxical subreg of the QImode input but vll actually uses the 
whole 32 bit value.
Couldn't we end up with uninitialized bytes being used as part of the length 
then? Do we need a
zero-extend here?

Bye,

Andreas



[PATCH 2/3] IBM zSystems: Make stack_tie to work with hard frame-pointer

2023-02-01 Thread Andreas Krebbel via Gcc-patches
With this patch a scheduling barrier is created to prevent the insn
setting up the frame-pointer and instructions which save GPRs to the
stack to be swapped.  Otherwise broken CFI information would be
generated since the stack save insns would use a base register which
is not currently declared as holding the CFA.

Without -mpreserve-args this did not happen because the store multiple
we used for saving the GPRs would also cover the frame-pointer
register and therefore creates a dependency on the frame-pointer
hardreg. However, with this patch the stack_tie is emitted regardless
of -mpreserve-args since this in general appears to be the safer
approach.

* config/s390/s390.cc (save_gprs): Use gen_frame_mem.
(restore_gprs): Likewise.
(s390_emit_stack_tie): Make the stack_tie to be dependent on the
frame pointer if a frame-pointer is used.
(s390_emit_prologue): Emit stack_tie when frame-pointer is needed.
* config/s390/s390.md (stack_tie): Add a register operand and
rename to ...
(@stack_tie): ... this.
---
 gcc/config/s390/s390.cc | 17 -
 gcc/config/s390/s390.md |  5 +++--
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index a9bb610385b..4db5677ce29 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -10898,9 +10898,7 @@ save_gprs (rtx base, int offset, int first, int last)
   int i;
 
   addr = plus_constant (Pmode, base, offset);
-  addr = gen_rtx_MEM (Pmode, addr);
-
-  set_mem_alias_set (addr, get_frame_alias_set ());
+  addr = gen_frame_mem (Pmode, addr);
 
   /* Special-case single register.  */
   if (first == last)
@@ -11012,8 +11010,7 @@ restore_gprs (rtx base, int offset, int first, int last)
   rtx addr, insn;
 
   addr = plus_constant (Pmode, base, offset);
-  addr = gen_rtx_MEM (Pmode, addr);
-  set_mem_alias_set (addr, get_frame_alias_set ());
+  addr = gen_frame_mem (Pmode, addr);
 
   /* Special-case single register.  */
   if (first == last)
@@ -11062,10 +11059,11 @@ s390_load_got (void)
 static void
 s390_emit_stack_tie (void)
 {
-  rtx mem = gen_frame_mem (BLKmode,
-  gen_rtx_REG (Pmode, STACK_POINTER_REGNUM));
-
-  emit_insn (gen_stack_tie (mem));
+  rtx mem = gen_frame_mem (BLKmode, stack_pointer_rtx);
+  if (frame_pointer_needed)
+emit_insn (gen_stack_tie (Pmode, mem, hard_frame_pointer_rtx));
+  else
+emit_insn (gen_stack_tie (Pmode, mem, stack_pointer_rtx));
 }
 
 /* Copy GPRS into FPR save slots.  */
@@ -11676,6 +11674,7 @@ s390_emit_prologue (void)
 
   if (frame_pointer_needed)
 {
+  s390_emit_stack_tie ();
   insn = emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
   RTX_FRAME_RELATED_P (insn) = 1;
 }
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 4828aa08be6..00d39608e1d 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -11590,9 +11590,10 @@
 ; This is used in s390_emit_prologue in order to prevent insns
 ; adjusting the stack pointer to be moved over insns writing stack
 ; slots using a copy of the stack pointer in a different register.
-(define_insn "stack_tie"
+(define_insn "@stack_tie"
   [(set (match_operand:BLK 0 "memory_operand" "+m")
-(unspec:BLK [(match_dup 0)] UNSPEC_TIE))]
+(unspec:BLK [(match_dup 0)
+(match_operand:P 1 "register_operand" "r")] UNSPEC_TIE))]
   ""
   ""
   [(set_attr "length" "0")])
-- 
2.39.1



[PATCH 3/3] IBM zSystems: Save argument registers to the stack -mpreserve-args

2023-02-01 Thread Andreas Krebbel via Gcc-patches
This adds support for preserving the content of parameter registers to
the stack and emit CFI for it. This useful for applications which want
to implement their own stack unwinding and need access to function
arguments without having to rely on debug information.

With the -mpreserve-args option GPRs and FPRs are save to the stack
slots which are reserved for stdargs in the register save area.

gcc/ChangeLog:

* config/s390/s390.cc (s390_restore_gpr_p): New function.
(s390_preserve_gpr_arg_in_range_p): New function.
(s390_preserve_gpr_arg_p): New function.
(s390_preserve_fpr_arg_p): New function.
(s390_register_info_stdarg_fpr): Rename to ...
(s390_register_info_arg_fpr): ... this. Add -mpreserve-args handling.
(s390_register_info_stdarg_gpr): Rename to ...
(s390_register_info_arg_gpr): ... this. Add -mpreserve-args handling.
(s390_register_info): Use the renamed functions above.
(s390_optimize_register_info): Likewise.
(save_fpr): Generate CFI for -mpreserve-args.
(save_gprs): Generate CFI for -mpreserve-args. Drop return value.
(s390_emit_prologue): Adjust to changed calling convention of save_gprs.
(s390_optimize_prologue): Likewise.
* config/s390/s390.opt: New option -mpreserve-args

gcc/testsuite/ChangeLog:

* gcc.target/s390/preserve-args-1.c: New test.
* gcc.target/s390/preserve-args-2.c: New test.
---
 gcc/config/s390/s390.cc   | 254 +-
 gcc/config/s390/s390.opt  |   4 +
 .../gcc.target/s390/preserve-args-1.c |  17 ++
 .../gcc.target/s390/preserve-args-2.c |  19 ++
 .../gcc.target/s390/preserve-args-3.c |  19 ++
 5 files changed, 239 insertions(+), 74 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-3.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 4db5677ce29..708b48b5ab6 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -411,6 +411,45 @@ struct s390_address
 #define FP_ARG_NUM_REG (TARGET_64BIT? 4 : 2)
 #define VEC_ARG_NUM_REG 8
 
+/* Return TRUE if GPR REGNO is supposed to be restored in the function
+   epilogue.  */
+static inline bool
+s390_restore_gpr_p (int regno)
+{
+  return (cfun_frame_layout.first_restore_gpr != -1
+ && regno >= cfun_frame_layout.first_restore_gpr
+ && regno <= cfun_frame_layout.last_restore_gpr);
+}
+
+/* Return TRUE if any of the registers in range [FIRST, LAST] is saved
+   because of -mpreserve-args.  */
+static inline bool
+s390_preserve_gpr_arg_in_range_p (int first, int last)
+{
+  int num_arg_regs = MIN (crtl->args.info.gprs + cfun->va_list_gpr_size,
+ GP_ARG_NUM_REG);
+  return (num_arg_regs
+ && s390_preserve_args_p
+ && first <= GPR2_REGNUM + num_arg_regs - 1
+ && last >= GPR2_REGNUM);
+}
+
+static inline bool
+s390_preserve_gpr_arg_p (int regno)
+{
+  return s390_preserve_gpr_arg_in_range_p (regno, regno);
+}
+
+static inline bool
+s390_preserve_fpr_arg_p (int regno)
+{
+  int num_arg_regs = MIN (crtl->args.info.fprs + cfun->va_list_fpr_size,
+ FP_ARG_NUM_REG);
+  return (s390_preserve_args_p
+ && regno <= FPR0_REGNUM + num_arg_regs - 1
+ && regno >= FPR0_REGNUM);
+}
+
 /* A couple of shortcuts.  */
 #define CONST_OK_FOR_J(x) \
CONST_OK_FOR_CONSTRAINT_P((x), 'J', "J")
@@ -9893,61 +9932,89 @@ s390_register_info_gprtofpr ()
 }
 
 /* Set the bits in fpr_bitmap for FPRs which need to be saved due to
-   stdarg.
+   stdarg or -mpreserve-args.
This is a helper routine for s390_register_info.  */
-
 static void
-s390_register_info_stdarg_fpr ()
+s390_register_info_arg_fpr ()
 {
   int i;
-  int min_fpr;
-  int max_fpr;
+  int min_stdarg_fpr = INT_MAX, max_stdarg_fpr = -1;
+  int min_preserve_fpr = INT_MAX, max_preserve_fpr = -1;
+  int min_fpr, max_fpr;
 
   /* Save the FP argument regs for stdarg. f0, f2 for 31 bit and
  f0-f4 for 64 bit.  */
-  if (!cfun->stdarg
-  || !TARGET_HARD_FLOAT
-  || !cfun->va_list_fpr_size
-  || crtl->args.info.fprs >= FP_ARG_NUM_REG)
-return;
+  if (cfun->stdarg
+  && TARGET_HARD_FLOAT
+  && cfun->va_list_fpr_size
+  && crtl->args.info.fprs < FP_ARG_NUM_REG)
+{
+  min_stdarg_fpr = crtl->args.info.fprs;
+  max_stdarg_fpr = min_stdarg_fpr + cfun->va_list_fpr_size - 1;
+  if (max_stdarg_fpr >= FP_ARG_NUM_REG)
+   max_stdarg_fpr = FP_ARG_NUM_REG - 1;
+
+  /* FPR argument regs start at f0.  */
+  min_stdarg_fpr += FPR0_REGNUM;
+  max_stdarg_fpr += FPR0_REGNUM;
+}
 
-  min_fpr = crtl->args.info.fprs;
-  max_fpr = min_fpr + cfun->va_list_fpr_size - 1;
-  if (max_fpr >= FP_ARG_NUM_REG)
-max_fpr = FP_ARG_NUM_REG - 1;
+  if 

[PATCH 1/3] New reg note REG_CFA_NORESTORE

2023-02-01 Thread Andreas Krebbel via Gcc-patches
This patch introduces a new reg note which can be used to tell the CFI
verification in dwarf2cfi that a register is stored without intending
to restore from it.

This is useful when storing e.g. register contents to the stack and
generate CFI for it although the register is not really supposed to be
restored.

gcc/ChangeLog:

* dwarf2cfi.cc (dwarf2out_frame_debug_cfa_restore): Add
EMIT_CFI parameter.
(dwarf2out_frame_debug): Add case for REG_CFA_NORESTORE.
* reg-notes.def (REG_CFA_NOTE): New reg note definition.
---
 gcc/dwarf2cfi.cc  | 15 ++-
 gcc/reg-notes.def |  5 +
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/dwarf2cfi.cc b/gcc/dwarf2cfi.cc
index 1c70bd83f28..57283c10a29 100644
--- a/gcc/dwarf2cfi.cc
+++ b/gcc/dwarf2cfi.cc
@@ -1496,10 +1496,12 @@ dwarf2out_frame_debug_cfa_val_expression (rtx set)
   update_row_reg_save (cur_row, dwf_regno (dest), cfi);
 }
 
-/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE note.  */
+/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE
+   note. When called with EMIT_CFI set to false emitting a CFI
+   statement is suppressed.  */
 
 static void
-dwarf2out_frame_debug_cfa_restore (rtx reg)
+dwarf2out_frame_debug_cfa_restore (rtx reg, bool emit_cfi)
 {
   gcc_assert (REG_P (reg));
 
@@ -1507,7 +1509,8 @@ dwarf2out_frame_debug_cfa_restore (rtx reg)
   if (!span)
 {
   unsigned int regno = dwf_regno (reg);
-  add_cfi_restore (regno);
+  if (emit_cfi)
+   add_cfi_restore (regno);
   update_row_reg_save (cur_row, regno, NULL);
 }
   else
@@ -1522,7 +1525,8 @@ dwarf2out_frame_debug_cfa_restore (rtx reg)
  reg = XVECEXP (span, 0, par_index);
  gcc_assert (REG_P (reg));
  unsigned int regno = dwf_regno (reg);
- add_cfi_restore (regno);
+ if (emit_cfi)
+   add_cfi_restore (regno);
  update_row_reg_save (cur_row, regno, NULL);
}
 }
@@ -2309,6 +2313,7 @@ dwarf2out_frame_debug (rtx_insn *insn)
break;
 
   case REG_CFA_RESTORE:
+  case REG_CFA_NO_RESTORE:
n = XEXP (note, 0);
if (n == NULL)
  {
@@ -2317,7 +2322,7 @@ dwarf2out_frame_debug (rtx_insn *insn)
  n = XVECEXP (n, 0, 0);
n = XEXP (n, 0);
  }
-   dwarf2out_frame_debug_cfa_restore (n);
+   dwarf2out_frame_debug_cfa_restore (n, REG_NOTE_KIND (note) == 
REG_CFA_RESTORE);
handled_one = true;
break;
 
diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def
index 23de1f13ee9..1f74a605b3e 100644
--- a/gcc/reg-notes.def
+++ b/gcc/reg-notes.def
@@ -157,6 +157,11 @@ REG_CFA_NOTE (CFA_VAL_EXPRESSION)
first pattern is the register to be restored.  */
 REG_CFA_NOTE (CFA_RESTORE)
 
+/* Like CFA_RESTORE but without actually emitting CFI.  This can be
+   used to tell the verification infrastructure that a register is
+   saved without intending to restore it.  */
+REG_CFA_NOTE (CFA_NO_RESTORE)
+
 /* Attached to insns that are RTX_FRAME_RELATED_P, marks insn that sets
vDRAP from DRAP.  If vDRAP is a register, vdrap_reg is initalized
to the argument, if it is a MEM, it is ignored.  */
-- 
2.39.1



[Committed 0/3] IBM zSystems: Add -mpreserve-args option

2023-02-01 Thread Andreas Krebbel via Gcc-patches
This adds support for preserving the content of parameter registers to
the stack and emit CFI for it. This useful for applications which want
to implement their own stack unwinding and need access to function
arguments without having to rely on debug information.

With the -mpreserve-args option GPRs and FPRs are save to the stack
slots which are reserved for stdargs in the register save area.

The introduction of REG_CFA_NORESTORE is a common code change which
has been approved last year already.

Bootstrapped and regtested on s390x. Committed to mainline. 

Andreas Krebbel (3):
  New reg note REG_CFA_NORESTORE
  IBM zSystems: Make stack_tie to work with hard frame pointer
  IBM zSystems: Save argument registers to the stack -mpreserve-args

 gcc/config/s390/s390.cc   | 271 --
 gcc/config/s390/s390.md   |   5 +-
 gcc/config/s390/s390.opt  |   4 +
 gcc/dwarf2cfi.cc  |  15 +-
 gcc/reg-notes.def |   5 +
 .../gcc.target/s390/preserve-args-1.c |  17 ++
 .../gcc.target/s390/preserve-args-2.c |  19 ++
 .../gcc.target/s390/preserve-args-3.c |  19 ++
 8 files changed, 265 insertions(+), 90 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-3.c

-- 
2.39.1



Re: [PATCH v2] IBM zSystems: Fix TARGET_D_CPU_VERSIONS

2023-01-24 Thread Andreas Krebbel via Gcc-patches
On 1/24/23 09:47, Stefan Schulze Frielinghaus wrote:
> In the context of D the interpretation of S390, S390X, and SystemZ is a
> bit fuzzy.  The wording S390X was wrongly deprecated in favour of
> SystemZ by commit
> https://github.com/dlang/dlang.org/commit/3b50a4c3faf01c32234d0ef8be5f82915a61c23f
> Thus, SystemZ is used for 64-bit targets, now, and S390 for 31-bit
> targets.  However, in TARGET_D_CPU_VERSIONS depending on TARGET_ZARCH we
> set the CPU version to SystemZ.  This is also the case if compiled for
> 31-bit targets leading to the following error:
> 
> libphobos/libdruntime/core/sys/posix/sys/stat.d:967:13: error: static assert: 
>  '96u == 144u' is false
>   967 | static assert(stat_t.sizeof == 144);
>   | ^
> 
> Thus in order to keep this patch simple I went for keeping SystemZ for
> 64-bit targets and S390, as usual, for 31-bit targets and dropped the
> distinction between ESA and z/Architecture.
> 
> Bootstrapped and regtested on IBM zSystems.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-d.cc (s390_d_target_versions): Fix detection
>   of CPU version.

Ok, thanks!

Andreas



Re: PING: New reg note REG_CFA_NORESTORE

2023-01-11 Thread Andreas Krebbel via Gcc-patches
On 12/27/22 19:23, Jeff Law wrote:
> 
> 
> On 12/13/22 01:55, Andreas Krebbel via Gcc-patches wrote:
>> Hi,
>>
>> I need a way to save registers on the stack and generate proper CFI for it. 
>> Since I do not intend to
>> restore them I needed a way to tell the CFI generation step about it:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606128.html
>>
>> Is this ok for mainline?
> Presumably there's validation bits that want to validate that everything 
> saved eventually gets restored?
> 
> There's only one call to dwarf2out_frame_debug_cfa_restore, so ISTM that 
> providing an initializer for the argument isn't needed and just creates 
> an overload (and associated code) that isn't needed.  Why not just 
> remove the default initializer?
> 
> Ok with that change or a good reason why you need to keep the initializer.

Right. I'll remove it. Thanks for having a look!

Bye,

Andreas



[Committed] IBM zSystems: Use NAND instruction to implement bit not

2023-01-11 Thread Andreas Krebbel via Gcc-patches
Bootstrapped and regression tested on s390x.

Committed to mainline.

gcc/ChangeLog:

* config/s390/s390.md (*not): New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/s390/not.c: New test.
---
 gcc/config/s390/s390.md |  8 
 gcc/testsuite/gcc.target/s390/not.c | 11 +++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/not.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 0e56fbad44d..4828aa08be6 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -8302,6 +8302,14 @@
   "nrk\t%0,%1,%2"
   [(set_attr "op_type" "RRF")])
 
+; Use NAND for bit inversion
+(define_insn "*not"
+  [(set (match_operand:GPR  0 "register_operand" "=d")
+   (not:GPR (match_operand:GPR 1 "register_operand"  "d")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z15"
+  "nnrk\t%0,%1,%1"
+  [(set_attr "op_type" "RRF")])
 
 ;
 ; Block inclusive or (OC) patterns.
diff --git a/gcc/testsuite/gcc.target/s390/not.c 
b/gcc/testsuite/gcc.target/s390/not.c
new file mode 100644
index 000..dae95f7d8a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/not.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=z15 -mzarch" } */
+
+unsigned long
+foo (unsigned long a)
+{
+  return ~a;
+}
+
+/* { dg-final { scan-assembler-times "\tnngrk\t" 1 { target { lp64 } } } } */
+/* { dg-final { scan-assembler-times "\tnnrk\t" 1 { target { ! lp64 } } } } */
-- 
2.39.0



[Committed] IBM zSystems: Make -fcall-saved-... work.

2023-01-10 Thread Andreas Krebbel via Gcc-patches
Committed to mainline. Bootstrap and regression tests are clean.

gcc/ChangeLog:

* config/s390/s390.cc (s390_register_info): Check call_used_regs
instead of hard-coding the register numbers for call saved
registers.
(s390_optimize_register_info): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/s390/fcall-saved.c: New test.
---
 gcc/config/s390/s390.cc | 10 --
 gcc/testsuite/gcc.target/s390/fcall-saved.c | 11 +++
 2 files changed, 15 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/fcall-saved.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 42177c204f6..a9bb610385b 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -10075,8 +10075,8 @@ s390_register_info ()
 
   memset (cfun_frame_layout.gpr_save_slots, SAVE_SLOT_NONE, 16);
 
-  for (i = 6; i < 16; i++)
-if (clobbered_regs[i])
+  for (i = 0; i < 16; i++)
+if (clobbered_regs[i] && !call_used_regs[i])
   cfun_gpr_save_slot (i) = SAVE_SLOT_STACK;
 
   s390_register_info_stdarg_fpr ();
@@ -10136,10 +10136,8 @@ s390_optimize_register_info ()
|| cfun_frame_layout.save_return_addr_p
|| crtl->calls_eh_return);
 
-  memset (cfun_frame_layout.gpr_save_slots, SAVE_SLOT_NONE, 6);
-
-  for (i = 6; i < 16; i++)
-if (!clobbered_regs[i])
+  for (i = 0; i < 16; i++)
+if (!clobbered_regs[i] || call_used_regs[i])
   cfun_gpr_save_slot (i) = SAVE_SLOT_NONE;
 
   s390_register_info_set_ranges ();
diff --git a/gcc/testsuite/gcc.target/s390/fcall-saved.c 
b/gcc/testsuite/gcc.target/s390/fcall-saved.c
new file mode 100644
index 000..a08155372f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/fcall-saved.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -fcall-saved-r4" } */
+
+void test(void) {
+asm volatile("nop" ::: "r4");
+}
+
+/* { dg-final { scan-assembler-times "\tstg\t" 1 { target { lp64 } } } } */
+/* { dg-final { scan-assembler-times "\tlg\t" 1 { target { lp64 } } } } */
+/* { dg-final { scan-assembler-times "\tst\t" 1 { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "\tl\t" 1 { target { ! lp64 } } } } */
-- 
2.39.0



PING: New reg note REG_CFA_NORESTORE

2022-12-13 Thread Andreas Krebbel via Gcc-patches
Hi,

I need a way to save registers on the stack and generate proper CFI for it. 
Since I do not intend to
restore them I needed a way to tell the CFI generation step about it:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606128.html

Is this ok for mainline?

Bye,

Andreas


[PATCH 2/2] IBM zSystems: Save argument registers to the stack -mpreserve-args

2022-11-14 Thread Andreas Krebbel via Gcc-patches
This adds support for preserving the content of parameter registers to
the stack and emit CFI for it. This useful for applications which want
to implement their own stack unwinding and need access to function
arguments.

With the -mpreserve-args option GPRs and FPRs are save to the stack
slots which are reserved for stdargs in the register save area.

gcc/ChangeLog:

* config/s390/s390.cc (s390_restore_gpr_p): New function.
(s390_preserve_gpr_arg_in_range_p): New function.
(s390_preserve_gpr_arg_p): New function.
(s390_preserve_fpr_args_p): New function.
(s390_preserve_fpr_arg_p): New function.
(s390_register_info_stdarg_fpr): Rename to ...
(s390_register_info_arg_fpr): ... this. Add -mpreserve-args handling.
(s390_register_info_stdarg_gpr): Rename to ...
(s390_register_info_arg_gpr): ... this. Add -mpreserve-args handling.
(s390_register_info): Use the renamed functions above.
(s390_optimize_register_info): Likewise.
(save_fpr): Generate CFI for -mpreserve-args.
(save_gprs): Generate CFI for -mpreserve-args. Drop return value.
(s390_emit_prologue): Adjust to changed calling convention of save_gprs.
(s390_optimize_prologue): Likewise.
* config/s390/s390.opt: New option -mpreserve-args

gcc/testsuite/ChangeLog:

* gcc.target/s390/preserve-args-1.c: New test.
* gcc.target/s390/preserve-args-2.c: New test.
---
 gcc/config/s390/s390.cc   | 263 +-
 gcc/config/s390/s390.opt  |   4 +
 .../gcc.target/s390/preserve-args-1.c |  17 ++
 .../gcc.target/s390/preserve-args-2.c |  19 ++
 4 files changed, 229 insertions(+), 74 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-2.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index f5c75395cf3..5e197b5314b 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -411,6 +411,53 @@ struct s390_address
 #define FP_ARG_NUM_REG (TARGET_64BIT? 4 : 2)
 #define VEC_ARG_NUM_REG 8
 
+/* Return TRUE if GPR REGNO is supposed to be restored in the function
+   epilogue.  */
+static inline bool
+s390_restore_gpr_p (int regno)
+{
+  return (cfun_frame_layout.first_restore_gpr != -1
+ && regno >= cfun_frame_layout.first_restore_gpr
+ && regno <= cfun_frame_layout.last_restore_gpr);
+}
+
+/* Return TRUE if any of the registers in range [FIRST, LAST] is saved
+   because of -mpreserve-args.  */
+static inline bool
+s390_preserve_gpr_arg_in_range_p (int first, int last)
+{
+  int num_arg_regs = MIN (crtl->args.info.gprs + cfun->va_list_gpr_size,
+ GP_ARG_NUM_REG);
+  return (num_arg_regs
+ && s390_preserve_args_p
+ && first <= GPR2_REGNUM + num_arg_regs - 1
+ && last >= GPR2_REGNUM);
+}
+
+static inline bool
+s390_preserve_gpr_arg_p (int regno)
+{
+  return s390_preserve_gpr_arg_in_range_p (regno, regno);
+}
+
+/* Return TRUE if FPR arguments need to be saved onto the stack due to 
-mpreserve-args.  */
+static inline bool
+s390_preserve_fpr_args_p (void)
+{
+  return (s390_preserve_args_p
+ && (crtl->args.info.fprs + cfun->va_list_fpr_size));
+}
+
+static inline bool
+s390_preserve_fpr_arg_p (int regno)
+{
+  int num_arg_regs = MIN (crtl->args.info.fprs + cfun->va_list_fpr_size,
+ FP_ARG_NUM_REG);
+  return (s390_preserve_args_p
+ && regno <= FPR0_REGNUM + num_arg_regs - 1
+ && regno >= FPR0_REGNUM);
+}
+
 /* A couple of shortcuts.  */
 #define CONST_OK_FOR_J(x) \
CONST_OK_FOR_CONSTRAINT_P((x), 'J', "J")
@@ -9893,61 +9940,90 @@ s390_register_info_gprtofpr ()
 }
 
 /* Set the bits in fpr_bitmap for FPRs which need to be saved due to
-   stdarg.
+   stdarg or -mpreserve-args.
This is a helper routine for s390_register_info.  */
-
 static void
-s390_register_info_stdarg_fpr ()
+s390_register_info_arg_fpr ()
 {
   int i;
-  int min_fpr;
-  int max_fpr;
+  int min_stdarg_fpr = INT_MAX, max_stdarg_fpr = -1;
+  int min_preserve_fpr = INT_MAX, max_preserve_fpr = -1;
+  int min_fpr, max_fpr;
 
   /* Save the FP argument regs for stdarg. f0, f2 for 31 bit and
  f0-f4 for 64 bit.  */
-  if (!cfun->stdarg
-  || !TARGET_HARD_FLOAT
-  || !cfun->va_list_fpr_size
-  || crtl->args.info.fprs >= FP_ARG_NUM_REG)
-return;
+  if (cfun->stdarg
+  && TARGET_HARD_FLOAT
+  && cfun->va_list_fpr_size
+  && crtl->args.info.fprs < FP_ARG_NUM_REG)
+{
+  min_stdarg_fpr = crtl->args.info.fprs;
+  max_stdarg_fpr = min_stdarg_fpr + cfun->va_list_fpr_size - 1;
+  if (max_stdarg_fpr >= FP_ARG_NUM_REG)
+   max_stdarg_fpr = FP_ARG_NUM_REG - 1;
+
+  /* FPR argument regs start at f0.  */
+  min_stdarg_fpr += FPR0_REGNUM;
+  max_stdarg_fpr += FPR0_REGNUM;
+}
+
+  if (s390_preserve_fpr_args_p ())
+{
+  

[PATCH 1/2] New reg note REG_CFA_NORESTORE

2022-11-14 Thread Andreas Krebbel via Gcc-patches
This patch introduces a new reg note which can be used to tell the CFI
verification in dwarf2cfi that a register is stored without intending
to restore from it.

This is useful when storing e.g. register contents to the stack and
generate CFI for it although the register is not really supposed to be
restored.

gcc/ChangeLog:

* dwarf2cfi.cc (dwarf2out_frame_debug_cfa_restore): Add
EMIT_CFI parameter.
(dwarf2out_frame_debug): Add case for REG_CFA_NORESTORE.
* reg-notes.def (REG_CFA_NOTE): New reg note definition.
---
 gcc/dwarf2cfi.cc  | 15 ++-
 gcc/reg-notes.def |  5 +
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/dwarf2cfi.cc b/gcc/dwarf2cfi.cc
index bef3165e691..6686498d7cc 100644
--- a/gcc/dwarf2cfi.cc
+++ b/gcc/dwarf2cfi.cc
@@ -1496,10 +1496,12 @@ dwarf2out_frame_debug_cfa_val_expression (rtx set)
   update_row_reg_save (cur_row, dwf_regno (dest), cfi);
 }
 
-/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE note.  */
+/* A subroutine of dwarf2out_frame_debug, process a REG_CFA_RESTORE
+   note. When called with EMIT_CFI set to false emitting a CFI
+   statement is suppressed.  */
 
 static void
-dwarf2out_frame_debug_cfa_restore (rtx reg)
+dwarf2out_frame_debug_cfa_restore (rtx reg, bool emit_cfi = true)
 {
   gcc_assert (REG_P (reg));
 
@@ -1507,7 +1509,8 @@ dwarf2out_frame_debug_cfa_restore (rtx reg)
   if (!span)
 {
   unsigned int regno = dwf_regno (reg);
-  add_cfi_restore (regno);
+  if (emit_cfi)
+   add_cfi_restore (regno);
   update_row_reg_save (cur_row, regno, NULL);
 }
   else
@@ -1522,7 +1525,8 @@ dwarf2out_frame_debug_cfa_restore (rtx reg)
  reg = XVECEXP (span, 0, par_index);
  gcc_assert (REG_P (reg));
  unsigned int regno = dwf_regno (reg);
- add_cfi_restore (regno);
+ if (emit_cfi)
+   add_cfi_restore (regno);
  update_row_reg_save (cur_row, regno, NULL);
}
 }
@@ -2309,6 +2313,7 @@ dwarf2out_frame_debug (rtx_insn *insn)
break;
 
   case REG_CFA_RESTORE:
+  case REG_CFA_NORESTORE:
n = XEXP (note, 0);
if (n == NULL)
  {
@@ -2317,7 +2322,7 @@ dwarf2out_frame_debug (rtx_insn *insn)
  n = XVECEXP (n, 0, 0);
n = XEXP (n, 0);
  }
-   dwarf2out_frame_debug_cfa_restore (n);
+   dwarf2out_frame_debug_cfa_restore (n, REG_NOTE_KIND (note) == 
REG_CFA_RESTORE);
handled_one = true;
break;
 
diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def
index 704bc75b0e7..ab08e65eedc 100644
--- a/gcc/reg-notes.def
+++ b/gcc/reg-notes.def
@@ -157,6 +157,11 @@ REG_CFA_NOTE (CFA_VAL_EXPRESSION)
first pattern is the register to be restored.  */
 REG_CFA_NOTE (CFA_RESTORE)
 
+/* Like CFA_RESTORE but without actually emitting CFI.  This can be
+   used to tell the verification infrastructure that a register is
+   saved without intending to restore it.  */
+REG_CFA_NOTE (CFA_NORESTORE)
+
 /* Attached to insns that are RTX_FRAME_RELATED_P, marks insn that sets
vDRAP from DRAP.  If vDRAP is a register, vdrap_reg is initalized
to the argument, if it is a MEM, it is ignored.  */
-- 
2.38.1



[PATCH 0/2] Preserve argument registers

2022-11-14 Thread Andreas Krebbel via Gcc-patches
This adds support for preserving the content of parameter registers to
the stack and emit CFI for it. This useful for applications which want
to implement their own stack unwinding and need access to function
arguments.

A small common code patch was needed to prevent the CFI verification
in dwarf2cfi from complaining about the register saves without restores.

Andreas Krebbel (2):
  New reg note REG_CFA_NORESTORE
  IBM zSystems: Save argument registers to the stack -mpreserve-args

 gcc/config/s390/s390.cc   | 263 +-
 gcc/config/s390/s390.opt  |   4 +
 gcc/dwarf2cfi.cc  |  15 +-
 gcc/reg-notes.def |   5 +
 .../gcc.target/s390/preserve-args-1.c |  17 ++
 .../gcc.target/s390/preserve-args-2.c |  19 ++
 6 files changed, 244 insertions(+), 79 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/preserve-args-2.c

-- 
2.38.1



Re: [PATCH] IBM zSystems: Fix function_ok_for_sibcall [PR106355]

2022-10-19 Thread Andreas Krebbel via Gcc-patches
On 8/17/22 13:50, Stefan Schulze Frielinghaus wrote:
> For a parameter with BLKmode we cannot use REG_NREGS in order to
> determine the number of consecutive registers.  Streamlined this with
> the implementation of s390_function_arg.
> 
> Fix some indentation whitespace, too.
> 
> Assuming bootstrap and regtest are ok for mainline and gcc-{10,11,12},
> ok to install for all of those?
> 
> PR target/106355
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_call_saved_register_used): For a
>   parameter with BLKmode fix determining number of consecutive
>   registers.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/pr106355.h: Common code for new tests.
>   * gcc.target/s390/pr106355-1.c: New test.
>   * gcc.target/s390/pr106355-2.c: New test.
>   * gcc.target/s390/pr106355-3.c: New test.

Ok for all those branches. Please check if the branches are currently open 
before committing. GCC 11
and 12 appear to be but I'm not sure if GCC 10 has been re-opened again. There 
should be a final
10.5 release some day though.

Thanks!

Andreas


Re: [PATCH] s390: Fix bootstrap error with checking and -m31

2022-10-19 Thread Andreas Krebbel via Gcc-patches
On 10/19/22 08:22, Robin Dapp wrote:
> Hi,
> 
> since r13-2746 we hit an ICE when bootstrapping with -m31 and
> --enable-checking=all.
> 
> ../../../../libgfortran/ieee/ieee_helper.c: In function
> 'ieee_class_helper_16':
> ../../../../libgfortran/ieee/ieee_helper.c:77:3: internal compiler
> error: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at
> rtl.h:1932
>77 |   }
>   |   ^
> ../../../../libgfortran/ieee/ieee_helper.c:87:1: note: in expansion of
> macro 'CLASSMACRO'
>87 | CLASSMACRO(16)
>   | ^~
> 
> This patch fixes the problem by first checking for reload_completed
> and also ensuring that REGNO is only called on reg operands rather
> than subregs.
> 
> Bootstrapped and regtested --with-arch=arch14 and --enable-checking=all.
> 
> Is it OK?
Ok. Thanks!

Andreas



Re: [PATCH] s390: Recognize reverse/element swap permute patterns.

2022-08-22 Thread Andreas Krebbel via Gcc-patches
On 8/22/22 17:10, Robin Dapp wrote:
> Hi,
> 
> after discussing off-list, here is v2 of the patch.  We now recognize if
> the permutation mask only refers to the first or the second operand and
> use this later when emitting vpdi.
> 
> Regtested and bootstrapped, no regressions.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> From 1f11a6b89c9b0ad64b480229cd4db06e887a Mon Sep 17 00:00:00 2001
> From: Robin Dapp 
> Date: Fri, 24 Jun 2022 15:17:08 +0200
> Subject: [PATCH v2] s390: Recognize reverse/element swap permute patterns.
> 
> This adds functions to recognize reverse/element swap permute patterns
> for vler, vster as well as vpdi and rotate.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (expand_perm_with_vpdi): Recognize swap pattern.
>   (is_reverse_perm_mask): New function.
>   (expand_perm_with_rot): Recognize reverse pattern.
>   (expand_perm_with_vstbrq): New function.
>   (expand_perm_with_vster): Use vler/vster for element reversal on z15.
>   (vectorize_vec_perm_const_1): Use.
>   (s390_vectorize_vec_perm_const): Add expand functions.
>   * config/s390/vx-builtins.md: Prefer vster over vler.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vperm-rev-z14.c: New test.
>   * gcc.target/s390/vector/vperm-rev-z15.c: New test.
>   * gcc.target/s390/zvector/vec-reve-store-byte.c: Adjust test
>   expectation.

Ok, thanks!

Andreas


Re: [PATCH] s390: Implement vec_set with vec_merge and, vec_duplicate.

2022-08-16 Thread Andreas Krebbel via Gcc-patches
On 8/12/22 16:48, Robin Dapp wrote:
> Hi,
> 
> similar to other backends this patch implements vec_set via
> vec_merge and vec_duplicate instead of an unspec.  This opens up
> more possibilites to combine instructions.
> 
> Bootstrapped and regtested. No regressions.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md: Implement vec_set with vec_merge and
>   vec_duplicate.
>   * config/s390/vector.md: Likewise.
>   * config/s390/vx-builtins.md: Likewise.
>   * config/s390/s390.cc (s390_expand_vec_init): Emit new pattern.
>   (print_operand_address): New output modifier.
>   (print_operand): New output modifier.

The way you handle the element selector doesn't look right to me. It appears to 
be an index if it is
a CONST_INT and a bitmask otherwise. I don't think it is legal to change 
operand semantics like this
depending on the operand type. This would break e.g. if LRA would decide to 
load the immediate index
in a register.

Couldn't you make the shift part of the RTX instead and have the parameter 
always as an index?

Bye,

Andreas

> ---
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index c86b26933d7a..ff89fb83360a 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -7073,11 +7073,10 @@ s390_expand_vec_init (rtx target, rtx vals)
>if (!general_operand (elem, GET_MODE (elem)))
>   elem = force_reg (inner_mode, elem);
> 
> -  emit_insn (gen_rtx_SET (target,
> -   gen_rtx_UNSPEC (mode,
> -   gen_rtvec (3, elem,
> -  GEN_INT (i), target),
> -   UNSPEC_VEC_SET)));
> +  emit_insn
> + (gen_rtx_SET
> +  (target, gen_rtx_VEC_MERGE
> +   (mode, gen_rtx_VEC_DUPLICATE (mode, elem), target, GEN_INT (1 << 
> i;
>  }
>  }
> 
> @@ -8057,6 +8056,8 @@ print_operand_address (FILE *file, rtx addr)
>  'S': print S-type memory reference (base+displacement).
>  'Y': print address style operand without index (e.g. shift count or
> setmem
>operand).
> +'P': print address-style operand without index but with the offset as
> +  if it were specified by a 'p' format flag.
> 
>  'b': print integer X as if it's an unsigned byte.
>  'c': print integer X as if it's an signed byte.
> @@ -8068,6 +8069,7 @@ print_operand_address (FILE *file, rtx addr)
>  'k': print the first nonzero SImode part of X.
>  'm': print the first SImode part unequal to -1 of X.
>  'o': print integer X as if it's an unsigned 32bit word.
> +'p': print N such that 2^N == X (X must be a power of 2 and const int).
>  's': "start" of contiguous bitmask X in either DImode or vector
> inner mode.
>  't': CONST_INT: "start" of contiguous bitmask X in SImode.
>CONST_VECTOR: Generate a bitmask for vgbm instruction.
> @@ -8237,6 +8239,16 @@ print_operand (FILE *file, rtx x, int code)
>print_shift_count_operand (file, x);
>return;
> 
> +case 'P':
> +  if (CONST_INT_P (x))
> + {
> +   ival = exact_log2 (INTVAL (x));
> +   fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival);
> + }
> +  else
> + print_shift_count_operand (file, x);
> +  return;
> +
>  case 'K':
>/* Append @PLT to both local and non-local symbols in order to
> support
>Linux Kernel livepatching: patches contain individual functions and
> @@ -8321,6 +8333,9 @@ print_operand (FILE *file, rtx x, int code)
>   case 'o':
> ival &= 0x;
> break;
> + case 'p':
> +   ival = exact_log2 (INTVAL (x));
> +   break;
>   case 'e': case 'f':
>   case 's': case 't':
> {
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index f37d8fd33a15..a82db4c624fa 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -183,7 +183,6 @@ (define_c_enum "unspec" [
> UNSPEC_VEC_GFMSUM_128
> UNSPEC_VEC_GFMSUM_ACCUM
> UNSPEC_VEC_GFMSUM_ACCUM_128
> -   UNSPEC_VEC_SET
> 
> UNSPEC_VEC_VSUMG
> UNSPEC_VEC_VSUMQ
> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> index c50451a8326c..bde3a39db3d4 100644
> --- a/gcc/config/s390/vector.md
> +++ b/gcc/config/s390/vector.md
> @@ -467,12 +467,17 @@ (define_insn "mov"
>  ; vec_set is supposed to *modify* an existing vector so operand 0 is
>  ; duplicated as input operand.
>  (define_expand "vec_set"
> -  [(set (match_operand:V0 "register_operand"  "")
> - (unspec:V [(match_operand: 1 "general_operand"   "")
> -(match_operand:SI2 "nonmemory_operand" "")
> -(match_dup 0)]
> -UNSPEC_VEC_SET))]
> -  "TARGET_VX")
> +  [(set (match_operand:V  0 "register_operand" "")
> + (vec_merge:V
> +   (vec_duplicate:V
> + (match_operand: 1 "general_operand" ""))

Re: [PATCH] s390: Implement vec_extract via vec_select.

2022-08-16 Thread Andreas Krebbel via Gcc-patches
On 8/12/22 16:19, Robin Dapp wrote:
> Hi,
> 
> vec_select can handle dynamic/runtime masks nowadays.  Therefore we can
> get rid of the UNSPEC_VEC_EXTRACT that was preventing further
> optimizations like combining instructions with vec_extract patterns.
> 
> Bootstrapped and regtested. No regressions.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md: Remove UNSPEC_VEC_EXTRACT.
>   * config/s390/vector.md: Rewrite patterns to use vec_select.
>   * config/s390/vx-builtins.md (vec_scatter_element_SI):
>   Likewise.

Ok. Thanks!

Andreas


Re: [PATCH] s390: Use vpdi and verllg in vec_reve.

2022-08-15 Thread Andreas Krebbel via Gcc-patches
On 8/12/22 12:13, Robin Dapp wrote:
> Hi,
> 
> swapping the two elements of a V2DImode or V2DFmode vector can be done
> with vpdi instead of using the generic way of loading a permutation mask
> from the literal pool and vperm.
> 
> Analogous to the V2DI/V2DF case reversing the elements of a four-element
> vector can be done by first swapping the elements of the first
> doubleword as well the ones of the second one and subsequently rotate
> the doublewords by 32 bits.
> 
> Bootstrapped and regtested, no regressions.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   PR target/100869
>   * config/s390/vector.md (@vpdi4_2): New pattern.
>   (rotl3_di): New pattern.
>   * config/s390/vx-builtins.md: Use vpdi and verll for reversing
>   elements.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/zvector/vec-reve-int-long.c: New test.

Ok. Thanks!

Andreas


Re: [PATCH] s390: Add z15 to s390_issue_rate.

2022-08-15 Thread Andreas Krebbel via Gcc-patches
On 8/12/22 12:02, Robin Dapp wrote:
> Hi,
> 
> this patch tries to be more explicit by mentioning z15 in s390_issue_rate.
> 
> No changes in testsuite, bootstrap or SPEC obviously.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_issue_rate): Add z15.
> ---
>  gcc/config/s390/s390.cc | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index ef38fbe68c84..528cd8c7f0f6 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -8582,6 +8582,7 @@ s390_issue_rate (void)
>  case PROCESSOR_2827_ZEC12:
>  case PROCESSOR_2964_Z13:
>  case PROCESSOR_3906_Z14:
> +case PROCESSOR_8561_Z15:
>  case PROCESSOR_3931_Z16:
>  default:
>return 1;

Ok. Thanks!

Andreas



Re: [PATCH] s390: Add -munroll-only-small-loops.

2022-08-15 Thread Andreas Krebbel via Gcc-patches
On 8/12/22 12:00, Robin Dapp wrote:
> Hi,
> 
> inspired by Power we also introduce -munroll-only-small-loops.  This
> implies activating -funroll-loops and -munroll-only-small-loops at -O2
> and above.
> 
> Bootstrapped and regtested.
> 
> This introduces one regression in gcc.dg/sms-compare-debug-1.c but
> currently dumps for sms are broken as well.  The difference is in the
> location of some INSN_DELETED notes so I would consider this a minor issue.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   * common/config/s390/s390-common.cc: Enable -funroll-loops and
>   -munroll-only-small-loops for OPT_LEVELS_2_PLUS_SPEED_ONLY.
>   * config/s390/s390.cc (s390_loop_unroll_adjust): Do not unroll
>   loops larger than 12 instructions.
>   (s390_override_options_after_change): Set unroll options.
>   (s390_option_override_internal): Likewise.
>   * config/s390/s390.opt: Document munroll-only-small-loops.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vec-copysign.c: Do not unroll.
>   * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: Dito.
>   * gcc.target/s390/zvector/autovec-double-signaling-ltgt.c: Dito.
>   * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: Dito.
>   * gcc.target/s390/zvector/autovec-float-signaling-ltgt.c: Dito.

Ok. Thanks!

Andreas


Re: [PATCH] PR106342 - IBM zSystems: Provide vsel for all vector modes

2022-08-10 Thread Andreas Krebbel via Gcc-patches
On 8/10/22 13:42, Ilya Leoshkevich wrote:
> On Wed, 2022-08-03 at 12:20 +0200, Ilya Leoshkevich wrote:
>> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
>>
>>
>>
>> dg.exp=pr104612.c fails with an ICE on s390x, because copysignv2sf3
>> produces an insn that vsel is supposed to recognize, but can't,
>> because it's not defined for V2SF.  Fix by defining it for all vector
>> modes supported by copysign3.
>>
>> gcc/ChangeLog:
>>
>> * config/s390/vector.md (V_HW_FT): New iterator.
>> * config/s390/vx-builtins.md (vsel): Use V instead of
>> V_HW.
>> ---
>>  gcc/config/s390/vector.md  |  6 ++
>>  gcc/config/s390/vx-builtins.md | 12 ++--
>>  2 files changed, 12 insertions(+), 6 deletions(-)
> 
> Jakub pointed out that this is broken in gcc-12 as well.
> The patch applies cleanly, and I started a bootstrap/regtest.
> Ok for gcc-12?

Yes. Thanks!

Andreas


Re: [PATCH] PR106342 - IBM zSystems: Provide vsel for all vector modes

2022-08-03 Thread Andreas Krebbel via Gcc-patches
On 8/3/22 12:20, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> dg.exp=pr104612.c fails with an ICE on s390x, because copysignv2sf3
> produces an insn that vsel is supposed to recognize, but can't,
> because it's not defined for V2SF.  Fix by defining it for all vector
> modes supported by copysign3.
> 
> gcc/ChangeLog:
> 
>   * config/s390/vector.md (V_HW_FT): New iterator.
>   * config/s390/vx-builtins.md (vsel): Use V instead of
>   V_HW.

Ok. There is a typo in the changelog:
"Use *V* instead ..." should probably read "Use V_HW_FT instead ..."

Thanks,

Andreas

> ---
>  gcc/config/s390/vector.md  |  6 ++
>  gcc/config/s390/vx-builtins.md | 12 ++--
>  2 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> index a6c4b4eb974..624729814af 100644
> --- a/gcc/config/s390/vector.md
> +++ b/gcc/config/s390/vector.md
> @@ -63,6 +63,12 @@
>  V1DF V2DF
>  (V1TF "TARGET_VXE") (TF "TARGET_VXE")])
>  
> +; All modes present in V_HW and VFT.
> +(define_mode_iterator V_HW_FT [V16QI V8HI V4SI V2DI (V1TI "TARGET_VXE") V1DF
> +V2DF (V1SF "TARGET_VXE") (V2SF "TARGET_VXE")
> +(V4SF "TARGET_VXE") (V1TF "TARGET_VXE")
> +(TF "TARGET_VXE")])
> +
>  ; FP vector modes directly supported by the HW.  This does not include
>  ; vector modes using only part of a vector register and should be used
>  ; for instructions which might trigger IEEE exceptions.
> diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
> index d5130799804..98ee08b2683 100644
> --- a/gcc/config/s390/vx-builtins.md
> +++ b/gcc/config/s390/vx-builtins.md
> @@ -517,12 +517,12 @@
>  ; swapped in s390-c.cc when we get here.
>  
>  (define_insn "vsel"
> -  [(set (match_operand:V_HW  0 "register_operand" "=v")
> - (ior:V_HW
> -  (and:V_HW (match_operand:V_HW   1 "register_operand"  "v")
> -(match_operand:V_HW   3 "register_operand"  "v"))
> -  (and:V_HW (not:V_HW (match_dup 3))
> -(match_operand:V_HW   2 "register_operand"  "v"]
> +  [(set (match_operand:V_HW_FT   0 "register_operand" "=v")
> + (ior:V_HW_FT
> +  (and:V_HW_FT (match_operand:V_HW_FT 1 "register_operand"  "v")
> +   (match_operand:V_HW_FT 3 "register_operand"  "v"))
> +  (and:V_HW_FT (not:V_HW_FT (match_dup 3))
> +   (match_operand:V_HW_FT 2 "register_operand"  "v"]
>"TARGET_VX"
>"vsel\t%v0,%1,%2,%3"
>[(set_attr "op_type" "VRR")])



[PATCH 1/1] PR 106101: IBM zSystems: Fix strict_low_part problem

2022-07-29 Thread Andreas Krebbel via Gcc-patches
This avoids generating illegal (strict_low_part (reg ...)) RTXs. This
required two changes:

1. Do not use gen_lowpart to generate the inner expression of a
STRICT_LOW_PART.  gen_lowpart might fold the SUBREG either because
there is already a paradoxical subreg or because it can directly be
applied to the register. A new wrapper function makes sure that we
always end up having an actual SUBREG.

2. Change the movstrict patterns to enforce a SUBREG as inner operand
of the STRICT_LOW_PARTs.  The new predicate introduced for the
destination operand requires a SUBREG expression with a
register_operand as inner operand.  However, since reload strips away
the majority of the SUBREGs we have to accept single registers as well
once we reach reload.

Bootstrapped and regression tested on IBM zSystems 64 bit.

gcc/ChangeLog:

PR target/106101
* config/s390/predicates.md (subreg_register_operand): New
predicate.
* config/s390/s390-protos.h (s390_gen_lowpart_subreg): New
function prototype.
* config/s390/s390.cc (s390_gen_lowpart_subreg): New function.
(s390_expand_insv): Use s390_gen_lowpart_subreg instead of
gen_lowpart.
* config/s390/s390.md ("*get_tp_64", "*zero_extendhisi2_31")
("*zero_extendqisi2_31", "*zero_extendqihi2_31"): Likewise.
("movstrictqi", "movstricthi", "movstrictsi"): Use the
subreg_register_operand predicate instead of register_operand.

gcc/testsuite/ChangeLog:

PR target/106101
* gcc.c-torture/compile/pr106101.c: New test.
---
 gcc/config/s390/predicates.md | 12 
 gcc/config/s390/s390-protos.h |  1 +
 gcc/config/s390/s390.cc   | 27 +++-
 gcc/config/s390/s390.md   | 36 +--
 .../gcc.c-torture/compile/pr106101.c  | 62 +++
 5 files changed, 116 insertions(+), 22 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr106101.c

diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index 33194d3f3d6..430cf6edfd6 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -594,3 +594,15 @@
 (define_predicate "addv_const_operand"
   (and (match_code "const_int")
(match_test "INTVAL (op) >= -32768 && INTVAL (op) <= 32767")))
+
+; Match (subreg (reg ...)) operands.
+; Used for movstrict destination operands
+; When replacing pseudos with hard regs reload strips away the
+; subregs. Accept also plain registers then to prevent the insn from
+; becoming unrecognizable.
+(define_predicate "subreg_register_operand"
+  (ior (and (match_code "subreg")
+   (match_test "register_operand (SUBREG_REG (op), GET_MODE 
(SUBREG_REG (op)))"))
+   (and (match_code "reg")
+   (match_test "reload_completed || reload_in_progress")
+   (match_test "register_operand (op, GET_MODE (op))"
diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index fd4acaae44a..765d843a418 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -50,6 +50,7 @@ extern void s390_set_has_landing_pad_p (bool);
 extern bool s390_hard_regno_rename_ok (unsigned int, unsigned int);
 extern int s390_class_max_nregs (enum reg_class, machine_mode);
 extern bool s390_return_addr_from_memory(void);
+extern rtx s390_gen_lowpart_subreg (machine_mode, rtx);
 extern bool s390_fma_allowed_p (machine_mode);
 #if S390_USE_TARGET_ATTRIBUTE
 extern tree s390_valid_target_attribute_tree (tree args,
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 5aaf76a9490..5e06bf9350c 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -458,6 +458,31 @@ s390_return_addr_from_memory ()
   return cfun_gpr_save_slot(RETURN_REGNUM) == SAVE_SLOT_STACK;
 }
 
+/* Generate a SUBREG for the MODE lowpart of EXPR.
+
+   In contrast to gen_lowpart it will always return a SUBREG
+   expression.  This is useful to generate STRICT_LOW_PART
+   expressions.  */
+rtx
+s390_gen_lowpart_subreg (machine_mode mode, rtx expr)
+{
+  rtx lowpart = gen_lowpart (mode, expr);
+
+  /* There might be no SUBREG in case it could be applied to the hard
+ REG rtx or it could be folded with a paradoxical subreg.  Bring
+ it back.  */
+  if (!SUBREG_P (lowpart))
+{
+  machine_mode reg_mode = TARGET_ZARCH ? DImode : SImode;
+  gcc_assert (REG_P (lowpart));
+  lowpart = gen_lowpart_SUBREG (mode,
+   gen_rtx_REG (reg_mode,
+REGNO (lowpart)));
+}
+
+  return lowpart;
+}
+
 /* Return nonzero if it's OK to use fused multiply-add for MODE.  */
 bool
 s390_fma_allowed_p (machine_mode mode)
@@ -6520,7 +6545,7 @@ s390_expand_insv (rtx dest, rtx op1, rtx op2, rtx src)
   /* Emit a strict_low_part pattern if possible.  */
   if (smode_bsize == bitsize && bitpos == mode_bsize - smode_bsize)
{
- rtx 

Re: GCC 11.2.1 Status Report (2022-04-13), branch frozen for release

2022-04-14 Thread Andreas Krebbel via Gcc-patches
On 4/13/22 09:30, Richard Biener via Gcc wrote:
> 
> Status
> ==
> 
> The gcc-11 branch is now frozen in preparation for a GCC 11.3 release
> candidate and the GCC 11.3 release next week.  All changes now require
> release manager approval.

Hi,

I would like to push:

https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593103.html

to GCC 11 branch before 11.3 release. Ok?

Bye,

Andreas


Re: [PATCH] s390: Add scheduler description for z16

2022-04-14 Thread Andreas Krebbel via Gcc-patches
On 4/13/22 12:23, Robin Dapp wrote:
> Hi,
> 
> this patch adds the scheduler description for z16.  Bootstrapped and
> regtested with --with-arch=z16.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_get_sched_attrmask): Add z16.
>   (s390_get_unit_mask): Likewise.
>   (s390_is_fpd): Likewise.
>   (s390_is_fxd): Likewise.
>   * config/s390/s390.md 
> (z900,z990,z9_109,z9_ec,z10,z196,zEC12,z13,z14,z15):
>   Add z16.
>   (z900,z990,z9_109,z9_ec,z10,z196,zEC12,z13,z14,z15,z16):
>   Likewise.
>   * config/s390/3931.md: New file.

Ok. Thanks!

Andreas




Re: [PATCH] testsuite/s390: Silence warning in pr80725.c

2022-04-14 Thread Andreas Krebbel via Gcc-patches
On 4/13/22 09:35, Robin Dapp wrote:
> Hi,
> 
> this test case checks that we do not ICE but FAILs because of
> -Wint-to-pointer-cast.  Silence this warning.
> 
> Is it OK?

Ok. Thanks!

Andreas



Re: [PATCH] testsuite: Skip pr105250.c for powerpc and s390 [PR105266]

2022-04-14 Thread Andreas Krebbel via Gcc-patches
On 4/14/22 05:10, Kewen.Lin wrote:
> Hi,
> 
> The test case pr105250.c is like its related pr105140.c, which
> suffers the error with message like "{AltiVec,vector} argument
> passed to unprototyped" on powerpc and s390.  So like commits
> r12-8025 and r12-8039, this fix is to add the dg-skip-if for
> powerpc*-*-* and s390*-*-*.
> 
> Tested on powerpc64le-linux-gnu P9 and it should work on s390
> as its similar PR105147.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
> 
> gcc/testsuite/ChangeLog:
> 
>   PR testsuite/105266
>   * gcc.dg/pr105250.c: Skip for powerpc*-*-* and s390*-*-*.

Ok for s390. Thanks!

Andreas


[Committed] IBM zSystems: Add support for z16 as CPU name.

2022-04-12 Thread Andreas Krebbel via Gcc-patches
So far z16 was identified as arch14. After the machine has been
announced we can now add the real name.

gcc/ChangeLog:

* common/config/s390/s390-common.cc: Rename PF_ARCH14 to PF_Z16.
* config.gcc: Add z16 as march/mtune switch.
* config/s390/driver-native.cc (s390_host_detect_local_cpu):
Recognize z16 with -march=native.
* config/s390/s390-opts.h (enum processor_type): Rename
PROCESSOR_ARCH14 to PROCESSOR_3931_Z16.
* config/s390/s390.cc (PROCESSOR_ARCH14): Rename to ...
(PROCESSOR_3931_Z16): ... throughout the file.
(s390_processor processor_table): Add z16 as cpu string.
* config/s390/s390.h (enum processor_flags): Rename PF_ARCH14 to
PF_Z16.
(TARGET_CPU_ARCH14): Rename to ...
(TARGET_CPU_Z16): ... this.
(TARGET_CPU_ARCH14_P): Rename to ...
(TARGET_CPU_Z16_P): ... this.
(TARGET_ARCH14): Rename to ...
(TARGET_Z16): ... this.
(TARGET_ARCH14_P): Rename to ...
(TARGET_Z16_P): ... this.
* config/s390/s390.md (cpu_facility): Rename arch14 to z16 and
check TARGET_Z16 instead of TARGET_ARCH14.
* config/s390/s390.opt: Add z16 to processor_type.
* doc/invoke.texi: Document z16 and arch14.
---
 gcc/common/config/s390/s390-common.cc |  4 ++--
 gcc/config.gcc|  2 +-
 gcc/config/s390/driver-native.cc  |  6 +-
 gcc/config/s390/s390-opts.h   |  2 +-
 gcc/config/s390/s390.cc   | 14 --
 gcc/config/s390/s390.h| 16 
 gcc/config/s390/s390.md   |  6 +++---
 gcc/config/s390/s390.opt  |  5 -
 gcc/doc/invoke.texi   |  3 ++-
 9 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/gcc/common/config/s390/s390-common.cc 
b/gcc/common/config/s390/s390-common.cc
index caec2f14c6c..72a5ef47eaa 100644
--- a/gcc/common/config/s390/s390-common.cc
+++ b/gcc/common/config/s390/s390-common.cc
@@ -50,10 +50,10 @@ EXPORTED_CONST int processor_flags_table[] =
 /* z15 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15,
-/* arch14 */ PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
+/* z16 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15
-| PF_NNPA | PF_ARCH14
+| PF_NNPA | PF_Z16
   };
 
 /* Change optimizations to be performed, depending on the
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 48a5bbcf787..c5064dd3766 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -5532,7 +5532,7 @@ case "${target}" in
for which in arch tune; do
eval "val=\$with_$which"
case ${val} in
-   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | 
arch11 | arch12 | arch13 | arch14 )
+   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | z16 | arch5 | arch6 | arch7 | arch8 | arch9 | 
arch10 | arch11 | arch12 | arch13 | arch14 )
# OK
;;
*)
diff --git a/gcc/config/s390/driver-native.cc b/gcc/config/s390/driver-native.cc
index 48524c49251..b5eb222872d 100644
--- a/gcc/config/s390/driver-native.cc
+++ b/gcc/config/s390/driver-native.cc
@@ -123,8 +123,12 @@ s390_host_detect_local_cpu (int argc, const char **argv)
case 0x8562:
  cpu = "z15";
  break;
+   case 0x3931:
+   case 0x3932:
+ cpu = "z16";
+ break;
default:
- cpu = "arch14";
+ cpu = "z16";
  break;
}
}
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index 1ec84631a5f..4ef82ac5d34 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -38,7 +38,7 @@ enum processor_type
   PROCESSOR_2964_Z13,
   PROCESSOR_3906_Z14,
   PROCESSOR_8561_Z15,
-  PROCESSOR_ARCH14,
+  PROCESSOR_3931_Z16,
   PROCESSOR_NATIVE,
   PROCESSOR_max
 };
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index d2af6d8813d..1342a2e7db0 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -337,7 +337,7 @@ const struct s390_processor processor_table[] =
   { "z13","z13",PROCESSOR_2964_Z13,_cost,  11 },
   { "z14","arch12", PROCESSOR_3906_Z14,_cost,  12 },
   { "z15","arch13", PROCESSOR_8561_Z15,_cost,  13 },
-  { "arch14", "arch14", PROCESSOR_ARCH14,  _cost,  14 },
+  { "z16","arch14", PROCESSOR_3931_Z16,_cost,  14 },
   { 

[PATCH] v2 PR102024 - IBM Z: Add psabi diagnostics

2022-04-11 Thread Andreas Krebbel via Gcc-patches
v2:

- Remove redundant num_zero_width_bf_seen and num_fields_seen
  tracking. (Thanks Stefan Schulze-Frielinghaus)

Re-tested with testsuite and ABI tests.



For IBM Z in particular there is a problem with structs like:

struct A { float a; int :0; };

Our ABI document allows passing a struct in an FPR only if it has
exactly one member. On the other hand it says that structs of 1,2,4,8
bytes are passed in a GPR. So this struct is expected to be passed in
a GPR. Since we don't return structs in registers (regardless of the
number of members) it is always returned in memory.

Situation is as follows:

All compiler versions tested return it in memory - as expected.

gcc 11, gcc 12, g++ 12, and clang 13 pass it in a GPR - as expected.

g++ 11 as well as clang++ 13 pass in an FPR

For IBM Z we stick to the current GCC 12 behavior, i.e. zero-width
bitfields are NOT ignored.  A struct as above will be passed in a
GPR. Rational behind this is that not affecting the C ABI is more
important here.

A patch for clang is in progress: https://reviews.llvm.org/D122388

In addition to the usual regression test I ran the compat and
struct-layout-1 testsuites comparing the compiler before and after the
patch.

gcc/ChangeLog:
PR target/102024
* config/s390/s390-protos.h (s390_function_arg_vector): Remove
prototype.
* config/s390/s390.cc (s390_single_field_struct_p): New function.
(s390_function_arg_vector): Invoke s390_single_field_struct_p.
(s390_function_arg_float): Likewise.

gcc/testsuite/ChangeLog:
PR target/102024
* g++.target/s390/pr102024-1.C: New test.
* g++.target/s390/pr102024-2.C: New test.
* g++.target/s390/pr102024-3.C: New test.
* g++.target/s390/pr102024-4.C: New test.
* g++.target/s390/pr102024-5.C: New test.
* g++.target/s390/pr102024-6.C: New test.
---
 gcc/config/s390/s390-protos.h  |   1 -
 gcc/config/s390/s390.cc| 208 +++--
 gcc/testsuite/g++.target/s390/pr102024-1.C |  12 ++
 gcc/testsuite/g++.target/s390/pr102024-2.C |  14 ++
 gcc/testsuite/g++.target/s390/pr102024-3.C |  15 ++
 gcc/testsuite/g++.target/s390/pr102024-4.C |  15 ++
 gcc/testsuite/g++.target/s390/pr102024-5.C |  14 ++
 gcc/testsuite/g++.target/s390/pr102024-6.C |  12 ++
 8 files changed, 187 insertions(+), 104 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-1.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-2.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-3.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-4.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-5.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-6.C

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index e6251595870..fd4acaae44a 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -49,7 +49,6 @@ extern void s390_function_profiler (FILE *, int);
 extern void s390_set_has_landing_pad_p (bool);
 extern bool s390_hard_regno_rename_ok (unsigned int, unsigned int);
 extern int s390_class_max_nregs (enum reg_class, machine_mode);
-extern bool s390_function_arg_vector (machine_mode, const_tree);
 extern bool s390_return_addr_from_memory(void);
 extern bool s390_fma_allowed_p (machine_mode);
 #if S390_USE_TARGET_ATTRIBUTE
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index d2af6d8813d..c091d2a692a 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -12148,29 +12148,26 @@ s390_function_arg_size (machine_mode mode, const_tree 
type)
   gcc_unreachable ();
 }
 
-/* Return true if a function argument of type TYPE and mode MODE
-   is to be passed in a vector register, if available.  */
-
-bool
-s390_function_arg_vector (machine_mode mode, const_tree type)
+/* Return true if a variable of TYPE should be passed as single value
+   with type CODE. If STRICT_SIZE_CHECK_P is true the sizes of the
+   record type and the field type must match.
+
+   The ABI says that record types with a single member are treated
+   just like that member would be.  This function is a helper to
+   detect such cases.  The function also produces the proper
+   diagnostics for cases where the outcome might be different
+   depending on the GCC version.  */
+static bool
+s390_single_field_struct_p (enum tree_code code, const_tree type,
+   bool strict_size_check_p)
 {
-  if (!TARGET_VX_ABI)
-return false;
-
-  if (s390_function_arg_size (mode, type) > 16)
-return false;
-
-  /* No type info available for some library calls ...  */
-  if (!type)
-return VECTOR_MODE_P (mode);
-
-  /* The ABI says that record types with a single member are treated
- just like that member would be.  */
   int empty_base_seen = 0;
+  bool zero_width_bf_skipped_p = false;
   const_tree orig_type = type;
+
   while (TREE_CODE (type) == RECORD_TYPE)
 {
-  tree field, 

Re: [PATCH] rs6000/testsuite: Skip pr105140.c

2022-04-06 Thread Andreas Krebbel via Gcc-patches
On 4/6/22 17:32, Segher Boessenkool wrote:
> This test fails with error "AltiVec argument passed to unprototyped
> function", but the code (in rs6000.c:invalid_arg_for_unprototyped_fn,
> from 2005) actually tests for any vector type argument.  It also does
> not fail on Darwin, not reflected here though.
> 
> Andreas, s390 has this same hook code, you may need to do the same?

Yes, thanks for the pointer. I've just committed the following:

IBM zSystems/testsuite: PR105147: Skip pr105140.c

pr105140.c fails on IBM zSystems with "vector argument passed to
unprototyped function".  s390_invalid_arg_for_unprototyped_fn in
s390.cc is triggered by that.

gcc/testsuite/ChangeLog:

PR target/105147
* gcc.dg/pr105140.c: Skip for s390*-*-*.
---
 gcc/testsuite/gcc.dg/pr105140.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr105140.c b/gcc/testsuite/gcc.dg/pr105140.c
index da34e7ad656..7d30985e850 100644
--- a/gcc/testsuite/gcc.dg/pr105140.c
+++ b/gcc/testsuite/gcc.dg/pr105140.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-Os -w -Wno-psabi" } */
-/* { dg-skip-if "PR105147" { powerpc*-*-* } } */
+/* { dg-skip-if "PR105147" { powerpc*-*-* s390*-*-* } } */

 typedef char __attribute__((__vector_size__ (16 * sizeof (char U;
 typedef int __attribute__((__vector_size__ (16 * sizeof (int V;


Re: [PATCH] testsuite/s390: Adapt test expections.

2022-04-04 Thread Andreas Krebbel via Gcc-patches
On 4/4/22 13:52, Robin Dapp wrote:
> Hi,
> 
> some tests expect a convert instruction but nowadays the conversion is
> already done at compile time.  This results in a literal-pool load.
> Change the tests accordingly.
> 
> OK for trunk?
> 
> Regards
>  Robin
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/zvector/vec-double-compile.c: Expect vl
> instead of vc*.
>   * gcc.target/s390/zvector/vec-float-compile.c: Dito.
>   * gcc.target/s390/zvector/vec-signed-compile.c: Dito.
>   * gcc.target/s390/zvector/vec-unsigned-compile.c: Dito.

I've seen Mike's comment but I'm not opposed to checking it in that way. These 
kind of comments have
probably saved me a few hours of bisecting already. Next time you might 
consider moving it to the
commit message instead.

Ok. Thanks!

Bye,

Andreas


  1   2   3   4   5   6   7   8   9   10   >