[committed] hppa: Fix handling of large arguments passed by value

2023-04-15 Thread John David Anglin
This change revises pa_function_arg_size to return values that fit
in an int.  For a number of reasons, updating pa_function_arg,
pa_arg_partial_bytes, etc, to handle handle HOST_WIDE_INT values
didn't seem useful.  Currently, gcc limits the size of arguments
passed by value to 1 GB.  The PA prologue/epilogue code only handles
32-bit frame offsets and 1 GB is the maximum frame size that can be
recorded in the HPUX unwind descriptor.  Thus, limiting argument
sizes to 1 GB is enough.

Tested on hppa64-hp-hpux11.11 and hppa-unknown-linux-gnu.  Committed
to trunk.

Dave
---

Fix handling of large arguments passed by value.

2023-04-15  John David Anglin  

gcc/ChangeLog:

PR target/109478
* config/pa/pa-protos.h (pa_function_arg_size): Update prototype.
* config/pa/pa.cc (pa_function_arg): Return NULL_RTX if argument
size is zero.
(pa_arg_partial_bytes): Don't call pa_function_arg_size twice.
(pa_function_arg_size): Change return type to int.  Return zero
for arguments larger than 1 GB.  Update comments.

diff --git a/gcc/config/pa/pa-protos.h b/gcc/config/pa/pa-protos.h
index c0a61ea89c3..b4b1310a52d 100644
--- a/gcc/config/pa/pa-protos.h
+++ b/gcc/config/pa/pa-protos.h
@@ -106,7 +106,7 @@ extern void pa_asm_output_aligned_local (FILE *, const char 
*,
 unsigned int);
 extern void pa_hpux_asm_output_external (FILE *, tree, const char *);
 extern HOST_WIDE_INT pa_initial_elimination_offset (int, int);
-extern HOST_WIDE_INT pa_function_arg_size (machine_mode, const_tree);
+extern int pa_function_arg_size (machine_mode, const_tree);
 extern void pa_output_function_label (FILE *);
 extern void hppa_profile_hook (int);
 
diff --git a/gcc/config/pa/pa.cc b/gcc/config/pa/pa.cc
index 3f91ebce603..db633b275e5 100644
--- a/gcc/config/pa/pa.cc
+++ b/gcc/config/pa/pa.cc
@@ -9784,6 +9784,8 @@ pa_function_arg (cumulative_args_t cum_v, const 
function_arg_info )
 return NULL_RTX;
 
   arg_size = pa_function_arg_size (mode, type);
+  if (!arg_size)
+return NULL_RTX;
 
   /* If this arg would be passed partially or totally on the stack, then
  this routine should return zero.  pa_arg_partial_bytes will
@@ -9985,15 +9987,16 @@ pa_arg_partial_bytes (cumulative_args_t cum_v, const 
function_arg_info )
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
   unsigned int max_arg_words = 8;
   unsigned int offset = 0;
+  int arg_size;
 
   if (!TARGET_64BIT)
 return 0;
 
-  if (pa_function_arg_size (arg.mode, arg.type) > 1 && (cum->words & 1))
+  arg_size = pa_function_arg_size (arg.mode, arg.type);
+  if (arg_size > 1 && (cum->words & 1))
 offset = 1;
 
-  if (cum->words + offset + pa_function_arg_size (arg.mode, arg.type)
-  <= max_arg_words)
+  if (cum->words + offset + arg_size <= max_arg_words)
 /* Arg fits fully into registers.  */
 return 0;
   else if (cum->words + offset >= max_arg_words)
@@ -11067,17 +11070,25 @@ pa_starting_frame_offset (void)
   return 8;
 }
 
-/* Figure out the size in words of the function argument.  The size
-   returned by this function should always be greater than zero because
-   we pass variable and zero sized objects by reference.  */
+/* Figure out the size in words of the function argument.  */
 
-HOST_WIDE_INT
+int
 pa_function_arg_size (machine_mode mode, const_tree type)
 {
   HOST_WIDE_INT size;
 
   size = mode != BLKmode ? GET_MODE_SIZE (mode) : int_size_in_bytes (type); 
-  return CEIL (size, UNITS_PER_WORD);
+
+  /* The 64-bit runtime does not restrict the size of stack frames,
+ but the gcc calling conventions limit argument sizes to 1G.  Our
+ prologue/epilogue code limits frame sizes to just under 32 bits.
+ 1G is also the maximum frame size that can be handled by the HPUX
+ unwind descriptor.  Since very large TYPE_SIZE_UNIT values can
+ occur for (parallel:BLK []), we need to ignore large arguments
+ passed by value.  */
+  if (size >= (1 << (HOST_BITS_PER_INT - 2)))
+size = 0;
+  return (int) CEIL (size, UNITS_PER_WORD);
 }
 
 #include "gt-pa.h"



signature.asc
Description: PGP signature


Re: [Ada] Fix PR bootstrap/109510

2023-04-15 Thread Eric Botcazou via Gcc-patches
> Tested on Aarch64/Linux by Richard S. (thanks!) and on x86-64/Linux by me,
> and applied on the mainline.

It turns out that it slightly broke the x86/Linux compiler, which is not yet 
an acceptable trade-off.  Adjusted like this, tested on x86[_64]/Linux, this 
should not change anything for Aarch64 in particular.


PR bootstrap/109510
* gcc-interface/decl.cc (gnat_to_gnu_entity) : Do not reset
align to zero in any case.  Set TYPE_USER_ALIGN on the type only if
it is an aggregate type, or else a type whose default alignment is
specifically capped on selected platforms.

-- 
Eric Botcazou
diff --git a/gcc/ada/gcc-interface/decl.cc b/gcc/ada/gcc-interface/decl.cc
index 851a6745f77..20f43de9ea9 100644
--- a/gcc/ada/gcc-interface/decl.cc
+++ b/gcc/ada/gcc-interface/decl.cc
@@ -4371,10 +4371,6 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	  align = validate_alignment (Alignment (gnat_entity), gnat_entity,
   TYPE_ALIGN (gnu_type));
 
-	  /* Treat confirming clauses on scalar types like the default.  */
-	  if (align == TYPE_ALIGN (gnu_type) && !AGGREGATE_TYPE_P (gnu_type))
-	align = 0;
-
 	  /* Warn on suspiciously large alignments.  This should catch
 	 errors about the (alignment,byte)/(size,bit) discrepancy.  */
 	  if (align > BIGGEST_ALIGNMENT && Has_Alignment_Clause (gnat_entity))
@@ -4657,6 +4653,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
   /* If this is not an unconstrained array type, set some flags.  */
   if (TREE_CODE (gnu_type) != UNCONSTRAINED_ARRAY_TYPE)
 	{
+	  bool align_clause;
+
 	  /* Record the property that objects of tagged types are guaranteed to
 	 be properly aligned.  This is necessary because conversions to the
 	 class-wide type are translated into conversions to the root type,
@@ -4669,8 +4667,20 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 	  if (is_by_ref && !VOID_TYPE_P (gnu_type))
 	TYPE_BY_REFERENCE_P (gnu_type) = 1;
 
-	  /* Record whether an alignment clause was specified.  */
-	  if (align > 0 && Present (Alignment_Clause (gnat_entity)))
+	  /* Record whether an alignment clause was specified.  At this point
+	 scalar types with a non-confirming clause have been wrapped into
+	 a record type, so only scalar types with a confirming clause are
+	 left untouched; we do not set the flag on them except if they are
+	 types whose default alignment is specifically capped in order not
+	 to lose the specified alignment.  */
+	  if ((AGGREGATE_TYPE_P (gnu_type)
+	   && Present (Alignment_Clause (gnat_entity)))
+	  || (double_float_alignment > 0
+		  && is_double_float_or_array (gnat_entity, _clause)
+		  && align_clause)
+	  || (double_scalar_alignment > 0
+		  && is_double_scalar_or_array (gnat_entity, _clause)
+		  && align_clause))
 	TYPE_USER_ALIGN (gnu_type) = 1;
 
 	  /* Record whether a pragma Universal_Aliasing was specified.  */


[pushed] c++: constexpr aggregate destruction [PR109357]

2023-04-15 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

We were assuming that the result of evaluation of TARGET_EXPR_INITIAL would
always be the new value of the temporary, but that's not necessarily true
when the initializer is complex (i.e. target_expr_needs_replace).  In that
case evaluating the initializer initializes the temporary as a side-effect.

PR c++/109357

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression) [TARGET_EXPR]:
Check for complex initializer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-dtor15.C: New test.
---
 gcc/cp/constexpr.cc   | 15 +++
 gcc/testsuite/g++.dg/cpp2a/constexpr-dtor15.C | 19 +++
 2 files changed, 30 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-dtor15.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 3de60cfd0f8..d1097764b10 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -7230,16 +7230,23 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
  non_constant_p, overflow_p);
if (*non_constant_p)
  break;
-   /* Adjust the type of the result to the type of the temporary.  */
-   r = adjust_temp_type (type, r);
+   /* If the initializer is complex, evaluate it to initialize slot.  */
+   bool is_complex = target_expr_needs_replace (t);
+   if (!is_complex)
+ {
+   r = unshare_constructor (r);
+   /* Adjust the type of the result to the type of the temporary.  */
+   r = adjust_temp_type (type, r);
+   ctx->global->put_value (slot, r);
+ }
if (TARGET_EXPR_CLEANUP (t) && !CLEANUP_EH_ONLY (t))
  ctx->global->cleanups->safe_push (TARGET_EXPR_CLEANUP (t));
-   r = unshare_constructor (r);
-   ctx->global->put_value (slot, r);
if (ctx->save_exprs)
  ctx->save_exprs->safe_push (slot);
if (lval)
  return slot;
+   if (is_complex)
+ r = ctx->global->get_value (slot);
   }
   break;
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-dtor15.C 
b/gcc/testsuite/g++.dg/cpp2a/constexpr-dtor15.C
new file mode 100644
index 000..d34c27eee45
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-dtor15.C
@@ -0,0 +1,19 @@
+// PR c++/109357
+// { dg-do compile { target c++20 } }
+// { dg-prune-output "used but never defined" }
+
+struct basic_string {
+  char _M_local_buf;
+  basic_string();
+  constexpr basic_string(const char *) {}
+  constexpr ~basic_string();
+  constexpr basic_string& operator=(basic_string);
+};
+struct S1 {
+  basic_string x;
+  basic_string y;
+} s1;
+struct s2 {
+  ~s2();
+};
+s2::~s2() { s1 = {"", ""}; }

base-commit: 9964df74a9e99e850bf9b0b6ff5c47133f846db8
-- 
2.31.1



[PATCH] build: Use -nostdinc generating macro_list [PR109522]

2023-04-15 Thread Xi Ruoyao via Gcc-patches
This prevents a spurious message building a cross-compiler when target
libc is not installed yet:

cc1: error: no include path in which to search for stdc-predef.h

As stdc-predef.h was added to define __STDC_* macros by libc, it's
unlikely the header will ever contain some bad definitions w/o "__"
prefix so it should be safe.

gcc/ChangeLog:

PR other/109522
* Makefile.in (s-macro_list): Pass -nostdinc to
$(GCC_FOR_TARGET).
---
 gcc/Makefile.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ad9a5d94cd0..eb26d5c7be5 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3215,7 +3215,7 @@ install-gcc-tooldir:
 
 macro_list: s-macro_list; @true
 s-macro_list : $(GCC_PASSES) cc1$(exeext)
-   echo | $(GCC_FOR_TARGET) -E -dM - | \
+   echo | $(GCC_FOR_TARGET) -nostdinc -E -dM - | \
  sed -n -e 's/^#define \([^_][a-zA-Z0-9_]*\).*/\1/p' \
 -e 's/^#define \(_[^_A-Z][a-zA-Z0-9_]*\).*/\1/p' | \
  sort -u > tmp-macro_list
-- 
2.40.0



[r13-7179 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 4 on Linux/x86_64

2023-04-15 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

040e64b09d4422c7d3c51bee098043782112b924 is the first bad commit
commit 040e64b09d4422c7d3c51bee098043782112b924
Author: Richard Biener 
Date:   Fri Apr 14 11:35:58 2023 +0200

Fix vect-simd-clone testcase dump scanning

caused

FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 4
FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 4
FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 4

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-7179/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


Re: [PATCH] if-conv: Small improvement for expansion of complex PHIs [PR109154]

2023-04-15 Thread Richard Biener via Gcc-patches



> Am 15.04.2023 um 10:30 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> Hi!
> 
> The following patch is just a dumb improvement, gets rid of 2 unnecessary
> instructions on both the PR's original testcase and on the two reduced ones,
> both on -mcpu=neoverse-v1 and -mavx512f.
> 
> The thing is, if we have args_len (args_len >= 2) unique PHI arguments,
> we need only args_len - 1 COND_EXPRs to expand the PHI, because first
> COND_EXPR can merge 2 unique arguments and all the following ones merge
> another unique argument with the previously merged arguments,
> while the code for mysterious reasons was always emitting args_len
> COND_EXPRs, where the first COND_EXPR merged the first and second unique
> arguments, the second COND_EXPR merged the second unique argument with
> result of merging the first and second unique arguments and the rest was
> already expectable, nth COND_EXPR for n > 2 merged the nth unique argument
> with result of merging the previous unique arguments.
> Now, in my understanding, the bb_predicate for bb's predecessor need to
> form a disjunct set which together creates the successor's bb_predicate,
> so I don't see why we'd need to check all the bb_predicates, if we check
> all but one then when all those other ones are false the last bb_predicate
> is necessarily true.  Given that the code attempts to sort argument with
> most occurrences (so likely most complex combined predicate) last, I chose
> not to test that last argument's predicate.
> So e.g. on the testcase from comment 47 in the PR:
> void
> foo (int *f, int d, int e)
> {
>  for (int i = 0; i < 1024; i++)
>{
>  int a = f[i];
>  int t;
>  if (a < 0)
>t = 1;
>  else if (a < e)
>t = 1 - a * d;
>  else
>t = 0;
>  f[i] = t;
>}
> }
> we used to emit:
>  _7 = a_10 < 0;
>  _21 = a_10 >= 0;
>  _22 = a_10 < e_11(D);
>  _23 = _21 & _22;
>  _26 = a_10 >= e_11(D);
>  _27 = _21 & _26;
>  _ifc__42 = _7 ? 1 : t_13;
>  _ifc__43 = _23 ? t_13 : _ifc__42;
>  t_6 = _27 ? 0 : _ifc__43;
> while the following patch changes it to:
>  _7 = a_10 < 0;
>  _21 = a_10 >= 0;
>  _22 = a_10 < e_11(D);
>  _23 = _21 & _22;
>  _ifc__42 = _23 ? t_13 : 0;
>  t_6 = _7 ? 1 : _ifc__42;
> which I believe should be sufficient for a PHI <1, t_13, 0>.
> 
> I've gathered some statistics and on x86_64-linux and i686-linux
> bootstraps/regtests, this code triggers:
> 92 4 4
>112 2 4
>141 3 4
>   4046 3 3
> (where 2nd number is args_len and 3rd argument EDGE_COUNT (bb->preds)
> and first argument count of those from sort | uniq -c | sort -n).
> In all these cases the patch should squeze one extra COND_EXPR and
> its associated predicate (the latter only if it wasn't used elsewhere).
> 
> Incrementally, I think we should try to perform some analysis on which
> predicates depend on inverses of other predicates and if possible try
> to sort the arguments better and omit testing unnecessary predicates.
> So essentially for the above testcase deconstruct it back to:
>  _7 = a_10 < 0;
>  _22 = a_10 < e_11(D);
>  _ifc__42 = _22 ? t_13 : 0;
>  t_6 = _7 ? 1 : _ifc__42;
> which is like what this patch produces, but with the & a_10 >= 0 part
> removed, because the last predicate is a_10 < 0 and so testing a_10 >= 0
> on what appears on the false branch doesn't make sense.
> But I'm afraid that will take more work than is doable in stage4 right now.

Agreed.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Yes - thanks for spotting this obvious improvement.

Richard 

> 2023-04-15  Jakub Jelinek  
> 
>PR tree-optimization/109154
>* tree-if-conv.cc (predicate_scalar_phi): For complex PHIs, emit just
>args_len - 1 COND_EXPRs rather than args_len.  Formatting fix.
> 
> --- gcc/tree-if-conv.cc.jj2023-04-12 08:53:58.264496474 +0200
> +++ gcc/tree-if-conv.cc2023-04-14 21:02:42.403826690 +0200
> @@ -2071,7 +2071,7 @@ predicate_scalar_phi (gphi *phi, gimple_
> }
> 
>   /* Put element with max number of occurences to the end of ARGS.  */
> -  if (max_ind != -1 && max_ind +1 != (int) args_len)
> +  if (max_ind != -1 && max_ind + 1 != (int) args_len)
> std::swap (args[args_len - 1], args[max_ind]);
> 
>   /* Handle one special case when number of arguments with different values
> @@ -2116,12 +2116,12 @@ predicate_scalar_phi (gphi *phi, gimple_
>   vec *indexes;
>   tree type = TREE_TYPE (gimple_phi_result (phi));
>   tree lhs;
> -  arg1 = args[1];
> -  for (i = 0; i < args_len; i++)
> +  arg1 = args[args_len - 1];
> +  for (i = args_len - 1; i > 0; i--)
>{
> -  arg0 = args[i];
> -  indexes = phi_arg_map.get (args[i]);
> -  if (i != args_len - 1)
> +  arg0 = args[i - 1];
> +  indexes = phi_arg_map.get (args[i - 1]);
> +  if (i != 1)
>lhs = make_temp_ssa_name (type, NULL, "_ifc_");
>  else
>lhs = res;
> 
>Jakub
> 


[PATCH] if-conv: Small improvement for expansion of complex PHIs [PR109154]

2023-04-15 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch is just a dumb improvement, gets rid of 2 unnecessary
instructions on both the PR's original testcase and on the two reduced ones,
both on -mcpu=neoverse-v1 and -mavx512f.

The thing is, if we have args_len (args_len >= 2) unique PHI arguments,
we need only args_len - 1 COND_EXPRs to expand the PHI, because first
COND_EXPR can merge 2 unique arguments and all the following ones merge
another unique argument with the previously merged arguments,
while the code for mysterious reasons was always emitting args_len
COND_EXPRs, where the first COND_EXPR merged the first and second unique
arguments, the second COND_EXPR merged the second unique argument with
result of merging the first and second unique arguments and the rest was
already expectable, nth COND_EXPR for n > 2 merged the nth unique argument
with result of merging the previous unique arguments.
Now, in my understanding, the bb_predicate for bb's predecessor need to
form a disjunct set which together creates the successor's bb_predicate,
so I don't see why we'd need to check all the bb_predicates, if we check
all but one then when all those other ones are false the last bb_predicate
is necessarily true.  Given that the code attempts to sort argument with
most occurrences (so likely most complex combined predicate) last, I chose
not to test that last argument's predicate.
So e.g. on the testcase from comment 47 in the PR:
void
foo (int *f, int d, int e)
{
  for (int i = 0; i < 1024; i++)
{
  int a = f[i];
  int t;
  if (a < 0)
t = 1;
  else if (a < e)
t = 1 - a * d;
  else
t = 0;
  f[i] = t;
}
}
we used to emit:
  _7 = a_10 < 0;
  _21 = a_10 >= 0;
  _22 = a_10 < e_11(D);
  _23 = _21 & _22;
  _26 = a_10 >= e_11(D);
  _27 = _21 & _26;
  _ifc__42 = _7 ? 1 : t_13;
  _ifc__43 = _23 ? t_13 : _ifc__42;
  t_6 = _27 ? 0 : _ifc__43;
while the following patch changes it to:
  _7 = a_10 < 0;
  _21 = a_10 >= 0;
  _22 = a_10 < e_11(D);
  _23 = _21 & _22;
  _ifc__42 = _23 ? t_13 : 0;
  t_6 = _7 ? 1 : _ifc__42;
which I believe should be sufficient for a PHI <1, t_13, 0>.

I've gathered some statistics and on x86_64-linux and i686-linux
bootstraps/regtests, this code triggers:
 92 4 4
112 2 4
141 3 4
   4046 3 3
(where 2nd number is args_len and 3rd argument EDGE_COUNT (bb->preds)
and first argument count of those from sort | uniq -c | sort -n).
In all these cases the patch should squeze one extra COND_EXPR and
its associated predicate (the latter only if it wasn't used elsewhere).

Incrementally, I think we should try to perform some analysis on which
predicates depend on inverses of other predicates and if possible try
to sort the arguments better and omit testing unnecessary predicates.
So essentially for the above testcase deconstruct it back to:
  _7 = a_10 < 0;
  _22 = a_10 < e_11(D);
  _ifc__42 = _22 ? t_13 : 0;
  t_6 = _7 ? 1 : _ifc__42;
which is like what this patch produces, but with the & a_10 >= 0 part
removed, because the last predicate is a_10 < 0 and so testing a_10 >= 0
on what appears on the false branch doesn't make sense.
But I'm afraid that will take more work than is doable in stage4 right now.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-04-15  Jakub Jelinek  

PR tree-optimization/109154
* tree-if-conv.cc (predicate_scalar_phi): For complex PHIs, emit just
args_len - 1 COND_EXPRs rather than args_len.  Formatting fix.

--- gcc/tree-if-conv.cc.jj  2023-04-12 08:53:58.264496474 +0200
+++ gcc/tree-if-conv.cc 2023-04-14 21:02:42.403826690 +0200
@@ -2071,7 +2071,7 @@ predicate_scalar_phi (gphi *phi, gimple_
 }
 
   /* Put element with max number of occurences to the end of ARGS.  */
-  if (max_ind != -1 && max_ind +1 != (int) args_len)
+  if (max_ind != -1 && max_ind + 1 != (int) args_len)
 std::swap (args[args_len - 1], args[max_ind]);
 
   /* Handle one special case when number of arguments with different values
@@ -2116,12 +2116,12 @@ predicate_scalar_phi (gphi *phi, gimple_
   vec *indexes;
   tree type = TREE_TYPE (gimple_phi_result (phi));
   tree lhs;
-  arg1 = args[1];
-  for (i = 0; i < args_len; i++)
+  arg1 = args[args_len - 1];
+  for (i = args_len - 1; i > 0; i--)
{
- arg0 = args[i];
- indexes = phi_arg_map.get (args[i]);
- if (i != args_len - 1)
+ arg0 = args[i - 1];
+ indexes = phi_arg_map.get (args[i - 1]);
+ if (i != 1)
lhs = make_temp_ssa_name (type, NULL, "_ifc_");
  else
lhs = res;

Jakub