[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #49 from Jakub Jelinek  ---
Should be fixed now.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #48 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:172a72da368146e0fe34194020eb7a6636db4438

commit r14-5556-g172a72da368146e0fe34194020eb7a6636db4438
Author: Jakub Jelinek 
Date:   Fri Nov 17 15:09:44 2023 +0100

vect: Fix check_reduction_path [PR112374]

As mentioned in the PR, the intent of the r14-5076 changes was that
it doesn't count one of the uses on the use_stmt, but what actually
got implemented is that it does this processing on any op_use_stmt,
even if it is not the use_stmt statement, which means that it
can increase count even on debug stmts (-fcompare-debug failures),
or if there would be some other use stmt with 2+ uses it could count
that as a single use.  Though, because it fails whenever cnt != 1
and I believe use_stmt must be one of the uses, it would probably
fail in the latter case anyway.

The following patch fixes that by doing this extra processing only when
op_use_stmt is use_stmt, and using the normal processing otherwise
(so ignore debug stmts, and increase on any uses on the stmt).

2023-11-17  Jakub Jelinek  

PR tree-optimization/112374
* tree-vect-loop.cc (check_reduction_path): Perform the cond_fn_p
special case only if op_use_stmt == use_stmt, use as_a rather than
dyn_cast in that case.

* gcc.dg/pr112374-1.c: New test.
* gcc.dg/pr112374-2.c: New test.
* g++.dg/opt/pr112374.C: New test.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #47 from Robin Dapp  ---
And, just to confirm: Testsuite is unchanged on riscv with your patch.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #46 from Robin Dapp  ---
(In reply to Jakub Jelinek from comment #43)

> Now, the patch changed it to allow one extra use in certain cases (but I
> think only on use_stmt, because there should be one use on use_stmt and if
> there is some other in the loop, that would be already too much).  So I
> think my patch ought to work and the above would be more expensive (as it
> would need to check each op_use_stmt for being suitable internal call).
> the code before

Your patch looks good.  What I wanted to do, originally, is to just skip the
reduction variable's next use in the else value and this is what your patch
ensures.  I agree, I don't think we actually need to check other uses in that
manner (because even the use as a proper op would be too much) so what I posted
last will do more work than necessary - it might just make the whole thing a
bit clearer.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #45 from Andrew Pinski  ---
(In reply to Jakub Jelinek from comment #44)
> (In reply to Andrew Pinski from comment #42)
> > Just one suggestion for the testcases, add:
> > /* { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } */
> > 
> > So it at least it will be tested on another target.
> 
> Does aarch64*-*-* support -march=armv9-a regardless of whether e.g.
> assembler has support for it?  I mean, x86 does that, but I think e.g.
> powerpc didn't or doesn't.

Yes -march=armv9-a is supported even if the assembler does not support it.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #44 from Jakub Jelinek  ---
(In reply to Andrew Pinski from comment #42)
> Just one suggestion for the testcases, add:
> /* { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } */
> 
> So it at least it will be tested on another target.

Does aarch64*-*-* support -march=armv9-a regardless of whether e.g. assembler
has support for it?  I mean, x86 does that, but I think e.g. powerpc didn't or
doesn't.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #43 from Jakub Jelinek  ---
(In reply to Robin Dapp from comment #40)
> (In reply to Jakub Jelinek from comment #37)
> [..]
> 
> > The above isn't complete, so one just has to guess what you mean outside of
> > that, but the above doesn't seem to be correct.  There are many internal
> > calls, and most of them shouldn't have the above handling.  Say if there is
> > .MUL_OVERFLOW (op.ops[opi], op.ops[opi]) call, it should count as 2 uses,
> > not one.
> 
> But we would count MUL_OVERFLOW as two uses because its else_pos = -1?
> For clarity we could maybe move the else pos check into the if condition,
> yes.
> 
> One other thing, though.  The else branch checks for is_gimple_debug and
> flow_bb_inside_loop_p.  We could handle this separately and continue.
> Maybe something like this is clearer, replacing the whole loop (and moved
> the else_idx check)?
> 
>   FOR_EACH_IMM_USE_STMT (op_use_stmt, imm_iter, op.ops[opi])
>   {
> if (is_gimple_debug (op_use_stmt)
> || (*code == ERROR_MARK
> && !flow_bb_inside_loop_p (loop,
>gimple_bb (op_use_stmt
>   continue;
> 
> /* In case of a COND_OP (mask, op1, op2, op1) reduction we might have
>op1 twice (once as definition, once as else) in the same operation.
>Allow this.  */
> gcall *call = dyn_cast (op_use_stmt);
> internal_fn ifn;
> if (call && gimple_call_internal_p (call)
> && (ifn = gimple_call_internal_fn (call))
> && internal_fn_else_index (ifn) > -1)
>   {
> unsigned else_idx = internal_fn_else_index (ifn);
> for (unsigned int j = 0; j < gimple_call_num_args (call); ++j)
>   {
> if (j == else_idx)
>   continue;
> if (gimple_call_arg (call, j) == op.ops[opi])
>   cnt++;
>   }
>   }
> else
>   FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
> cnt++;
>   }

Then
  bool cond_fn_p = op.code.is_internal_fn ()
&& (conditional_internal_fn_code (internal_fn (op.code))
!= ERROR_MARK);
would be unused and would cause bootstrap error.
Anyway, it really depends on what you want to achieve with this code.
My (admittedly limited) understanding of this is that use_stmt ought to be
always
one of the statements among the op_use_stmt statements (I believe it must be a
use statement of op.ops[opi]) and I think should be also inside of the loop and
because after this loop it fails if cnt != 1, I think before your changes it
basically made sure that there is exactly one non-debug use in the loop (the
use_stmt one) and not other, and that on use_stmt there is just one use.
Now, the patch changed it to allow one extra use in certain cases (but I think
only on use_stmt, because there should be one use on use_stmt and if there is
some other in the loop, that would be already too much).  So I think my patch
ought to work and the above would be more expensive (as it would need to check
each op_use_stmt for being suitable internal call).
the code before

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #42 from Andrew Pinski  ---
Just one suggestion for the testcases, add:
/* { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } */

So it at least it will be tested on another target.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #41 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #39)
> Created attachment 56601 [details]
> gcc14-pr112374.patch
> 
> So the following then?

LGTM.  Refactoring to more explicitly handle debug stmts like with

  if (is_gimple_debug (op_use_stmt))
continue;
  else if (cond_fn_p && ...)

would also be fine but it isn't technically necessary after this fix.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #40 from Robin Dapp  ---
(In reply to Jakub Jelinek from comment #37)
[..]

> The above isn't complete, so one just has to guess what you mean outside of
> that, but the above doesn't seem to be correct.  There are many internal
> calls, and most of them shouldn't have the above handling.  Say if there is
> .MUL_OVERFLOW (op.ops[opi], op.ops[opi]) call, it should count as 2 uses,
> not one.

But we would count MUL_OVERFLOW as two uses because its else_pos = -1?
For clarity we could maybe move the else pos check into the if condition, yes.

One other thing, though.  The else branch checks for is_gimple_debug and
flow_bb_inside_loop_p.  We could handle this separately and continue.
Maybe something like this is clearer, replacing the whole loop (and moved the
else_idx check)?

  FOR_EACH_IMM_USE_STMT (op_use_stmt, imm_iter, op.ops[opi])
{
  if (is_gimple_debug (op_use_stmt)
  || (*code == ERROR_MARK
  && !flow_bb_inside_loop_p (loop,
 gimple_bb (op_use_stmt
continue;

  /* In case of a COND_OP (mask, op1, op2, op1) reduction we might have
 op1 twice (once as definition, once as else) in the same
operation.
 Allow this.  */
  gcall *call = dyn_cast (op_use_stmt);
  internal_fn ifn;
  if (call && gimple_call_internal_p (call)
  && (ifn = gimple_call_internal_fn (call))
  && internal_fn_else_index (ifn) > -1)
{
  unsigned else_idx = internal_fn_else_index (ifn);
  for (unsigned int j = 0; j < gimple_call_num_args (call); ++j)
{
  if (j == else_idx)
continue;
  if (gimple_call_arg (call, j) == op.ops[opi])
cnt++;
}
}
  else
FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
  cnt++;
}

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #39 from Jakub Jelinek  ---
Created attachment 56601
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56601=edit
gcc14-pr112374.patch

So the following then?

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #38 from Jakub Jelinek  ---
And the #c29 change actually seems to fix all the testcases here with
-fcompare-debug -gno-statement-frontiers:
$ ~/src/gcc/obj48/gcc/xg++ -B ~/src/gcc/obj48/gcc/ -fcompare-debug
-gno-statement-frontiers pr112374-0.C -O2 -march=skylake-avx512 -S
$ ~/src/gcc/obj48/gcc/xgcc -B ~/src/gcc/obj48/gcc/ -fcompare-debug
-gno-statement-frontiers pr112374-1.c -O2 -march=skylake-avx512 -S
xgcc: error: pr112374-1.c: ‘-fcompare-debug’ failure (length)
$ ~/src/gcc/obj48/gcc/xgcc -B ~/src/gcc/obj48/gcc/ -fcompare-debug
-gno-statement-frontiers pr112374-2.c -O2 -march=skylake-avx512 -S
xgcc: error: pr112374-2.c: ‘-fcompare-debug’ failure (length)
$ ~/src/gcc/obj48/gcc/xg++ -B ~/src/gcc/obj48/gcc/ -fcompare-debug
-gno-statement-frontiers pr112374-3.C -O2 -march=skylake-avx512 -S
pr112374-3.C: In constructor ‘t::t(const long unsigned int&)’:
pr112374-3.C:4:36: warning: narrowing conversion of ‘(long unsigned int)a’ from
‘long unsigned int’ to ‘long int’ [-Wnarrowing]
4 |   t(const unsigned long ) : coef{a} {};
  |^
xg++: error: pr112374-3.C: ‘-fcompare-debug’ failure
$ ./xg++ -B ./ -fcompare-debug -gno-statement-frontiers pr112374-0.C -O2
-march=skylake-avx512 -S
$ ./xgcc -B ./ -fcompare-debug -gno-statement-frontiers pr112374-1.c -O2
-march=skylake-avx512 -S
$ ./xgcc -B ./ -fcompare-debug -gno-statement-frontiers pr112374-2.c -O2
-march=skylake-avx512 -S
$ ./xg++ -B ./ -fcompare-debug -gno-statement-frontiers pr112374-3.C -O2
-march=skylake-avx512 -S
pr112374-3.C: In constructor ‘t::t(const long unsigned int&)’:
pr112374-3.C:4:36: warning: narrowing conversion of ‘(long unsigned int)a’ from
‘long unsigned int’ to ‘long int’ [-Wnarrowing]
4 |   t(const unsigned long ) : coef{a} {};
  |^
where ~/src/gcc/obj48/gcc/ is build dir of last night's trunk and ./ is build
dir of trunk patched with #c29 patch.
pr112374-0.C is #c19, pr112374-1.c is #c21, pr112374-2.c is #c25 and
pr112374-3.C is #c30.
So, pr112374-0.C is not useful with -gno-statement-frontiers, guess I'll add
the rest into the testsuite for the patch after trying to fix the narrowing
warning in #c30.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #37 from Jakub Jelinek  ---
(In reply to Robin Dapp from comment #35)
> What does get rid of the comparison failures in the three last posted
> reduced examples is:
> 
> gcall *call = dyn_cast (op_use_stmt);
> internal_fn ifn;
> if (call && gimple_call_internal_p (call))
>   {
> ifn = gimple_call_internal_fn (call);
> unsigned else_pos = internal_fn_else_index (ifn);
> for (unsigned int j = 0; j < gimple_call_num_args (call); ++j)
>   {
> if (j == else_pos)
>   continue;
> if (gimple_call_arg (call, j) == op.ops[opi])
>   cnt++;
>   }
>   }
> 
> i.e. ignoring the else altogether.

The above isn't complete, so one just has to guess what you mean outside of
that, but the above doesn't seem to be correct.  There are many internal calls,
and most of them shouldn't have the above handling.  Say if there is
.MUL_OVERFLOW (op.ops[opi], op.ops[opi]) call, it should count as 2 uses, not
one.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #36 from Jakub Jelinek  ---
Actually, while i386-expand.o compilation fails with -fcompare-debug even after
the above patch, seems that is only because of the discriminator issue, the
*.gkd difference is then:
--- i386-expand.cc.gkd  2023-11-16 09:57:22.666238096 +0100
+++ i386-expand.gk.cc.gkd   2023-11-16 09:59:09.207704622 +0100
@@ -245675,14 +245675,14 @@ Declarations used by ix86_expand_vector_
 (note # 0 0 NOTE_INSN_DELETED)
 (insn:TI # 0 0 11 (set (reg:QI 0 ax [360])
 (mem:QI (plus:DI (reg/f:DI 0 ax [orig:192 rt ] [192])
-(const_int 3 [0x3])) [ MEM  [(struct rtx_def
*)tmp_295 + 3B]+0 S1 A8])) "../../gcc/config/i386/i386-expand.cc":17709:451
discrim 6# {*movqi_internal}
+(const_int 3 [0x3])) [ MEM  [(struct rtx_def
*)tmp_295 + 3B]+0 S1 A8])) "../../gcc/config/i386/i386-expand.cc":17709:451
discrim 5# {*movqi_internal}
  (nil))
 (insn:TI # 0 0 11 (parallel [
 (set (reg:SI 0 ax [360])
 (and:SI (reg:SI 0 ax [360])
 (const_int -29 [0xffe3])))
 (clobber (reg:CC 17 flags))
-]) "../../gcc/config/i386/i386-expand.cc":17709:451 discrim 6#
{*andsi_1}
+]) "../../gcc/config/i386/i386-expand.cc":17709:451 discrim 5#
{*andsi_1}
  (expr_list:REG_UNUSED (reg:CC 17 flags)
 (nil)))
 (insn:TI # 0 0 11 (parallel [
@@ -245690,12 +245690,12 @@ Declarations used by ix86_expand_vector_
 (ior:SI (reg:SI 0 ax [360])
 (const_int 24 [0x18])))
 (clobber (reg:CC 17 flags))
-]) "../../gcc/config/i386/i386-expand.cc":17709:451 discrim 6#
{*iorsi_1}
+]) "../../gcc/config/i386/i386-expand.cc":17709:451 discrim 5#
{*iorsi_1}
  (expr_list:REG_UNUSED (reg:CC 17 flags)
 (nil)))
 (insn:TI # 0 0 11 (set (mem:QI (plus:DI (reg/v/f:DI 3 bx [orig:192 rt ] [192])
 (const_int 3 [0x3])) [ MEM  [(struct rtx_def
*)tmp_295 + 3B]+0 S1 A8])
-(reg:QI 0 ax [362])) "../../gcc/config/i386/i386-expand.cc":17709:451
discrim 6# {*movqi_internal}
+(reg:QI 0 ax [362])) "../../gcc/config/i386/i386-expand.cc":17709:451
discrim 5# {*movqi_internal}
  (expr_list:REG_DEAD (reg:QI 0 ax [362])
 (nil)))
 (jump_insn # 0 0 11 (set (pc)

So, this should no longer fail normal bootstrap comparison (but of course fail
-fcompare-debug bootstrap).  And with -fcompare-debug -gno-statement-frontiers
it compiles fine.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #35 from Robin Dapp  ---
What does get rid of the comparison failures in the three last posted reduced
examples is:

  gcall *call = dyn_cast (op_use_stmt);
  internal_fn ifn;
  if (call && gimple_call_internal_p (call))
{
  ifn = gimple_call_internal_fn (call);
  unsigned else_pos = internal_fn_else_index (ifn);
  for (unsigned int j = 0; j < gimple_call_num_args (call); ++j)
{
  if (j == else_pos)
continue;
  if (gimple_call_arg (call, j) == op.ops[opi])
cnt++;
}
}

i.e. ignoring the else altogether.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #34 from Robin Dapp  ---
(In reply to Jakub Jelinek from comment #29)
> --- gcc/tree-vect-loop.cc.jj  2023-11-14 10:35:52.0 +0100
> +++ gcc/tree-vect-loop.cc 2023-11-15 22:42:32.782007408 +0100
> @@ -4105,9 +4105,9 @@ pop:
>   /* In case of a COND_OP (mask, op1, op2, op1) reduction we might have
>  op1 twice (once as definition, once as else) in the same operation.
>  Allow this.  */
> -   if (cond_fn_p)
> +   if (cond_fn_p && op_use_stmt == use_stmt)
>   {
> -   gcall *call = dyn_cast (use_stmt);
> +   gcall *call = as_a (use_stmt);
> unsigned else_pos
>   = internal_fn_else_index (internal_fn (op.code));
>  
> doesn't cure the -fcompare-debug failure though.

Yeah that's what I meant to mean in the first place ;)
Thanks for the debugging efforts.  I'm going to check with your reduced
example,   could reproduce it locally with it.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-15 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #33 from rguenther at suse dot de  ---
On Wed, 15 Nov 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374
> 
> --- Comment #29 from Jakub Jelinek  ---
> --- gcc/tree-vect-loop.cc.jj2023-11-14 10:35:52.0 +0100
> +++ gcc/tree-vect-loop.cc   2023-11-15 22:42:32.782007408 +0100
> @@ -4105,9 +4105,9 @@ pop:
> /* In case of a COND_OP (mask, op1, op2, op1) reduction we might have
>op1 twice (once as definition, once as else) in the same operation.
>Allow this.  */
> - if (cond_fn_p)
> + if (cond_fn_p && op_use_stmt == use_stmt)
> {
> - gcall *call = dyn_cast (use_stmt);
> + gcall *call = as_a (use_stmt);
>   unsigned else_pos
> = internal_fn_else_index (internal_fn (op.code));
> 
> doesn't cure the -fcompare-debug failure though.

Looks like a relevant fix tho.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

--- Comment #32 from Andrew Pinski  ---
I suspect we need a dump option to dump the freed ssa names after each pass.
This will allow proving if r14-5076 is at fault or not.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374

Andrew Pinski  changed:

   What|Removed |Added

  Build|x86_64-linux-gnu|
   Host|x86_64-linux-gnu|
 Target|x86_64-linux-gnu|x86_64-linux-gnu
   ||aarch64-linux-gnu
  Component|target  |tree-optimization

--- Comment #31 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #30)
> Here is another reduced testcase:

Note the testcase in comment #30, I get comparison failure also on aarch64
with:
 `-O2 -fno-exceptions -fno-rtti -fno-common   -Wno-narrowing 
-gno-statement-frontiers -march=armv9-a+sve2  -fcompare-debug` .