[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-10-19 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #24 from Kewen Lin  ---
(In reply to Richard Biener from comment #22)
> I see the mems properly get their base adjusted:
> 
> (insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> (const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
> S16 A128])
> (reg:V2DI 616)) -1
>  (nil))
> 
> vs.
> 
> (insn 389 388 390 (set (reg:HI 619)
> (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
> (const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
> *)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
>  (nil))
> 
> both are based off a fake _10.  But we get alias sets 7 and 4 used here
> which might be a problem.
> 
> See update_alias_info_with_stack_vars and uses of decls_to_pointers,
> in particular from set_mem_attributes_minus_bitpos where we preserve
> TBAA info with the rewrite.  I'm not sure why that should be OK ...
> (but I'm sure I must have thought of this problem back in time)
> 
> Does the following fix the testcase?
> 
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index 84b6833225e..81c0a63eddc 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -2128,7 +2128,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int
> objectp,
>   tree *orig_base = 
>   while (handled_component_p (*orig_base))
> orig_base = _OPERAND (*orig_base, 0);
> - tree aptrt = reference_alias_ptr_type (*orig_base);
> + tree aptrt = ptr_type_node;
>   *orig_base = build2 (MEM_REF, TREE_TYPE (*orig_base), *namep,
>build_int_cst (aptrt, 0));
> }

Sorry, this doesn't help.

I noticed that it makes insns 384 and 389 become to:

(insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 16 [0x10])) [7 MEM  [(voidD.48
*)_10]+0 S16 A128])
(reg:V2DI 616)) -1
 (nil))

(insn 389 388 390 (set (reg:HI 619)
(mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 16 [0x10])) [4 MEM  [(voidD.48
*)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
 (nil))

alias sets are not changed. Aggressively further hacking with attrs.alias = 0
can make it pass. Can we make an new alias set for each partition? then all
involved decls in the same partition is aliased. For a particular involved
decl, it's aliased to the previous ones and the new ones in its own partitions.

[Bug tree-optimization/50856] ARM: suboptimal code for absolute difference calculation

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856

--- Comment #4 from Andrew Pinski  ---
Here is a full testcase (f3 is caught via fold_cond_expr_with_comparison):
```
int f(int a, int b)
{
  int t = a - b;
  if (t > 0) return t;
  return b - a;
}
int f1(int a, int b)
{
  if (a > b)  return a - b;
  return b - a;
}

int f2(int a, int b)
{
 return (a > b) ? a - b : b - a;
}

int f3(int a, int b)
{
 return (a - b) > 0 ? a - b : b - a;
}
```

I should note we currently have the following note in match.pd about not
folding `(a - b) > 0` because of catching f3:
/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
   ??? The transformation is valid for the other operators if overflow
   is undefined for the type, but performing it here badly interacts
   with the transformation in fold_cond_expr_with_comparison which
   attempts to synthetize ABS_EXPR.  */

[Bug target/111725] Missed one vsetivli insn

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111725

--- Comment #1 from CVS Commits  ---
The trunk branch has been updated by Lehua Ding :

https://gcc.gnu.org/g:29331e72d0ce9fe8aabdeb8c320b99943b9e067a

commit r14-4773-g29331e72d0ce9fe8aabdeb8c320b99943b9e067a
Author: Lehua Ding 
Date:   Fri Oct 20 10:22:43 2023 +0800

RISC-V: Refactor and cleanup vsetvl pass

This patch refactors and cleanups the vsetvl pass in order to make the code
easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only
maintain
   and modify this virtual CFG. Phase 4 performs insertion, modification
and
   deletion of vsetvl insns based on the virtual CFG. The basic block in
the
   virtual CFG is called vsetvl_block_info and the vsetvl information
inside
   is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand
system,
   this phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to uplift
vsetvl
   info to a pred basic block to a more unified method that there is a
vsetvl
   info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
   Phase 4 is responsible for inserting, modifying and deleting vsetvl
   instructions based on fully optimized vsetvl infos. Phase 5 removes the
avl
   operand from the RVV instruction and removes the unused dest operand
   register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. The
reasons
for updating are summarized below:

1. more optimized
   vlmax_back_prop-{25,26}.c
   vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/
   avl_single-{23,84,95}.c/pr109773-1.c
2. less unnecessary fusion
   avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
   scalar_move-1.c
4. add some bugfix testcases.
   pr111037-{3,4}.c/pr111037-4.c
   avl_single-{89,104,105,106,107,108,109}.c

PR target/111037
PR target/111234
PR target/111725

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry):
New.
(debug): Removed.
(compute_reaching_defintion): New.
(enum vsetvl_type): Moved.
(vlmax_avl_p): Moved.
(enum emit_type): Moved.
(vlmul_to_str): Moved.
(vlmax_avl_insn_p): Removed.
(policy_to_str): Moved.
(loop_basic_block_p): Removed.
(valid_sew_p): Removed.
(vsetvl_insn_p): Moved.
(vsetvl_vtype_change_only_p): Removed.
(after_or_same_p): Removed.
(before_p): Removed.
(anticipatable_occurrence_p): Removed.
(available_occurrence_p): Removed.
(insn_should_be_added_p): Removed.
(get_all_sets): Moved.
(get_same_bb_set): Moved.
(gen_vsetvl_pat): Removed.
(calculate_vlmul): Moved.
(get_max_int_sew): New.
(emit_vsetvl_insn): Removed.
(get_max_float_sew): New.
(eliminate_insn): Removed.
(insert_vsetvl): Removed.
(count_regno_occurrences): Moved.
(get_vl_vtype_info): Removed.
(enum def_type): Moved.
(validate_change_or_fail): Moved.
(change_insn): Removed.
(get_all_real_uses): Moved.
(get_forward_read_vl_insn): Removed.
(get_backward_fault_first_load_insn): Removed.
(change_vsetvl_insn): Removed.
(avl_source_has_vsetvl_p): Removed.
(source_equal_p): Moved.
(calculate_sew): Removed.
(same_equiv_note_p): Moved.
(get_expr_id): New.
(incompatible_avl_p): Removed.
(get_regno): New.
(different_sew_p): Removed.
(get_bb_index): New.
(different_lmul_p): Removed.
(has_no_uses): Moved.
(different_ratio_p): Removed.
(different_tail_policy_p): Removed.
(different_mask_policy_p): Removed.
(possible_zero_avl_p): Removed.
(enum demand_flags): New.
(second_ratio_invalid_for_first_sew_p): Removed.
(second_ratio_invalid_for_first_lmul_p): Removed.
(enum class): New.
(float_insn_valid_sew_p): Removed.
(second_sew_less_than_first_sew_p): Removed.
(first_sew_less_than_second_sew_p): Removed.
(class vsetvl_info): New.
(compare_lmul): Removed.
(second_lmul_less_than_first_lmul_p): Removed.
(second_ratio_less_than_first_ratio_p): Removed.
(DEF_INCOMPATIBLE_COND): Removed.

[Bug target/111234] RISC-V: ICE in vsetvl pass

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111234

--- Comment #3 from CVS Commits  ---
The trunk branch has been updated by Lehua Ding :

https://gcc.gnu.org/g:29331e72d0ce9fe8aabdeb8c320b99943b9e067a

commit r14-4773-g29331e72d0ce9fe8aabdeb8c320b99943b9e067a
Author: Lehua Ding 
Date:   Fri Oct 20 10:22:43 2023 +0800

RISC-V: Refactor and cleanup vsetvl pass

This patch refactors and cleanups the vsetvl pass in order to make the code
easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only
maintain
   and modify this virtual CFG. Phase 4 performs insertion, modification
and
   deletion of vsetvl insns based on the virtual CFG. The basic block in
the
   virtual CFG is called vsetvl_block_info and the vsetvl information
inside
   is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand
system,
   this phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to uplift
vsetvl
   info to a pred basic block to a more unified method that there is a
vsetvl
   info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
   Phase 4 is responsible for inserting, modifying and deleting vsetvl
   instructions based on fully optimized vsetvl infos. Phase 5 removes the
avl
   operand from the RVV instruction and removes the unused dest operand
   register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. The
reasons
for updating are summarized below:

1. more optimized
   vlmax_back_prop-{25,26}.c
   vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/
   avl_single-{23,84,95}.c/pr109773-1.c
2. less unnecessary fusion
   avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
   scalar_move-1.c
4. add some bugfix testcases.
   pr111037-{3,4}.c/pr111037-4.c
   avl_single-{89,104,105,106,107,108,109}.c

PR target/111037
PR target/111234
PR target/111725

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry):
New.
(debug): Removed.
(compute_reaching_defintion): New.
(enum vsetvl_type): Moved.
(vlmax_avl_p): Moved.
(enum emit_type): Moved.
(vlmul_to_str): Moved.
(vlmax_avl_insn_p): Removed.
(policy_to_str): Moved.
(loop_basic_block_p): Removed.
(valid_sew_p): Removed.
(vsetvl_insn_p): Moved.
(vsetvl_vtype_change_only_p): Removed.
(after_or_same_p): Removed.
(before_p): Removed.
(anticipatable_occurrence_p): Removed.
(available_occurrence_p): Removed.
(insn_should_be_added_p): Removed.
(get_all_sets): Moved.
(get_same_bb_set): Moved.
(gen_vsetvl_pat): Removed.
(calculate_vlmul): Moved.
(get_max_int_sew): New.
(emit_vsetvl_insn): Removed.
(get_max_float_sew): New.
(eliminate_insn): Removed.
(insert_vsetvl): Removed.
(count_regno_occurrences): Moved.
(get_vl_vtype_info): Removed.
(enum def_type): Moved.
(validate_change_or_fail): Moved.
(change_insn): Removed.
(get_all_real_uses): Moved.
(get_forward_read_vl_insn): Removed.
(get_backward_fault_first_load_insn): Removed.
(change_vsetvl_insn): Removed.
(avl_source_has_vsetvl_p): Removed.
(source_equal_p): Moved.
(calculate_sew): Removed.
(same_equiv_note_p): Moved.
(get_expr_id): New.
(incompatible_avl_p): Removed.
(get_regno): New.
(different_sew_p): Removed.
(get_bb_index): New.
(different_lmul_p): Removed.
(has_no_uses): Moved.
(different_ratio_p): Removed.
(different_tail_policy_p): Removed.
(different_mask_policy_p): Removed.
(possible_zero_avl_p): Removed.
(enum demand_flags): New.
(second_ratio_invalid_for_first_sew_p): Removed.
(second_ratio_invalid_for_first_lmul_p): Removed.
(enum class): New.
(float_insn_valid_sew_p): Removed.
(second_sew_less_than_first_sew_p): Removed.
(first_sew_less_than_second_sew_p): Removed.
(class vsetvl_info): New.
(compare_lmul): Removed.
(second_lmul_less_than_first_lmul_p): Removed.
(second_ratio_less_than_first_ratio_p): Removed.
(DEF_INCOMPATIBLE_COND): Removed.

[Bug target/111848] RISC-V: RVV cost model pick unexpected big LMUL

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848

--- Comment #2 from CVS Commits  ---
The trunk branch has been updated by Lehua Ding :

https://gcc.gnu.org/g:f0e28d8c13713f509fde26fbe7dd13280b67fb87

commit r14-4774-gf0e28d8c13713f509fde26fbe7dd13280b67fb87
Author: Juzhe-Zhong 
Date:   Wed Oct 18 18:25:33 2023 +0800

RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848

But it generate horrible register spillings.

The root cause is that we didn't hoist the vmv.v.x outside the loop which
increase the SLP loop register pressure.

So, change the COSNT_VECTOR move into vec_duplicate splitter that we can
gain better optimizations:

1. better LICM.
2. More opportunities of transforming 'vv' into 'vx' in the future.

Before this patch:

f3:
ble a4,zero,.L8
csrrt0,vlenb
sllit1,t0,4
csrra6,vlenb
sub sp,sp,t1
csrra5,vlenb
sllia6,a6,3
sllia5,a5,2
add a6,a6,sp
vsetvli a7,zero,e16,m8,ta,ma
sllia4,a4,3
vid.v   v8
addit6,a5,-1
vand.vi v8,v8,-2
neg t5,a5
vs8r.v  v8,0(sp)
vadd.vi v8,v8,1
vs8r.v  v8,0(a6)
j   .L4
.L12:
vsetvli a7,zero,e16,m8,ta,ma
.L4:
csrrt0,vlenb
sllit0,t0,3
vl8re16.v   v16,0(sp)
add t0,t0,sp
vmv.v.x v8,t6
mv  t1,a4
vand.vv v24,v16,v8
mv  a6,a4
vl8re16.v   v16,0(t0)
vand.vv v8,v16,v8
bleua4,a5,.L3
mv  a6,a5
.L3:
vsetvli zero,a6,e8,m4,ta,ma
vle8.v  v20,0(a2)
vle8.v  v16,0(a3)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v24
vadd.vv v4,v16,v4
vsetvli zero,a6,e8,m4,ta,ma
vse8.v  v4,0(a0)
vle8.v  v20,0(a2)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v8
vadd.vv v4,v4,v16
vsetvli zero,a6,e8,m4,ta,ma
vse8.v  v4,0(a1)
add a4,a4,t5
add a0,a0,a5
add a3,a3,a5
add a1,a1,a5
add a2,a2,a5
bgtut1,a5,.L12
csrrt0,vlenb
sllit1,t0,4
add sp,sp,t1
jr  ra
.L8:
ret

After this patch:

f3:
ble a4,zero,.L6
csrra6,vlenb
csrra5,vlenb
sllia6,a6,2
sllia5,a5,2
addia6,a6,-1
sllia4,a4,3
neg t5,a5
vsetvli t1,zero,e16,m8,ta,ma
vmv.v.x v24,a6
vid.v   v8
vand.vi v8,v8,-2
vadd.vi v16,v8,1
vand.vv v8,v8,v24
vand.vv v16,v16,v24
.L4:
mv  t1,a4
mv  a6,a4
bleua4,a5,.L3
mv  a6,a5
.L3:
vsetvli zero,a6,e8,m4,ta,ma
vle8.v  v28,0(a2)
vle8.v  v24,0(a3)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v28,v8
vadd.vv v4,v24,v4
vsetvli zero,a6,e8,m4,ta,ma
vse8.v  v4,0(a0)
vle8.v  v28,0(a2)
vsetvli a7,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v28,v16
vadd.vv v4,v4,v24
vsetvli zero,a6,e8,m4,ta,ma
vse8.v  v4,0(a1)
add a4,a4,t5
add a0,a0,a5
add a3,a3,a5
add a1,a1,a5
add a2,a2,a5
bgtut1,a5,.L4
.L6:
ret

Note that this patch triggers multiple FAILs:
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
execution test
FAIL: 

[Bug target/111037] RISC-V: Invalid vsetvli fusion

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111037

--- Comment #4 from CVS Commits  ---
The trunk branch has been updated by Lehua Ding :

https://gcc.gnu.org/g:29331e72d0ce9fe8aabdeb8c320b99943b9e067a

commit r14-4773-g29331e72d0ce9fe8aabdeb8c320b99943b9e067a
Author: Lehua Ding 
Date:   Fri Oct 20 10:22:43 2023 +0800

RISC-V: Refactor and cleanup vsetvl pass

This patch refactors and cleanups the vsetvl pass in order to make the code
easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only
maintain
   and modify this virtual CFG. Phase 4 performs insertion, modification
and
   deletion of vsetvl insns based on the virtual CFG. The basic block in
the
   virtual CFG is called vsetvl_block_info and the vsetvl information
inside
   is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand
system,
   this phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to uplift
vsetvl
   info to a pred basic block to a more unified method that there is a
vsetvl
   info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
   Phase 4 is responsible for inserting, modifying and deleting vsetvl
   instructions based on fully optimized vsetvl infos. Phase 5 removes the
avl
   operand from the RVV instruction and removes the unused dest operand
   register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. The
reasons
for updating are summarized below:

1. more optimized
   vlmax_back_prop-{25,26}.c
   vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/
   avl_single-{23,84,95}.c/pr109773-1.c
2. less unnecessary fusion
   avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
   scalar_move-1.c
4. add some bugfix testcases.
   pr111037-{3,4}.c/pr111037-4.c
   avl_single-{89,104,105,106,107,108,109}.c

PR target/111037
PR target/111234
PR target/111725

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry):
New.
(debug): Removed.
(compute_reaching_defintion): New.
(enum vsetvl_type): Moved.
(vlmax_avl_p): Moved.
(enum emit_type): Moved.
(vlmul_to_str): Moved.
(vlmax_avl_insn_p): Removed.
(policy_to_str): Moved.
(loop_basic_block_p): Removed.
(valid_sew_p): Removed.
(vsetvl_insn_p): Moved.
(vsetvl_vtype_change_only_p): Removed.
(after_or_same_p): Removed.
(before_p): Removed.
(anticipatable_occurrence_p): Removed.
(available_occurrence_p): Removed.
(insn_should_be_added_p): Removed.
(get_all_sets): Moved.
(get_same_bb_set): Moved.
(gen_vsetvl_pat): Removed.
(calculate_vlmul): Moved.
(get_max_int_sew): New.
(emit_vsetvl_insn): Removed.
(get_max_float_sew): New.
(eliminate_insn): Removed.
(insert_vsetvl): Removed.
(count_regno_occurrences): Moved.
(get_vl_vtype_info): Removed.
(enum def_type): Moved.
(validate_change_or_fail): Moved.
(change_insn): Removed.
(get_all_real_uses): Moved.
(get_forward_read_vl_insn): Removed.
(get_backward_fault_first_load_insn): Removed.
(change_vsetvl_insn): Removed.
(avl_source_has_vsetvl_p): Removed.
(source_equal_p): Moved.
(calculate_sew): Removed.
(same_equiv_note_p): Moved.
(get_expr_id): New.
(incompatible_avl_p): Removed.
(get_regno): New.
(different_sew_p): Removed.
(get_bb_index): New.
(different_lmul_p): Removed.
(has_no_uses): Moved.
(different_ratio_p): Removed.
(different_tail_policy_p): Removed.
(different_mask_policy_p): Removed.
(possible_zero_avl_p): Removed.
(enum demand_flags): New.
(second_ratio_invalid_for_first_sew_p): Removed.
(second_ratio_invalid_for_first_lmul_p): Removed.
(enum class): New.
(float_insn_valid_sew_p): Removed.
(second_sew_less_than_first_sew_p): Removed.
(first_sew_less_than_second_sew_p): Removed.
(class vsetvl_info): New.
(compare_lmul): Removed.
(second_lmul_less_than_first_lmul_p): Removed.
(second_ratio_less_than_first_ratio_p): Removed.
(DEF_INCOMPATIBLE_COND): Removed.

[Bug c/111888] New: RISC-V: Horrible redundant number vsetvl instructions in vectorized codes

2023-10-19 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111888

Bug ID: 111888
   Summary: RISC-V: Horrible redundant number vsetvl instructions
in vectorized codes
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

https://godbolt.org/z/9G5MMa3Tq

void
foo (int32_t *__restrict a, int32_t *__restrict b,int32_t *__restrict c,
  int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2,
  int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3,
  int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4,
  int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5,
  int32_t *__restrict d,
  int32_t *__restrict d2,
  int32_t *__restrict d3,
  int32_t *__restrict d4,
  int32_t *__restrict d5,
  int n)
{
  for (int i = 0; i < n; i++)
{
  a[i] = b[i] + c[i];
  b5[i] = b[i] + c[i];
  a2[i] = b2[i] + c2[i];
  a3[i] = b3[i] + c3[i];
  a4[i] = b4[i] + c4[i];
  a5[i] = a[i] + a4[i];
  d2[i] = a2[i] + c2[i];
  d3[i] = a3[i] + c3[i];
  d4[i] = a4[i] + c4[i];
  d5[i] = a[i] + a4[i];
  a[i] = a5[i] + b5[i] + a[i];

  c2[i] = a[i] + c[i];
  c3[i] = b5[i] * a5[i];
  c4[i] = a2[i] * a3[i];
  c5[i] = b5[i] * a2[i];
  c[i] = a[i] + c3[i];
  c2[i] = a[i] + c4[i];
  a5[i] = a[i] + a4[i];
  a[i] = a[i] + b5[i] + a[i] * a2[i] * a3[i] * a4[i] 
  * a5[i] * c[i] * c2[i] * c3[i] * c4[i] * c5[i]
  * d[i] * d2[i] * d3[i] * d4[i] * d5[i];
}
}


Loop body:

vsetvli t1,t4,e8,mf4,ta,ma
vle32.v v1,0(a1)
vle32.v v4,0(a2)
vle32.v v2,0(s10)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v4,v4,v1
vsetvli zero,t4,e32,m1,ta,ma
vle32.v v7,0(s9)
vle32.v v1,0(a4)
vse32.v v4,0(t0)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v2,v7,v2
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v2,0(t5)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v5,v2,v4
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v5,0(s3)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v3,v5,v4
vsetvli zero,t4,e32,m1,ta,ma
vle32.v v9,0(a5)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v3,v3,v4
vsetvli zero,t4,e32,m1,ta,ma
vle32.v v6,0(a7)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v1,v9,v1
vsetvli zero,t4,e32,m1,ta,ma
vle32.v v8,0(s8)
vse32.v v1,0(a3)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v6,v8,v6
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v6,0(a6)
vsetvli t3,zero,e32,m1,ta,ma
vmul.vv v11,v5,v4
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v11,0(s4)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v13,v11,v3
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v13,0(s6)
vsetvli t3,zero,e32,m1,ta,ma
vmul.vv v10,v6,v1
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v10,0(s5)
vsetvli t3,zero,e32,m1,ta,ma
vmul.vv v12,v1,v4
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v12,0(t2)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v9,v1,v9
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v9,0(s0)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v8,v6,v8
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v8,0(s1)
vsetvli t3,zero,e32,m1,ta,ma
vadd.vv v7,v2,v7
vsetvli zero,t4,e32,m1,ta,ma
vse32.v v7,0(s2)
vsetvli t3,zero,e32,m1,ta,ma
vmul.vv v1,v3,v1
vmul.vv v1,v1,v6
vadd.vv v6,v10,v3
vmul.vv v1,v1,v2
vadd.vv v2,v3,v2
vmul.vv v1,v1,v2
vmul.vv v1,v1,v13
vsetvli zero,t1,e32,m1,ta,ma
vse32.v v6,0(s7)
vsetvli t3,zero,e32,m1,ta,ma
vmul.vv v1,v1,v6
vsetvli zero,t1,e32,m1,ta,ma
vse32.v v2,0(t6)
vsetvli t3,zero,e32,m1,ta,ma
vmul.vv v1,v1,v11
vsetvli zero,t1,e32,m1,ta,ma
vle32.v v2,0(s11)
vsetvli t3,zero,e32,m1,ta,ma
sllit3,t1,2
vmul.vv v1,v1,v10
vadd.vv v3,v3,v4
vmul.vv v1,v1,v12
sub t4,t4,t1
vmul.vv v1,v1,v2
vmul.vv v1,v1,v9
vmul.vv v1,v1,v8
vmul.vv v1,v1,v7
vmadd.vvv5,v1,v3
vsetvli zero,t1,e32,m1,ta,ma
vse32.v v5,0(a0)

So many redundant AVL toggling. Ideally, it should be only a single vsetvl
instruction in the header of the loop. All other vsetvls should be elided.

It's known issue for a long time.
And I will be working on it recently base on refactored VSETVL PASS.

[Bug c++/101631] gcc allows for the changing of an union active member to be changed via a reference

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101631

--- Comment #4 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:1d260ab0e39ea63644e3af3ab2e0db946026b5a6

commit r14-4771-g1d260ab0e39ea63644e3af3ab2e0db946026b5a6
Author: Nathaniel Shead 
Date:   Thu Oct 12 19:53:55 2023 +1100

c++: indirect change of active union member in constexpr
[PR101631,PR102286]

This patch adds checks for attempting to change the active member of a
union by methods other than a member access expression.

To be able to properly distinguish `*() = ` from `u.a = `, this
patch redoes the solution for c++/59950 to avoid extranneous *&; it
seems that the only case that needed the workaround was when copying
empty classes.

This patch also ensures that constructors for a union field mark that
field as the active member before entering the call itself; this ensures
that modifications of the field within the constructor's body don't
cause false positives (as these will not appear to be member access
expressions). This means that we no longer need to start the lifetime of
empty union members after the constructor body completes.

As a drive-by fix, this patch also ensures that value-initialised unions
are considered to have activated their initial member for the purpose of
checking stores and accesses, which catches some additional mistakes
pre-C++20.

PR c++/101631
PR c++/102286

gcc/cp/ChangeLog:

* call.cc (build_over_call): Fold more indirect refs for trivial
assignment op.
* class.cc (type_has_non_deleted_trivial_default_ctor): Create.
* constexpr.cc (cxx_eval_call_expression): Start lifetime of
union member before entering constructor.
(cxx_eval_component_reference): Check against first member of
value-initialised union.
(cxx_eval_store_expression): Activate member for
value-initialised union. Check for accessing inactive union
member indirectly.
* cp-tree.h (type_has_non_deleted_trivial_default_ctor):
Forward declare.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-89336-3.C: Fix union initialisation.
* g++.dg/cpp1y/constexpr-union6.C: New test.
* g++.dg/cpp1y/constexpr-union7.C: New test.
* g++.dg/cpp2a/constexpr-union2.C: New test.
* g++.dg/cpp2a/constexpr-union3.C: New test.
* g++.dg/cpp2a/constexpr-union4.C: New test.
* g++.dg/cpp2a/constexpr-union5.C: New test.
* g++.dg/cpp2a/constexpr-union6.C: New test.

Signed-off-by: Nathaniel Shead 
Reviewed-by: Jason Merrill 

[Bug c++/102286] [constexpr] construct_at incorrectly starts union array lifetime in some cases

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102286

--- Comment #5 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:1d260ab0e39ea63644e3af3ab2e0db946026b5a6

commit r14-4771-g1d260ab0e39ea63644e3af3ab2e0db946026b5a6
Author: Nathaniel Shead 
Date:   Thu Oct 12 19:53:55 2023 +1100

c++: indirect change of active union member in constexpr
[PR101631,PR102286]

This patch adds checks for attempting to change the active member of a
union by methods other than a member access expression.

To be able to properly distinguish `*() = ` from `u.a = `, this
patch redoes the solution for c++/59950 to avoid extranneous *&; it
seems that the only case that needed the workaround was when copying
empty classes.

This patch also ensures that constructors for a union field mark that
field as the active member before entering the call itself; this ensures
that modifications of the field within the constructor's body don't
cause false positives (as these will not appear to be member access
expressions). This means that we no longer need to start the lifetime of
empty union members after the constructor body completes.

As a drive-by fix, this patch also ensures that value-initialised unions
are considered to have activated their initial member for the purpose of
checking stores and accesses, which catches some additional mistakes
pre-C++20.

PR c++/101631
PR c++/102286

gcc/cp/ChangeLog:

* call.cc (build_over_call): Fold more indirect refs for trivial
assignment op.
* class.cc (type_has_non_deleted_trivial_default_ctor): Create.
* constexpr.cc (cxx_eval_call_expression): Start lifetime of
union member before entering constructor.
(cxx_eval_component_reference): Check against first member of
value-initialised union.
(cxx_eval_store_expression): Activate member for
value-initialised union. Check for accessing inactive union
member indirectly.
* cp-tree.h (type_has_non_deleted_trivial_default_ctor):
Forward declare.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-89336-3.C: Fix union initialisation.
* g++.dg/cpp1y/constexpr-union6.C: New test.
* g++.dg/cpp1y/constexpr-union7.C: New test.
* g++.dg/cpp2a/constexpr-union2.C: New test.
* g++.dg/cpp2a/constexpr-union3.C: New test.
* g++.dg/cpp2a/constexpr-union4.C: New test.
* g++.dg/cpp2a/constexpr-union5.C: New test.
* g++.dg/cpp2a/constexpr-union6.C: New test.

Signed-off-by: Nathaniel Shead 
Reviewed-by: Jason Merrill 

[Bug c/111887] GCC: 14: A potential miscompilation with __builtin_inf

2023-10-19 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111887

--- Comment #2 from wierton <141242068 at smail dot nju.edu.cn> ---
Thanks for you reply, I got it!

[Bug c/111887] GCC: 14: A potential miscompilation with __builtin_inf

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111887

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
-Ofast enables -ffinite-math-only which means infinite will not show up and
compares against it will not work.

[Bug c/111887] New: GCC: 14: A potential miscompilation with __builtin_inf

2023-10-19 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111887

Bug ID: 111887
   Summary: GCC: 14: A potential miscompilation with __builtin_inf
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

I am uncertain whether this program contains undefined behavior, the testcase
is:
```
extern void abort (void);

void test(double f, double i)
{
  if (i != __builtin_inf())
abort ();
}

int main()
{
  test (34.0, __builtin_inf());
  return 0;
}

```

When compile it with -O0, -O1, -O2, the program normally exits, but when
compile it with -Ofast, the program abort.

Here is the verification link: https://gcc.godbolt.org/z/7YKbvha84

[Bug middle-end/101195] ICE: in tree_to_uhwi, at tree.c:6324

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101195

Andrew Pinski  changed:

   What|Removed |Added

 CC||141242068 at smail dot 
nju.edu.cn

--- Comment #3 from Andrew Pinski  ---
*** Bug 111886 has been marked as a duplicate of this bug. ***

[Bug c/111886] GCC: 14: internal compiler error: in tree_to_uhwi, at tree.cc:6467

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111886

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 101195 ***

[Bug c/111886] New: GCC: 14: internal compiler error: in tree_to_uhwi, at tree.cc:6467

2023-10-19 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111886

Bug ID: 111886
   Summary: GCC: 14: internal compiler error: in tree_to_uhwi, at
tree.cc:6467
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

The ICE trigger:
```
void f() {
  __builtin_eh_return_data_regno(-1);;
}
```

Verification link: https://gcc.godbolt.org/z/K9178a89W

The stack dump:
```
during RTL pass: expand
: In function 'f':
:2:3: internal compiler error: in tree_to_uhwi, at tree.cc:6467
2 |   __builtin_eh_return_data_regno(-1);;
  |   ^~
0x231788e internal_error(char const*, ...)
???:0
0xa0002a fancy_abort(char const*, int, char const*)
???:0
0xca7b96 expand_builtin_eh_return_data_regno(tree_node*)
???:0
0xb79cc6 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
???:0
0xccd48b expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

[Bug c++/111885] [14 Regression] source code after "required from here" note sometimes printed strangely

2023-10-19 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111885

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |MOVED
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Patrick Palka  ---
On second thought I reckon I'll just note this observation in the patch email
thread instead, sorry for the noise...

[Bug tree-optimization/50856] ARM: suboptimal code for absolute difference calculation

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50856

--- Comment #3 from Andrew Pinski  ---
The second case will be solved by updating the patch at:
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574892.html

For the review at
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574948.html

[Bug c++/54367] [meta-bug] lambda expressions

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54367
Bug 54367 depends on bug 79021, which changed state.

Bug 79021 Summary: attribute noreturn on function template ignored in generic 
lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79021

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c++/79021] attribute noreturn on function template ignored in generic lambda

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79021

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |8.5

--- Comment #4 from Andrew Pinski  ---
Yes this was the same issue as PR 94742 . I will note in the original testcase,
the f call inside h was not a depedent call would have been resolved and if you
make it a depedent call, then you run into the same issue as the generic
lambda. Or if you change the call to f inside the generic lambda to being
non-depedent, you don't get the warning.

So this is all fixed with a testcase added already.

[Bug c++/79021] attribute noreturn on function template ignored in generic lambda

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79021

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection
  Known to work||10.1.0

--- Comment #3 from Andrew Pinski  ---
Looks to be fixed in GCC 10 though.

[Bug c++/108238] auto return type and some attributes don't get along

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108238

--- Comment #4 from Andrew Pinski  ---
Here is one which is a little more complex for templated function too:
```
template
[[gnu::returns_nonnull]]
auto f() {
  return new T(42);
}

auto g(void)
{
   return f();
}
```

[Bug target/111528] aarch64: Test gfortran.dg/pr80494.f90 fails with -fstack-protector-strong with gcc-13 since r13-7813-gb96e66fd4ef3e3

2023-10-19 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111528

Richard Sandiford  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #7 from Richard Sandiford  ---
Fixed on GCC 12 and 13 branches.  I'm a bit nervous about backporting the fix
to GCC 11 given that the next release will be the last.

[Bug target/111876] bf16 complex mul/div does not work when the target has +fp16 support or when -fexcess-precision=16 is supplied

2023-10-19 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111876

--- Comment #5 from Iain Sandoe  ---
for the record an __fp16 implementation works as expected;

* when the target does not support +fp16, the code-gen promotes to float and
does the multiply with __mulsc3

* when the target supports +fp16, the code-gen uses __mulhc3 (which is
implemented in libgcc)

* adding "+bf16" does not affect this.

---

So the issue is that __bf16 works when there is no support for a 16b float,
(because, like fp16, it gets promoted) - but fails when the target declares
support for a different 16 float.

[Bug c++/108238] returns_nonnull attribute with auto return type fails to compile

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108238

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #3 from Andrew Pinski  ---
alloc_align, alloc_size, assume_aligned, malloc attributes have a similar
issue.
const and pure does too.

```
[[gnu::const]]
auto f(){}

[[gnu::const]]
void f1(){}
```

warn_unused_result has a similar issue too:
```
[[gnu::warn_unused_result]]
auto f(){}

[[gnu::warn_unused_result]]
void f1(){}
```

There might be others too.

[Bug c++/111885] New: source code after "required from here" note sometimes printed strangely

2023-10-19 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111885

Bug ID: 111885
   Summary: source code after "required from here" note sometimes
printed strangely
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ppalka at gcc dot gnu.org
  Target Milestone: ---

$ cat required-from-here-bug.C
template void f(typename T::type);

int main() {
  f(0);
}

$ g++ required-from-here-bug.C
required-from-here-bug.C: In function ‘int main()’:
required-from-here-bug.C:4:9: error: no matching function for call to
‘f(int)’
4 |   f(0);
  |   ~~^~~
required-from-here-bug.C:1:24: note: candidate: ‘template void
f(typename T::type)’
1 | template void f(typename T::type);
  |^
required-from-here-bug.C:1:24: note:   template argument deduction/substitution
failed:
required-from-here-bug.C: In substitution of ‘template void f(typename
T::type) [with T = int]’:
required-from-here-bug.C:4:9:   required from here
required-from-here-bug.C:1:24: note: 4 |   f(0);
required-from-here-bug.C:1:24: note:   |   ~~^~~
required-from-here-bug.C:1:24: error: ‘int’ is not a class, struct, or union
type
1 | template void f(typename T::type);
  |^

The two notes following the "required from here" diagnostic seem to be
misprinted -- they refer to line 1, but show source code from line 4, which
seems like a bug?

[Bug c/111884] [13/14 Regression] unsigned char no longer aliases anything under -std=c2x

2023-10-19 Thread tom at honermann dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111884

--- Comment #4 from Tom Honermann  ---
(In reply to Marek Polacek from comment #3)
> Thanks, I can test

Thank you. That change looks right. My apologies for introducing the
regression.

[Bug testsuite/111883] Wstringop-overflow-6.C FAILs with -std=c++26

2023-10-19 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111883

--- Comment #3 from Marek Polacek  ---
Did you mean like the following?  I have no idea if that's correct but is
suppresses the warnings I see.

In C++23 I don't see the code in the .ii file at all, so it doesn't warn.

--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -684,7 +684,7 @@ namespace __detail
 from_chars_result __res
   = __from_chars_float16_t(__first, __last, __val, __fmt);
 if (__res.ec == errc{})
-  __value = __val;
+  __value = _Float16(__val);
 return __res;
   }
 #endif
@@ -697,7 +697,7 @@ namespace __detail
 float __val;
 from_chars_result __res = from_chars(__first, __last, __val, __fmt);
 if (__res.ec == errc{})
-  __value = __val;
+  __value = _Float32(__val);
 return __res;
   }
 #endif
@@ -710,7 +710,7 @@ namespace __detail
 double __val;
 from_chars_result __res = from_chars(__first, __last, __val, __fmt);
 if (__res.ec == errc{})
-  __value = __val;
+  __value = _Float64(__val);
 return __res;
   }
 #endif
@@ -723,7 +723,7 @@ namespace __detail
 long double __val;
 from_chars_result __res = from_chars(__first, __last, __val, __fmt);
 if (__res.ec == errc{})
-  __value = __val;
+  __value = _Float128(__val);
 return __res;
   }
 #elif defined(__STDCPP_FLOAT128_T__) && defined(_GLIBCXX_HAVE_FLOAT128_MATH)
@@ -739,7 +739,7 @@ namespace __detail
 __extension__ __ieee128 __val;
 from_chars_result __res = from_chars(__first, __last, __val, __fmt);
 if (__res.ec == errc{})
-  __value = __val;
+  __value = _Float128(__val);
 return __res;
   }
 #else
@@ -760,7 +760,7 @@ namespace __detail
 from_chars_result __res
   = __from_chars_bfloat16_t(__first, __last, __val, __fmt);
 if (__res.ec == errc{})
-  __value = __val;
+  __value = __gnu_cxx::__bfloat16_t(__val);
 return __res;
   }
 #endif

[Bug c/111884] [13/14 Regression] unsigned char no longer aliases anything under -std=c2x

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111884

Andrew Pinski  changed:

   What|Removed |Added

Summary|unsigned char no longer |[13/14 Regression] unsigned
   |aliases anything under  |char no longer aliases
   |-std=c2x|anything under -std=c2x
   Keywords||diagnostic
   Target Milestone|--- |13.3

[Bug c/111884] unsigned char no longer aliases anything under -std=c2x

2023-10-19 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111884

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Marek Polacek  ---
Thanks, I can test

--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -3828,8 +3828,9 @@ c_common_get_alias_set (tree t)
   if (!TYPE_P (t))
 return -1;

-  /* Unlike char, char8_t doesn't alias. */
-  if (flag_char8_t && t == char8_type_node)
+  /* Unlike char, char8_t doesn't alias in C++.  (In C, char8_t is not
+ a distinct type.)  */
+  if (flag_char8_t && t == char8_type_node && c_dialect_cxx ())
 return -1;

   /* The C standard guarantees that any object may be accessed via an

[Bug testsuite/111883] Wstringop-overflow-6.C FAILs with -std=c++26

2023-10-19 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111883

--- Comment #2 from Jakub Jelinek  ---
Why doesn't it fail with -std=c++23 though?  Was there some C++26 change I'm
not aware of?
In the to_chars cases, we already use float(__value) casts in the
_Float16/__bfloat16_t cases (but others too), so I think we just want to add
explicit casts also to all the from_chars
  __value = __val;
lines (or at least the _Float16/__bfloat16_t cases).

[Bug c/111884] unsigned char no longer aliases anything under -std=c2x

2023-10-19 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111884

--- Comment #2 from joseph at codesourcery dot com  ---
I'm going to guess this was introduced by the char8_t changes ("C: 
Implement C2X N2653 char8_t and UTF-8 string literal changes", commit 
703837b2cc8ac03c53ac7cc0fb1327055acaebd2).

  /* Unlike char, char8_t doesn't alias. */
  if (flag_char8_t && t == char8_type_node)
return -1;

is not correct for C, where char8_t is not a distinct type.

[Bug c/111884] unsigned char no longer aliases anything under -std=c2x

2023-10-19 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111884

Sam James  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org,
   ||sjames at gcc dot gnu.org

--- Comment #1 from Sam James  ---
Originally reported at
https://inbox.sourceware.org/gcc-help/e18818e9-396c-41fe-6c4a-9ac86c564...@ispras.ru/T/#m147d6322a26e2bc094dd5e0935b7f2223bc910ac.
Curious indeed..

[Bug testsuite/111883] Wstringop-overflow-6.C FAILs with -std=c++26

2023-10-19 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111883

--- Comment #1 from Jonathan Wakely  ---
I think Jakub wrote that code, but it looks like we just want the explicit
casts. I can add those.

[Bug c/111884] New: unsigned char no longer aliases anything under -std=c2x

2023-10-19 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111884

Bug ID: 111884
   Summary: unsigned char no longer aliases anything under
-std=c2x
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

int f(int i)
{
int f = 1;
return i[(unsigned char *)];
}
int g(int i)
{
int f = 1;
return i[(signed char *)];
}
int h(int i)
{
int f = 1;
return i[(char *)];
}


gcc -O2 -std=c2x compiles 'f' as though inspecting representation via an
'unsigned char *' is not valid (with a confusing warning under -Wall).

[Bug target/111645] Intrinsics vec_sldb /vec_srdb fail with __vector unsigned __int128

2023-10-19 Thread carll at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645

Carl Love  changed:

   What|Removed |Added

 CC||carll at gcc dot gnu.org

--- Comment #5 from Carl Love  ---

There are a couple of issues with the test case in the attachment.  For example
one of the tests is:


static inline vui64_t
vec_vsldbi_64 (vui64_t vra, vui64_t vrb, const unsigned int shb)
{
 return vec_sldb (vra, vrb, shb);
}

When I tried to compile it, it seemed to compile.  However if I take off the
static inline, then I get an error about in compatible arguments.  The built-in
requires an explicit integer be based in the third argument.  The following
worked for me:


static inline vui64_t
vec_vsldbi_64 (vui64_t vra, vui64_t vrb, const unsigned int shb)
{
 return vec_sldb (vra, vrb, 1);
}

The compiler/assembler needs an explicit value for the third argument as it has
to generate the instruction with the immediate shift value as part of the
instruction.  Hence a variable for the third argument will not work.

Agreed that the __int128 arguments can and should be supported.  Patch to add
that support is in progress but will require getting the LLVM/OpenXL team to
agree to adding the __128int variants as well.

[Bug target/111876] aarch64: bf16 complex mul/div does not work when the target has +fp16 support or when -fexcess-precision=16 is supplied

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111876

Andrew Pinski  changed:

   What|Removed |Added

 Target|aarch64 |aarch64 x86_64
Summary|aarch64: bf16 complex   |aarch64: bf16 complex
   |mul/div does not work when  |mul/div does not work when
   |the target has +fp16|the target has +fp16
   |support.|support or when
   ||-fexcess-precision=16 is
   ||supplied
 CC||jakub at gcc dot gnu.org

--- Comment #4 from Andrew Pinski  ---
So the difference between with/without +fp16 is -fexcess-precision=16 vs
without that option.

On x86_64 with -fexcess-precision=16, we get the same link failure.
Maybe Jakub understands what is the correct behavior here; (since he added
bfloat16 support to x86_64; r13-3292-gc2565a31c1622ab0926aeef4a6579413e121b9f9
).

[Bug target/111876] aarch64: bf16 complex mul/div does not work when the target has +fp16 support.

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111876

--- Comment #3 from Andrew Pinski  ---
(In reply to Iain Sandoe from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > This could either be wrong code for not doing the promotion or just missing
> > the libgcc functions (which could be implemented as doing the promotion).
> > 
> > Either ways confirmed.
> 
> thanks, for checking.
> but I think the underlying concern is that providing a disjoint extension
> (+fp16) should not alter the behaviour of bf16 (in this case I did some
> limited poking about but could not see any obvious place where the addition
> of fp16 alters complex number handling).

The difference comes from the front-end which adds the promotions even.

[Bug target/111876] aarch64: bf16 complex mul/div does not work when the target has +fp16 support.

2023-10-19 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111876

--- Comment #2 from Iain Sandoe  ---
(In reply to Andrew Pinski from comment #1)
> This could either be wrong code for not doing the promotion or just missing
> the libgcc functions (which could be implemented as doing the promotion).
> 
> Either ways confirmed.

thanks, for checking.
but I think the underlying concern is that providing a disjoint extension
(+fp16) should not alter the behaviour of bf16 (in this case I did some limited
poking about but could not see any obvious place where the addition of fp16
alters complex number handling).

[Bug target/111876] aarch64: bf16 complex mul/div does not work when the target has +fp16 support.

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111876

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
   Last reconfirmed||2023-10-19
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
Summary|aarch64: Wrong code for |aarch64: bf16 complex
   |bf16 complex mul/div when   |mul/div does not work when
   |the target has +fp16|the target has +fp16
   |support.|support.

--- Comment #1 from Andrew Pinski  ---
This could either be wrong code for not doing the promotion or just missing the
libgcc functions (which could be implemented as doing the promotion).

Either ways confirmed.

[Bug tree-optimization/111878] [14 Regression] ICE: in get_loop_exit_edges, at cfgloop.cc:1204 with -O3 -fgraphite-identity -fsave-optimization-record/-fdump-tree-graphite

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111878

Andrew Pinski  changed:

   What|Removed |Added

Summary|[14 Regression] ICE: in |[14 Regression] ICE: in
   |get_loop_exit_edges, at |get_loop_exit_edges, at
   |cfgloop.cc:1204 with -O3|cfgloop.cc:1204 with -O3
   |-fgraphite-identity |-fgraphite-identity
   |-fsave-optimization-record  |-fsave-optimization-record/
   ||-fdump-tree-graphite

--- Comment #3 from Andrew Pinski  ---
If we wrap the whole function inside a loop, there is no crash.
That is:
```
int long_c2i_ltmp;
int *long_c2i_cont;

void
long_c2i (long utmp, int i)
{
  for(int j = 0; j < 100; j++)
 {
  int neg = 1;
  switch (long_c2i_cont[0])
case 0:
neg = 0;

  for (; i; i++)
if (neg)
  utmp |= long_c2i_cont[i] ^ 5;
else
  utmp |= long_c2i_cont[i];
  long_c2i_ltmp = utmp;
 }
}
```

That is because the loop that is being chosen here for the `->loop_father` is
the outer most loop (that was just added).

[Bug tree-optimization/111878] [14 Regression] ICE: in get_loop_exit_edges, at cfgloop.cc:1204 with -O3 -fgraphite-identity -fsave-optimization-record

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111878

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-10-19
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
`-O3 -fgraphite-identity -fdump-tree-graphite` will also cause the ICE.

The code in graphite is:
dump_user_location_t loc = find_loop_location
  (scops[i]->scop_info->region.entry->dest->loop_father);
dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc,
 "loop nest optimized\n");

[Bug ipa/111873] [12/13/14 Regression] runtime Segmentation fault with '-O3 -fno-code-hoisting -fno-early-inlining -fno-tree-fre -fno-tree-loop-optimize -fno-tree-pre'

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111873

Andrew Pinski  changed:

   What|Removed |Added

Summary|runtime Segmentation fault  |[12/13/14 Regression]
   |with '-O3   |runtime Segmentation fault
   |-fno-code-hoisting  |with '-O3
   |-fno-early-inlining |-fno-code-hoisting
   |-fno-tree-fre   |-fno-early-inlining
   |-fno-tree-loop-optimize |-fno-tree-fre
   |-fno-tree-pre'  |-fno-tree-loop-optimize
   ||-fno-tree-pre'
   Target Milestone|--- |12.4
 Status|UNCONFIRMED |NEW
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2023-10-19
  Component|middle-end  |ipa

--- Comment #3 from Andrew Pinski  ---
Confirmed.
It is the inliner which messes up and introduces the store to a const variable.
We go from:
```
int main ()
{
   [local count: 1073741824]:
  h (c);
  return 0;

}


void h (const struct a i)
{
  const short int i$b;
  int _1;

   [local count: 1073741824]:
  i$b_5 = i.b;
  i.b = i$b_5;
```

To:
```
int main ()
{
  int D.2031;
  const short int i$b;
  int _4;
  int _6;

   [local count: 1073741824]:
  i$b_3 = 0;
  c.b = i$b_3;
```

Which is totally wrong.

[Bug testsuite/111883] New: Wstringop-overflow-6.C FAILs with -std=c++26

2023-10-19 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111883

Bug ID: 111883
   Summary: Wstringop-overflow-6.C FAILs with -std=c++26
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mpolacek at gcc dot gnu.org
  Target Milestone: ---

FAIL: g++.dg/warn/Wstringop-overflow-6.C  -std=gnu++26 (test for excess errors)

the excess warning is:

$ xg++ -c Wstringop-overflow-6.C  -std=gnu++26 -O2 -Wall -Wsystem-headers 
In file included from
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:51,
 from
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/string:54,
 from
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_classes.h:40,
 from
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/ios_base.h:41,
 from
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/ios:44,
 from
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/ostream:40,
 from
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/iostream:41,
 from Wstringop-overflow-6.C:6:
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/charconv: In
function ‘std::from_chars_result std::from_chars(const char*, const char*,
_Float16&, chars_format)’:
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/charconv:687:17:
warning: converting to ‘_Float16’ from ‘float’ with greater conversion rank
  687 |   __value = __val;
  | ^
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/charconv: In
function ‘std::from_chars_result std::from_chars(const char*, const char*,
__gnu_cxx::__bfloat16_t&, chars_format)’:
/home/mpolacek/x/trunk/x86_64-pc-linux-gnu/libstdc++-v3/include/charconv:763:17:
warning: converting to ‘__gnu_cxx::__bfloat16_t’ {aka ‘__bf16’} from ‘float’
with greater conversion rank
  763 |   __value = __val;
  | ^

[Bug fortran/111880] [9/10/11/12/13] False positive warning of obsolescent COMMON block with Fortran submodule

2023-10-19 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111880

--- Comment #3 from Steve Kargl  ---
On Thu, Oct 19, 2023 at 05:20:46PM +, zed.three at gmail dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111880
> 
> --- Comment #2 from zed.three at gmail dot com ---
> The common block is in 'third_party_module', rather than 'foo',
> unless you mean that it is visible from 'foo'?

Exactly. 'not_my_code' is in the namespace for foo
through use association of third_party_module. 
It seems that trying to hide 'not_my_code' with PRIVATE
or 'use third_party_module, only : some_param' in foo 
does not mute the warning.  Likely, due to -std=f2018
and F2018, Sec. 4.2.

[Bug tree-optimization/110485] vectorizing simd clone calls without loop masking applied

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110485

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Andre Simoes Dias Vieira
:

https://gcc.gnu.org/g:8b704ed0b8f35ec1a57e70bd8e6913ba0e9d1f24

commit r14-4765-g8b704ed0b8f35ec1a57e70bd8e6913ba0e9d1f24
Author: Andre Vieira 
Date:   Thu Oct 19 18:28:12 2023 +0100

vect: don't allow fully masked loops with non-masked simd clones [PR
110485]

When analyzing a loop and choosing a simdclone to use it is possible to
choose
a simdclone that cannot be used 'inbranch' for a loop that can use partial
vectors.  This may lead to the vectorizer deciding to use partial vectors
which
are not supported for notinbranch simd clones.  This patch fixes that by
disabling the use of partial vectors once a notinbranch simd clone has been
selected.

gcc/ChangeLog:

PR tree-optimization/110485
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Disable
partial
vectors usage if a notinbranch simdclone has been selected.

gcc/testsuite/ChangeLog:

* gcc.dg/gomp/pr110485.c: New test.

[Bug fortran/111880] [9/10/11/12/13] False positive warning of obsolescent COMMON block with Fortran submodule

2023-10-19 Thread zed.three at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111880

--- Comment #2 from zed.three at gmail dot com ---
The common block is in 'third_party_module', rather than 'foo', unless you mean
that it is visible from 'foo'? It is still a surprising warning in this
location at any rate!

[Bug c++/89038] #pragma GCC diagnostic ignored "-Wunknown-pragmas" does not work

2023-10-19 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89038

Lewis Hyatt  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #10 from Lewis Hyatt  ---
Fixed for GCC 14 and 13.3.

[Bug c++/89038] #pragma GCC diagnostic ignored "-Wunknown-pragmas" does not work

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89038

--- Comment #9 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Lewis Hyatt
:

https://gcc.gnu.org/g:7a1de35f9cdc13098375baa277496147be271dd3

commit r13-7964-g7a1de35f9cdc13098375baa277496147be271dd3
Author: Lewis Hyatt 
Date:   Wed Oct 18 12:37:08 2023 -0400

c++: Make -Wunknown-pragmas controllable by #pragma GCC diagnostic
[PR89038]

As noted on the PR, commit r13-1544, the fix for PR53431, did not handle
the specific case of -Wunknown-pragmas, because that warning is issued
during preprocessing, but not by libcpp directly (it comes from the
cb_def_pragma callback).  Address that by handling this pragma in
addition to libcpp pragmas during the early pragma handler.

gcc/c-family/ChangeLog:

PR c++/89038
* c-pragma.cc (handle_pragma_diagnostic_impl):  Handle
-Wunknown-pragmas during early processing.

gcc/testsuite/ChangeLog:

PR c++/89038
* c-c++-common/cpp/Wunknown-pragmas-1.c: New test.

[Bug c/100532] ICE: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in useless_type_conversion_p, at gimple-expr.c:259

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100532

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED

--- Comment #11 from Andrew Pinski  ---
Fixed.

[Bug c/104822] -Wscalar-storage-order warning for initialization from NULL seems useless

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104822

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |14.0

--- Comment #7 from Andrew Pinski  ---
Fixed.

[Bug c/100532] ICE: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in useless_type_conversion_p, at gimple-expr.c:259

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100532

--- Comment #10 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:2454ba9e2d1ce2d1b9b2b46f6111e022364bf9b5

commit r14-4759-g2454ba9e2d1ce2d1b9b2b46f6111e022364bf9b5
Author: Andrew Pinski 
Date:   Thu Oct 19 05:42:02 2023 +

c: Fix ICE when an argument was an error mark [PR100532]

In the case of convert_argument, we would return the same expression
back rather than error_mark_node after the error message about
trying to convert to an incomplete type. This causes issues in
the gimplfier trying to see if another conversion is needed.

The code here dates back to before the revision history too so
it might be the case it never noticed we should return an error_mark_node.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR c/100532

gcc/c/ChangeLog:

* c-typeck.cc (convert_argument): After erroring out
about an incomplete type return error_mark_node.

gcc/testsuite/ChangeLog:

* gcc.dg/pr100532-1.c: New test.

[Bug c/104822] -Wscalar-storage-order warning for initialization from NULL seems useless

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104822

--- Comment #6 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:9f33e4c50ee92a2053f52e5eb8f205afa54d4cb0

commit r14-4758-g9f33e4c50ee92a2053f52e5eb8f205afa54d4cb0
Author: Andrew Pinski 
Date:   Wed Oct 18 20:49:05 2023 -0700

c: Don't warn about converting NULL to different sso endian [PR104822]

In a similar way we don't warn about NULL pointer constant conversion to
a different named address we should not warn to a different sso endian
either.
This adds the simple check.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR c/104822

gcc/c/ChangeLog:

* c-typeck.cc (convert_for_assignment): Check for null pointer
before warning about an incompatible scalar storage order.

gcc/testsuite/ChangeLog:

* gcc.dg/sso-18.c: New test.
* gcc.dg/sso-19.c: New test.

[Bug c++/111872] GCC rejects out of class definition of inner private class template

2023-10-19 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111872

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-19
 Ever confirmed|0   |1

--- Comment #1 from Patrick Palka  ---
Confirmed, this is similar to the example in
http://eel.is/c++draft/class.access.general#example-3

[Bug tree-optimization/111882] [13/14 Regression] : internal compiler error: in get_expr_operand in ifcvt with Variable length arrays and bitfields inside a struct

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111882

--- Comment #2 from Andrew Pinski  ---
Looks like this was broken when bitfield expansion was added to ifcvt (I think
r13-3219-g25413fdb2ac24933214123e24ba165026452a6f2 ).

[Bug tree-optimization/111882] [13/14 Regression] : internal compiler error: in get_expr_operand in ifcvt with Variable length arrays inside a struct

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111882

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-19
 Ever confirmed|0   |1
Summary|[13/14 Regression] :|[13/14 Regression] :
   |internal compiler error: in |internal compiler error: in
   |get_expr_operand in ifcvt   |get_expr_operand in ifcvt
   ||with Variable length arrays
   ||inside a struct

--- Comment #1 from Andrew Pinski  ---
We start out with:
  (*A.1_23)[i_33]{lb: 0 sz: _10 * 4}[j_34]{lb: 0 sz: _11 * 4}.b{off: _13 * 4} =
2;

And then we add:
  _ifc__49 = (*A.1_23)[i_33]{lb: 0 sz: _10 * 4}[j_34]{lb: 0 sz: _11 *
4}.D.2748{off: ((sizetype) SAVE_EXPR  + 3 & 18446744073709551612) + 4};

[Bug tree-optimization/111882] [13/14 Regression] : internal compiler error: in get_expr_operand in ifcvt

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111882

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

[Bug target/99087] suboptimal codegen for division by constant 3

2023-10-19 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99087

Ivan Sorokin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Ivan Sorokin  ---
Since GCC 12 the issue no longer reproduces. Closing as fixed.

https://godbolt.org/z/ss7Y84a9f

[Bug c/111882] New: GCC: 14: internal compiler error: in get_expr_operands, at tree-ssa-operands.cc:940

2023-10-19 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111882

Bug ID: 111882
   Summary: GCC: 14: internal compiler error: in
get_expr_operands, at tree-ssa-operands.cc:940
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

Compiler Explorer: https://gcc.godbolt.org/z/h9MMnxbvK

When compile this program with option -O2, gcc crashes:
```
static void __attribute__((noipa)) f(int n) {
  int i, j;
  struct S { char d[n]; int a; int b : 17; int c : 12; };
  struct S A[100][];
  for (i = 0; i < 100; i++) {
asm volatile("" : : "g"([0][0]) : "memory");
for (j = 0; j < ; j++) A[i][j].b = 2;
  }
}

void g(void) { f(1); }
```

The stack dump:
```
unhandled expression in get_expr_operands():
 
unit-size 
align:32 warn_if_not_align:0 symtab:-167332928 alias-set 3
canonical-type 0x7fe6f5ea55e8 precision:32 min  max 
pointer_to_this >
side-effects public
arg:0 
used ignored SI :3:10 size 
unit-size 
align:32 warn_if_not_align:0 context >>

during GIMPLE pass: ifcvt
: In function 'f':
:1:36: internal compiler error: in get_expr_operands, at
tree-ssa-operands.cc:940
1 | static void __attribute__((noipa)) f(int n) {
  |^
0x231788e internal_error(char const*, ...)
???:0
0xa0002a fancy_abort(char const*, int, char const*)
???:0
0x12e792d operands_scanner::get_expr_operands(tree_node**, int)
???:0
0x12e71b8 operands_scanner::get_expr_operands(tree_node**, int)
???:0
0x12e71b8 operands_scanner::get_expr_operands(tree_node**, int)
???:0
0x12e71b8 operands_scanner::get_expr_operands(tree_node**, int)
???:0
0x12e7e50 operands_scanner::parse_ssa_operands()
???:0
0x12e8cfa operands_scanner::build_ssa_operands()
???:0
0x12e8f44 update_stmt_operands(function*, gimple*)
???:0
0xd66fec gsi_insert_before(gimple_stmt_iterator*, gimple*, gsi_iterator_update)
???:0
0x1196551 tree_if_conversion(loop*, vec*)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #20 from Tamar Christina  ---
(In reply to David Binderman from comment #19)
> Created attachment 56154 [details]
> C source code
> 
> You might like to have a go at getting the attached code working:
> 
> $ ~/gcc/results/bin/gcc -c -w -O3  bug967B.c
> bug967B.c: In function ‘__wcstod128_l_internal’:
> bug967B.c:10:1: error: stmt with wrong VUSE
>10 | __wcstod128_l_internal() {
>   | ^~
> 
> I have 20+ other cases. I can provide them, if you like.

No need :) They're all the same bug.  The idea for the fix was correct, but the
way I checked if the loop was versioned wasn't strong enough.

All the reported testcases now pass. I'll start regressions.

[Bug fortran/110644] Error in gfc_format_decoder

2023-10-19 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110644

--- Comment #9 from Steve Kargl  ---
On Thu, Oct 19, 2023 at 04:00:10PM +, aluaces at udc dot es wrote:
> 
> No, I meant building *gcc* with those flags, but alas each gcc compilation
> stage was still building with "-O2" so almost all of the compiler structures
> are still optimized.

Ah.  Yes, that can be a pain.  The only way I've ever "fixed" this
problem is hacking configure to use "-O -g" instead of "-O2 -g".

> Nevertheless I did what you suggest and climbed up those 6 levels to find that
> indeed expr->where has null fields.  To me it is not very strange since in my
> code there is a structure that has copy and assignment members. gfortran is
> arguing about them being called in another module, but of course there is no
> physical place where they are called, as this is done implicitly by the
> compiler when using those kind of objects, if I am correct.

This is definitely a bug in that expr->where should be set.  Are
you using OOP?   

> I am rebuilding with your suggested gfc_current_locus change and reporting the
> results.

It may simply give you nonsense, but the compiliation should continue.
This may be a warning that can then be ignored.

[Bug fortran/111880] [9/10/11/12/13] False positive warning of obsolescent COMMON block with Fortran submodule

2023-10-19 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111880

kargl at gcc dot gnu.org changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Priority|P3  |P5

--- Comment #1 from kargl at gcc dot gnu.org ---
(In reply to zed.three from comment #0)
>  Warning: Fortran 2018 obsolescent feature: COMMON block at (1)
>   foo.f90:14:14:
>   
>  14 | submodule (foo) foo_submod
> |  1
>   Warning: Fortran 2018 obsolescent feature: COMMON block at (1)
> 

Not sure I would call this a false positive as the 
locus is pointing at module 'foo', and 'foo' contains
a common block.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #19 from David Binderman  ---
Created attachment 56154
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56154=edit
C source code

You might like to have a go at getting the attached code working:

$ ~/gcc/results/bin/gcc -c -w -O3  bug967B.c
bug967B.c: In function ‘__wcstod128_l_internal’:
bug967B.c:10:1: error: stmt with wrong VUSE
   10 | __wcstod128_l_internal() {
  | ^~

I have 20+ other cases. I can provide them, if you like.

[Bug fortran/110644] Error in gfc_format_decoder

2023-10-19 Thread aluaces at udc dot es via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110644

--- Comment #8 from Alberto Luaces  ---
No, I meant building *gcc* with those flags, but alas each gcc compilation
stage was still building with "-O2" so almost all of the compiler structures
are still optimized.

Nevertheless I did what you suggest and climbed up those 6 levels to find that
indeed expr->where has null fields.  To me it is not very strange since in my
code there is a structure that has copy and assignment members. gfortran is
arguing about them being called in another module, but of course there is no
physical place where they are called, as this is done implicitly by the
compiler when using those kind of objects, if I am correct.

I am rebuilding with your suggested gfc_current_locus change and reporting the
results.

[Bug target/111466] RISC-V: redundant sign extensions despite ABI guarantees

2023-10-19 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111466

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||law at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #5 from Jeffrey A. Law  ---
Fixed on the trunk now.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #18 from Tamar Christina  ---
Fix is too conservative, when there's no use in either loop it fails as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111877 shows.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 CC||zsojka at seznam dot cz

--- Comment #17 from Tamar Christina  ---
*** Bug 111877 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/111877] [14 Regression] ICE: verify_ssa failed: PHI node with wrong VUSE on edge from BB 25 with -O -fno-tree-sink -ftree-vectorize

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111877

Tamar Christina  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Tamar Christina  ---
merging the two

*** This bug has been marked as a duplicate of bug 111860 ***

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #16 from David Binderman  ---
Created attachment 56153
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56153=edit
C source code

[Bug fortran/110644] Error in gfc_format_decoder

2023-10-19 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110644

--- Comment #7 from Steve Kargl  ---
On Thu, Oct 19, 2023 at 08:00:27AM +, aluaces at udc dot es wrote:
> 
> It says something about a non-recursive function likely calling itself.  I 
> will
> inspect my source, even it is a bit too big.  Maybe a better solution would be
> if I compiled gcc with debugging flags.  Can I just use C_FLAGS="-O0 -g" and
> CXX_FLAGS="-O0 -g" at configure time, or if is there a specific configure 
> flag?

I doubt that these options will help with this problem.  Those
are for building your application with debugging info.  The issue
is the compiler itself is dying.

> 1078gcc_assert (loc->nextc - loc->lb->line >= 0);
> (gdb) bt
> #0  gfc_format_decoder (pp=0x2706750, text=0x7fffb840, spec=0x2708d10 "L",
> precision=, wide=false, set_locus=false, hash=false,
> quoted=0x7fffb667, buffer_ptr=0x2708b00)
> at ../../gcc-13.2.0/gcc/fortran/error.cc:1078
> #1  0x01b44c0a in pp_format (pp=,
> text=text@entry=0x7fffb840) at ../../gcc-13.2.0/gcc/pretty-print.cc:1475
> #2  0x01b34e02 in diagnostic_report_diagnostic (context=0x26ee380
> , diagnostic=diagnostic@entry=0x7fffb840) at
> ../../gcc-13.2.0/gcc/diagnostic.cc:1592
> #3  0x0071cbc8 in gfc_report_diagnostic (diagnostic=0x7fffb840) at
> ../../gcc-13.2.0/gcc/fortran/error.cc:890
> #4  gfc_warning(int, const char *, typedef __va_list_tag __va_list_tag *)
> (opt=0,
> gmsgid=0x1c9c420 "Non-RECURSIVE procedure %qs at %L is possibly calling
> itself recursively.  Declare it RECURSIVE or use %<-frecursive%>",
> ap=ap@entry=0x7fffb9c8)
> at ../../gcc-13.2.0/gcc/fortran/error.cc:923
> #5  0x0071d287 in gfc_warning (opt=opt@entry=0,
> gmsgid=gmsgid@entry=0x1c9c420 "Non-RECURSIVE procedure %qs at %L is possibly
> calling itself recursively.  Declare it RECURSIVE or use %<-frecursive%>")
> at ../../gcc-13.2.0/gcc/fortran/error.cc:954

I suspect that the locus associate with %L in the above
error message is undefined (i.e., a NULL pointer).  This
likely means expr in the next frame is missing some info.

> #6  0x007a275f in resolve_procedure_expression (expr=0x32a9530) at
> ../../gcc-13.2.0/gcc/fortran/resolve.cc:1956

Can you do 

(gdb) up 6
(gdb) p *expr

You'll likely see expr->where is either NULL or its components are.
My sources are bit newer, resolve.cc:1974 is

  if (is_illegal_recursion (sym, gfc_current_ns))
gfc_warning (0, "Non-RECURSIVE procedure %qs at %L is possibly calling"
 " itself recursively.  Declare it RECURSIVE or use"
 " %<-frecursive%>", sym->name, >where);

You can try replacing >where with gfc_current_locus.  I don't
remember if the & is needed with gfc_current_locus.

[Bug c/100532] ICE: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in useless_type_conversion_p, at gimple-expr.c:259

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100532

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-October
   ||/633610.html

--- Comment #9 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633610.html

[Bug c/104822] -Wscalar-storage-order warning for initialization from NULL seems useless

2023-10-19 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104822

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-October
   ||/633609.html

--- Comment #5 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633609.html

[Bug analyzer/111881] New: analyzer: ICE in ensure_closed, at analyzer/constraint-manager.cc:130 with -Ofast

2023-10-19 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111881

Bug ID: 111881
   Summary: analyzer: ICE in ensure_closed, at
analyzer/constraint-manager.cc:130 with -Ofast
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

The following fails (here on aarch64, but also on x86_64):

$ cat t.c
int int0(float sf1) { return sf1 <= 0 || sf1 >= 7 ? 0 : sf1; }

$ ./xgcc -B . -c t.c -fanalyzer -Ofast
during IPA pass: analyzer
t.c: In function ‘int0’:
t.c:1:55: internal compiler error: in ensure_closed, at
analyzer/constraint-manager.cc:130
1 | int int0(float sf1) { return sf1 <= 0 || sf1 >= 7 ? 0 : sf1; }
  |  ~^
0x26060c5 ana::bound::ensure_closed(ana::bound_kind)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/constraint-manager.cc:130
0x25fdb21 ana::bound::ensure_closed(ana::bound_kind)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/constraint-manager.cc:121
0x25fdb21 ana::range::add_bound(ana::bound, ana::bound_kind)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/constraint-manager.cc:309
0x25ff278 ana::constraint_manager::eval_condition(ana::equiv_class_id,
tree_code, tree_node*) const
   
/home/alecop01/toolchain/src/gcc/gcc/analyzer/constraint-manager.cc:2605
0x25ff574 ana::constraint_manager::eval_condition(ana::svalue const*,
tree_code, ana::svalue const*) const
   
/home/alecop01/toolchain/src/gcc/gcc/analyzer/constraint-manager.cc:2744
0x25ff6c8 ana::constraint_manager::add_constraint(ana::svalue const*,
tree_code, ana::svalue const*)
   
/home/alecop01/toolchain/src/gcc/gcc/analyzer/constraint-manager.cc:1773
0x2659ef0 ana::region_model::add_constraint(ana::svalue const*, tree_code,
ana::svalue const*, ana::region_model_context*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/region-model.cc:4565
0x265a087 ana::region_model::add_constraints_from_binop(ana::svalue const*,
tree_code, ana::svalue const*, bool*, ana::region_model_context*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/region-model.cc:4482
0x2659ec4 ana::region_model::add_constraint(ana::svalue const*, tree_code,
ana::svalue const*, ana::region_model_context*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/region-model.cc:4561
0x265a0db ana::region_model::add_constraints_from_binop(ana::svalue const*,
tree_code, ana::svalue const*, bool*, ana::region_model_context*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/region-model.cc:4495
0x2659ec4 ana::region_model::add_constraint(ana::svalue const*, tree_code,
ana::svalue const*, ana::region_model_context*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/region-model.cc:4561
0x265b271 ana::region_model::add_constraint(tree_node*, tree_code, tree_node*,
ana::region_model_context*, std::unique_ptr >*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/region-model.cc:4593
0x263dc74 ana::program_state::on_edge(ana::exploded_graph&,
ana::exploded_node*, ana::superedge const*, ana::uncertainty_t*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/program-state.cc:1148
0x262027e ana::exploded_graph::process_node(ana::exploded_node*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/engine.cc:4334
0x262132a ana::exploded_graph::process_worklist()
/home/alecop01/toolchain/src/gcc/gcc/analyzer/engine.cc:3486
0x26238a1 ana::impl_run_checkers(ana::logger*)
/home/alecop01/toolchain/src/gcc/gcc/analyzer/engine.cc:6154
0x2624756 ana::run_checkers()
/home/alecop01/toolchain/src/gcc/gcc/analyzer/engine.cc:6242
0x25e52e8 execute
/home/alecop01/toolchain/src/gcc/gcc/analyzer/analyzer-pass.cc:87
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug fortran/111880] New: [9/10/11/12/13] False positive warning of obsolescent COMMON block with Fortran submodule

2023-10-19 Thread zed.three at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111880

Bug ID: 111880
   Summary: [9/10/11/12/13] False positive warning of obsolescent
COMMON block with Fortran submodule
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zed.three at gmail dot com
  Target Milestone: ---

Created attachment 56152
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56152=edit
Minimal source code demonstrating issue.

Compiler Explorer link with complete reproducer:
https://godbolt.org/z/dd45enhWe

  module third_party_module
integer :: some_param
common /not_my_code/ some_param
  end module third_party_module

  module foo
use third_party_module
interface
  module subroutine bar()
  end subroutine bar
end interface
  end module foo

  submodule (foo) foo_submod
  contains
module procedure bar
end procedure bar
  end submodule foo_submod


Compiling the above minimal program like:

  gfortran -std=f2018 -c foo.f90


gives the following warnings:

  foo.f90:3:22:

  3 |   common /not_my_code/ some_param
|  1
  Warning: Fortran 2018 obsolescent feature: COMMON block at (1)
  foo.f90:14:14:

 14 | submodule (foo) foo_submod
|  1
  Warning: Fortran 2018 obsolescent feature: COMMON block at (1)


The first warning is expected, but the second one is a false positive. I came
across this when building with a library outside of my control, so I cannot
remove the problem common block (actually this was with MPI, and it happens
with all the major implementations as the common block is required for
technical reasons).


If the submodule is removed, the extra warning disappears. The warning also
appears when building the submodule separately (in a different file and having
already built the parent module).

It also only appears to be this warning, and not other F2018 obsolescent
feature warnings (e.g. labeled DO statements), or other warnings enabled at
`-Wall` for instance.

[Bug tree-optimization/111877] [14 Regression] ICE: verify_ssa failed: PHI node with wrong VUSE on edge from BB 25 with -O -fno-tree-sink -ftree-vectorize

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111877

Tamar Christina  changed:

   What|Removed |Added

   Last reconfirmed||2023-10-19
   Assignee|unassigned at gcc dot gnu.org  |tnfchris at gcc dot 
gnu.org
   Priority|P3  |P1

--- Comment #2 from Tamar Christina  ---
(In reply to Richard Biener from comment #1)
> possibly fixed already

Sadly no, this is a third case where neither loop uses the value at all.

It's kept because the tree gets versioned and so it thinks the second loop
needs it.  I should probably always remove it if the first loop doesn't use it
and fix it up in the guard creation instead.

[Bug libstdc++/110167] excessive compile time for std::to_array with huge arrays

2023-10-19 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110167

Jonathan Wakely  changed:

   What|Removed |Added

   Target Milestone|--- |12.4

[Bug tree-optimization/106878] [11/12 Regression] ICE: verify_gimple failed at -O2 with pointers and bitwise calculation

2023-10-19 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106878

--- Comment #18 from Alex Coplan  ---
(In reply to Jakub Jelinek from comment #15)
> Just note this had various follow-ups.
> r13-2658
> r13-2709
> r13-2891
> at least.

So for backports, it sounds like we want r13-2658 without the verify_gimple
changes, and the other two patches as is. Is that right? Would it make sense to
squash these if we were to backport them or should they be kept as separate
patches?

[Bug middle-end/111875] With -Og ubsan check inserted even though __builtin_assume_aligned guarantees no UB

2023-10-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111875

--- Comment #3 from Richard Biener  ---
CCP propagates the alignment here.(In reply to Filip Kastl from comment #1)
> I found out that this is caused by the copy_prop pass. With -Og, an instance
> of copy_prop runs after the fold_builtins pass but before the sanopt pass.
> The fold_builtins pass changes the statement p_2 =
> __builtin_assume_aligned(p_1, 4) to p_2 = p_1; and changes the alignment of
> p_2 to 32 bits. However the alignment of p_1 remains 8 bits so when
> copy_prop propagates all occurences of p_2 to instead be occurences of p_1,
> the information about alignment is lost. When the sanopt pass runs, it
> decides that casting p to (int *) possibly creates UB.
> 
> I see a few possible solutions:
> - Stop copy prop from propagating through assignments where the alignments
> differ
> - Modify copy prop to use the alignment information of the lhs ssa name when
> propagating through similar assignment statements
> - Modify fold_builtins to copy propagate in similar cases
> - Modify fold_builtins to also set alignment of the rhs ssa name when
> removing __builtin_assume_aligned in similar cases

I think in general none of those work.  IIRC the copyprop pass was put there
specifically as a "cheap" way to propagate constants exposed by
pass_fold_builtins.  git blame might tell - there was the alternative to
perform this propagation in fold_builtins but it's difficult to be
"complete" there.  The alternative would be to turn that into a proper
simple constant propagation pass.

Not sure if all worth for -Og just because of sanopt though.

[Bug c/107954] Support -std=c23/gnu23 as aliases of -std=c2x/gnu2x

2023-10-19 Thread daniel.lundin.mail at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107954

Daniel Lundin  changed:

   What|Removed |Added

 CC||daniel.lundin.mail at gmail 
dot co
   ||m

--- Comment #6 from Daniel Lundin  ---
(In reply to jos...@codesourcery.com from comment #5)
> The straw poll at the June meeting said to keep calling it C23 (votes 
> 4/12/2 for/against/abstain on the question of changing the informal name 
> to C24).  Of course the actual standard will be ISO/IEC 9899:2024 (but 
> __STDC_VERSION__ will remain as 202311L, consistent with the informal name 
> rather than the publication date, in the absence of a technical DIS 
> comment requesting a change of version number being accepted, and 
> accepting any technical DIS comments would delay the standard by requiring 
> an FDIS).

Please keep in mind that -std=c2x uses a placeholder value of 202000L for
__STDC_VERSION__, whereas WG14 has decided that 202311L should be used for the
final version. Worth considering when -std=c23 goes live.

[Bug tree-optimization/111877] [14 Regression] ICE: verify_ssa failed: PHI node with wrong VUSE on edge from BB 25 with -O -fno-tree-sink -ftree-vectorize

2023-10-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111877

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 CC||tnfchris at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
possibly fixed already

[Bug tree-optimization/111878] [14 Regression] ICE: in get_loop_exit_edges, at cfgloop.cc:1204 with -O3 -fgraphite-identity -fsave-optimization-record

2023-10-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111878

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug target/111720] RISC-V: Ugly codegen in RVV

2023-10-19 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720

--- Comment #26 from rguenther at suse dot de  ---
On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> 
> --- Comment #25 from JuzheZhong  ---
> (In reply to rguent...@suse.de from comment #24)
> > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > 
> > > --- Comment #23 from JuzheZhong  ---
> > > (In reply to rguent...@suse.de from comment #22)
> > > > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > > > 
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > > > 
> > > > > --- Comment #21 from JuzheZhong  ---
> > > > > (In reply to rguent...@suse.de from comment #20)
> > > > > > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > > > > > 
> > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > > > > > 
> > > > > > > --- Comment #19 from JuzheZhong  ---
> > > > > > > (In reply to Richard Biener from comment #18)
> > > > > > > > With RVV you have intrinsic calls in GIMPLE so nothing to 
> > > > > > > > optimize:
> > > > > > > > 
> > > > > > > > vbool8_t fn ()
> > > > > > > > {
> > > > > > > >   vbool8_t vmask;
> > > > > > > >   vuint8m1_t vand_m;
> > > > > > > >   vuint8m1_t varr;
> > > > > > > >   uint8_t arr[32];
> > > > > > > > 
> > > > > > > >[local count: 1073741824]:
> > > > > > > >   arr =
> > > > > > > > "\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01\x02\x07\x01
> > > > > > > > \x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t";
> > > > > > > >   varr_3 = __riscv_vle8_v_u8m1 (, 32); [return slot 
> > > > > > > > optimization]
> > > > > > > >   vand_m_4 = __riscv_vand_vx_u8m1 (varr_3, 1, 32); [return slot 
> > > > > > > > optimization]
> > > > > > > >   vmask_5 = __riscv_vreinterpret_v_u8m1_b8 (vand_m_4); [return 
> > > > > > > > slot
> > > > > > > > optimization]
> > > > > > > >= vmask_5;
> > > > > > > >   arr ={v} {CLOBBER(eol)};
> > > > > > > >   return ;
> > > > > > > > 
> > > > > > > > and on RTL I see lots of UNSPECs, RTL opts cannot do anything 
> > > > > > > > with those.
> > > > > > > > 
> > > > > > > > This is what Andrew said already.
> > > > > > > 
> > > > > > > Ok. I wonder why this issue is gone when I change it into:
> > > > > > > 
> > > > > > > arr as static
> > > > > > > 
> > > > > > > https://godbolt.org/z/Tdoshdfr6
> > > > > > 
> > > > > > Because the stacik initialization isn't required then.
> > > > > 
> > > > > I have experiment with a simplifed pattern:
> > > > > 
> > > > > 
> > > > > (insn 14 13 15 2 (set (reg/v:RVVM1QI 134 [ varr ])
> > > > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > > > > (const_vector:RVVMF8BI repeat [
> > > > > (const_int 1 [0x1])
> > > > > ])
> > > > > (reg:DI 143)
> > > > > (const_int 2 [0x2]) repeated x2
> > > > > (const_int 0 [0])
> > > > > (reg:SI 66 vl)
> > > > > (reg:SI 67 vtype)
> > > > > ] UNSPEC_VPREDICATE)
> > > > > (mem:RVVM1QI (reg:DI 142) [0  S[16, 16] A8])
> > > > > (const_vector:RVVM1QI repeat [
> > > > > (const_int 0 [0])
> > > > > ]))) "rvv.c":5:23 1476 {*pred_movrvvm1qi}
> > > > >  (nil))
> > > > > (insn 15 14 16 2 (set (reg:DI 144)
> > > > > (const_int 32 [0x20])) "rvv.c":6:5 206 {*movdi_64bit}
> > > > >  (nil))
> > > > > (insn 16 15 0 2 (set (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 
> > > > > 16] A8])
> > > > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > > > > (const_vector:RVVMF8BI repeat [
> > > > > (const_int 1 [0x1])
> > > > > ])
> > > > > (reg:DI 144)
> > > > > (const_int 0 [0])
> > > > > (reg:SI 66 vl)
> > > > > (reg:SI 67 vtype)
> > > > > ] UNSPEC_VPREDICATE)
> > > > > (reg/v:RVVM1QI 134 [ varr ])
> > > > > (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])))
> > > > > "rvv.c":6:5 1592 {pred_storervvm1qi}
> > > > >  (nil))
> > > > > 
> > > > > You can see there is only one UNSPEC now. Still has redundant stack
> > > > > transferring.
> > > > > 
> > > > > Is it because the pattern too complicated?
> > > > 
> > > > It's because it has an UNSPEC in it - that makes it have target
> > > > specific (unknown to the middle-end) behavior so nothing can
> > > > be optimized here.
> > > > 
> > > > Specifically passes likely refuse to replace MEM operands in
> > > > such a construct.
> > > 
> > > I saw ARM SVE load/store intrinsic also have UNSPEC.
> > > They don't have such issues.
> > > 
> > > https://godbolt.org/z/fsW6Ko93z
> > > 
> > > But their patterns are much simplier than RVV patterns. 
> > > 

[Bug preprocessor/82335] Incorrect _Pragma expansion if macro passed to another macro

2023-10-19 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82335

Lewis Hyatt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Lewis Hyatt  ---
Closing it now that the testcase has been added.

[Bug c++/89038] #pragma GCC diagnostic ignored "-Wunknown-pragmas" does not work

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89038

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Lewis Hyatt :

https://gcc.gnu.org/g:19cc4b9d74940f29c961e2a5a8b1fa84992d3d30

commit r14-4748-g19cc4b9d74940f29c961e2a5a8b1fa84992d3d30
Author: Lewis Hyatt 
Date:   Wed Oct 18 12:37:08 2023 -0400

c++: Make -Wunknown-pragmas controllable by #pragma GCC diagnostic
[PR89038]

As noted on the PR, commit r13-1544, the fix for PR53431, did not handle
the specific case of -Wunknown-pragmas, because that warning is issued
during preprocessing, but not by libcpp directly (it comes from the
cb_def_pragma callback).  Address that by handling this pragma in
addition to libcpp pragmas during the early pragma handler.

gcc/c-family/ChangeLog:

PR c++/89038
* c-pragma.cc (handle_pragma_diagnostic_impl):  Handle
-Wunknown-pragmas during early processing.

gcc/testsuite/ChangeLog:

PR c++/89038
* c-c++-common/cpp/Wunknown-pragmas-1.c: New test.

[Bug preprocessor/82335] Incorrect _Pragma expansion if macro passed to another macro

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82335

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Lewis Hyatt :

https://gcc.gnu.org/g:202a214d6859d91af5a95aa989321c5d2173c40a

commit r14-4747-g202a214d6859d91af5a95aa989321c5d2173c40a
Author: Lewis Hyatt 
Date:   Mon Oct 2 14:56:58 2023 -0400

libcpp: testsuite: Add test for fixed _Pragma bug [PR82335]

This PR was fixed by r12-4797 and r12-5454. Add test coverage from the PR
that is not represented elsewhere.

gcc/testsuite/ChangeLog:

PR preprocessor/82335
* c-c++-common/cpp/diagnostic-pragma-3.c: New test.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

Tamar Christina  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Tamar Christina  ---
Fixed, thanks for the report

[Bug tree-optimization/111879] No gather BB vectorization for

2023-10-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111879

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Last reconfirmed||2023-10-19
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
The way we identify gathers/scatters in vect_analyze_data_refs doesn't work.
We probably have to delay gather/scatter "discovery" until after we do
group analysis or within that.

So it's not going to be simple.

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #14 from CVS Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:217a0fcb852aeb4aa9e3fb9baec6ff947c8de3d4

commit r14-4746-g217a0fcb852aeb4aa9e3fb9baec6ff947c8de3d4
Author: Tamar Christina 
Date:   Thu Oct 19 13:44:01 2023 +0100

middle-end: don't create LC-SSA PHI variables for PHI nodes who dominate
loop

As the testcase shows, when a PHI node dominates the loop there is no new
definition inside the loop.  As such there would be no PHI nodes to update.

When we maintain LCSSA form we create an intermediate node in between the
two
loops to thread alongt the value.  However later on when we update the
second
loop we don't have any PHI nodes to update and so
adjust_phi_and_debug_stmts
does nothing.   This leaves us with an incorrect phi node.  Normally this
does
nothing and just gets ignored.  But in the case of the vUSE chain we end up
corrupting the chain.

As such whenever a PHI node's argument dominates the loop, we should remove
the newly created PHI node after edge redirection.

The one exception to this is when the loop has been versioned.  In such
cases
the versioned loop may not use the value but the second loop can.

When this happens and we add the loop guard unless the join block has the
PHI
it can't find the original value for use inside the guard block.

The next refactoring in the series moves the formation of the guard block
inside peeling itself.  Here we have all the information and wouldn't
need to re-create it later.

gcc/ChangeLog:

PR tree-optimization/111860
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Remove PHI nodes that dominate loop.

gcc/testsuite/ChangeLog:

PR tree-optimization/111860
* gcc.dg/vect/pr111860.c: New test.

[Bug c++/89038] #pragma GCC diagnostic ignored "-Wunknown-pragmas" does not work

2023-10-19 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89038

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #7 from Marek Polacek  ---
Fix by Lewis:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633508.html

[Bug tree-optimization/111879] New: No gather BB vectorization for

2023-10-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111879

Bug ID: 111879
   Summary: No gather BB vectorization for
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

void __attribute__((noipa))
f (int *restrict y, int *restrict x, int *restrict indices)
{ 
  int i = 0;
  y[i * 2] = x[indices[i * 2]] + 1;
  y[i * 2 + 1] = x[indices[i * 2 + 1]] + 2;
  i++;
  y[i * 2] = x[indices[i * 2]] + 1;
  y[i * 2 + 1] = x[indices[i * 2 + 1]] + 2;
  i++;
  y[i * 2] = x[indices[i * 2]] + 1;
  y[i * 2 + 1] = x[indices[i * 2 + 1]] + 2;
  i++;
  y[i * 2] = x[indices[i * 2]] + 1;
  y[i * 2 + 1] = x[indices[i * 2 + 1]] + 2;
}   

doesn't see the gather operation vectorized (extracted from the loop in
gcc.dg/vect/vect-gather-1.c).  Instead we only see the adds and the stores
vectorized.

[Bug target/111720] RISC-V: Ugly codegen in RVV

2023-10-19 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720

--- Comment #25 from JuzheZhong  ---
(In reply to rguent...@suse.de from comment #24)
> On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > 
> > --- Comment #23 from JuzheZhong  ---
> > (In reply to rguent...@suse.de from comment #22)
> > > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > > 
> > > > --- Comment #21 from JuzheZhong  ---
> > > > (In reply to rguent...@suse.de from comment #20)
> > > > > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > > > > 
> > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > > > > 
> > > > > > --- Comment #19 from JuzheZhong  ---
> > > > > > (In reply to Richard Biener from comment #18)
> > > > > > > With RVV you have intrinsic calls in GIMPLE so nothing to 
> > > > > > > optimize:
> > > > > > > 
> > > > > > > vbool8_t fn ()
> > > > > > > {
> > > > > > >   vbool8_t vmask;
> > > > > > >   vuint8m1_t vand_m;
> > > > > > >   vuint8m1_t varr;
> > > > > > >   uint8_t arr[32];
> > > > > > > 
> > > > > > >[local count: 1073741824]:
> > > > > > >   arr =
> > > > > > > "\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01\x02\x07\x01
> > > > > > > \x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t";
> > > > > > >   varr_3 = __riscv_vle8_v_u8m1 (, 32); [return slot 
> > > > > > > optimization]
> > > > > > >   vand_m_4 = __riscv_vand_vx_u8m1 (varr_3, 1, 32); [return slot 
> > > > > > > optimization]
> > > > > > >   vmask_5 = __riscv_vreinterpret_v_u8m1_b8 (vand_m_4); [return 
> > > > > > > slot
> > > > > > > optimization]
> > > > > > >= vmask_5;
> > > > > > >   arr ={v} {CLOBBER(eol)};
> > > > > > >   return ;
> > > > > > > 
> > > > > > > and on RTL I see lots of UNSPECs, RTL opts cannot do anything 
> > > > > > > with those.
> > > > > > > 
> > > > > > > This is what Andrew said already.
> > > > > > 
> > > > > > Ok. I wonder why this issue is gone when I change it into:
> > > > > > 
> > > > > > arr as static
> > > > > > 
> > > > > > https://godbolt.org/z/Tdoshdfr6
> > > > > 
> > > > > Because the stacik initialization isn't required then.
> > > > 
> > > > I have experiment with a simplifed pattern:
> > > > 
> > > > 
> > > > (insn 14 13 15 2 (set (reg/v:RVVM1QI 134 [ varr ])
> > > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > > > (const_vector:RVVMF8BI repeat [
> > > > (const_int 1 [0x1])
> > > > ])
> > > > (reg:DI 143)
> > > > (const_int 2 [0x2]) repeated x2
> > > > (const_int 0 [0])
> > > > (reg:SI 66 vl)
> > > > (reg:SI 67 vtype)
> > > > ] UNSPEC_VPREDICATE)
> > > > (mem:RVVM1QI (reg:DI 142) [0  S[16, 16] A8])
> > > > (const_vector:RVVM1QI repeat [
> > > > (const_int 0 [0])
> > > > ]))) "rvv.c":5:23 1476 {*pred_movrvvm1qi}
> > > >  (nil))
> > > > (insn 15 14 16 2 (set (reg:DI 144)
> > > > (const_int 32 [0x20])) "rvv.c":6:5 206 {*movdi_64bit}
> > > >  (nil))
> > > > (insn 16 15 0 2 (set (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 
> > > > 16] A8])
> > > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > > > (const_vector:RVVMF8BI repeat [
> > > > (const_int 1 [0x1])
> > > > ])
> > > > (reg:DI 144)
> > > > (const_int 0 [0])
> > > > (reg:SI 66 vl)
> > > > (reg:SI 67 vtype)
> > > > ] UNSPEC_VPREDICATE)
> > > > (reg/v:RVVM1QI 134 [ varr ])
> > > > (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])))
> > > > "rvv.c":6:5 1592 {pred_storervvm1qi}
> > > >  (nil))
> > > > 
> > > > You can see there is only one UNSPEC now. Still has redundant stack
> > > > transferring.
> > > > 
> > > > Is it because the pattern too complicated?
> > > 
> > > It's because it has an UNSPEC in it - that makes it have target
> > > specific (unknown to the middle-end) behavior so nothing can
> > > be optimized here.
> > > 
> > > Specifically passes likely refuse to replace MEM operands in
> > > such a construct.
> > 
> > I saw ARM SVE load/store intrinsic also have UNSPEC.
> > They don't have such issues.
> > 
> > https://godbolt.org/z/fsW6Ko93z
> > 
> > But their patterns are much simplier than RVV patterns. 
> > 
> > I am still trying find a way to optimize the RVV pattern for that.
> > However, it seems to be very diffcult since we are trying to merge each type
> > intrinsics into same single pattern to avoid explosion of the insn-ouput.cc
> > and insn-emit.cc
> 
> They also expose the semantics to GIMPLE instead of keeping
> builtin function calls:
> 
> void fn (svbool_t pg, uint8_t * out)
> {
>   

[Bug tree-optimization/111860] [14 Regression] incorrect vUSE after guard block loop skip block during vectorization.

2023-10-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111860

--- Comment #13 from Tamar Christina  ---
Patch posted https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633569.html

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2023-10-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 31, which changed state.

Bug 31 Summary: SLP of gathers incomplete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/111131] SLP of gathers incomplete

2023-10-19 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |14.0
 Resolution|--- |FIXED

--- Comment #4 from Richard Biener  ---
Fixed.

[Bug tree-optimization/111131] SLP of gathers incomplete

2023-10-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:beab5b95c581452adeb26efd59ae84a61fb3b429

commit r14-4745-gbeab5b95c581452adeb26efd59ae84a61fb3b429
Author: Richard Biener 
Date:   Thu Oct 19 10:33:01 2023 +0200

tree-optimization/31 - SLP for non-IFN gathers

The following implements SLP vectorization support for gathers
without relying on IFNs being pattern detected (and supported by
the target).  That includes support for emulated gathers but also
the legacy x86 builtin path.

PR tree-optimization/31
* tree-vect-loop.cc (update_epilogue_loop_vinfo): Make
sure to update all gather/scatter stmt DRs, not only those
that eventually got VMAT_GATHER_SCATTER set.
* tree-vect-slp.cc (_slp_oprnd_info::first_gs_info): Add.
(vect_get_and_check_slp_defs): Handle gathers/scatters,
adding the offset as SLP operand and comparing base and scale.
(vect_build_slp_tree_1): Handle gathers.
(vect_build_slp_tree_2): Likewise.

* gcc.dg/vect/vect-gather-1.c: Now expected to vectorize
everywhere.
* gcc.dg/vect/vect-gather-2.c: Expected to not SLP anywhere.
Massage the scale case to more reliably produce a different
one.  Scan for the specific messages.
* gcc.dg/vect/vect-gather-3.c: Masked gather is also supported
for AVX2, but not emulated.
* gcc.dg/vect/vect-gather-4.c: Expected to not SLP anywhere.
Massage to more properly ensure this.
* gcc.dg/vect/tsvc/vect-tsvc-s353.c: Expect to vectorize
everywhere.

[Bug target/111720] RISC-V: Ugly codegen in RVV

2023-10-19 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720

--- Comment #24 from rguenther at suse dot de  ---
On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> 
> --- Comment #23 from JuzheZhong  ---
> (In reply to rguent...@suse.de from comment #22)
> > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > 
> > > --- Comment #21 from JuzheZhong  ---
> > > (In reply to rguent...@suse.de from comment #20)
> > > > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > > > 
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > > > 
> > > > > --- Comment #19 from JuzheZhong  ---
> > > > > (In reply to Richard Biener from comment #18)
> > > > > > With RVV you have intrinsic calls in GIMPLE so nothing to optimize:
> > > > > > 
> > > > > > vbool8_t fn ()
> > > > > > {
> > > > > >   vbool8_t vmask;
> > > > > >   vuint8m1_t vand_m;
> > > > > >   vuint8m1_t varr;
> > > > > >   uint8_t arr[32];
> > > > > > 
> > > > > >[local count: 1073741824]:
> > > > > >   arr =
> > > > > > "\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01\x02\x07\x01
> > > > > > \x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t";
> > > > > >   varr_3 = __riscv_vle8_v_u8m1 (, 32); [return slot 
> > > > > > optimization]
> > > > > >   vand_m_4 = __riscv_vand_vx_u8m1 (varr_3, 1, 32); [return slot 
> > > > > > optimization]
> > > > > >   vmask_5 = __riscv_vreinterpret_v_u8m1_b8 (vand_m_4); [return slot
> > > > > > optimization]
> > > > > >= vmask_5;
> > > > > >   arr ={v} {CLOBBER(eol)};
> > > > > >   return ;
> > > > > > 
> > > > > > and on RTL I see lots of UNSPECs, RTL opts cannot do anything with 
> > > > > > those.
> > > > > > 
> > > > > > This is what Andrew said already.
> > > > > 
> > > > > Ok. I wonder why this issue is gone when I change it into:
> > > > > 
> > > > > arr as static
> > > > > 
> > > > > https://godbolt.org/z/Tdoshdfr6
> > > > 
> > > > Because the stacik initialization isn't required then.
> > > 
> > > I have experiment with a simplifed pattern:
> > > 
> > > 
> > > (insn 14 13 15 2 (set (reg/v:RVVM1QI 134 [ varr ])
> > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > > (const_vector:RVVMF8BI repeat [
> > > (const_int 1 [0x1])
> > > ])
> > > (reg:DI 143)
> > > (const_int 2 [0x2]) repeated x2
> > > (const_int 0 [0])
> > > (reg:SI 66 vl)
> > > (reg:SI 67 vtype)
> > > ] UNSPEC_VPREDICATE)
> > > (mem:RVVM1QI (reg:DI 142) [0  S[16, 16] A8])
> > > (const_vector:RVVM1QI repeat [
> > > (const_int 0 [0])
> > > ]))) "rvv.c":5:23 1476 {*pred_movrvvm1qi}
> > >  (nil))
> > > (insn 15 14 16 2 (set (reg:DI 144)
> > > (const_int 32 [0x20])) "rvv.c":6:5 206 {*movdi_64bit}
> > >  (nil))
> > > (insn 16 15 0 2 (set (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] 
> > > A8])
> > > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > > (const_vector:RVVMF8BI repeat [
> > > (const_int 1 [0x1])
> > > ])
> > > (reg:DI 144)
> > > (const_int 0 [0])
> > > (reg:SI 66 vl)
> > > (reg:SI 67 vtype)
> > > ] UNSPEC_VPREDICATE)
> > > (reg/v:RVVM1QI 134 [ varr ])
> > > (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])))
> > > "rvv.c":6:5 1592 {pred_storervvm1qi}
> > >  (nil))
> > > 
> > > You can see there is only one UNSPEC now. Still has redundant stack
> > > transferring.
> > > 
> > > Is it because the pattern too complicated?
> > 
> > It's because it has an UNSPEC in it - that makes it have target
> > specific (unknown to the middle-end) behavior so nothing can
> > be optimized here.
> > 
> > Specifically passes likely refuse to replace MEM operands in
> > such a construct.
> 
> I saw ARM SVE load/store intrinsic also have UNSPEC.
> They don't have such issues.
> 
> https://godbolt.org/z/fsW6Ko93z
> 
> But their patterns are much simplier than RVV patterns. 
> 
> I am still trying find a way to optimize the RVV pattern for that.
> However, it seems to be very diffcult since we are trying to merge each type
> intrinsics into same single pattern to avoid explosion of the insn-ouput.cc
> and insn-emit.cc

They also expose the semantics to GIMPLE instead of keeping
builtin function calls:

void fn (svbool_t pg, uint8_t * out)
{
  svuint8_t varr;
  static uint8_t arr[32] = 
"\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t";

   [local count: 1073741824]:
  varr_3 = .MASK_LOAD (, 8B, pg_2(D));
  .MASK_STORE (out_4(D), 8B, pg_2(D), varr_3); [tail call]
  return;

[Bug target/111720] RISC-V: Ugly codegen in RVV

2023-10-19 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720

--- Comment #23 from JuzheZhong  ---
(In reply to rguent...@suse.de from comment #22)
> On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > 
> > --- Comment #21 from JuzheZhong  ---
> > (In reply to rguent...@suse.de from comment #20)
> > > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > > 
> > > > --- Comment #19 from JuzheZhong  ---
> > > > (In reply to Richard Biener from comment #18)
> > > > > With RVV you have intrinsic calls in GIMPLE so nothing to optimize:
> > > > > 
> > > > > vbool8_t fn ()
> > > > > {
> > > > >   vbool8_t vmask;
> > > > >   vuint8m1_t vand_m;
> > > > >   vuint8m1_t varr;
> > > > >   uint8_t arr[32];
> > > > > 
> > > > >[local count: 1073741824]:
> > > > >   arr =
> > > > > "\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01\x02\x07\x01
> > > > > \x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t";
> > > > >   varr_3 = __riscv_vle8_v_u8m1 (, 32); [return slot optimization]
> > > > >   vand_m_4 = __riscv_vand_vx_u8m1 (varr_3, 1, 32); [return slot 
> > > > > optimization]
> > > > >   vmask_5 = __riscv_vreinterpret_v_u8m1_b8 (vand_m_4); [return slot
> > > > > optimization]
> > > > >= vmask_5;
> > > > >   arr ={v} {CLOBBER(eol)};
> > > > >   return ;
> > > > > 
> > > > > and on RTL I see lots of UNSPECs, RTL opts cannot do anything with 
> > > > > those.
> > > > > 
> > > > > This is what Andrew said already.
> > > > 
> > > > Ok. I wonder why this issue is gone when I change it into:
> > > > 
> > > > arr as static
> > > > 
> > > > https://godbolt.org/z/Tdoshdfr6
> > > 
> > > Because the stacik initialization isn't required then.
> > 
> > I have experiment with a simplifed pattern:
> > 
> > 
> > (insn 14 13 15 2 (set (reg/v:RVVM1QI 134 [ varr ])
> > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > (const_vector:RVVMF8BI repeat [
> > (const_int 1 [0x1])
> > ])
> > (reg:DI 143)
> > (const_int 2 [0x2]) repeated x2
> > (const_int 0 [0])
> > (reg:SI 66 vl)
> > (reg:SI 67 vtype)
> > ] UNSPEC_VPREDICATE)
> > (mem:RVVM1QI (reg:DI 142) [0  S[16, 16] A8])
> > (const_vector:RVVM1QI repeat [
> > (const_int 0 [0])
> > ]))) "rvv.c":5:23 1476 {*pred_movrvvm1qi}
> >  (nil))
> > (insn 15 14 16 2 (set (reg:DI 144)
> > (const_int 32 [0x20])) "rvv.c":6:5 206 {*movdi_64bit}
> >  (nil))
> > (insn 16 15 0 2 (set (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] 
> > A8])
> > (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> > (const_vector:RVVMF8BI repeat [
> > (const_int 1 [0x1])
> > ])
> > (reg:DI 144)
> > (const_int 0 [0])
> > (reg:SI 66 vl)
> > (reg:SI 67 vtype)
> > ] UNSPEC_VPREDICATE)
> > (reg/v:RVVM1QI 134 [ varr ])
> > (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])))
> > "rvv.c":6:5 1592 {pred_storervvm1qi}
> >  (nil))
> > 
> > You can see there is only one UNSPEC now. Still has redundant stack
> > transferring.
> > 
> > Is it because the pattern too complicated?
> 
> It's because it has an UNSPEC in it - that makes it have target
> specific (unknown to the middle-end) behavior so nothing can
> be optimized here.
> 
> Specifically passes likely refuse to replace MEM operands in
> such a construct.

I saw ARM SVE load/store intrinsic also have UNSPEC.
They don't have such issues.

https://godbolt.org/z/fsW6Ko93z

But their patterns are much simplier than RVV patterns. 

I am still trying find a way to optimize the RVV pattern for that.
However, it seems to be very diffcult since we are trying to merge each type
intrinsics into same single pattern to avoid explosion of the insn-ouput.cc
and insn-emit.cc

[Bug tree-optimization/111878] [14 Regression] ICE: in get_loop_exit_edges, at cfgloop.cc:1204 with -O3 -fgraphite-identity -fsave-optimization-record

2023-10-19 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111878

--- Comment #1 from Zdenek Sojka  ---
The correct version output for gcc-14 is:
$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-4743-20231019111223-g947fb34a165-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-4743-20231019111223-g947fb34a165-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231019 (experimental) (GCC)

[Bug target/111720] RISC-V: Ugly codegen in RVV

2023-10-19 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720

--- Comment #22 from rguenther at suse dot de  ---
On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> 
> --- Comment #21 from JuzheZhong  ---
> (In reply to rguent...@suse.de from comment #20)
> > On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > > 
> > > --- Comment #19 from JuzheZhong  ---
> > > (In reply to Richard Biener from comment #18)
> > > > With RVV you have intrinsic calls in GIMPLE so nothing to optimize:
> > > > 
> > > > vbool8_t fn ()
> > > > {
> > > >   vbool8_t vmask;
> > > >   vuint8m1_t vand_m;
> > > >   vuint8m1_t varr;
> > > >   uint8_t arr[32];
> > > > 
> > > >[local count: 1073741824]:
> > > >   arr =
> > > > "\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01\x02\x07\x01
> > > > \x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t";
> > > >   varr_3 = __riscv_vle8_v_u8m1 (, 32); [return slot optimization]
> > > >   vand_m_4 = __riscv_vand_vx_u8m1 (varr_3, 1, 32); [return slot 
> > > > optimization]
> > > >   vmask_5 = __riscv_vreinterpret_v_u8m1_b8 (vand_m_4); [return slot
> > > > optimization]
> > > >= vmask_5;
> > > >   arr ={v} {CLOBBER(eol)};
> > > >   return ;
> > > > 
> > > > and on RTL I see lots of UNSPECs, RTL opts cannot do anything with 
> > > > those.
> > > > 
> > > > This is what Andrew said already.
> > > 
> > > Ok. I wonder why this issue is gone when I change it into:
> > > 
> > > arr as static
> > > 
> > > https://godbolt.org/z/Tdoshdfr6
> > 
> > Because the stacik initialization isn't required then.
> 
> I have experiment with a simplifed pattern:
> 
> 
> (insn 14 13 15 2 (set (reg/v:RVVM1QI 134 [ varr ])
> (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> (const_vector:RVVMF8BI repeat [
> (const_int 1 [0x1])
> ])
> (reg:DI 143)
> (const_int 2 [0x2]) repeated x2
> (const_int 0 [0])
> (reg:SI 66 vl)
> (reg:SI 67 vtype)
> ] UNSPEC_VPREDICATE)
> (mem:RVVM1QI (reg:DI 142) [0  S[16, 16] A8])
> (const_vector:RVVM1QI repeat [
> (const_int 0 [0])
> ]))) "rvv.c":5:23 1476 {*pred_movrvvm1qi}
>  (nil))
> (insn 15 14 16 2 (set (reg:DI 144)
> (const_int 32 [0x20])) "rvv.c":6:5 206 {*movdi_64bit}
>  (nil))
> (insn 16 15 0 2 (set (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])
> (if_then_else:RVVM1QI (unspec:RVVMF8BI [
> (const_vector:RVVMF8BI repeat [
> (const_int 1 [0x1])
> ])
> (reg:DI 144)
> (const_int 0 [0])
> (reg:SI 66 vl)
> (reg:SI 67 vtype)
> ] UNSPEC_VPREDICATE)
> (reg/v:RVVM1QI 134 [ varr ])
> (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])))
> "rvv.c":6:5 1592 {pred_storervvm1qi}
>  (nil))
> 
> You can see there is only one UNSPEC now. Still has redundant stack
> transferring.
> 
> Is it because the pattern too complicated?

It's because it has an UNSPEC in it - that makes it have target
specific (unknown to the middle-end) behavior so nothing can
be optimized here.

Specifically passes likely refuse to replace MEM operands in
such a construct.

[Bug target/111720] RISC-V: Ugly codegen in RVV

2023-10-19 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720

--- Comment #21 from JuzheZhong  ---
(In reply to rguent...@suse.de from comment #20)
> On Thu, 19 Oct 2023, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111720
> > 
> > --- Comment #19 from JuzheZhong  ---
> > (In reply to Richard Biener from comment #18)
> > > With RVV you have intrinsic calls in GIMPLE so nothing to optimize:
> > > 
> > > vbool8_t fn ()
> > > {
> > >   vbool8_t vmask;
> > >   vuint8m1_t vand_m;
> > >   vuint8m1_t varr;
> > >   uint8_t arr[32];
> > > 
> > >[local count: 1073741824]:
> > >   arr =
> > > "\x01\x02\x07\x01\x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t\x01\x02\x07\x01
> > > \x03\x04\x05\x03\x01\x00\x01\x02\x04\x04\t\t";
> > >   varr_3 = __riscv_vle8_v_u8m1 (, 32); [return slot optimization]
> > >   vand_m_4 = __riscv_vand_vx_u8m1 (varr_3, 1, 32); [return slot 
> > > optimization]
> > >   vmask_5 = __riscv_vreinterpret_v_u8m1_b8 (vand_m_4); [return slot
> > > optimization]
> > >= vmask_5;
> > >   arr ={v} {CLOBBER(eol)};
> > >   return ;
> > > 
> > > and on RTL I see lots of UNSPECs, RTL opts cannot do anything with those.
> > > 
> > > This is what Andrew said already.
> > 
> > Ok. I wonder why this issue is gone when I change it into:
> > 
> > arr as static
> > 
> > https://godbolt.org/z/Tdoshdfr6
> 
> Because the stacik initialization isn't required then.

I have experiment with a simplifed pattern:


(insn 14 13 15 2 (set (reg/v:RVVM1QI 134 [ varr ])
(if_then_else:RVVM1QI (unspec:RVVMF8BI [
(const_vector:RVVMF8BI repeat [
(const_int 1 [0x1])
])
(reg:DI 143)
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(mem:RVVM1QI (reg:DI 142) [0  S[16, 16] A8])
(const_vector:RVVM1QI repeat [
(const_int 0 [0])
]))) "rvv.c":5:23 1476 {*pred_movrvvm1qi}
 (nil))
(insn 15 14 16 2 (set (reg:DI 144)
(const_int 32 [0x20])) "rvv.c":6:5 206 {*movdi_64bit}
 (nil))
(insn 16 15 0 2 (set (mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])
(if_then_else:RVVM1QI (unspec:RVVMF8BI [
(const_vector:RVVMF8BI repeat [
(const_int 1 [0x1])
])
(reg:DI 144)
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(reg/v:RVVM1QI 134 [ varr ])
(mem:RVVM1QI (reg/v/f:DI 135 [ out ]) [0  S[16, 16] A8])))
"rvv.c":6:5 1592 {pred_storervvm1qi}
 (nil))

You can see there is only one UNSPEC now. Still has redundant stack
transferring.

Is it because the pattern too complicated?

[Bug tree-optimization/111878] New: [14 Regression] ICE: in get_loop_exit_edges, at cfgloop.cc:1204 with -O3 -fgraphite-identity -fsave-optimization-record

2023-10-19 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111878

Bug ID: 111878
   Summary: [14 Regression] ICE: in get_loop_exit_edges, at
cfgloop.cc:1204 with -O3 -fgraphite-identity
-fsave-optimization-record
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 56151
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56151=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O3 -fgraphite-identity -fsave-optimization-record
testcase.c
during GIMPLE pass: graphite
testcase.c: In function 'long_c2i':
testcase.c:5:1: internal compiler error: in get_loop_exit_edges, at
cfgloop.cc:1204
5 | long_c2i (long utmp, int i)
  | ^~~~
0x71d5c7 get_loop_exit_edges(loop const*, basic_block_def**)
/repo/gcc-trunk/gcc/cfgloop.cc:1204
0x177ed5e find_loop_location(loop*)
/repo/gcc-trunk/gcc/tree-vect-loop-manip.cc:1844
0x25dd9db graphite_transform_loops()
/repo/gcc-trunk/gcc/graphite.cc:478
0x25ddb70 graphite_transforms
/repo/gcc-trunk/gcc/graphite.cc:541
0x25ddb70 execute
/repo/gcc-trunk/gcc/graphite.cc:620
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-13-branch/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-13-branch/binary-13-branch-20231014001921-ga9f39860efa-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/13.2.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-13-branch//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-13-branch//binary-13-branch-20231014001921-ga9f39860efa-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.1 20231014 (GCC)

  1   2   >