date:20230731

[Bug target/110741] vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC

2023-07-31 Thread linkw at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110741

Kewen Lin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Kewen Lin  ---
Should be fixed everywhere.

[Bug target/110741] vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110741

--- Comment #7 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:cf10790c24b302b3265afdfc96abdfc35932fbd0

commit r13-7677-gcf10790c24b302b3265afdfc96abdfc35932fbd0
Author: Kewen Lin 
Date:   Wed Jul 26 03:42:29 2023 -0500

rs6000: Correct vsx operands output for xxeval [PR110741]

PR110741 exposes one issue that we didn't use the correct
character for vsx operands in output operand substitution,
consequently it can map to the wrong registers which hold
some unexpected values.

PR target/110741

gcc/ChangeLog:

* config/rs6000/vsx.md (define_insn xxeval): Correct vsx
operands output with "x".

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr110741.C: New test.

(cherry picked from commit 96a839233ced3a0bfc3d5492a6d8b102e6981472)

[Bug target/110741] vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110741

--- Comment #6 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:e58899645f09fb26fbca03ed8b9b13f162ff32dc

commit r12-9795-ge58899645f09fb26fbca03ed8b9b13f162ff32dc
Author: Kewen Lin 
Date:   Wed Jul 26 03:42:29 2023 -0500

rs6000: Correct vsx operands output for xxeval [PR110741]

PR110741 exposes one issue that we didn't use the correct
character for vsx operands in output operand substitution,
consequently it can map to the wrong registers which hold
some unexpected values.

PR target/110741

gcc/ChangeLog:

* config/rs6000/vsx.md (define_insn xxeval): Correct vsx
operands output with "x".

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr110741.C: New test.

(cherry picked from commit 96a839233ced3a0bfc3d5492a6d8b102e6981472)

[Bug target/110741] vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110741

--- Comment #5 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:2ad68e7ce034f74ac0e74b6140b3207c21b6573a

commit r11-10931-g2ad68e7ce034f74ac0e74b6140b3207c21b6573a
Author: Kewen Lin 
Date:   Wed Jul 26 03:42:29 2023 -0500

rs6000: Correct vsx operands output for xxeval [PR110741]

PR110741 exposes one issue that we didn't use the correct
character for vsx operands in output operand substitution,
consequently it can map to the wrong registers which hold
some unexpected values.

PR target/110741

gcc/ChangeLog:

* config/rs6000/altivec.md (define_insn xxeval): Correct vsx
operands output with "x".

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr110741.C: New test.

(cherry picked from commit 96a839233ced3a0bfc3d5492a6d8b102e6981472)

[Bug preprocessor/87299] #pragma GCC target behaves differently when using -save-temps

2023-07-31 Thread lhyatt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87299

Lewis Hyatt  changed:

   What|Removed |Added

 CC||lhyatt at gcc dot gnu.org

--- Comment #4 from Lewis Hyatt  ---
Patch submitted for review:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625924.html

[Bug c++/110866] No UDC lookup with va_arg

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110866

--- Comment #2 from Andrew Pinski  ---
In fact ICC rejects all cases:
(11): error: incorrect use of va_start
va_start(va, fmt); va.engaged = 1;
^

(12): error: incorrect use of va_arg
va_arg(va, int);
^

(14): error: incorrect use of va_copy
va_copy(va, va);
^

(14): error: incorrect use of va_copy
va_copy(va, va);
^

(16): error: incorrect use of va_end
va_end(va);
^

[Bug c++/110866] No UDC lookup with va_arg

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110866

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
I think this is not valid C++ code at all. All compilers I tried reject either
va_start/va_end or the va_arg.
The only portable fix is to do:
void f(const char *fmt, ...)
{
va_wrap va;
va_start(static_cast(va), fmt); va.engaged = 1;
va_arg(static_cast(va), int);
if (0)
va_copy(static_cast(va), static_cast(va));
if (0)
va_end(static_cast(va));
}

[Bug c++/110866] New: No UDC lookup with va_arg

2023-07-31 Thread jengelh at inai dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110866

Bug ID: 110866
   Summary: No UDC lookup with va_arg
   Product: gcc
   Version: 13.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jengelh at inai dot de
  Target Milestone: ---

Bugtype: enhancement
Version: 13.1.1 20230720 [revision 9aac37ab8a7b919a89c6d64bc7107a8436996e93]
(SUSE Linux)

Input
=

#include 
struct va_wrap {
~va_wrap() { if (engaged) va_end(vl); }
va_list vl;
operator va_list &() { return vl; }
bool engaged = false;
};
void f(const char *fmt, ...)
{
va_wrap va;
va_start(va, fmt); va.engaged = 1;
va_arg(va, int);
if (0)
va_copy(va, va);
if (0)
va_end(va);
}
int main()
{
f("", 42);
}


Observed output
===
$ g++ -c x.cpp
x.cpp:12:20: error: first argument to ‘va_arg’ not of type ‘va_list’


Expected output
===
Succeed.

If va_start and va_end can take a va_wrap and have the user-defined conversion
operator invoked, surely va_arg could do so too.

[Bug c++/104113] invalid template argument causes the type to become int which confuses the rest of the diagnostic

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104113

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #9 from Andrew Pinski  ---
Not going to handle this as there will be many pathes inside the C++ front-end
that will need to be changed to handle error_mark_node rather than just a type
...

[Bug c++/104113] invalid template argument causes the type to become int which confuses the rest of the diagnostic

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104113

--- Comment #8 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> from decl.cc:
> 
>   if (type_was_error_mark_node && template_parm_flag)
> /* FIXME we should be able to propagate the error_mark_node as is
>for other contexts too.  */
> type = error_mark_node;
>   else
> type = integer_type_node;

Changing this to not check template_parm_flag causes a regression in
g++.dg/other/nontype-1.C . What happens is the type that happens here is now
error_mark_node but the parser is not exacting that still. We get instead:
```
t1.cc:3:37: error: expected ‘)’ before ‘,’ token
3 |Op::first_argument_type a, // { dg-error "not a type" }
  | ^
  | )
t1.cc:2:11: note: to match this ‘(’
2 | bool asfun(Op f,
  |   ^
```
Which is totally bad error recovery ...

[Bug libstdc++/110862] format out of bands read on format string "{0:{0}"

2023-07-31 Thread hewillk at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110862

康桓瑋  changed:

   What|Removed |Added

 CC||hewillk at gmail dot com

--- Comment #1 from 康桓瑋  ---
It does throw:

https://godbolt.org/z/5q3bb51YE

[Bug modula2/110865] Unable to access copied const array

2023-07-31 Thread gaius at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110865

Gaius Mulley  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Gaius Mulley  ---
Closing now that the patch has been applied.

[Bug modula2/110865] Unable to access copied const array

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110865

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:8a47474f2cf48837d6adf4a1232a89fd398ca7fa

commit r14-2892-g8a47474f2cf48837d6adf4a1232a89fd398ca7fa
Author: Gaius Mulley 
Date:   Tue Aug 1 01:42:16 2023 +0100

PR modula2/110865 Unable to access copied const array

This patch allows constants of an array type to be indexed.

gcc/m2/ChangeLog:

PR modula2/110865
* gm2-compiler/M2Quads.mod (BuildDesignatorArray):
Rename t as type and d as dim.  New variable result.
Allow constants of an array type to be indexed.

gcc/testsuite/ChangeLog:

PR modula2/110865
* gm2/iso/pass/constvec.mod: New test.
* gm2/iso/pass/constvec2.mod: New test.
* gm2/iso/run/pass/constvec3.mod: New test.

Signed-off-by: Gaius Mulley

[Bug tree-optimization/93044] extra cast is not removed

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93044

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/93044] extra cast is not removed

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93044

--- Comment #7 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:cc2003cd87532f319c94028f17d20a327df5ccfa

commit r14-2890-gcc2003cd87532f319c94028f17d20a327df5ccfa
Author: Andrew Pinski 
Date:   Sun Jul 23 21:44:39 2023 +

Fix PR 93044: extra cast is not removed

In this case we are not removing convert to a bigger size
back to the same size (or smaller) if signedness does not
match.
For an example:
```
  signed char _1;
...
  _1 = *a_4(D);
  b_5 = (short unsigned int) _1;
  _2 = (unsigned char) b_5;
```
The inner cast is not needed and can be removed but was not.
The match pattern for removing the extra cast is overly
complex so decided to add a new case for rather than trying
to modify the current if statement here.

Committed as approved. Bootstrapped and tested on x86_64-linux-gnu with no
regressions.

gcc/ChangeLog:

PR tree-optimization/93044
* match.pd (nested int casts): A truncation (to the same size or
smaller)
can always remove the inner cast.

gcc/testsuite/ChangeLog:

PR tree-optimization/93044
* gcc.dg/tree-ssa/cast-1.c: New test.
* gcc.dg/tree-ssa/cast-2.c: New test.

[Bug testsuite/110858] [14 Regression] gcc.dg/unroll-1.c UNRESOLVED

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110858

--- Comment #4 from Andrew Pinski  ---
The scan-rtl-dump-not should be only testing for `{ target { { i?86-*-*
x86_64-*-* } && ia32 } }` I think.  So this is a just a testsuite issue ...

[Bug testsuite/110858] [14 Regression] gcc.dg/unroll-1.c UNRESOLVED

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110858

Andrew Pinski  changed:

   What|Removed |Added

 Target|powerpc64le-linux-gnu   |
   Host|powerpc64le-linux-gnu   |
  Build|powerpc64le-linux-gnu   |

--- Comment #3 from Andrew Pinski  ---
(In reply to seurer from comment #2)
> Also seen on powerpc64

And on x86_64.

[Bug c++/110855] std::source_location doesn't work with C++20 coroutine

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110855

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-07-31
 Status|UNCONFIRMED |NEW
   Keywords||rejects-valid
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
Confirmed.

[Bug c++/110855] std::source_location doesn't work with C++20 coroutine

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110855

--- Comment #1 from Andrew Pinski  ---
Created attachment 55669
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55669=edit
full testcase

[Bug testsuite/110858] [14 Regression] gcc.dg/unroll-1.c UNRESOLVED

2023-07-31 Thread seurer at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110858

seurer at gcc dot gnu.org changed:

   What|Removed |Added

  Build||powerpc64le-linux-gnu
   Host||powerpc64le-linux-gnu
 Target||powerpc64le-linux-gnu
 CC||seurer at gcc dot gnu.org

--- Comment #2 from seurer at gcc dot gnu.org ---
Also seen on powerpc64

[Bug modula2/110865] Unable to access copied const array

2023-07-31 Thread gaius at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110865

--- Comment #2 from Gaius Mulley  ---
Created attachment 55668
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55668=edit
Proposed fix

Here is a proposed patch which I'll apply (if) the bootstrap succeeds and no
further regressions are introduced.

[Bug analyzer/110830] -Wanalyzer-use-of-uninitialized-value false negative due to use-after-free::supercedes_p.

2023-07-31 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110830

--- Comment #2 from David Malcolm  ---
The "supercedes_p" logic is called in
diagnostic_manager::emit_saved_diagnostics here:
  best_candidates.handle_interactions (this);

I *think* every saved_diagnostic ought to have a non-NULL m_best_epath by the
time this is called.

[Bug analyzer/110830] -Wanalyzer-use-of-uninitialized-value false negative due to use-after-free::supercedes_p.

2023-07-31 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110830

--- Comment #1 from David Malcolm  ---
For reference, I implemented use_after_free::supercedes_p in commit
g:33255ad3ac14e3953750fe0f2d82b901c2852ff6 as part of the gcc 12
(re)implementation of -Wanalyzer-use-of-uninitialized-value.

[Bug modula2/110865] Unable to access copied const array

2023-07-31 Thread gaius at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110865

Gaius Mulley  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2023-07-31

--- Comment #1 from Gaius Mulley  ---
Indeed - both statements should successfully access the const array.

[Bug modula2/110865] New: Unable to access copied const array

2023-07-31 Thread gaius at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110865

Bug ID: 110865
   Summary: Unable to access copied const array
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: gaius at gcc dot gnu.org
  Target Milestone: ---

As reported on the gm2 mailing list:

> MODULE MayBeBuggy;
>
> TYPE  V5= ARRAY [1.. 5] OF LONGREAL;
>
> CONST wg10  = V5{
> 6.66713443086881375935688098933317928579E-02,
> 1.49451349150580593145776339657697332403E-01,
> 2.19086362515982043995534934228163192459E-01,
> 2.69266719309996355091226921569469352860E-01,
> 2.95524224714752870173892994651338329421E-01
>   }; (* weights of the 10-point Gauss-Konrod rule *)
>
> CONST WG  = wg10;
>
> VAR   x   : LONGREAL;
> BEGIN
>   x := wg10[3]; (* OK *)
>
>   x := WG[3];   (* NOT OK *)
> END MayBeBuggy.

[Bug middle-end/110864] [14 Regression] ICE in combine.cc causes stage2 build failure on RISCV

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110864

--- Comment #2 from Andrew Pinski  ---
That is:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=7cdd0860949c6c3232e6cff1d7ca37bb5234074c

[Bug middle-end/110864] [14 Regression] ICE in combine.cc causes stage2 build failure on RISCV

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110864

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Keywords||ice-on-valid-code

[Bug middle-end/110864] [14 Regression] ICE in combine.cc causes stage2 build failure on RISCV

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110864

--- Comment #1 from Andrew Pinski  ---
Most likely the same issue as on arm:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625913.html

[Bug middle-end/110864] New: [14 Regression] ICE in combine.cc causes stage2 build failure on RISCV

2023-07-31 Thread patrick at rivosinc dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110864

Bug ID: 110864
   Summary: [14 Regression] ICE in combine.cc causes stage2 build
failure on RISCV
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Created attachment 55667
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55667=edit
Full build log with the error

Using https://github.com/riscv-collab/riscv-gnu-toolchain with tip-of-tree GCC
fails to build rv32gc linux target.

Here's the error:
during RTL pass: combine
../../../../gcc/libitm/beginend.cc: In static member function 'static uint32_t
GTM::gtm_thread::begin_transaction(uint32_t, const GTM::gtm_jmpbuf*)':
../../../../gcc/libitm/beginend.cc:425:1: internal compiler error: in
decompose, at rtl.h:2297
  425 | }
  | ^

The full log with the traceback is attached.

Affects rv32gc linux. rv64gc linux and rv{32|64}gc Newlib/musl still build.

I haven't attempted to bootstrap other architectures so it may affect those as
well.

I've partially bisected it down to 5 possible commits:
Known to fail: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=75d62394
Known to succeed: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=05986af

I'll continue bisecting and keep this issue updated.

[Bug testsuite/110858] [14 Regression] gcc.dg/unroll-1.c UNRESOLVED

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110858

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-07-31
  Component|c   |testsuite
   Keywords||testsuite-fail
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-07-31

--- Comment #1 from Andrew Pinski  ---
Confirmed. This code definitely looks questionable:
__guess += max((int)__builtin_log10(__builtin_abs(__v)) / 2,
1);


So in this case, most likely the author of the above code thought __builtin_abs
was type generic rather than takes an integer type.

The reason why I say that is because below we have:
  if (!__builtin_signbit(__v))


Most likely __builtin_abs(__v) should be replaced with `(__builtin_signbit(__v)
? -__v : __v)` (which does get optimized to just `ABS_EXPR` internally.

[Bug middle-end/110859] New FAIL: 23_containers/vector/bool/110807.cc

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110859

Andrew Pinski  changed:

   What|Removed |Added

 Target||!LP64
Summary|[14 Regression] New FAIL:   |New FAIL:
   |23_containers/vector/bool/1 |23_containers/vector/bool/1
   |10807.cc|10807.cc

--- Comment #2 from Andrew Pinski  ---
It is a new testcase and see also bug 110807 comment #8 where it was known to
fail in this case ...

[Bug libstdc++/110859] [14 Regression] New FAIL: 23_containers/vector/bool/110807.cc

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110859

Andrew Pinski  changed:

   What|Removed |Added

 CC||seurer at gcc dot gnu.org

--- Comment #1 from Andrew Pinski  ---
*** Bug 110863 has been marked as a duplicate of this bug. ***

[Bug other/110863] New test case 23_containers/vector/bool/110807.cc from r14-2797-g7931a1de9ec87b fails on big endian

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110863

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
Dup, it is failing on 32bit ...

*** This bug has been marked as a duplicate of bug 110859 ***

[Bug fortran/50410] [11/12/13/14 Regression] ICE in record_reference, pointer variable in data statement

2023-07-31 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50410

--- Comment #48 from anlauf at gcc dot gnu.org ---
Created attachment 55666
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55666=edit
Inremental patch

This is a cut-down and revised version of the patch by Tobias that deals with
invalid allocatable and pointer components in data statements, and adjusted
to F2018, and fixing the issue mentioned by Steve.

I've removed the other part that tries to detect the double initialization.
I think this is the wrong place as is would not detect e.g. the following:

program p
  type t
 integer :: g
  end type t
  type(t) :: u
  data u /t(3)/
  data u%g /2/
end

A better-suited place is probably the loop in gfc_assign_data_value, but
find_con_by_component seems not to be able to handle the current situation.

[Bug middle-end/87403] [Meta-bug] Issues that suggest a new warning

2023-07-31 Thread jsm28 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403
Bug 87403 depends on bug 65213, which changed state.

Bug 65213 Summary: Extend -Wmissing-declarations to variables [i.e. add 
-Wmissing-variable-declarations]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65213

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c/65213] Extend -Wmissing-declarations to variables [i.e. add -Wmissing-variable-declarations]

2023-07-31 Thread jsm28 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65213

Joseph S. Myers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Target Milestone|--- |14.0

--- Comment #5 from Joseph S. Myers  ---
Implemented for GCC 14.

[Bug c/65213] Extend -Wmissing-declarations to variables [i.e. add -Wmissing-variable-declarations]

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65213

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Joseph Myers :

https://gcc.gnu.org/g:ffc74822468a39324722eef4c4412ea3224ca976

commit r14-2888-gffc74822468a39324722eef4c4412ea3224ca976
Author: Hamza Mahfooz 
Date:   Mon Jul 31 19:03:47 2023 +

c: add -Wmissing-variable-declarations [PR65213]

Resolves:
PR c/65213 - Extend -Wmissing-declarations to variables [i.e. add
-Wmissing-variable-declarations]

gcc/c-family/ChangeLog:

PR c/65213
* c.opt (-Wmissing-variable-declarations): New option.

gcc/c/ChangeLog:

PR c/65213
* c-decl.cc (start_decl): Handle
-Wmissing-variable-declarations.

gcc/ChangeLog:

PR c/65213
* doc/invoke.texi (-Wmissing-variable-declarations): Document
new option.

gcc/testsuite/ChangeLog:

PR c/65213
* gcc.dg/Wmissing-variable-declarations.c: New test.

Signed-off-by: Hamza Mahfooz

[Bug other/110863] New: New test case 23_containers/vector/bool/110807.cc from r14-2797-g7931a1de9ec87b fails on big endian

2023-07-31 Thread seurer at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110863

Bug ID: 110863
   Summary: New test case 23_containers/vector/bool/110807.cc from
r14-2797-g7931a1de9ec87b fails on big endian
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:7931a1de9ec87b996d51d3d60786f5c81f63919f, r14-2797-g7931a1de9ec87b 

This is failing on our powerpc64 BE machines and I saw this in gcc 13 as well
recently.

make  -k check RUNTESTFLAGS="--target_board=unix'{-m32,m64}'
conformance.exp=23_containers/vector/bool/110807.cc"
FAIL: 23_containers/vector/bool/110807.cc (test for excess errors)

Excess errors:
/home/seurer/gcc/git/build/gcc-test/powerpc64-unknown-linux-gnu/32/libstdc++-v3/include/bits/stl_algobase.h:437:
warning: 'void* __builtin_memmove(void*, const void*, unsigned int)' writing
between 5 and 268435455 bytes into a region of size 4 overflows the destination
[-Wstringop-overflow=]


commit 7931a1de9ec87b996d51d3d60786f5c81f63919f (HEAD)
Author: Jonathan Wakely 
Date:   Wed Jul 26 14:09:24 2023 +0100

libstdc++: Avoid bogus overflow warnings in std::vector [PR110807]
* testsuite/23_containers/vector/bool/110807.cc: New test.

[Bug modula2/110284] [14 Regression] Bootstrap failures with m2

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110284

--- Comment #17 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Gaius Mulley
:

https://gcc.gnu.org/g:1f0933298c4aa76646b4ea964d6fbc07504526c8

commit r13-7675-g1f0933298c4aa76646b4ea964d6fbc07504526c8
Author: Gaius Mulley 
Date:   Mon Jul 31 19:43:02 2023 +0100

modula2: Fix bootstrap

Combining 3 patches from master for Make-lang.in and header
file changes to ensure that this sequence does not break git bisect.

gcc/m2/ChangeLog:

PR modula2/110284
* Make-lang.in: Build $(generated_files) before building
all $(GM2_C_OBJS).
(m2_OBJS): Assign $(GM2_C_OBJS).  Add m2/gm2-gcc/rtegraph.o and
m2/gm2-compiler-boot/m2flex.o.
(GM2_C_OBJS): Remove m2/stor-layout.o.
(m2/stor-layout.o): Remove rule.
* gm2-gcc/gcc-consolidation.h (rtl.h): Remove include.
(df.h): Remove include.
(except.h): Remove include.
(c-family/m2pp.o): Remove.
* Make-maintainer.in (c-family/m2pp.o): Add.

Signed-off-by: Gaius Mulley

[Bug tree-optimization/96923] Failure to optimize a select-related bool pattern to or+not

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96923
Bug 96923 depends on bug 100864, which changed state.

Bug 100864 Summary: (a&!b) | b is not opimized to a | b for comparisons
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100864

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/100864] (a&!b) | b is not opimized to a | b for comparisons

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100864

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/105903] Missed optimization for __synth3way

2023-07-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105903

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Andrew Pinski  ---
Fixed by r14-2886 .

[Bug tree-optimization/100864] (a&!b) | b is not opimized to a | b for comparisons

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100864

--- Comment #9 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:b9237226fdc9387bccf584a811b30c5d3689ffd2

commit r14-2885-gb9237226fdc9387bccf584a811b30c5d3689ffd2
Author: Andrew Pinski 
Date:   Fri Jul 28 20:27:03 2023 -0700

tree-optimization: [PR100864] `(a&!b) | b` is not opimized to `a | b` for
comparisons

This is a new version of the patch.
Instead of doing the matching of inversion comparison directly inside
match, creating a new function (bitwise_inverted_equal_p) to do it.
It is very similar to bitwise_equal_p that was added in
r14-2751-g2a3556376c69a1fb
but instead it says `expr1 == ~expr2`. A follow on patch, will
use this function in other patterns where we try to match `@0` and
`(bit_not @0)`.

Changed the name bitwise_not_equal_p to bitwise_inverted_equal_p.

Committed as approved after a Bootstrapped and test on x86_64-linux-gnu
with no regressions.

PR tree-optimization/100864

gcc/ChangeLog:

* generic-match-head.cc (bitwise_inverted_equal_p): New function.
* gimple-match-head.cc (bitwise_inverted_equal_p): New macro.
(gimple_bitwise_inverted_equal_p): New function.
* match.pd ((~x | y) & x): Use bitwise_inverted_equal_p
instead of direct matching bit_not.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-3.c: New test.

[Bug tree-optimization/106164] (a > b) & (a >= b) does not get optimized until reassoc1

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106164

--- Comment #13 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:ac0e0966ebf08c454d53042a649403e2880ccbc1

commit r14-2887-gac0e0966ebf08c454d53042a649403e2880ccbc1
Author: Andrew Pinski 
Date:   Sat Jul 29 21:52:31 2023 -0700

MATCH: Add `a == b | a cmp b` and `a != b & a cmp b` simplifications

Even though these are done by combine_comparisons, we can add them to match
to allow simplifcations during match rather than just during
reassoc/ifcombine.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/106164
* match.pd (`a != b & a <= b`, `a != b & a >= b`,
`a == b | a < b`, `a == b | a > b`): Handle these cases
too.

gcc/testsuite/ChangeLog:

PR tree-optimization/106164
* gcc.dg/tree-ssa/cmpbit-2.c: New test.

[Bug tree-optimization/106164] (a > b) & (a >= b) does not get optimized until reassoc1

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106164

--- Comment #12 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:0258b73680e21fd96290af961c80966ac6b3cc68

commit r14-2886-g0258b73680e21fd96290af961c80966ac6b3cc68
Author: Andrew Pinski 
Date:   Sat Jul 29 16:59:10 2023 -0700

MATCH: PR 106164 : Optimize `(X CMP1 Y) AND/IOR (X CMP2 Y)`

I noticed that there are patterns that optimize
`(X CMP1 CST1) AND/IOR (X CMP2 CST2)` and we can easily extend
them to support the  `(X CMP1 Y) AND/IOR (X CMP2 Y)` by saying they
compare equal. This allows for this kind of optimization for integral
and pointer types (which have the same semantics).

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/106164
* match.pd: Extend the `(X CMP1 CST1) AND/IOR (X CMP2 CST2)`
patterns to support `(X CMP1 Y) AND/IOR (X CMP2 Y)`.

gcc/testsuite/ChangeLog:

PR tree-optimization/106164
* gcc.dg/tree-ssa/cmpbit-1.c: New test.

[Bug driver/77576] gcc-ar doesn't work if all options are read from file

2023-07-31 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77576

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Jeffrey A. Law  ---
Fixed on the trunk.

[Bug driver/77576] gcc-ar doesn't work if all options are read from file

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77576

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:c6523ae786e36dccd64589682140e9221628bb5b

commit r14-2884-gc6523ae786e36dccd64589682140e9221628bb5b
Author: Costas Argyris 
Date:   Mon Jul 31 10:56:20 2023 -0600

Re: [PATCH] gcc-ar: Handle response files properly [PR77576]

Problem: gcc-ar fails when a @file is passed to it:

$ cat rsp
--version
$ gcc-ar @rsp
/usr/bin/ar: invalid option -- '@'

This is because a dash '-' is prepended to the first
argument if it doesn't start with one, resulting in
the wrong call 'ar -@rsp'.

Fix: Expand argv to get rid of any @files and if any
expansions were made, pass everything through a
temporary response file.

$ gcc-ar @rsp
GNU ar (GNU Binutils for Debian) 2.35.2
...

gcc/
PR driver/77576
* gcc-ar.cc (main): Expand argv and use
temporary response file to call ar if any
expansions were made.

[Bug bootstrap/109250] Invalid configuration `loongarch64-linux-gnu': machine `loongarch64-unknown' not recognized

2023-07-31 Thread xry111 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109250

--- Comment #8 from Xi Ruoyao  ---
Should be fixed with GMP 6.3.0.

[Bug target/109713] [14 Regression] gcc/config/riscv/sync.md:66:1: error: control reaches end of non-void function [-Werror=return-type] since r14-406-gbff7c773864479

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109713

--- Comment #8 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Patrick O'Neill
:

https://gcc.gnu.org/g:1e9180b3298def6c01d9055d558fdb52231f8d2d

commit r13-7674-g1e9180b3298def6c01d9055d558fdb52231f8d2d
Author: Martin Liska 
Date:   Wed May 3 16:35:26 2023 +0200

riscv: fix error: control reaches end of non-void function

Fixes:
gcc/config/riscv/sync.md:66:1: error: control reaches end of non-void
function [-Werror=return-type]
66 |   [(set (attr "length") (const_int 4))])
   | ^

PR target/109713

gcc/ChangeLog:

* config/riscv/sync.md: Add gcc_unreachable to a switch.

[Bug target/89835] The RISC-V target uses amoswap.w for relaxed stores

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89835

--- Comment #4 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Patrick O'Neill
:

https://gcc.gnu.org/g:74abe200bc9b06e10f0f3cad74f11da4fae90cd3

commit r13-7668-g74abe200bc9b06e10f0f3cad74f11da4fae90cd3
Author: Patrick O'Neill 
Date:   Wed Apr 5 09:56:33 2023 -0700

RISC-V: Strengthen atomic stores

This change makes atomic stores strictly stronger than table A.6 of the
ISA manual. This mapping makes the overall patchset compatible with
table A.7 as well.

2023-04-27 Patrick O'Neill 

PR target/89835

gcc/ChangeLog:

* config/riscv/sync.md (atomic_store): Use simple store
instruction in combination with fence(s).

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr89835.c: New test.

Signed-off-by: Patrick O'Neill

[Bug tree-optimization/110582] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu

2023-07-31 Thread amacleod at redhat dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110582

Andrew Macleod  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Andrew Macleod  ---
There was a bug in the implementation of fur_list where it was using the
supplied values for *any* encountered operand, not just ssa_names.   so when
the phi analyzer was trying to find an initial value fro the PHI a_lsm.2_29 it
was miscalculating the value of iftmp.1_15 and thought it had an (incorrect)
limited range of [-2,2].

 # a_lsm.12_29 = PHI 
  iftmp.1_15 = 3 / a_lsm.12_29; 

it then though it could fold away the condition.

The patch adjusts the operand fetcher to work properly, and then the phi
analyzer calcualtes the range for the statement properly, and we can no longer
remove the condtion.

[Bug ipa/110378] IPA-SRA for destructors

2023-07-31 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110378

--- Comment #4 from Martin Jambor  ---
Created attachment 55665
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55665=edit
Testcase with non-zero offset with pass-through split

This testcase is similar to the previous one but on top of it also deals with
non-zero offsets which is currently not supported by IPA-SRA analysis and will
also require some assumption adjustments on the modification side (see TODOs in
ptr_parm_has_nonarg_uses and in ipa_param_body_adjustments::modify_call_stmt).

[Bug tree-optimization/110582] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110582

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Andrew Macleod :

https://gcc.gnu.org/g:b769811e7c1b3dff2fa0ec2c37b52859d7bceed4

commit r14-2883-gb769811e7c1b3dff2fa0ec2c37b52859d7bceed4
Author: Andrew MacLeod 
Date:   Mon Jul 31 10:08:51 2023 -0400

fur_list should not use the range vector for non-ssa operands.

gcc/
PR tree-optimization/110582
* gimple-range-fold.cc (fur_list::get_operand): Do not use the
range vector for non-ssa names.

gcc/testsuite/
* gcc.dg/pr110582.c: New.

[Bug ipa/110378] IPA-SRA for destructors

2023-07-31 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110378

--- Comment #3 from Martin Jambor  ---
Created attachment 55664
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55664=edit
Testcase with single inheritance

This testcase is somewhat more difficult and addressing will mean not
just changes in the IPA-SRA analysis but also in the modification.
Both need to deal with the following:

 _1 = _2(D)->D.2842;
{anonymous}::foo::~foo (_1);

Modification must see _1 before its use to understand it is just
another name for a (split) parameter and must behave accordingly when
processing the call statement (i.e. prepare ground for call
redirection later).

[Bug ipa/110378] IPA-SRA for destructors

2023-07-31 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110378

Martin Jambor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-07-31
 Ever confirmed|0   |1

[Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-07-31 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293

--- Comment #22 from Martin Jambor  ---
(In reply to Jan Hubicka from comment #21)
> Fixing loop distribution and vectorizer profile update seems to do the trick
> with profile feedback. Without we are still worse than in July last year on
> zen2 tester (zen3 and ice lake seems to behave differently perhaps due to
> different vectorization decisions)
> 
> https://lnt.opensuse.org/db_default/v4/SPEC/graph?highlight_run=38536.
> 0=476.180.0
> 
> shows two jumps last year.
> g:d489ec082ea21410 (2022-06-30 16:46) and 3731dd0bea8994c3 (2022-07-04 00:16)

On a machine very similar to lntzen3, hmmer binary built with these
two revisions ran for pretty much the same time.

> g:3731dd0bea8994c3 (2022-07-04 00:16) and 07dd0f7ba27d1fe9 (2022-07-05 14:05)

Bisecting in this range led to g:d2a8980 but that is the commit
referenced in the summary of this bug.

[Bug target/106346] [11/12/13/14 Regression] Potential regression on vectorization of left shift with constants since r11-5160-g9fc9573f9a5e94

2023-07-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106346

Tamar Christina  changed:

   What|Removed |Added

   Target Milestone|11.5|14.0

[Bug analyzer/109361] RFE: SARIF output could contain timing/profile information

2023-07-31 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109361

David Malcolm  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from David Malcolm  ---
Implemented in trunk for gcc 14 by the above patch.

[Bug analyzer/109361] RFE: SARIF output could contain timing/profile information

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109361

--- Comment #5 from CVS Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:75d623946d4b6ea80a777b789b116d4b4a2298dc

commit r14-2881-g75d623946d4b6ea80a777b789b116d4b4a2298dc
Author: David Malcolm 
Date:   Mon Jul 31 11:13:02 2023 -0400

SARIF and -ftime-report's output [PR109361]

This patch adds support for embeddding profiling information about the
compiler itself into the SARIF output.

Specifically, if SARIF diagnostic output is requested, via
-fdiagnostics-format=sarif-file or -fdiagnostics-format=sarif-stderr,
then any -ftime-report output is written in JSON form into the SARIF
output, rather than to stderr.

In earlier versions of this patch I extended -ftime-report so that
*as well* as writing to stderr, it would embed the information in any
SARIF output.  This turned out to be awkward to use, in that I found
myself needing to get the data in JSON form without also having it
emitted on stderr (which was fouling my build scripts).

The timing information is written to the SARIF as a "gcc/timeReport"
property within a property bag of the "invocation" object.

Here's an example of the output:

  "invocations": [
  {
  "executionSuccessful": true,
  "toolExecutionNotifications": [],
  "properties": {
  "gcc/timeReport": {
  "timevars": [
  {
  "name": "phase setup",
  "elapsed": {
  "user": 0.04,
  "sys": 0,
  "wall": 0.04,
  "ggc_mem": 1863472
  }
  },

  [...snip...]

  {
  "name": "analyzer: processing worklist",
  "elapsed": {
  "user": 0.06,
  "sys": 0,
  "wall": 0.06,
  "ggc_mem": 48
  }
  },
  {
  "name": "analyzer: emitting diagnostics",
  "elapsed": {
  "user": 0.01,
  "sys": 0,
  "wall": 0.01,
  "ggc_mem": 0
  }
  },
  {
  "name": "TOTAL",
  "elapsed": {
  "user": 0.21,
  "sys": 0.03,
  "wall": 0.24,
  "ggc_mem": 3368736
  }
  }
  ],
  "CHECKING_P": true,
  "flag_checking": true
  }
  }
  }
  ]

The documentation notes that the precise output format is subject
to change.

I have successfully used this in my analyzer integration tests to get
timing information about which source files get slowed down by the
analyzer.  I've validated the generated .sarif files against the SARIF
schema.

gcc/ChangeLog:
PR analyzer/109361
* diagnostic-client-data-hooks.h (class sarif_object): New forward
decl.
(diagnostic_client_data_hooks::add_sarif_invocation_properties):
New vfunc.
* diagnostic-format-sarif.cc: Include "diagnostic-format-sarif.h".
(class sarif_invocation): Inherit from sarif_object rather than
json::object.
(class sarif_result): Likewise.
(class sarif_ice_notification): Likewise.
(sarif_object::get_or_create_properties): New.
(sarif_invocation::prepare_to_flush): Add "context" param.  Use it
to call the context's add_sarif_invocation_properties hook.
(sarif_builder::flush_to_file): Pass m_context to
sarif_invocation::prepare_to_flush.
* diagnostic-format-sarif.h: New header.
* doc/invoke.texi (Developer Options): Clarify that -ftime-report
writes to stderr.  Document that if SARIF diagnostic output is
requested then any timing information is written in JSON form as
part of the SARIF output, rather than to stderr.
* timevar.cc: Include "json.h".
(timer::named_items::m_hash_map): Split out type into...
(timer::named_items::hash_map_t): ...this new typedef.
(timer::named_items::make_json): New function.

[Bug middle-end/110378] IPA-SRA for destructors

2023-07-31 Thread jamborm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110378

--- Comment #2 from Martin Jambor  ---
Created attachment 55663
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55663=edit
Simplest testcase

The PR 109849 testcase behavior changes over time, so I prepared three
specialized for this PR.  This one is simplest, just making sure a clobber is
not considered a write for the purposes of IPA-SRA parameter splitting.

[Bug libstdc++/110862] New: format out of bands read on format string "{0:{0}"

2023-07-31 Thread gcc at pauldreik dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110862

Bug ID: 110862
   Summary: format out of bands read on format string "{0:{0}"
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at pauldreik dot se
  Target Milestone: ---

The following program with an incorrect format string causes an out of bounds
read when compiled with gcc 13.2:

#include 
#include 

int main() {
unsigned short v = 0;
std::puts(std::vformat("{0:{0}", std::make_format_args(v)).c_str());
}

I expected an exception to be thrown.

Link to reproducer: 
https://godbolt.org/z/WrqxGE1jG

[Bug fortran/110360] ABI issue with character,value dummy argument

2023-07-31 Thread dje at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110360

David Edelsohn  changed:

   What|Removed |Added

 CC||tkoenig at gcc dot gnu.org

--- Comment #38 from David Edelsohn  ---
As mentioned by Mikael in PR110419, GFORTRAN calling convention has a hole.

The problem on PowerPC Big Endian is that sometimes GFORTRAN passes a character
as a character and sometimes it passes a character as a STRING of length 1
passed by value.  On Little Endian systems, the single value happens to end up
in the same location in the register.  On Big Endian systems, the values are at
opposite ends of the register.

Or to put it another way, the GFORTRAN internal "type" of CHARACTER as
represented by/to GCC type system is inconsistent.  GFORTRAN is breaking the
GCC type system, which causes the parameter to be represented differently in
the register.

[Bug c++/110861] Bad codegen leading to runtime segfault when mixing import and #include

2023-07-31 Thread headch at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110861

--- Comment #1 from Christopher Head  ---
Note that I ran into this while trying to find a small repro case for the same
symptoms (uninitialized “this” and segfault using a string_view) in a larger
project, where I originally saw the same thing happen while only importing
stdlib headers, not including any of them, and eventually made this as a
smaller example.

[Bug c++/110861] New: Bad codegen leading to runtime segfault when mixing import and #include

2023-07-31 Thread headch at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110861

Bug ID: 110861
   Summary: Bad codegen leading to runtime segfault when mixing
import and #include
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: headch at gmail dot com
  Target Milestone: ---

$ g++-13 --version
g++-13 (Gentoo 13.2.0 p3) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


$ cat hello.cpp
#include 
import ;

void f(std::string_view sv) {
std::cout << sv;
}

int main() {
f("hello\n");
}


$ g++-13 -O1 -Wall -Wextra -c -std=c++20 -x c++-system-header -fmodules-ts
-fmodule-header=system iostream


$ g++-13 -std=c++20 -O1 -Wall -Wextra -fmodules-ts -ohello hello.cpp
In file included from
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/basic_string.h:47,
 from
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/string:54,
 from
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/locale_classes.h:40,
 from
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/ios_base.h:41,
 from
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/ios:44,
 from
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/ostream:40,
 from
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/iostream:41,
of module /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/iostream,
imported at hello.cpp:2:
In member function ‘constexpr const std::basic_string_view<_CharT,
_Traits>::value_type* std::basic_string_view<_CharT, _Traits>::data() const
[with _CharT = char; _Traits = std::char_traits]’,
inlined from ‘std::basic_ostream<_CharT, _Traits>&
std::operator<<(basic_ostream<_CharT, _Traits>&, basic_string_view<_CharT,
_Traits>) [with _CharT = char; _Traits = char_traits]’ at
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/string_view:762:30,
inlined from ‘void f(std::string_view)’ at hello.cpp:5:15:
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/string_view:292:22:
warning: ‘this’ is used uninitialized [-Wuninitialized]
  292 |   { return this->_M_str; }
  |  ^~
In file included from hello.cpp:1:
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/string_view: In function
‘void f(std::string_view)’:
/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/string_view:760:5: note:
‘this’ was declared here
  760 | operator<<(basic_ostream<_CharT, _Traits>& __os,
  | ^~~~


$ ./hello
Segmentation fault (core dumped)


If I use optimization level zero, the problem goes away (both the segfault and
the “uninitialized” warning). Likewise if I import both headers rather than
including one of them. But as far as I’m aware, I haven’t found any indication
that the stdlib is required to be either all-included or all-imported and not a
mix.

[Bug testsuite/110419] [14 regression] new test case gfortran.dg/value_9.f90 in r14-2050-gd130ae8499e0c6 fails

2023-07-31 Thread mikael at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110419

--- Comment #18 from Mikael Morin  ---
Created attachment 55662
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55662=edit
Updated tentative patch

This fixes comment #4 as well, but the failure on value_9 remains on 32 bit
powerpc.

It is almost testsuite clean on x86_64.

There is a regression on c_char_tests_2.f03 because there is a hole in the
handling of single char values in gfc_conv_procedure_call.
Length one arguments are handled with:

6436  else if (fsym && fsym->attr.value)
6437{
6438  if (fsym->ts.type == BT_CHARACTER
6439  && fsym->ts.is_c_interop
6440  && fsym->ns->proc_name != NULL
6441  && fsym->ns->proc_name->attr.is_bind_c)
6442{
  // Pass single char value
6447}
6448  else
6449{
6450gfc_conv_expr (, e);
6451

6456if (!fsym->ts.is_c_interop
6457&& gfc_length_one_character_type_p (>ts))
6458  {
// pass single char value


The failing case is when the type is interoperable (character(c_char)), but the
procedure is not bind(c).  So the translation from string to single character
is neither done in the if branch (the procedure is not bind(c)) nor in the if
of the else branch (the type is interoperable).

[Bug libstdc++/110860] New: std::format("{:f}",2e304) invokes undefined behaviour

2023-07-31 Thread gcc at pauldreik dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860

Bug ID: 110860
   Summary: std::format("{:f}",2e304) invokes undefined behaviour
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at pauldreik dot se
  Target Milestone: ---

The following program, compiled with gcc 13.2:

#include 
#include 

int main() {
std::puts(std::format("{:f}",2e304).c_str());
}

causes ubsan to warn:
/opt/compiler-explorer/gcc-13.2.0/include/c++/13.2.0/format:1489:52: runtime
error: negation of -2147483648 cannot be represented in type 'int'; cast to an
unsigned type to negate this value to itself

I believe the problem is using __builtin_abs() which uses integer, but is fed a
double.


Link to reproducer:

https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,selection:(endColumn:1,endLineNumber:6,positionColumn:1,positionLineNumber:6,selectionStartColumn:1,selectionStartLineNumber:6,startColumn:1,startLineNumber:6),source:'%23include+%3Cformat%3E%0A%23include+%3Ccstdio%3E%0A%0Aint+main()+%7B%0Astd::puts(std::format(%22%7B:f%7D%22,2e304).c_str())%3B%0A%7D'),l:'5',n:'0',o:'C%2B%2B+source+%231',t:'0')),k:33.336,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:g132,deviceViewOpen:'1',filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'0',intel:'0',libraryCode:'0',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:c%2B%2B,libs:!(),options:'-std%3Dgnu%2B%2B2b+-fsanitize%3Daddress,undefined,leak+-g',overrides:!(),selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+x86-64+gcc+13.2+(Editor+%231)',t:'0')),k:33.336,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(editorid:1,fontScale:14,fontUsePx:'0',j:1,wrap:'1'),l:'5',n:'0',o:'Output+of+x86-64+gcc+13.2+(Compiler+%231)',t:'0')),k:33.33,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4

[Bug libstdc++/110859] New: [14 Regression] New FAIL: 23_containers/vector/bool/110807.cc

2023-07-31 Thread carlos.seo at linaro dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110859

Bug ID: 110859
   Summary: [14 Regression] New FAIL:
23_containers/vector/bool/110807.cc
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: carlos.seo at linaro dot org
  Target Milestone: ---

Commit f30e62b0ee05befd20863466d1fb55a34d15c228 introduced a new failure:

FAIL: 23_containers/vector/bool/110807.cc (test for excess errors)

Confirmed on arm linux by the CI -
https://ci.linaro.org/job/tcwg_gnu_native_check_gcc--master-arm-build/488/

[Bug c/110858] New: [14 Regression] gcc.dg/unroll-1.c UNRESOLVED

2023-07-31 Thread carlos.seo at linaro dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110858

Bug ID: 110858
   Summary: [14 Regression] gcc.dg/unroll-1.c UNRESOLVED
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: carlos.seo at linaro dot org
  Target Milestone: ---

Commit a7d4310aed539b04345894ebafb49ca364780653 caused a new UNRESOLVED test
output on trunk:

UNRESOLVED: gcc.dg/unroll-1.c scan-rtl-dump-not loop2_unroll "Invalid sum"

Confirmed on arm linux, found by the precommit CI -
https://ci.linaro.org/job/tcwg_gcc_check--master-arm-precommit/2405/artifact/artifacts/artifacts.precommit/

[Bug gcov-profile/110545] gcda files not generated for some shared libs

2023-07-31 Thread gejoed at rediffmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110545

--- Comment #3 from Gejoe  ---
Any other comment or suggestion here for this case ?

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-07-31 Thread Richard Biener via Gcc-bugs

On Mon, Jul 31, 2023 at 2:57 PM Changbin Du via Gcc  wrote:
>
> Hello, folks.
> This is to discuss Gcc's heuristic strategy about Predicated Instructions and
> Branches. And probably something needs to be improved.
>
> [The story]
> Weeks ago, I built a huffman encoding program with O2, O3, and PGO 
> respectively.
> This program is nothing special, just a random code I found on the internet. 
> You
> can download it from http://cau.ac.kr/~bongbong/dsd08/huffman.c.
>
> Build it with O2/O3/PGO (My GCC is 13.1):
> $ gcc -O2 -march=native -g -o huffman huffman.c
> $ gcc -O3 -march=native -g -o huffman.O3 huffman.c
>
> $ gcc -O2 -march=native -g -fprofile-generate -o huffman.instrumented 
> huffman.c
> $ ./huffman.instrumented test.data
> $ gcc -O2 -march=native -g -fprofile-use=huffman.instrumented.gcda -o 
> huffman.pgo huffman.c
>
> Run them on my 12900H laptop:
> $ head -c 50M /dev/urandom > test.data
> $ perf stat  -r3 --table -- taskset -c 0 ./huffman test.data
> $ perf stat  -r3 --table -- taskset -c 0 ./huffman.O3 test.data
> $ perf stat  -r3 --table -- taskset -c 0 ./huffman.pgo test.data
>
> The result (p-core, no ht, no turbo, performance mode):
>
> O2  O3  PGO
> cycles  2,581,832,749   8,638,401,568   9,394,200,585
> (1.07s) (3.49s) (3.80s)
> instructions12,609,600,094  11,827,675,782  12,036,010,638
> branches2,303,416,221   2,671,184,833   2,723,414,574
> branch-misses   0.00%   7.94%   8.84%
> cache-misses3,012,613   3,055,722   3,076,316
> L1-icache-load-misses   11,416,391  12,112,703  11,896,077
> icache_tag.stalls   1,553,521   1,364,092   1,896,066
> itlb_misses.stlb_hit6,856   21,756  22,600
> itlb_misses.walk_completed  14,430  4,454   15,084
> baclears.any131,573 140,355 131,644
> int_misc.clear_resteer_cycles   2,545,915   586,578,125 679,021,993
> machine_clears.count22,235  39,671  37,307
> dsb2mite_switches.penalty_cycles 6,985,838  12,929,675  8,405,493
> frontend_retired.any_dsb_miss   28,785,677  28,161,724  28,093,319
> idq.dsb_cycles_any  1,986,038,896   5,683,820,258   5,971,969,906
> idq.dsb_uops11,149,445,952  26,438,051,062  28,622,657,650
> idq.mite_uops   207,881,687 216,734,007 212,003,064
>
>
> Above data shows:
>   o O3/PGO lead to *2.3x/2.6x* performance drop than O2 respectively.
>   o O3/PGO reduced instructions by 6.2% and 4.5%. I think this attributes to
> aggressive inline.
>   o O3/PGO introduced very bad branch prediction. I will explain it later.
>   o Code built with O3 has high iTLB miss but much lower sTLB miss. This is 
> beyond
> my expectation.
>   o O3/PGO introduced 78% and 68% more machine clears. This is interesting and
> I don't know why. (subcategory MC is not measured yet)
>   o O3 has much higher dsb2mite_switches.penalty_cycles than O2/PGO.
>   o The idq.mite_uops of O3/PGO increased 4%, while idq.dsb_uops increased 2x.
> DSB hit well. So frontend fetching and decoding is not a problem for 
> O3/PGO.
>   o Other events are mainly affected by bad branch misprediction.
>
> Additionally, here is the TMA level 2 analysis: The main changes in the 
> pipeline
> slots are of Bad Speculation and Frontend Bound categories. I doubt the 
> accuracy
> of tma_fetch_bandwidth according to above frontend_retired.any_dsb_miss and
> idq.mite_uops data.
>
> $ perf stat --topdown --td-level=2 --cputype core -- taskset -c 0 ./huffman 
> test.data
> test.data.huf is 1.00% of test.data
>
>  Performance counter stats for 'taskset -c 0 ./huffman test.data':
>
>  %  tma_branch_mispredicts%  tma_core_bound %  tma_heavy_operations %  
> tma_light_operations  %  tma_memory_bound %  tma_fetch_bandwidth %  
> tma_fetch_latency %  tma_machine_clears
>0.0  0.8 11.4  
>76.8  2.0 8.3  
> 0.80.0
>
>1.073381357 seconds time elapsed
>
>0.945233000 seconds user
>0.095719000 seconds sys
>
>
> $ perf stat --topdown --td-level=2 --cputype core -- taskset -c 0 
> ./huffman.O3 test.data
> test.data.huf is 1.00% of test.data
>
>  Performance counter stats for 'taskset -c 0 ./huffman.O3 test.data':
>
>  %  tma_branch_mispredicts%  tma_core_bound %  tma_heavy_operations %  
> tma_light_operations  %  tma_memory_bound %  tma_fetch_bandwidth %  
> tma_fetch_latency %  tma_machine_clears
>   38.2  6.6  3.5  
>21.7  0.920.9

[Bug tree-optimization/110280] [13 Regression] internal compiler error: in const_unop, at fold-const.cc:1884

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110280

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
  Known to fail||13.2.0
  Known to work||13.2.1

--- Comment #18 from Richard Biener  ---
Fixed.

[Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-07-31 Thread Changbin Du via Gcc-bugs

Hello, folks.
This is to discuss Gcc's heuristic strategy about Predicated Instructions and
Branches. And probably something needs to be improved.

[The story]
Weeks ago, I built a huffman encoding program with O2, O3, and PGO respectively.
This program is nothing special, just a random code I found on the internet. You
can download it from http://cau.ac.kr/~bongbong/dsd08/huffman.c.

Build it with O2/O3/PGO (My GCC is 13.1):
$ gcc -O2 -march=native -g -o huffman huffman.c
$ gcc -O3 -march=native -g -o huffman.O3 huffman.c

$ gcc -O2 -march=native -g -fprofile-generate -o huffman.instrumented huffman.c
$ ./huffman.instrumented test.data
$ gcc -O2 -march=native -g -fprofile-use=huffman.instrumented.gcda -o 
huffman.pgo huffman.c

Run them on my 12900H laptop:
$ head -c 50M /dev/urandom > test.data
$ perf stat  -r3 --table -- taskset -c 0 ./huffman test.data
$ perf stat  -r3 --table -- taskset -c 0 ./huffman.O3 test.data
$ perf stat  -r3 --table -- taskset -c 0 ./huffman.pgo test.data

The result (p-core, no ht, no turbo, performance mode):

O2  O3  PGO
cycles  2,581,832,749   8,638,401,568   9,394,200,585
(1.07s) (3.49s) (3.80s)
instructions12,609,600,094  11,827,675,782  12,036,010,638
branches2,303,416,221   2,671,184,833   2,723,414,574
branch-misses   0.00%   7.94%   8.84%
cache-misses3,012,613   3,055,722   3,076,316
L1-icache-load-misses   11,416,391  12,112,703  11,896,077
icache_tag.stalls   1,553,521   1,364,092   1,896,066
itlb_misses.stlb_hit6,856   21,756  22,600
itlb_misses.walk_completed  14,430  4,454   15,084
baclears.any131,573 140,355 131,644
int_misc.clear_resteer_cycles   2,545,915   586,578,125 679,021,993
machine_clears.count22,235  39,671  37,307
dsb2mite_switches.penalty_cycles 6,985,838  12,929,675  8,405,493
frontend_retired.any_dsb_miss   28,785,677  28,161,724  28,093,319
idq.dsb_cycles_any  1,986,038,896   5,683,820,258   5,971,969,906
idq.dsb_uops11,149,445,952  26,438,051,062  28,622,657,650
idq.mite_uops   207,881,687 216,734,007 212,003,064


Above data shows:
  o O3/PGO lead to *2.3x/2.6x* performance drop than O2 respectively.
  o O3/PGO reduced instructions by 6.2% and 4.5%. I think this attributes to
aggressive inline.
  o O3/PGO introduced very bad branch prediction. I will explain it later.
  o Code built with O3 has high iTLB miss but much lower sTLB miss. This is 
beyond
my expectation.
  o O3/PGO introduced 78% and 68% more machine clears. This is interesting and
I don't know why. (subcategory MC is not measured yet)
  o O3 has much higher dsb2mite_switches.penalty_cycles than O2/PGO.
  o The idq.mite_uops of O3/PGO increased 4%, while idq.dsb_uops increased 2x.
DSB hit well. So frontend fetching and decoding is not a problem for O3/PGO.
  o Other events are mainly affected by bad branch misprediction.

Additionally, here is the TMA level 2 analysis: The main changes in the pipeline
slots are of Bad Speculation and Frontend Bound categories. I doubt the accuracy
of tma_fetch_bandwidth according to above frontend_retired.any_dsb_miss and
idq.mite_uops data.

$ perf stat --topdown --td-level=2 --cputype core -- taskset -c 0 ./huffman 
test.data
test.data.huf is 1.00% of test.data

 Performance counter stats for 'taskset -c 0 ./huffman test.data':

 %  tma_branch_mispredicts%  tma_core_bound %  tma_heavy_operations %  
tma_light_operations  %  tma_memory_bound %  tma_fetch_bandwidth %  
tma_fetch_latency %  tma_machine_clears
   0.0  0.8 11.4
 76.8  2.0 8.3  
0.80.0

   1.073381357 seconds time elapsed

   0.945233000 seconds user
   0.095719000 seconds sys


$ perf stat --topdown --td-level=2 --cputype core -- taskset -c 0 ./huffman.O3 
test.data
test.data.huf is 1.00% of test.data

 Performance counter stats for 'taskset -c 0 ./huffman.O3 test.data':

 %  tma_branch_mispredicts%  tma_core_bound %  tma_heavy_operations %  
tma_light_operations  %  tma_memory_bound %  tma_fetch_bandwidth %  
tma_fetch_latency %  tma_machine_clears
  38.2  6.6  3.5
 21.7  0.920.9  
7.50.8

   3.501875873 seconds time elapsed

   3.378572000 seconds user
   0.084163000 seconds sys


$ perf stat --topdown --td-level=2 --cputype core -- taskset -c 0 ./huffman.pgo 
test.data
test.data.huf is

[Bug target/110625] [AArch64] Vect: SLP fails to vectorize a loop as the reduction_latency calculated by new costs is too large

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110625

--- Comment #12 from CVS Commits  ---
The master branch has been updated by Hao Liu :

https://gcc.gnu.org/g:bf67bf4880ce5be0b6e48c7c35828528b7be12ed

commit r14-2877-gbf67bf4880ce5be0b6e48c7c35828528b7be12ed
Author: Hao Liu 
Date:   Mon Jul 31 20:53:37 2023 +0800

AArch64: Do not increase the vect reduction latency by multiplying count
[PR110625]

The new costs should only count reduction latency by multiplying count for
single_defuse_cycle.  For other situations, this will increase the
reduction
latency a lot and miss vectorization opportunities.

Tested on aarch64-linux-gnu.

gcc/ChangeLog:

PR target/110625
* config/aarch64/aarch64.cc (count_ops): Only '* count' for
single_defuse_cycle while counting reduction_latency.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr110625_1.c: New testcase.
* gcc.target/aarch64/pr110625_2.c: New testcase.

[Bug c++/110848] Consider enabling -Wvla by default in non-GNU C++ modes

2023-07-31 Thread aaron at aaronballman dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110848

--- Comment #8 from Aaron Ballman  ---
(In reply to Richard Biener from comment #7)
> I think -std=c++XY should diagnose (at least with a warning) the use of GNU
> extensions.  Let me alter the summary and confirm.

Thanks! I still think this should be diagnosed in all language modes due to the
ease of accidental usage along with the feature's security concerns, but at
least getting it diagnosed by default in C++ language modes is a step in the
right direction. Some more evidence of the security concerns (VLAs in general,
not specific to C++):

https://nvd.nist.gov/vuln/detail/CVE-2015-5147
https://nvd.nist.gov/vuln/detail/CVE-2020-11203
https://nvd.nist.gov/vuln/detail/CVE-2021-3527

That said, it sounds like GCC maintainers feel (at least somewhat) strongly
that this extension should not be diagnosed by default in GNU mode. I think
Clang can follow suit so that there's less problems for folks porting between
the two compilers. But we've recently started being more aggressive about
diagnosing things that have security implications in C and C++ because of
warnings to not use these languages due to poor security practices and lack of
coverage with tooling:

https://advocacy.consumerreports.org/wp-content/uploads/2023/01/Memory-Safety-Convening-Report-1-1.pdf
https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/0/CSI_SOFTWARE_MEMORY_SAFETY.PDF

I think VLA usage in C++ meets the bar as something to be more aggressive with
warning users about. It's not that the extension is broken, it's that it's very
often a surprise you're using the extension in the first place. It's
unfortunate to have to opt out of diagnostics about an extension you're
intentionally using; IMO, it's more unfortunate to have a CVE for your product
due to accidentally using an extension you weren't aware of.

[Bug middle-end/110857] New: aarch64-linux-gnu profiledbootstrap broken

2023-07-31 Thread prathamesh3492 at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110857

Bug ID: 110857
   Summary: aarch64-linux-gnu profiledbootstrap broken
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Bootstrapping gcc with profiledboostrap results in following failure:

during GIMPLE pass: ivcanon
../../gcc/gcc/cfgrtl.cc: In function ‘bool could_fall_through(basic_block,
basic_block)’:
../../gcc/gcc/cfgrtl.cc:670:1: internal compiler error: in operator>, at
profile-count.h:995
  670 | could_fall_through (basic_block src, basic_block target)
  | ^~
0xc6c89f profile_count::operator>(profile_count const&) const
../../gcc/gcc/profile-count.h:995
0xc6c89f profile_count::operator>(profile_count const&) const
../../gcc/gcc/profile-count.h:987
0xc6c89f update_loop_exit_probability_scale_dom_bbs(loop*, edge_def*,
profile_count)
../../gcc/gcc/cfgloopmanip.cc:641
0xc6cb2b scale_loop_profile(loop*, profile_probability, long)
../../gcc/gcc/cfgloopmanip.cc:776
0x1338a5f try_unroll_loop_completely
../../gcc/gcc/tree-ssa-loop-ivcanon.cc:927
0x1338a5f canonicalize_loop_induction_variables
../../gcc/gcc/tree-ssa-loop-ivcanon.cc:1274
0x13396cf canonicalize_induction_variables()
../../gcc/gcc/tree-ssa-loop-ivcanon.cc:1317
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

This seems most likely caused due to:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=88618fa0211d77d91b70f7af9b02e08a34b57912

Thanks,
Prathamesh

[Bug tree-optimization/110838] [14 Regression] wrong code on x365-3.5, -O3, sign extraction

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110838

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #6 from Richard Biener  ---
I have a patch.

[Bug target/110309] Wrong code for masked load expansion

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110309

Richard Biener  changed:

   What|Removed |Added

  Known to work||13.1.1, 14.0
   Target Milestone|--- |11.5
  Known to fail||13.1.0
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Richard Biener  ---
Fixed.

[Bug tree-optimization/88540] Issues with vectorization of min/max operations

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88540

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug rtl-optimization/94864] Failure to combine vunpckhpd+movsd into single vunpckhpd

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94864
Bug 94864 depends on bug 88540, which changed state.

Bug 88540 Summary: Issues with vectorization of min/max operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88540

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/88540] Issues with vectorization of min/max operations

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88540

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Richard Biener  ---
Both issues are now fixed.

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 88540, which changed state.

Bug 88540 Summary: Issues with vectorization of min/max operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88540

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/110062] missed vectorization in graphicsmagick

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110062

--- Comment #10 from Richard Biener  ---
We now also apply SLP vectorizing the loop, but as said the high VF is probably
prohibitive and causes quite some spilling:

.L7:
vmovdqu (%r14), %ymm2
vmovdqu 32(%r14), %ymm1
subq$-128, %r14
subq$-128, %rdx
vmovups -128(%rdx), %ymm10
vmovdqu -64(%r14), %ymm0
vpshufb .LC7(%rip), %ymm2, %ymm4
vmovups -96(%rdx), %ymm9
vmovups -64(%rdx), %ymm8
vpshufb .LC8(%rip), %ymm1, %ymm3
vpermq  $78, %ymm4, %ymm4
vpermq  $78, %ymm3, %ymm3
...
vmulps  %ymm7, %ymm0, %ymm0
vaddps  136(%rsp), %ymm0, %ymm7
vaddps  %ymm3, %ymm15, %ymm15
vmovaps %ymm4, 168(%rsp)
vmovaps %ymm7, 136(%rsp)
cmpq%r13, %r14
jne .L7

Maybe we should more aggressively reject vectorization when the VF is
equal to the smallest element number of vector lanes.  When we then
also detect SLP this usually means BB-level SLP can do something.
Note we fail to support V2SF -> V2QI now, not sure what changed here.
vectorizable_conversion doesn't support float->int->short->char but
only either float->char, float->int->char or float->short->char, but
at least for 2-element vectors we don't support these (the vectorizer
could support extra intermediate steps as well).

[Bug tree-optimization/110280] [13 Regression] internal compiler error: in const_unop, at fold-const.cc:1884

2023-07-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110280

--- Comment #17 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Prathamesh Kulkarni
:

https://gcc.gnu.org/g:f4029de35fb1b293a4fd586574b1b4b73ddf7880

commit r13-7661-gf4029de35fb1b293a4fd586574b1b4b73ddf7880
Author: Prathamesh Kulkarni 
Date:   Wed Jul 26 22:36:26 2023 +0530

[aarch64/match.pd] Fix ICE observed in PR110280.

gcc/ChangeLog:
PR tree-optimization/110280
* match.pd (vec_perm_expr(v, v, mask) -> v): Explicitly build
vector
using build_vector_from_val with the element of input operand, and
mask's type if operand and mask's types don't match.

gcc/testsuite/ChangeLog:
PR tree-optimization/110280
* gcc.target/aarch64/sve/pr110280.c: New test.

(cherry picked from commit 85d8e0d8d5342ec8b4e6a54e22741c30b33c6f04)

[Bug c++/110856] New: GCC rejects template alias of function type as invalid template parameter

2023-07-31 Thread hewillk at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110856

Bug ID: 110856
   Summary: GCC rejects template alias of function type as invalid
template parameter
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

GCC rejects the following valid case:

#include 
#include 
#include 

template
using identity = T;

template
using MakeFunType = decltype(
  [](std::index_sequence) {
return std::type_identity...)>{};
  }(std::make_index_sequence{})
)::type;

template
class S {
  using F1 = std::function>; // ok
  using F2 = std::function>; // syntax error
};

https://godbolt.org/z/fc638b4Y1

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-07-31
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #17 from Richard Biener  ---
(In reply to Alexander Monakov from comment #16)
> In C11 and C++11 the issue of compiler-introduced racing loads is discussed
> as follows (5.1.2.4 Multi-threaded executions and data races in C11):
> 
> 28 NOTE 14 Transformations that introduce a speculative read of a
> potentially shared memory location may not preserve the semantics of the
> program as defined in this standard, since they potentially introduce a data
> race. However, they are typically valid in the context of an optimizing
> compiler that targets a specific machine with well-defined semantics for
> data races. They would be invalid for a hypothetical machine that is not
> tolerant of races or provides hardware race detection.
> 
> 
> So for TSan we'd allow hoisting only after TSan instrumentation, and for
> Helgrind we'd ask them to handle load-load-cmov combo as only consuming one
> of the loads?

Yes.  I think for testing optimized code (aka production code) Helgrind
is more useful while TSan could as well be used with -O0.  Note I've
never had the need to use either of them.

> I think the other optimizations from comment #8 introduce racing loads more
> rarely in practice, because they are limited to non-trapping accesses.

Yes, that's true.

Generally with the above note from the standard I wonder if we want to
have an option to control speculating loads (of "global" memory, aka
memory which address possibly escapes to another thread?).

[Bug c++/110848] Consider enabling -Wvla by default in non-GNU C++ modes

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110848

Richard Biener  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-07-31
Summary|Consider enabling -Wvla by  |Consider enabling -Wvla by
   |default in C++ modes|default in non-GNU C++
   ||modes

--- Comment #7 from Richard Biener  ---
I think -std=c++XY should diagnose (at least with a warning) the use of GNU
extensions.  Let me alter the summary and confirm.

[Bug sanitizer/110799] [tsan] False positive due to -fhoist-adjacent-loads

2023-07-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799

--- Comment #16 from Alexander Monakov  ---
In C11 and C++11 the issue of compiler-introduced racing loads is discussed as
follows (5.1.2.4 Multi-threaded executions and data races in C11):

28 NOTE 14 Transformations that introduce a speculative read of a potentially
shared memory location may not preserve the semantics of the program as defined
in this standard, since they potentially introduce a data race. However, they
are typically valid in the context of an optimizing compiler that targets a
specific machine with well-defined semantics for data races. They would be
invalid for a hypothetical machine that is not tolerant of races or provides
hardware race detection.


So for TSan we'd allow hoisting only after TSan instrumentation, and for
Helgrind we'd ask them to handle load-load-cmov combo as only consuming one of
the loads?


I think the other optimizations from comment #8 introduce racing loads more
rarely in practice, because they are limited to non-trapping accesses.

[Bug lto/110844] LTO sometimes fail with -save-temp -dumpdir options

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110844

--- Comment #3 from Richard Biener  ---
we shouldn't pass it two times either

[Bug target/110762] [11/12/13 Regression] inappropriate use of SSE (or AVX) insns for v2sf mode operations

2023-07-31 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762

--- Comment #23 from Hongtao.liu  ---
(In reply to Uroš Bizjak from comment #22)
> It looks to me that partial vector half-float instructions have the same
> issue.

Yes, I'll take a look.

[Bug target/110843] [14 Regression] ICE in convert_insn, at config/i386/i386-features.cc:1438 since r14-2405-g4814b63c3c2326

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110843

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Richard Biener  ---
Fixed.

[Bug target/110762] [11/12/13 Regression] inappropriate use of SSE (or AVX) insns for v2sf mode operations

2023-07-31 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110762

--- Comment #22 from Uroš Bizjak  ---
It looks to me that partial vector half-float instructions have the same issue.

[Bug target/81904] FMA and addsub instructions

2023-07-31 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #7 from Hongtao.liu  ---

> 
> to .VEC_ADDSUB possibly loses exceptions (the vectorizer now directly
> creates .VEC_ADDSUB when possible).
Let's put it under -fno-trapping-math.

[Bug middle-end/110832] [14 Regression] 14% capacita -O2 regression between g:9fdbd7d6fa5e0a76 (2023-07-26 01:45) and g:ca912a39cccdd990 (2023-07-27 03:44) on zen3 and core

2023-07-31 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832

--- Comment #10 from Uroš Bizjak  ---
(In reply to Hongtao.liu from comment #9)
> for mov_internal, we can just set alternative (v,v) with mode DI, then
> it will use vmovq, for other alternatives which set sse_regs, the
> instructions has already cleared the upper bits.
Move instructions can be sanitized in ix86_expand_vector_move. If the target is
in V2SFmode and the source is a subreg register, then movq_v2sf_to_sse should
be emitted. However, we would still like to emit MOVAPS reg, reg for V2SF to
V2SF moves, because MOVAPS may be eliminated by hardware, while MOVQ won't be.

[Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-07-31 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293

--- Comment #21 from Jan Hubicka  ---
Fixing loop distribution and vectorizer profile update seems to do the trick
with profile feedback. Without we are still worse than in July last year on
zen2 tester (zen3 and ice lake seems to behave differently perhaps due to
different vectorization decisions)

https://lnt.opensuse.org/db_default/v4/SPEC/graph?highlight_run=38536=476.180.0

shows two jumps last year.
g:d489ec082ea21410 (2022-06-30 16:46) and 3731dd0bea8994c3 (2022-07-04 00:16)
g:3731dd0bea8994c3 (2022-07-04 00:16) and 07dd0f7ba27d1fe9 (2022-07-05 14:05)

Which seems both different from the patch listed (which is even older).
Optically it seems that second jump is gone, but it is hard to tell a year
later.
Martin, it would be great to bisect these two.

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2023-07-31 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 110758, which changed state.

Bug 110758 Summary: [14 Regression] 8% hmmer regression on zen1/3 with -Ofast 
-march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and 
g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c9bc5ef113 (2023-07-16 00:16) 
and g:a5088dc3f5ef73c8 (2023-07-17 03:24)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c9bc5

2023-07-31 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758

Jan Hubicka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Jan Hubicka  ---
Fixing profile update in loop distribution fixed the regression and we get
better result than before.  The first tester still shows regression compared to
July last year, but we have PR106293 to track this.

[Bug c++/110855] New: std::source_location doesn't work with C++20 coroutine

2023-07-31 Thread hewillk at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110855

Bug ID: 110855
   Summary: std::source_location doesn't work with C++20 coroutine
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

When I try to use source_location to debug C++ coroutine:

  auto initial_suspend(const std::source_location location =
   std::source_location::current()) {
return std::suspend_never{};
  }

I found that gcc does not allow this with:

  cc1plus: error: taking address of an immediate function 'static consteval 
  std::source_location std::source_location::current(__builtin_ret_type)'
  In function 'void bar(bar()::_Z3barv.Frame*)':

Not too sure if this is a valid code, although other compilers compile fine.

https://godbolt.org/z/qddrsvsT9

[Bug target/81904] FMA and addsub instructions

2023-07-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #6 from Richard Biener  ---
(In reply to Hongtao.liu from comment #5)
> (In reply to Richard Biener from comment #1)
> > Hmm, I think the issue is we see
> > 
> > f (__m128d x, __m128d y, __m128d z)
> > {
> >   vector(2) double _4;
> >   vector(2) double _6;
> > 
> >[100.00%]:
> >   _4 = x_2(D) * y_3(D);
> >   _6 = __builtin_ia32_addsubpd (_4, z_5(D)); [tail call]
> We can fold the builtin into .VEC_ADDSUB, and optimize MUL + VEC_ADDSUB ->
> VEC_FMADDSUB in match.pd?

I think MUL + .VEC_ADDSUB can be handled in the FMA pass.  For my example
above we early (before FMA recog) get

  _4 = x_2(D) * y_3(D);
  tem2_7 = _4 + z_6(D);
  tem3_8 = _4 - z_6(D);
  _9 = VEC_PERM_EXPR ;

we could recognize that as .VEC_ADDSUB.  I think we want to avoid doing
this too early, not sure if doing this within the FMA pass itself will
work since we key FMAs on the mult but would need to key the addsub
on the VEC_PERM (we are walking stmts from BB start to end).  Looking
at the code it seems changing the walking order should work.

Note matching

  tem2_7 = _4 + z_6(D);
  tem3_8 = _4 - z_6(D);
  _9 = VEC_PERM_EXPR ;

to .VEC_ADDSUB possibly loses exceptions (the vectorizer now directly
creates .VEC_ADDSUB when possible).

[Bug target/110478] RISC-V multilib gcc zicsr in the -march causing incorrect libgcc to be used

2023-07-31 Thread bmeng.cn at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110478

--- Comment #7 from Bin Meng  ---
Any update about this issue?

1 2 >

1 - 100 of 104 matches

Mail list logo