[Bug target/113240] Use wrong rule to pass fixed-length(size<=2*XLEN) vector argument

2024-01-04 Thread kito at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113240

--- Comment #6 from Kito Cheng  ---
> There needs to be a -Wabi warning for this too for the change between 
> versions.

This bug only happened on trunk, and GCC 13 is OK, so I think it's not the
case?

[Bug target/113193] [SH] ICE in gen_reg_rtx, at emit-rtl.cc:1177 with -mfcsa -funsafe-math-operations

2024-01-04 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113193

Oleg Endo  changed:

   What|Removed |Added

  Known to fail||13.2.1
Version|14.0|13.0
 Status|UNCONFIRMED |NEW
   Keywords||ra
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-05

--- Comment #1 from Oleg Endo  ---
I was able to reproduce the bug with the following compiler options on 13
branch:

-x c++ -std=c++17 -O3 -m4-single -ml -mfsca -mfsrra -funsafe-math-optimizations


combined.cpp: In function 'void transform(const model_transform&, const vec3*,
int, float*)':
combined.cpp:574:1: internal compiler error: in gen_reg_rtx, at
emit-rtl.cc:1171
0x61da53 gen_reg_rtx(machine_mode)
../../gcc/gcc/emit-rtl.cc:1171
0x9e8cc1 copy_to_reg(rtx_def*)
../../gcc/gcc/explow.cc:622
0x9d7727 operand_subword_force(rtx_def*, poly_int<1u, unsigned long>,
machine_mode)
../../gcc/gcc/emit-rtl.cc:1812
0xa0eddc emit_move_multi_word
../../gcc/gcc/expr.cc:4129
0xa12bdb gen_move_insn(rtx_def*, rtx_def*)
../../gcc/gcc/expr.cc:4359
0xccfcda gen_reload
../../gcc/gcc/reload1.cc:8614
0xcd8956 emit_output_reload_insns
../../gcc/gcc/reload1.cc:7667
0xcd8956 do_output_reload
../../gcc/gcc/reload1.cc:7939
0xcd8956 emit_reload_insns
../../gcc/gcc/reload1.cc:8003
0xcd8956 reload_as_needed
../../gcc/gcc/reload1.cc:4543
0xcdc460 reload(rtx_insn*, int)
../../gcc/gcc/reload1.cc:1047
0xb78004 do_reload
../../gcc/gcc/ira.cc:5975
0xb78004 execute
../../gcc/gcc/ira.cc:6149


With added -mlra:

ombined.cpp: In function 'void transform(const model_transform&, const vec3*,
int, float*)':
combined.cpp:574:1: internal compiler error: maximum number of generated reload
insns per insn achieved (90)
0xbd0699 lra_constraints(bool)
../../gcc/gcc/lra-constraints.cc:5258
0xbbc182 lra(_IO_FILE*)
../../gcc/gcc/lra.cc:2375
0xb77999 do_reload
../../gcc/gcc/ira.cc:5963
0xb77999 execute
../../gcc/gcc/ira.cc:6149

[Bug target/113240] Use wrong rule to pass fixed-length(size<=2*XLEN) vector argument

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113240

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-05
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #5 from Andrew Pinski  ---
.

[Bug target/113240] Use wrong rule to pass fixed-length(size<=2*XLEN) vector argument

2024-01-04 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113240

--- Comment #4 from Lehua Ding  ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Lehua Ding from comment #2)
> > (In reply to Andrew Pinski from comment #1)
> > > There needs to be a -Wabi warning for this too for the change between
> > > versions.
> > 
> > I'm more inclined to think of it as a bug, since the vector ABI
> > specification is still being worked out and isn't stable and currently used
> > with vector arguments will throw a warning about unstable.
> 
> Yes but other targets had bugs in argument passing and now throw a warning
> about the change ... Plus there is no current -Wpsabi warning in previous
> versions of GCC 

OK, makes sense.

[Bug target/113240] Use wrong rule to pass fixed-length(size<=2*XLEN) vector argument

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113240

--- Comment #3 from Andrew Pinski  ---
(In reply to Lehua Ding from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > There needs to be a -Wabi warning for this too for the change between
> > versions.
> 
> I'm more inclined to think of it as a bug, since the vector ABI
> specification is still being worked out and isn't stable and currently used
> with vector arguments will throw a warning about unstable.

Yes but other targets had bugs in argument passing and now throw a warning
about the change ... Plus there is no current -Wpsabi warning in previous
versions of GCC 

[Bug target/113240] Use wrong rule to pass fixed-length(size<=2*XLEN) vector argument

2024-01-04 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113240

--- Comment #2 from Lehua Ding  ---
(In reply to Andrew Pinski from comment #1)
> There needs to be a -Wabi warning for this too for the change between
> versions.

I'm more inclined to think of it as a bug, since the vector ABI specification
is still being worked out and isn't stable and currently used with vector
arguments will throw a warning about unstable.

[Bug ipa/112783] core dump on libxo when function is inlined

2024-01-04 Thread jiangchuanganghw at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112783

Jiang ChuanGang  changed:

   What|Removed |Added

 CC||jiangchuanganghw at outlook 
dot co
   ||m

--- Comment #9 from Jiang ChuanGang  ---
(In reply to Andrew Pinski from comment #3)
> The problem is not in GCC but rather a bad assumption on the code part.
> 
> Basically we have:
> 
> memcpy(nbuf, name, nlen);
> ...
> 
> 
> if (!name)
> ...
> 
> 
> But the check after a memcpy can be removed as passing a null pointer to
> memcpy is undefined even if nlen is 0 here.
> 
> 
> This patch to the sources fixes the issue for me:
> ```
> diff --git a/libxo/libxo.c b/libxo/libxo.c
> index 916a111..ea71723 100644
> --- a/libxo/libxo.c
> +++ b/libxo/libxo.c
> @@ -4300,7 +4300,8 @@ xo_format_value (xo_handle_t *xop, const char *name,
> ssize_t nlen,
> if ((xsp->xs_flags & (XSF_EMIT | XSF_EMIT_KEY))
> || !(xsp->xs_flags & XSF_EMIT_LEAF_LIST)) {
> char nbuf[nlen + 1];
> -   memcpy(nbuf, name, nlen);
> +if (name)
> + memcpy(nbuf, name, nlen);
> nbuf[nlen] = '\0';
> 
> ssize_t rc = xo_transition(xop, 0, nbuf, XSS_EMIT_LEAF_LIST);
> @@ -4324,7 +4325,8 @@ xo_format_value (xo_handle_t *xop, const char *name,
> ssize_t nlen,
> 
> } else if (!(xsp->xs_flags & XSF_EMIT_KEY)) {
> char nbuf[nlen + 1];
> -   memcpy(nbuf, name, nlen);
> +if (name)
> + memcpy(nbuf, name, nlen);
> nbuf[nlen] = '\0';
> 
> ssize_t rc = xo_transition(xop, 0, nbuf, XSS_EMIT);
> @@ -4342,7 +4344,8 @@ xo_format_value (xo_handle_t *xop, const char *name,
> ssize_t nlen,
> if ((xsp->xs_flags & XSF_EMIT_LEAF_LIST)
> || !(xsp->xs_flags & XSF_EMIT)) {
> char nbuf[nlen + 1];
> -   memcpy(nbuf, name, nlen);
> +if (name)
> + memcpy(nbuf, name, nlen);
> nbuf[nlen] = '\0';
> 
> ssize_t rc = xo_transition(xop, 0, nbuf, XSS_EMIT);
> 
> ```

I've encountered the same bug, and your solution does fix it.
But strangely enough, I can't reproduce it with code like the following.
The inevitable condition of this bug still puzzles me. Do you have any thoughts
on this.


#include 
#include 

static const char *xo_xml_leader_len(const char *name, int len)
{
if (name == NULL || name[0] == '_')
return "";
return "_";
}

void test()
{
char *name = NULL;
int len = 0;
char mbuf[len + 1];
memcpy(mbuf, name, len);
mbuf[len] = '\0';
const char *leader = xo_xml_leader_len(name, len);
printf("leader: %s", leader);
}

int main()
{
test();
return 0;
}

[Bug target/113240] Use wrong rule to pass fixed-length(size<=2*XLEN) vector argument

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113240

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #1 from Andrew Pinski  ---
There needs to be a -Wabi warning for this too for the change between versions.

[Bug target/113240] New: Use wrong rule to pass fixed-length(size<=2*XLEN) vector argument

2024-01-04 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113240

Bug ID: 113240
   Summary: Use wrong rule to pass fixed-length(size<=2*XLEN)
vector argument
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lehua.ding at rivai dot ai
  Target Milestone: ---

According to a recent proposal
(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/416), GCC
incorrectly uses references when passing fixed-length vector arguments with
size less than or equal to 2*VLEN instead of scalar registers for passing.
Reproduced online: https://godbolt.org/z/3ooovcz7c

C Code:
```
#include 

typedef int v4si __attribute__ ((vector_size (16)));

v4si foo (v4si a, v4si b)
{
  v4si c = a + b;
  return c;
}
```

Asm:
```
foo:
vsetivlizero,4,e32,m1,ta,ma
vle32.v v1,0(a1)
vle32.v v2,0(a2)
vadd.vv v1,v1,v2
vse32.v v1,0(a0)
ret
```

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread yangyujie at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233

--- Comment #6 from Yang Yujie  ---
(In reply to Xi Ruoyao from comment #4)
> (In reply to Jan Hubicka from comment #3)
> > > Confirm.  But option save/restore has been always implemented:
> > > 
> > > .section.gnu.lto_.opts,"",@progbits
> > > .ascii  "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection"
> > > .ascii  "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m"
> > > .ascii  "simd=lasx' '-mcmodel=normal' '-mtune=loongarch64' '-flto"
> > > .ascii  "'\000"
> > > 
> > > So -msimd=lasx is correctly recorded.  Not sure why it does not work.
> > 
> > With LTO we need to mix code compiled with different sets of options.
> > For this reason we imply for every function defition and optimization
> > and target attribute which record the flags.  So it seems target
> > attribute is likely broken for this flag.
> 
> Target attribute is not implemented for LoongArch.  And I don't think it's a
> good idea to implement it in stage 3.

Yes, target attribute may have to wait.  But save/restore can be implemented
without target attributes of functions.  By marking options as "Save" in .opt
or implementing custom TARGET_OPTION_{SAVE,RESTORE} hooks, we can stream the
target configuration (which may come from the command line / the target
attributes / #pragma GCC target) into the per-function LTO bytecode, so that
lto1 can pick up and use them later when generating code for each function.

[Bug demangler/86152] Failure to demange clone names with digits

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86152

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Fixed for GCC 12 by r12-6154-gbe674bdd11d5fa .

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2024-01-04 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087

--- Comment #28 from JuzheZhong  ---
(In reply to Patrick O'Neill from comment #27)
> Linking the discussion/plan here since more interested people are CCd here.
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113206#c9
> Using 4a0a8dc1b88408222b88e10278017189f6144602, the spec run failed on:
> zvl128b (All runtime fails):
> 527.cam4 (Runtime)
> 531.deepsjeng (Runtime)
> 521.wrf (Runtime)
> 523.xalancbmk (Runtime)
> 
> zvl256b:
> 507.cactuBSSN (Runtime)
> 521.wrf (Build)
> 527.cam4 (Runtime)
> 531.deepsjeng (Runtime)
> 549.fotonik3d (Runtime)
> 
> With that info I think the next steps are:
> 1. Triage the zvl256b 521.wrf build failure
> 2. Bisect the newly-failing testcases
> 3. Finish triaging the remaining testcases the fuzzer found
> 4. Attempt to manually reduce cam4 for zvl128b (since it seems to have the
> fastest build+runtime)
> 5. Attempt to manually reduce other fails.

Plz reduce cam4 for zvl128b first. No need to care about build fail of wrf.
We already know the reason, it's middle-end issue which takes some time.

[Bug target/113156] AVR build broken due to ICE while compiling libgcc, started with r14-6201-gf0a90c7d7333fc

2024-01-04 Thread paulbendixen at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113156

Paul M. Bendixen  changed:

   What|Removed |Added

 CC||paulbendixen at gmail dot com

--- Comment #3 from Paul M. Bendixen  ---
I can confirm this also happens for me.

I have found the same commit to be the problem and have tried with both gcc 11
and 12 
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
gcc-12 (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0

For reproducability it was configured using:
../configure --prefix=$PREFIX --target=avr --enable-languages=c,c++
--disable-nls --disable-libssp --with-dwarf2

The error only seems to happen when compiling the long64 part of the multilib
(or at least first), the parts compiled for avrXX and xmegaX targets run fine.

[Bug demangler/86152] Failure to demange clone names with digits

2024-01-04 Thread ssbssa at yahoo dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86152

Hannes Domani  changed:

   What|Removed |Added

 CC||ssbssa at yahoo dot de

--- Comment #1 from Hannes Domani  ---
Looks like this was fixed by this commit:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=be674bdd11d5fa6b20d469e6d6f43c26da9e744f

[Bug tree-optimization/113239] [13 regression] After 822a11a1e64, bogus -Warray-bounds warnings in std::vector

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113239

--- Comment #1 from Andrew Pinski  ---
So I suspect it is either inlining differences due to slightly increased sizes
in some cases or jump threading due to the extra check. I highly doubt that
patch is underlying cause of the warning ...

[Bug tree-optimization/113186] [13/14 Regression] `(a^c) & (a^!c)` is not optimized to 0 for bool

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113186

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/113186] [13/14 Regression] `(a^c) & (a^!c)` is not optimized to 0 for bool

2024-01-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113186

--- Comment #5 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:97def769e6b28832f5ba4087d6fcdd44e18bf005

commit r14-6927-g97def769e6b28832f5ba4087d6fcdd44e18bf005
Author: Andrew Pinski 
Date:   Sun Dec 31 16:38:30 2023 -0800

Match: Improve inverted_equal_p for bool and `^` and `==` [PR113186]

For boolean types, `a ^ b` is a valid form for `a != b`. This means for
gimple_bitwise_inverted_equal_p, we catch some inverted value forms. This
patch extends inverted_equal_p to allow matching of `^` with the
corresponding `==`. Note in the testcase provided we used to optimize
in GCC 12 to just `return 0` where `a == b` was used,
this allows us to do that again.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/113186

gcc/ChangeLog:

* gimple-match-head.cc (gimple_bitwise_inverted_equal_p):
Match `^` with the `==` for 1bit integral types.
* match.pd (maybe_cmp): Allow for bit_xor for 1bit
integral types.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-bool-1.c: New test.

Signed-off-by: Andrew Pinski 

[Bug testsuite/113238] [14] RISC-V: gcc.dg vect-tsvc flakey test timeouts when under heavy workload

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113238

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org

--- Comment #2 from Andrew Pinski  ---
I see PR 112890 was already linked and yes I see some of these are failing some
of the time but those runs are with qemu and I have not seen them fail on real
HW yet though.

[Bug c++/113239] New: [13 regression] After 822a11a1e64, bogus -Warray-bounds warnings in std::vector

2024-01-04 Thread dimitry--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113239

Bug ID: 113239
   Summary: [13 regression] After 822a11a1e64, bogus
-Warray-bounds warnings in std::vector
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dimi...@unified-streaming.com
  Target Milestone: ---

We noticed spurious warnings in some C++17 code compiled with g++ 13.2.0, and I
bisected it to commit 822a11a1e642e0abe92a996e7033a5066905a447 ("libstdc++: Do
not use memmove for 1-element ranges [PR108846]") for bug 108846 ("std::copy,
std::copy_n and std::copy_backward on potentially overlapping subobjects").

Reduced test case:

// g++ -std=c++17 -Wall -O2 -c testcase.cpp

#include 
#include 

struct frame_t
{
  uint64_t pts_;
  uint32_t timescale_;
  std::vector data_;
};

struct frame_source_t
{
  virtual frame_t get() = 0;
};

struct frame_filter_t : frame_source_t
{
  frame_t get() override
  {
if(current_frame_.data_.empty())
{
  return current_frame_;
}
else
{
  return frame_t();
}
  }

  frame_t current_frame_;
};

frame_filter_t create_frame_filter()
{
  return frame_filter_t();
}

// EOT

With gcc-13-6371-ga41a56dee5c, this compiles without any warning. With
gcc-13-6372-g822a11a1e64 and later, up to gcc-14-6924-g00dea7e8c41 (master as
of 2024-01-04), you get:

In file included from
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/vector:62,
 from testcase.cpp:5:
In static member function 'static void std::__copy_move::__assign_one(_Tp*, _Up*) [with _Tp = unsigned
char; _Up = const unsigned char]',
inlined from 'static _Up* std::__copy_move<_IsMove, true,
std::random_access_iterator_tag>::__copy_m(_Tp*, _Tp*, _Up*) [with _Tp = const
unsigned char; _Up = unsigned char; bool _IsMove = false]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_algobase.h:441:20,
inlined from '_OI std::__copy_move_a2(_II, _II, _OI) [with bool _IsMove =
false; _II = const unsigned char*; _OI = unsigned char*]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_algobase.h:507:30,
inlined from '_OI std::__copy_move_a1(_II, _II, _OI) [with bool _IsMove =
false; _II = const unsigned char*; _OI = unsigned char*]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_algobase.h:534:42,
inlined from '_OI std::__copy_move_a(_II, _II, _OI) [with bool _IsMove =
false; _II = __gnu_cxx::__normal_iterator >; _OI = unsigned char*]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_algobase.h:541:31,
inlined from '_OI std::copy(_II, _II, _OI) [with _II =
__gnu_cxx::__normal_iterator >; _OI
= unsigned char*]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_algobase.h:637:7,
inlined from 'static _ForwardIterator
std::__uninitialized_copy::__uninit_copy(_InputIterator, _InputIterator,
_ForwardIterator) [with _InputIterator = __gnu_cxx::__normal_iterator >; _ForwardIterator = unsigned
char*]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_uninitialized.h:147:27,
inlined from '_ForwardIterator std::uninitialized_copy(_InputIterator,
_InputIterator, _ForwardIterator) [with _InputIterator =
__gnu_cxx::__normal_iterator >;
_ForwardIterator = unsigned char*]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_uninitialized.h:185:15,
inlined from '_ForwardIterator std::__uninitialized_copy_a(_InputIterator,
_InputIterator, _ForwardIterator, allocator<_Tp>&) [with _InputIterator =
__gnu_cxx::__normal_iterator >;
_ForwardIterator = unsigned char*; _Tp = unsigned char]' at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_uninitialized.h:373:37,
inlined from 'std::vector<_Tp, _Alloc>::vector(const std::vector<_Tp,
_Alloc>&) [with _Tp = unsigned char; _Alloc = std::allocator]'
at
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_vector.h:603:31,
inlined from 'frame_t::frame_t(const frame_t&)' at testcase.cpp:7:8,
inlined from 'virtual frame_t frame_filter_t::get()' at testcase.cpp:25:14:
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/stl_algobase.h:399:17:
warning: array subscript 0 is outside array bounds of 'unsigned char [0]'
[-Warray-bounds=]
  399 | { *__to = *__from; }
  |   ~~^
In file included from
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/x86_64-pc-linux-gnu/bits/c++allocator.h:33,
 from
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/bits/allocator.h:46,
 from
/home/dim/ins/gcc-14-6924-g00dea7e8c41/include/c++/14.0.0/vector:63:
In member function '_Tp* std::__new_allocator<_Tp>::allocate(size_type, const
void*) [with _Tp = unsigned char]',
inlined from 'static _Tp* std::allocator_traits

[Bug testsuite/113238] [14] RISC-V: gcc.dg vect-tsvc flakey test timeouts when under heavy workload

2024-01-04 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113238

--- Comment #1 from Edwin Lu  ---
Debug log for one of the flakey tests

spawn -ignore SIGHUP
/github/patrick-postcommit-runner-2/_work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
-B/github/patrick-postcommit-runner-2/_work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
/github/patrick-postcommit-runner-2/_work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s351.c
-march=rv64gcv -mabi=lp64d -mtune=rocket -mcmodel=medlow
-fdiagnostics-plain-output -flto -ffat-lto-objects --param riscv-vector-abi
-ftree-vectorize -fno-tree-loop-distribute-patterns -fno-vect-cost-model
-fno-common -O2 -fdump-tree-vect-details --param vect-epilogues-nomask=0 -lm -o
./vect-tsvc-s351.exe
PASS: gcc.dg/vect/tsvc/vect-tsvc-s351.c -flto -ffat-lto-objects (test for
excess errors)
spawn riscv64-unknown-linux-gnu-run ./vect-tsvc-s351.exe
WARNING: program timed out.
FAIL: gcc.dg/vect/tsvc/vect-tsvc-s351.c -flto -ffat-lto-objects execution test

[Bug testsuite/113238] New: [14] RISC-V: gcc.dg vect-tsvc flakey test timeouts when under heavy workload

2024-01-04 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113238

Bug ID: 113238
   Summary: [14] RISC-V: gcc.dg vect-tsvc flakey test timeouts
when under heavy workload
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ewlu at rivosinc dot com
  Target Milestone: ---

The following tests are flakey on our post-commit ci
https://github.com/patrick-rivos/gcc-postcommit-ci/ for various vector targets. 

FAIL: gcc.dg/vect/tsvc/vect-tsvc-vpv.c 
FAIL: gcc.dg/vect/tsvc/vect-tsvc-vtv.c
FAIL: gcc.dg/vect/tsvc/vect-tsvc-s176.c
FAIL: gcc.dg/vect/tsvc/vect-tsvc-s351.c
FAIL: gcc.dg/vect/tsvc/vect-tsvc-s431.c
FAIL: gcc.dg/vect/tsvc/vect-tsvc-s1281.c


The test FAIL: gcc.dg/vect/tsvc/vect-tsvc-s1351.c might also be flakey as it
appeared several times but not as frequently as the rest.

[Bug middle-end/107436] Is -fsignaling-nans still experimental?

2024-01-04 Thread florian.schanda at bmw dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107436

--- Comment #8 from Florian Schanda  ---
I am no longer working at BMW.


For safety topics please contact alexander.schem...@bmw.de or
markus.schur...@bmw.de

For TRLC and LOBSTER topics please contact philipp.wullstein-kamm...@bmw.de or
create issues on public github
https://github.com/bmw-software-engineering/trlc/issues

[Bug tree-optimization/113237] [14 Regression] ICE verify_ssa failed when building 500.perlbench_r since r14-6822-g01f4251b8775c8

2024-01-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113237

Tamar Christina  changed:

   What|Removed |Added

   Priority|P3  |P1
   Last reconfirmed||2024-01-04
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |tnfchris at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #3 from Tamar Christina  ---
Thanks,

Indeed the patch for PR 113137 won't fix this one as it looks like the peeling
code has gotten confused about which exit is which when adjusting
virtual_operands.

It looks like it's swapped them, and this happens because non of the loop exits
are counting one so it just picks a random one.

Looks the one it picks is not the latch connected one:

perl.c:10:8: note:   using as main loop exit: 11 -> 7 [AUX: (nil)]
perl.c:10:8: note:=== get_loop_niters ===
perl.c:10:8: note:Loop has 2 exits.
perl.c:10:8: note:Analyzing exit 0...
perl.c:10:8: note:Analyzing exit 1...

which then incorrectly peels:

 # iters_46 = PHI 

which should be:

 # iters_46 = PHI 

I started implemented a fix for this same situation earlier for PR 113178 but
didn't finish it because I didn't think we'd get this far with a legit loop.

I'll finish that part.  Thanks for the testcase!

[Bug libfortran/113223] NAMELIST internal write missing leading blank character

2024-01-04 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113223

--- Comment #4 from kargl at gcc dot gnu.org ---
(In reply to Jerry DeLisle from comment #3)
> Created attachment 56990 [details]
> Suggested patch including affected test cases
> 
> Regression tested OK.  Three test cases affected.
> 

Thanks, and whoops, sorry about the lack of regtesting.

The change looks simple enough that if you what to 
backport, then go for it

[Bug tree-optimization/113237] [14 Regression] ICE verify_ssa failed when building 500.perlbench_r since r14-6822-g01f4251b8775c8

2024-01-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113237

--- Comment #2 from Tamar Christina  ---
Ah wait, I see. Ok, taking a look.

[Bug tree-optimization/113237] [14 Regression] ICE verify_ssa failed when building 500.perlbench_r since r14-6822-g01f4251b8775c8

2024-01-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113237

--- Comment #1 from Tamar Christina  ---
> I have bisected the failure to r14-6822-g01f4251b8775c8 (middle-end: Support
> vectorization of loops with multiple exits).  I have tried if the patch
> attached to PR 113137 helps but unfortunately it does not.

Indeed this should be fixed by the patch in PR 113136 not 113137 :)

[Bug testsuite/113226] [14 Regression] testsuite/std/ranges/iota/max_size_type.cc fails for cris-elf after r14-6888-ga138b99646a555

2024-01-04 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113226

--- Comment #3 from Hans-Peter Nilsson  ---
(In reply to Patrick Palka from comment #1)
> Huh, how bizarre.

Indeed.  I'm *not* ruling out an actual gcc bug.  Whether in the target or
middle-end this time I dare not guess; too few posts.

JFTR; I already mentioned this in the gcc-patches post: I see only posts on
gcc-testresults@ that include r14-6888-ga138b99646a555 for 64-bit-targets with
"-m32" multilibs, and I don't trust them to treat that hw_type the same.

[Bug tree-optimization/113237] [14 Regression] ICE verify_ssa failed when building 500.perlbench_r since r14-6822-g01f4251b8775c8

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113237

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org
   Keywords||ice-on-valid-code
   Target Milestone|--- |14.0

[Bug tree-optimization/113237] New: [14 Regression] ICE verify_ssa failed when building 500.perlbench_r since r14-6822-g01f4251b8775c8

2024-01-04 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113237

Bug ID: 113237
   Summary: [14 Regression] ICE verify_ssa failed when building
500.perlbench_r since r14-6822-g01f4251b8775c8
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jamborm at gcc dot gnu.org
CC: tnfchris at gcc dot gnu.org
Blocks: 26163
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

With a compiler configured with --enable-checking=yes and the following
testcase derived from 500.perlbench_r with -O3 -march=x86-64-v3 I get a
verify_ssa ICE:

$ cat test.c 
long Perl_pp_split_limit;
int Perl_block_gimme();
int Perl_pp_split() {
  char strend;
  long iters;
  int gimme = Perl_block_gimme();
  while (--Perl_pp_split_limit) {
if (gimme)
  iters++;
if (strend)
  break;
  }
  if (iters)
return 0;
}

$ $PREFIX/gcc -O3 -march=x86-64-v3  -S test.c 
test.c: In function ‘Perl_pp_split’:
test.c:3:5: error: definition in block 4 does not dominate use in block 6
3 | int Perl_pp_split() {
  | ^
for SSA_NAME: vect_iters_12.12_110 in statement:
vect_iters_12.12_111 = PHI 
PHI argument
vect_iters_12.12_110
for PHI node
vect_iters_12.12_111 = PHI 
during GIMPLE pass: vect
test.c:3:5: internal compiler error: verify_ssa failed
0x129673f verify_ssa(bool, bool)
/home/mjambor/gcc/mine/src/gcc/tree-ssa.cc:1203
0xf0bcd5 execute_function_todo
/home/mjambor/gcc/mine/src/gcc/passes.cc:2095
0xf0c13e execute_todo
/home/mjambor/gcc/mine/src/gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

I have bisected the failure to r14-6822-g01f4251b8775c8 (middle-end: Support
vectorization of loops with multiple exits).  I have tried if the patch
attached to PR 113137 helps but unfortunately it does not.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang

2024-01-04 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

Xi Ruoyao  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||xry111 at gcc dot gnu.org
   Last reconfirmed||2024-01-04
Summary|SMHasher SHA3-256 benchmark |SMHasher SHA3-256 benchmark
   |is almost 40% slower vs.|is almost 40% slower vs.
   |Clang on AMD Zen 4  |Clang
 Status|UNCONFIRMED |NEW

--- Comment #3 from Xi Ruoyao  ---
GCC trunk still gets around 200 (on a Tiger Lake but I've not used -march) with
-fno-semantic-interposition.

Confirm, and I'm removing "on xxx" from the subject as the uarch seems
irrelevant.

[Bug tree-optimization/110176] [11/12/13/14 Regression] wrong code at -Os and above on x86_64-linux-gnu since r11-2446

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110176

--- Comment #7 from Andrew Pinski  ---
(In reply to Jakub Jelinek from comment #6)
> Started with r11-2446-g3e61a2056335ca7d4e2009823efae4ee2dc950ee

Note r10-9757-gec97d2e842022a3f112e27d6d8 is the backported to the GCC 10
branch.

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2024-01-04 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087

--- Comment #27 from Patrick O'Neill  ---
Linking the discussion/plan here since more interested people are CCd here.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113206#c9
Using 4a0a8dc1b88408222b88e10278017189f6144602, the spec run failed on:
zvl128b (All runtime fails):
527.cam4 (Runtime)
531.deepsjeng (Runtime)
521.wrf (Runtime)
523.xalancbmk (Runtime)

zvl256b:
507.cactuBSSN (Runtime)
521.wrf (Build)
527.cam4 (Runtime)
531.deepsjeng (Runtime)
549.fotonik3d (Runtime)

With that info I think the next steps are:
1. Triage the zvl256b 521.wrf build failure
2. Bisect the newly-failing testcases
3. Finish triaging the remaining testcases the fuzzer found
4. Attempt to manually reduce cam4 for zvl128b (since it seems to have the
fastest build+runtime)
5. Attempt to manually reduce other fails.

[Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4

2024-01-04 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #2 from Xi Ruoyao  ---
The test file can be downloaded from
http://phoronix-test-suite.com/benchmark-files/smhasher-20220822.tar.xz.  Just
build it with cmake and run "./SMHasher --test=Speed sha3-256".  The building
system enables -O3 and LTO by default.

With GCC 13 I get about 180 MiB/s, but Clang 17 produces 250 MiB/s.

Part of the difference is caused by the different -fsemantic-interposition
default, if I pass -fno-semantic-interposition GCC 13 produces about 200 MiB/s.

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #14 from Jan Hubicka  ---
> I thought the goal was to handle what is in predict-18.c, i.e.
> b * __builtin_expect (c, 0)
> or similar.  If it is about
> __builtin_expect_with_probability (b, 42, 0.25) *
> __builtin_expect_with_probability (c, 0, 0.42)
> sure, my version will merge the probabilities, while you'll pick the
> probability from
> the 0 case.

Probability from 0 case is better estimate, so I think it makes sense to
handle it right.  I did not take that much stats on how often it
happens, but on my TODO list is to turn this into value range predictor
which may have better chance of success. We can also handle other
constants than INTEGER_CST.

I will see if I can clean up the code bit more or add a comment, since
it is indeed bit confusing as written now.  Will also look into more
testcases.

Thanks a lot!
Honza

[Bug target/113236] WebP benchmark is 20% slower vs. Clang on AMD Zen 4

2024-01-04 Thread aros at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113236

--- Comment #1 from Artem S. Tashkinov  ---
That's WebP image encode, Quality 100, highest compression.

Also applies to MTL:
https://www.phoronix.com/review/intel-meteorlake-gcc-clang/3

[Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4

2024-01-04 Thread aros at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #1 from Artem S. Tashkinov  ---
Also valid for MTL:
https://www.phoronix.com/review/intel-meteorlake-gcc-clang/2

[Bug testsuite/113226] [14 Regression] testsuite/std/ranges/iota/max_size_type.cc fails for cris-elf after r14-6888-ga138b99646a555

2024-01-04 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113226

--- Comment #2 from Patrick Palka  ---
(In reply to Patrick Palka from comment #1)
> Huh, how bizarre.
> 
> > i == 1, j == -100, i*j == 4294967196, max_type(i) == 1, max_type(i)*j == 
> > -100
> 
> Here i and j are just ordinary 'long long', so I don't get why i*j is
> 4294967196 instead of -100?

Everything else, in particular that int64_t(max_type(i)*j) is -100, seems
correct/expected to me.  FWIW that expression computes the product of the
corresponding promoted/sign-extended 65-bit precision values, and the overall
check is analogous to

  int32_t i = 1, j = -100;
  assert (int64_t(i*j) == int64_t(i)*j);

except the two precisions are 64/65 bits instead of 32/64 bits.

(When shorten_p is true, the overall check is analogous to
 assert (i*j == int32_t(int64_t(i)*j)) instead.)

[Bug rtl-optimization/113236] New: WebP benchmark is 20% slower vs. Clang on AMD Zen 4

2024-01-04 Thread aros at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113236

Bug ID: 113236
   Summary: WebP benchmark is 20% slower vs. Clang on AMD Zen 4
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: aros at gmx dot com
  Target Milestone: ---

According to Phoronix Test Suite WebP 1.2.4 is 20% slower when built with GCC
13.2/GCC git snapshot vs Clang:

https://www.phoronix.com/review/gcc-clang-eoy2023/4

[Bug middle-end/103500] Stack slots for overaligned stack temporaries are not properly aligned

2024-01-04 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103500

Alex Coplan  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW

--- Comment #3 from Alex Coplan  ---
No longer working on this.

[Bug libfortran/113223] NAMELIST internal write missing leading blank character

2024-01-04 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113223

--- Comment #3 from Jerry DeLisle  ---
Created attachment 56990
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56990=edit
Suggested patch including affected test cases

Regression tested OK.  Three test cases affected.

$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git restore ..." to discard changes in working directory)
modified:   gcc/testsuite/gfortran.dg/dtio_25.f90
modified:   gcc/testsuite/gfortran.dg/namelist_57.f90
modified:   gcc/testsuite/gfortran.dg/namelist_65.f90
modified:   libgfortran/io/write.c

no changes added to commit (use "git add" and/or "git commit -a")


This looks good to me and I will commit as simple and obvious.

[Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4

2024-01-04 Thread aros at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

Bug ID: 113235
   Summary: SMHasher SHA3-256 benchmark is almost 40% slower vs.
Clang on AMD Zen 4
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: aros at gmx dot com
  Target Milestone: ---

According to Phoronix Test Suite SMHasher SHA3-256 is almost 40% slower when
built with GCC 13.2/GCC git snapshort vs Clang:

https://www.phoronix.com/review/gcc-clang-eoy2023/3

FormHash32 x86_64 AVX is also a lot slower.

[Bug target/113206] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2024-01-04 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113206

--- Comment #9 from Patrick O'Neill  ---
(In reply to JuzheZhong from comment #8)
> It seems that we still didn't locate the real problem of failed SPEC you ran.
> Do you have any other ideas to locale the real problem ?
> 
> Li Pan didn't locate the problem neither.

Using 4a0a8dc1b88408222b88e10278017189f6144602, the spec run failed on:
zvl128b (All runtime fails):
527.cam4 (Runtime)
531.deepsjeng (Runtime)
521.wrf (Runtime)
523.xalancbmk (Runtime)

zvl256b:
507.cactuBSSN (Runtime)
521.wrf (Build)
527.cam4 (Runtime)
531.deepsjeng (Runtime)
549.fotonik3d (Runtime)

With that info I think the next steps are:
1. Triage the zvl256b 521.wrf build failure
2. Bisect the newly-failing testcases
3. Finish triaging the remaining testcases the fuzzer found
4. Attempt to manually reduce cam4 for zvl128b (since it seems to have the
fastest build+runtime)
5. Attempt to manually reduce other fails.

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #13 from Jakub Jelinek  ---
But, when you are touching the PHI case, I think
  /* If this PHI has itself as an argument, we cannot
 determine the string length of this argument.  However,
 if we can find an expected constant value for the other
 PHI args then we can still be sure that this is
 likely a constant.  So be optimistic and just
 continue with the next argument.  */
is a pasto from somewhere else (get_range_strlen), this function doesn't care
about string lengths...

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #12 from Jakub Jelinek  ---
(In reply to Jan Hubicka from comment #11)
> I added the early exits to handle the following case.
> 
> a = b * c
> 
> If b is prediced to 0 with predictor1, while c is predicted to 1 with
> predictor2 your version will predict a to be 0, but will merge
> predictor1 and 2 leading to lower probability than predictor1 alone.
> So the early exit will give bit higher chance for not losing
> information.

I thought the goal was to handle what is in predict-18.c, i.e.
b * __builtin_expect (c, 0)
or similar.  If it is about
__builtin_expect_with_probability (b, 42, 0.25) *
__builtin_expect_with_probability (c, 0, 0.42)
sure, my version will merge the probabilities, while you'll pick the
probability from
the 0 case.

[Bug middle-end/113228] [14 Regression] ICE: recalculate_side_effects, at gimplify.cc:3347

2024-01-04 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113228

--- Comment #10 from Patrick O'Neill  ---
(In reply to Andrew Pinski from comment #9)
> Oh ok, I was deciding if I should look further into this or let someone else
> handle it. Since it is from a fuzzer, I am just going to say I don't have
> time to look into this latent bug even though I exposed it :).

Makes sense, I'll start adding a little blurb to future testcases that come
from the fuzzer so people can prioritize accordingly.

[Bug middle-end/113228] [14 Regression] ICE: recalculate_side_effects, at gimplify.cc:3347

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113228

--- Comment #9 from Andrew Pinski  ---
(In reply to Patrick O'Neill from comment #8)
> (In reply to Andrew Pinski from comment #7)
> > This seems like a reduced testcase, where is the original testcase from? Or
> > is it an automated code generator?
> 
> This was found with the fuzzer we're using to try to nail down some spec
> fails in risc-v vector [1]. We've had some success with this approach.

Oh ok, I was deciding if I should look further into this or let someone else
handle it. Since it is from a fuzzer, I am just going to say I don't have time
to look into this latent bug even though I exposed it :).

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #11 from Jan Hubicka  ---
> > + int p1 = get_predictor_value (*predictor, *probability);
> > + int p2 = get_predictor_value (predictor2, probability2);
> > + /* If both predictors agrees, it does not matter from which
> 
> s/agrees/agree/
> 
> > + Consequently failing to fold both means that we will not suceed
> > determinging
> 
> s/suceed/succeed/;s/determinging/determining/

Fixed that, thanks!
> 
> Otherwise yes, but I think the code could be still simplified the way I had in
> my patch (i.e. drop parts of the r14-2219 changes, and simply assume that
> failed recursion for one operand is PRED_UNCONDITIONAL instead of returning
> early, and not requiring the operands are INTEGER_CSTs, just that the result 
> of
> the binop folds to INTEGER_CST.

I added the early exits to handle the following case.

a = b * c

If b is prediced to 0 with predictor1, while c is predicted to 1 with
predictor2 your version will predict a to be 0, but will merge
predictor1 and 2 leading to lower probability than predictor1 alone.
So the early exit will give bit higher chance for not losing
information.

The code is still lax if both b and c are predicted to 0 in which case
we can work out that combined probability is at least max of the two
predictor probabilities, but I was not sure if that is work extra
folding overhead.
> 
> -- 
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os

2024-01-04 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

Roger Sayle  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |roger at 
nextmovesoftware dot com

--- Comment #4 from Roger Sayle  ---
I'm testing a patch, for more accurate conversion gains/costs in the
scalar-to-vector pass.  Adding -mno-stv will work around the problem.

[Bug middle-end/113228] [14 Regression] ICE: recalculate_side_effects, at gimplify.cc:3347

2024-01-04 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113228

--- Comment #8 from Patrick O'Neill  ---
(In reply to Andrew Pinski from comment #7)
> This seems like a reduced testcase, where is the original testcase from? Or
> is it an automated code generator?

This was found with the fuzzer we're using to try to nail down some spec fails
in risc-v vector [1]. We've had some success with this approach.

I can share the unreduced testcase if that's interesting to you?

[1]: Csmith used w/ scripts to compare risc-v qemu with native build/runs:
https://github.com/patrick-rivos/gcc-fuzz-ci

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #10 from Jakub Jelinek  ---
(In reply to Jan Hubicka from comment #9)
> By removing the logic we lose ability to optimize things like
>   a = b * c;
> where b is predicted to value 0 and c has no useful prediction on it.

No, that is what my unposted WIP patch did, but predict-18.c test catched that.

> > @@ -2631,6 +2623,9 @@ expr_expected_value_1 (tree type, tree o
> > 
> >   if (predictor2 < *predictor)
> > *predictor = predictor2;
> > + if (*predictor != PRED_BUILTIN_EXPECT
> > + && *predictor != PRED_BUILTIN_EXPECT_WITH_PROBABILITY)
> > +   *probability = -1;
> 
> This still can "upgrade" prediction to a predictor of lower enm value
> but higher probability that is not conservative thing to do.
> > 
> >   return res;
> > }
> I ended up with the folloing patch that also takes care of various cases
> of phi merging and downgrading the predictor to new
> PRED_COMBINED_VALUE_PREDICTION which can, like PRED_BUILTIN_EXPECT hold
> custom probability but it is not trued as FIRST_MATCH.
> What do you think?

> +   int p1 = get_predictor_value (*predictor, *probability);
> +   int p2 = get_predictor_value (predictor2, probability2);
> +   /* If both predictors agrees, it does not matter from which

s/agrees/agree/

> + Consequently failing to fold both means that we will not suceed
> determinging

s/suceed/succeed/;s/determinging/determining/

Otherwise yes, but I think the code could be still simplified the way I had in
my patch (i.e. drop parts of the r14-2219 changes, and simply assume that
failed recursion for one operand is PRED_UNCONDITIONAL instead of returning
early, and not requiring the operands are INTEGER_CSTs, just that the result of
the binop folds to INTEGER_CST.

[Bug target/110934] m68k: ICE with -fzero-call-used-regs=all compiling openssh 9.3p2

2024-01-04 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110934

--- Comment #11 from Mikael Pettersson  ---
Reduced test case:

> cat ../pr110934.c 
extern double clobber_fp0(void);
void f(void) { clobber_fp0(); }
> gcc/xgcc -Bgcc -fzero-call-used-regs=used -fPIC -O -S ../pr110934.c
during RTL pass: zero_call_used_regs
../pr110934.c: In function 'f':
../pr110934.c:2:31: internal compiler error: in change_address_1, at
emit-rtl.cc:2299
2 | void f(void) { clobber_fp0(); }
  |  ^
0x40dc15 change_address_1
/mnt/scratch/other/mikpe-gcc.git/gcc/emit-rtl.cc:2299
0x69dc1e emit_move_insn(rtx_def*, rtx_def*)
/mnt/scratch/other/mikpe-gcc.git/gcc/expr.cc:4717
0xa0cf63 default_zero_call_used_regs(unsigned long)
/mnt/scratch/other/mikpe-gcc.git/gcc/targhooks.cc:1112
0x6ed8fa gen_call_used_regs_seq
/mnt/scratch/other/mikpe-gcc.git/gcc/function.cc:5928
0x6ed8fa execute
/mnt/scratch/other/mikpe-gcc.git/gcc/function.cc:6785
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

The issue is that to zero %fp0 the generic code synthesizes a move from
(const_double:XF 0.0 [0x0.0p+0]) which is replaced by a label referencing a
literal in .rodata, but that label is rejected by
m68k_legitimate_constant_address_p due to -fPIC, which triggers the assertion
failure in change_address_1.

[Bug testsuite/113226] [14 Regression] testsuite/std/ranges/iota/max_size_type.cc fails for cris-elf after r14-6888-ga138b99646a555

2024-01-04 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113226

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #1 from Patrick Palka  ---
Huh, how bizarre.

> i == 1, j == -100, i*j == 4294967196, max_type(i) == 1, max_type(i)*j == -100

Here i and j are just ordinary 'long long', so I don't get why i*j is
4294967196 instead of -100?

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #9 from Jan Hubicka  ---
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852
> 
> --- Comment #7 from Jakub Jelinek  ---
> So, what about following patch (which also fixes the ICE, would of course need
> to add the testcase) and doesn't regress any predict-*.c tests)?
> 
> --- gcc/predict.cc.jj   2024-01-03 11:51:32.0 +0100
> +++ gcc/predict.cc  2024-01-04 16:28:55.041507010 +0100
> @@ -2583,44 +2583,36 @@ expr_expected_value_1 (tree type, tree o
>if (get_gimple_rhs_class (code) == GIMPLE_BINARY_RHS)
>  {
>tree res;
> -  tree nop0 = op0;
> -  tree nop1 = op1;
> -  if (TREE_CODE (op0) != INTEGER_CST)
> -   {
> - /* See if expected value of op0 is good enough to determine the
> result.  */
> - nop0 = expr_expected_value (op0, visited, predictor, probability);
> - if (nop0
> - && (res = fold_build2 (code, type, nop0, op1)) != NULL
> - && TREE_CODE (res) == INTEGER_CST)
> -   return res;
> - if (!nop0)
> -   nop0 = op0;
> -}

By removing the logic we lose ability to optimize things like
  a = b * c;
where b is predicted to value 0 and c has no useful prediction on it.
> @@ -2631,6 +2623,9 @@ expr_expected_value_1 (tree type, tree o
> 
>   if (predictor2 < *predictor)
> *predictor = predictor2;
> + if (*predictor != PRED_BUILTIN_EXPECT
> + && *predictor != PRED_BUILTIN_EXPECT_WITH_PROBABILITY)
> +   *probability = -1;

This still can "upgrade" prediction to a predictor of lower enm value
but higher probability that is not conservative thing to do.
> 
>   return res;
> }
I ended up with the folloing patch that also takes care of various cases
of phi merging and downgrading the predictor to new
PRED_COMBINED_VALUE_PREDICTION which can, like PRED_BUILTIN_EXPECT hold
custom probability but it is not trued as FIRST_MATCH.
What do you think?

gcc/ChangeLog:

* predict.cc (expr_expected_value_1):
(get_predictor_value):
* predict.def (PRED_COMBINED_VALUE_PREDICTIONS):

diff --git a/gcc/predict.cc b/gcc/predict.cc
index 2e9b7dd07a7..cdfaea1e607 100644
--- a/gcc/predict.cc
+++ b/gcc/predict.cc
@@ -2404,16 +2404,29 @@ expr_expected_value_1 (tree type, tree op0, enum
tree_code code,
   if (!bitmap_set_bit (visited, SSA_NAME_VERSION (op0)))
return NULL;

-  if (gimple_code (def) == GIMPLE_PHI)
+  if (gphi *phi = dyn_cast  (def))
{
  /* All the arguments of the PHI node must have the same constant
 length.  */
- int i, n = gimple_phi_num_args (def);
- tree val = NULL, new_val;
+ int i, n = gimple_phi_num_args (phi);
+ tree val = NULL;
+ bool has_nonzero_edge = false;
+
+ /* If we already proved that given edge is unlikely, we do not need
+to handle merging of the probabilities.  */
+ for (i = 0; i < n && !has_nonzero_edge; i++)
+   {
+ tree arg = PHI_ARG_DEF (phi, i);
+ if (arg == PHI_RESULT (phi))
+   continue;
+ profile_count cnt = gimple_phi_arg_edge (phi, i)->count ();
+ if (!cnt.initialized_p () || cnt.nonzero_p ())
+   has_nonzero_edge = true;
+   }

  for (i = 0; i < n; i++)
{
- tree arg = PHI_ARG_DEF (def, i);
+ tree arg = PHI_ARG_DEF (phi, i);
  enum br_predictor predictor2;

  /* If this PHI has itself as an argument, we cannot
@@ -2422,26 +2435,50 @@ expr_expected_value_1 (tree type, tree op0, enum
tree_code code,
 PHI args then we can still be sure that this is
 likely a constant.  So be optimistic and just
 continue with the next argument.  */
- if (arg == PHI_RESULT (def))
+ if (arg == PHI_RESULT (phi))
continue;

+ /* Skip edges which we already predicted as unlikely.  */
+ if (has_nonzero_edge)
+   {
+ profile_count cnt = gimple_phi_arg_edge (phi, i)->count ();
+ if (cnt.initialized_p () && !cnt.nonzero_p ())
+   continue;
+   }
  HOST_WIDE_INT probability2;
- new_val = expr_expected_value (arg, visited, ,
-);
+ tree new_val = expr_expected_value (arg, visited, ,
+ );
+ /* If we know nothing about value, give up.  */
+ if (!new_val)
+   return NULL;

- /* It is difficult to combine value predictors.  Simply assume
-that later predictor is weaker and take its prediction.  */
- if (*predictor < predictor2)
+ /* If this is a first edge, trust its prediction.  */
+ if 

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #8 from Jakub Jelinek  ---
Created attachment 56989
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56989=edit
gcc14-pr110852.patch

Full untested patch.

[Bug target/113217] [14 Regression][aarch64] ICE in rtl_verify_bb_insns, at cfgrtl.cc:2796 since r14-6605-gc0911c6b357ba9

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113217

--- Comment #5 from Andrew Pinski  ---
(In reply to Alex Coplan from comment #4)
> Looks like the fix in r14-6784-gaca1f9d7cab3dc1a374a7dc0ec6f7a8d02d2869a
> wasn't sufficient to prevent trying to move throwing accesses above debug
> insns.  ICEs with just -O -fnon-call-exceptions -g.  I'll see what can be
> done about that.  I don't think we need to punt on such opportunities.

Please note we also want to keep with and without -g code generation the same
too.  -fcompare-debug can helpful there too ...

[Bug target/112804] ICE in aarch64 crosscompiler in plus_constant, at explow.cc:102 with -mabi=ilp32 and -finline-stringops

2024-01-04 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112804

Filip Kastl  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Filip Kastl  ---
Marking this as fixed.

[Bug tree-optimization/113234] missing folding to builtin_isunordered if manual nan comparison is used

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113234

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Component|rtl-optimization|tree-optimization
   Last reconfirmed||2024-01-04
   Keywords||missed-optimization
 Ever confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
Even with -fno-trapping-math, we don't combine:

  _1 = i_3(D) != i_3(D);
  _2 = j_4(D) != j_4(D);
  _5 = _1 | _2;

into:

  _5 = i_2(D) unord j_3(D);


Confirmed.

[Bug libstdc++/113230] 27_io/print/1.cc fails when run with qemu

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113230

--- Comment #5 from Andrew Pinski  ---
Created attachment 56988
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56988=edit
dejagnu board that I use

And run the testsuite like:
export SIM_ARM=qemu-aarch64
export QEMU_LD_PREFIX=${SYSROOT}

make -j${CPUS} -k check RUNTESTFLAGS="--target_board=qemu_board
 SIM_ARM=${SIM_ARM} $1"

And have the target libraries installed already in the SYSROOT .

[Bug rtl-optimization/113234] missing folding to builtin_isunordered if manual nan comparison is used

2024-01-04 Thread jsm28 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113234

--- Comment #1 from Joseph S. Myers  ---
Note that if flag_signaling_nans, __builtin_isnan should not raise exceptions
for signaling NaN argument (bug 66462), but ==, != and __builtin_isunordered
should raise exceptions for signaling NaN argument.

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #7 from Jakub Jelinek  ---
So, what about following patch (which also fixes the ICE, would of course need
to add the testcase) and doesn't regress any predict-*.c tests)?

--- gcc/predict.cc.jj   2024-01-03 11:51:32.0 +0100
+++ gcc/predict.cc  2024-01-04 16:28:55.041507010 +0100
@@ -2583,44 +2583,36 @@ expr_expected_value_1 (tree type, tree o
   if (get_gimple_rhs_class (code) == GIMPLE_BINARY_RHS)
 {
   tree res;
-  tree nop0 = op0;
-  tree nop1 = op1;
-  if (TREE_CODE (op0) != INTEGER_CST)
-   {
- /* See if expected value of op0 is good enough to determine the
result.  */
- nop0 = expr_expected_value (op0, visited, predictor, probability);
- if (nop0
- && (res = fold_build2 (code, type, nop0, op1)) != NULL
- && TREE_CODE (res) == INTEGER_CST)
-   return res;
- if (!nop0)
-   nop0 = op0;
-}
   enum br_predictor predictor2;
   HOST_WIDE_INT probability2;
-  if (TREE_CODE (op1) != INTEGER_CST)
+  tree nop0 = expr_expected_value (op0, visited, predictor, probability);
+  if (!nop0)
+   {
+ nop0 = op0;
+ *predictor = PRED_UNCONDITIONAL;
+ *probability = -1;
+   }
+  tree nop1 = expr_expected_value (op1, visited, ,
);
+  if (!nop1)
+   {
+ nop1 = op1;
+ predictor2 = PRED_UNCONDITIONAL;
+ probability2 = -1;
+   }
+  /* Finally see if we have two known values.  */
+  res = fold_build2 (code, type, nop0, nop1);
+  if (TREE_CODE (res) == INTEGER_CST)
{
- /* See if expected value of op1 is good enough to determine the
result.  */
- nop1 = expr_expected_value (op1, visited, ,
);
- if (nop1
- && (res = fold_build2 (code, type, op0, nop1)) != NULL
- && TREE_CODE (res) == INTEGER_CST)
+ /* If one operand is PRED_UNCONDITIONAL, aka directly or indirectly
+constant, prefer the other predictor.  */
+ if (predictor2 == PRED_UNCONDITIONAL)
+   return res;
+ if (*predictor == PRED_UNCONDITIONAL)
{
  *predictor = predictor2;
  *probability = probability2;
  return res;
}
- if (!nop1)
-   nop1 = op1;
-}
-  if (nop0 == op0 || nop1 == op1)
-   return NULL;
-  /* Finally see if we have two known values.  */
-  res = fold_build2 (code, type, nop0, nop1);
-  if (TREE_CODE (res) == INTEGER_CST
- && TREE_CODE (nop0) == INTEGER_CST
- && TREE_CODE (nop1) == INTEGER_CST)
-   {
  /* Combine binary predictions.  */
  if (*probability != -1 || probability2 != -1)
{
@@ -2631,6 +2623,9 @@ expr_expected_value_1 (tree type, tree o

  if (predictor2 < *predictor)
*predictor = predictor2;
+ if (*predictor != PRED_BUILTIN_EXPECT
+ && *predictor != PRED_BUILTIN_EXPECT_WITH_PROBABILITY)
+   *probability = -1;

  return res;
}

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

--- Comment #6 from Jan Hubicka  ---
> which fixes the ICE by preferring PRED_BUILTIN_EXPECT* over others.
> At least in this case when one operand is a constant and another one is
> __builtin_expect* result that seems like the right choice to me, the fact that
> one operand is constant doesn't mean the outcome of the binary operation is
> unconditionally constant when the other operand is __builtin_expect* based.

The code attempt to solve an problem with no good solution.  If you have two
probabilities that certain thing happens, the combination of them is
neither correct (since we do not know that the both probabilities are
independent) nor the resulting value corresponds to one of the two
predictors contributing to the outcome.

One exception is when one value is 100% sure, which is the case of
PRED_UNCDITIONAL.  So I would say we could simply special case 0% and
100% probability and pick the other predictor in that case.

> But reading the
> "This incorrectly takes precedence over more reliable heuristics predicting
> that call
> to cold noreturn is likely not going to happen."
> in the description makes me wonder whether that is what we want always 
> (though,
> I must say I don't understand the cold noreturn argument because
> PRED_COLD_FUNCTION is never returned from expr_expected_value_1.  But

Once expr_epeced_value finishes its job, it attaches precitor to an edge
and later all predictions sitting on an edge are combined.  If you end
up declaring prediction as BUILTIN_EXPECT_WITH_PROBABILITY, the logic
combining precitions will believe that the value is very reliable and 
will ignore other predictors.  This is the PRED_FLAG_FIRST_MATCH mode.

So computing uncertain probability and declaring it to be
BUILTIN_EXPECT_WITH_PROBABILITY is worse than declaring it to be a
predictor with PRED_DS_THEORY merging mode
(which assumes that the value is detrmined by possibly unreliable
heuristics)

So I would go with special casing 0% and 100% predictors (which can be
simply stripped and forgotten). For the rest we could probably introduce
PRED_COMBINED_VALUE which will be like BUILTIN_EXPECT_WITH_PROBABILITY
but with DS_THEORY meging mode.  It is probably better than nothing, but
certainly can not be trusted anymore.
> 
> Oh, another thing is that before the patch there were two spots that merged 
> the
> predictors, one for the binary operands (which previously picked the weaker
> predictor and now picks stronger predictor), but also in the PHI handling
> (which still picks weaker predictor); shouldn't that be changed to match,
> including the != -1 handling?

PHI is even more fishy, since we have no idea of probability of entering
the basic block with a given edge.  We can probably ignore basic blocks
that are already having profile_count of 0.

Otherwise we may try to do DS theory combination of the incomming
values.  I can cook up a patch.

(also fixing other profile related issue is very high on my TODO now)

[Bug libstdc++/113230] 27_io/print/1.cc fails when run with qemu

2024-01-04 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113230

--- Comment #4 from Andreas Schwab  ---
What does "run with qemu" mean exactly?

[Bug analyzer/112790] -Wanalyzer-deref-before-check false positives seen in Linux kernel due to inlining

2024-01-04 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112790

David Malcolm  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #4 from David Malcolm  ---
Should be fixed by the above patch on trunk.

Keeping open as it still affects GCC 13.

[Bug c/113232] wrong code at -fpack-struct on x86_64-pc-linux-gnu

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113232

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
This is not a bug.

Without -fpacked-struct, e.b and e.d.c does not have any bits in common due to
a padding field in `struct a` because of the alignment of short is 2. But with
-fpacked-struct, the padding byte/field goes away and you get an overlap.

[Bug analyzer/106358] [meta-bug] tracker bug for building the Linux kernel with -fanalyzer

2024-01-04 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106358
Bug 106358 depends on bug 113222, which changed state.

Bug 113222 Summary: ICE with -fanalyzer seen on Linux kernel kernel/sched/core.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113222

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug analyzer/113222] ICE with -fanalyzer seen on Linux kernel kernel/sched/core.c

2024-01-04 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113222

David Malcolm  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from David Malcolm  ---
Should be fixed by the above commit.

[Bug rtl-optimization/113234] New: missing folding to builtin_isunordered if manual nan comparison is used

2024-01-04 Thread denis.campredon at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113234

Bug ID: 113234
   Summary: missing folding to builtin_isunordered if manual nan
comparison is used
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: denis.campredon at gmail dot com
  Target Milestone: ---

Compiled with -O1, on x86-64, f1 and f2 should produce the same code than f3
and f4, but f1 and f2 use two comparisons whereas f3 and f4 only use 1.


---
bool f1(float i, float j)
{
return i !=i || j != j;
}

bool f2(float i, float j)
{
return i == i && j == j;
}

bool f3(float i, float j)
{
return __builtin_isnan(i) || __builtin_isnan(j);
}

bool f4(float i, float j)
{
return !__builtin_isnan(i) && !__builtin_isnan(j);
}
---


It seems that gcc does not fold "f != f" to __builtin_isnan, or too late,
leading to the missed optimisation.

[Bug libstdc++/113230] 27_io/print/1.cc fails when run with qemu

2024-01-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113230

--- Comment #3 from Andrew Pinski  ---
(In reply to Jonathan Wakely from comment #2)
> The point of the test is to write out a byte that isn't valid UTF-8, and
> check that it's printed unchanged, as a single byte. If something does some
> kind of iconv-like conversion on the test output and "fixes" the non-UTF-8
> output, then the test's assumption will not hold.


Yes that is what I think is happening here. Qemu is doing the conversion from
Latin 1 to utf-8 .

I filed this to record it until I had sometime to look into it further. Maybe
there is some env setting I am supposed to set for qemu to prevent this from
happening.

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2024-01-04 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #62 from Rich Felker  ---
The process described there would have to end at least N bits before the end of
the destination buffer. The point was that it would destroy information
internal to the buffer at each step along the way, before it got to the end.

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
So, what happens here is that expr_expected_value_1 is called on a binary
operation (GT_EXPR in this case, but that is irrelevant) with 2 SSA_NAME
operands, where one SSA_NAME is set to an INTEGER_CST (not propagated because
of -fno-tree-fre) and the other one result of __builtin_expect.  For the
INTEGER_CST, we get *predictor PRED_UNCONDITIONAL with *probability -1, for
__builtin_expect predictor2 PRED_BUILTIN_EXPECT with probability2 9000.
Next we try to merge the predictors.  As at least one of them has probability
!= -1,
  /* Combine binary predictions.  */
  if (*probability != -1 || probability2 != -1)
{
  HOST_WIDE_INT p1 = get_predictor_value (*predictor,
*probability);
  HOST_WIDE_INT p2 = get_predictor_value (predictor2,
probability2);
  *probability = RDIV (p1 * p2, REG_BR_PROB_BASE);
}
is done to combine the value and we get 9000 out of that in this case,
but then pick the predictor with the smaller value which is PRED_UNCONDITIONAL
in:
  if (predictor2 < *predictor)
*predictor = predictor2;
which causes the later ICE, because only PRED_BUILTIN_EXPECT* should have
probability
other than -1.
Now, given the combination code I wrote:
--- gcc/predict.cc.jj   2024-01-03 11:51:32.0 +0100
+++ gcc/predict.cc  2024-01-04 14:04:40.996639979 +0100
@@ -2626,10 +2626,20 @@ expr_expected_value_1 (tree type, tree o
{
  HOST_WIDE_INT p1 = get_predictor_value (*predictor,
*probability);
  HOST_WIDE_INT p2 = get_predictor_value (predictor2,
probability2);
+ if (*probability != -1 && probability2 != -1)
+   {
+ /* If both predictors are PRED_BUILTIN_EXPECT*, pick the
+smaller one from them.  */
+ if (predictor2 < *predictor)
+   *predictor = predictor2;
+   }
+ /* Otherwise, if at least one predictor is PRED_BUILTIN_EXPECT*,
+use that one for the combination.  */
+ else if (probability2 != -1)
+   *predictor = predictor2;
  *probability = RDIV (p1 * p2, REG_BR_PROB_BASE);
}
-
- if (predictor2 < *predictor)
+ else if (predictor2 < *predictor)
*predictor = predictor2;

  return res;
which fixes the ICE by preferring PRED_BUILTIN_EXPECT* over others.
At least in this case when one operand is a constant and another one is
__builtin_expect* result that seems like the right choice to me, the fact that
one operand is constant doesn't mean the outcome of the binary operation is
unconditionally constant when the other operand is __builtin_expect* based.
But reading the
"This incorrectly takes precedence over more reliable heuristics predicting
that call
to cold noreturn is likely not going to happen."
in the description makes me wonder whether that is what we want always (though,
I must say I don't understand the cold noreturn argument because
PRED_COLD_FUNCTION is never returned from expr_expected_value_1.  But
PRED_COMPARE_AND_SWAP (in between
PRED_BUILTIN_EXPECT_WITH_PROBABILITY and PRED_BUILTIN_EXPECT), whatever the
fortran IFN_BUILTIN_EXPECTs have (all above PRED_BUILTIN_EXPECT*),
PRED_MALLOC_NONNULL (above PRED_BUILTIN_EXPECT*) can appear.
So, shall we prefer PRED_BUILTIN_EXPECT* over PRED_UNCONDITIONAL for the binary
operations, but prefer PRED_COMPARE_AND_SWAP over PRED_BUILTIN_EXPECT (and then
set *probability to -1 obviously)?  The others don't matter because they have
higher value and so aren't picked up.

The reason this doesn't ICE with -fno-tree-fre is that if one of the binary
operands is already INTEGER_CST, it just uses the predictor from the other
operand (which is another argument in support of ignoring PRED_UNCONDITIONAL
operand vs. other predictors when merging).

Oh, another thing is that before the patch there were two spots that merged the
predictors, one for the binary operands (which previously picked the weaker
predictor and now picks stronger predictor), but also in the PHI handling
(which still picks weaker predictor); shouldn't that be changed to match,
including the != -1 handling?

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2024-01-04 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #61 from Richard Earnshaw  ---
Then I don't understand what you're trying to say in c57.

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2024-01-04 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #60 from Rich Felker  ---
Nobody said anything about writing past end of buffer. Obviously you can't do
that.

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2024-01-04 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #59 from Richard Earnshaw  ---
Memcpy must never write beyond the end of the specified buffer, even if reading
it is safe.  That wouldn't be thread safe.

[Bug analyzer/112790] -Wanalyzer-deref-before-check false positives seen in Linux kernel due to inlining

2024-01-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112790

--- Comment #3 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:05c99b1c7965f46f0ff17d5e8f4020a62c643ae5

commit r14-6919-g05c99b1c7965f46f0ff17d5e8f4020a62c643ae5
Author: David Malcolm 
Date:   Thu Jan 4 09:19:06 2024 -0500

analyzer: add sarif properties for checker events

As another followup to r14-6057-g12b67d1e13b3cf, optionally add SARIF
property bags to threadFlowLocation objects when writing out diagnostic
paths, and add analyzer-specific properties to them.

This was useful for debugging PR analyzer/112790.

gcc/analyzer/ChangeLog:
* checker-event.cc: Include "diagnostic-format-sarif.h" and
"tree-logical-location.h".
(checker_event::maybe_add_sarif_properties): New.
(superedge_event::maybe_add_sarif_properties): New.
(superedge_event::superedge_event): Add comment.
* checker-event.h (checker_event::maybe_add_sarif_properties): New
decl.
(superedge_event::maybe_add_sarif_properties): New decl.

gcc/ChangeLog:
* diagnostic-format-sarif.cc
(sarif_builder::make_logical_location_object): Convert to...
(make_sarif_logical_location_object): ...this.
(sarif_builder::set_any_logical_locs_arr): Update for above
change.
(sarif_builder::make_thread_flow_location_object): Call
maybe_add_sarif_properties on each diagnostic_event.
* diagnostic-format-sarif.h (class logical_location): New forward
decl.
(make_sarif_logical_location_object): New decl.
* diagnostic-path.h (class sarif_object): New forward decl.
(diagnostic_event::maybe_add_sarif_properties): New vfunc.

Signed-off-by: David Malcolm 

[Bug analyzer/112790] -Wanalyzer-deref-before-check false positives seen in Linux kernel due to inlining

2024-01-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112790

--- Comment #2 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:5743e1899d596497800f7d6f4273d535ea0abcdd

commit r14-6918-g5743e1899d596497800f7d6f4273d535ea0abcdd
Author: David Malcolm 
Date:   Thu Jan 4 09:15:18 2024 -0500

analyzer: fix deref-before-check false positives due to inlining [PR112790]

gcc/analyzer/ChangeLog:
PR analyzer/112790
* checker-event.cc (class inlining_info): Move to...
* inlining-iterator.h (class inlining_info): ...here.
* sm-malloc.cc: Include "analyzer/inlining-iterator.h".
(maybe_complain_about_deref_before_check): Reject stmts that were
inlined from another function.

gcc/testsuite/ChangeLog:
PR analyzer/112790
* c-c++-common/analyzer/deref-before-check-pr112790.c: New test.

Signed-off-by: David Malcolm 

[Bug analyzer/113222] ICE with -fanalyzer seen on Linux kernel kernel/sched/core.c

2024-01-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113222

--- Comment #2 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:db5b01d282a0e3ddcac737e55f9758c8b081cf4b

commit r14-6917-gdb5b01d282a0e3ddcac737e55f9758c8b081cf4b
Author: David Malcolm 
Date:   Thu Jan 4 09:12:40 2024 -0500

analyzer: handle arrays of unknown size in access diagrams [PR113222]

gcc/analyzer/ChangeLog:
PR analyzer/113222
* access-diagram.cc (valid_region_spatial_item::add_boundaries):
Handle TYPE_DOMAIN being null.
(valid_region_spatial_item::add_array_elements_to_table):
Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/113222
* gcc.dg/analyzer/out-of-bounds-diagram-pr113222.c: New test.

Signed-off-by: David Malcolm 

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233

--- Comment #5 from Xi Ruoyao  ---
Note that x86 also passes the recorded -mavx2 etc. to lto1.

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233

--- Comment #4 from Xi Ruoyao  ---
(In reply to Jan Hubicka from comment #3)
> > Confirm.  But option save/restore has been always implemented:
> > 
> > .section.gnu.lto_.opts,"",@progbits
> > .ascii  "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection"
> > .ascii  "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m"
> > .ascii  "simd=lasx' '-mcmodel=normal' '-mtune=loongarch64' '-flto"
> > .ascii  "'\000"
> > 
> > So -msimd=lasx is correctly recorded.  Not sure why it does not work.
> 
> With LTO we need to mix code compiled with different sets of options.
> For this reason we imply for every function defition and optimization
> and target attribute which record the flags.  So it seems target
> attribute is likely broken for this flag.

Target attribute is not implemented for LoongArch.  And I don't think it's a
good idea to implement it in stage 3.

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233

--- Comment #3 from Jan Hubicka  ---
> Confirm.  But option save/restore has been always implemented:
> 
> .section.gnu.lto_.opts,"",@progbits
> .ascii  "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection"
> .ascii  "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m"
> .ascii  "simd=lasx' '-mcmodel=normal' '-mtune=loongarch64' '-flto"
> .ascii  "'\000"
> 
> So -msimd=lasx is correctly recorded.  Not sure why it does not work.

With LTO we need to mix code compiled with different sets of options.
For this reason we imply for every function defition and optimization
and target attribute which record the flags.  So it seems target
attribute is likely broken for this flag.

Re: [Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread Jan Hubicka via Gcc-bugs
> Confirm.  But option save/restore has been always implemented:
> 
> .section.gnu.lto_.opts,"",@progbits
> .ascii  "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection"
> .ascii  "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m"
> .ascii  "simd=lasx' '-mcmodel=normal' '-mtune=loongarch64' '-flto"
> .ascii  "'\000"
> 
> So -msimd=lasx is correctly recorded.  Not sure why it does not work.

With LTO we need to mix code compiled with different sets of options.
For this reason we imply for every function defition and optimization
and target attribute which record the flags.  So it seems target
attribute is likely broken for this flag.


[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #58 from Jakub Jelinek  ---
(In reply to Rich Felker from comment #57)
> and more concerned about the consequences of LTO/whole-program-analysis where
> something in the translation process can see the violated restrict
> qualifier, infer UB, and blow everything up.

That can't happen, in GCC in GIMPLE these are represented as assignments, not
{__builtin_,}memcpy calls and are turned into the calls (or inline expansion of
the copying) only when being expanded into RTL.
All the LTO/whole program optimizations happen on GIMPLE, so at that point
nothing can be inferred from that because it simply isn't present in the IL and
only after all LTO & IPA optimizations are done individual functions go through
the rest of GIMPLE optimizations and then RTL ones.
The only exception to that is IPA-RA, which intra partition (for LTO, otherwise
within the TU) can take into account what hard registers are used/unused by
previously emitted
functions and take that knowledge into their callers emitted later; but for
those this is a library call like any other.

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233

Xi Ruoyao  changed:

   What|Removed |Added

 Target||loongarch*
   Last reconfirmed||2024-01-04
 CC||xry111 at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Xi Ruoyao  ---
Confirm.  But option save/restore has been always implemented:

.section.gnu.lto_.opts,"",@progbits
.ascii  "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection"
.ascii  "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m"
.ascii  "simd=lasx' '-mcmodel=normal' '-mtune=loongarch64' '-flto"
.ascii  "'\000"

So -msimd=lasx is correctly recorded.  Not sure why it does not work.

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2024-01-04 Thread bugdal at aerifal dot cx via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #57 from Rich Felker  ---
I think one could reasonably envision an implementation that does some sort of
vector loads/stores where, due to some performance constraint or avoiding
special casing for possible page boundary past the end of the copy, it only
wants to load N bits at a time, but the efficient store instruction always
stores a full vector of 2N bits. Of course, one could also argue quite
reasonably that this is a weird enough thing to do that the implementation
should then just check for src==dest and early-out.

I'm far less concerned about whether such mechanical breakage exists, and more
concerned about the consequences of LTO/whole-program-analysis where something
in the translation process can see the violated restrict qualifier, infer UB,
and blow everything up.

The change being requested here is really one of removing the restrict
qualification from the arguments and making a custom weaker condition. This may
in turn have consequences on what types of transformations are possible.

[Bug libstdc++/113230] 27_io/print/1.cc fails when run with qemu

2024-01-04 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113230

--- Comment #2 from Jonathan Wakely  ---
The point of the test is to write out a byte that isn't valid UTF-8, and check
that it's printed unchanged, as a single byte. If something does some kind of
iconv-like conversion on the test output and "fixes" the non-UTF-8 output, then
the test's assumption will not hold.

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread yangyujie at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233

--- Comment #1 from Yang Yujie  ---
I've already made a patch for this, will push it to gcc-patc...@gcc.gnu.org
later.

[Bug target/113233] New: LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread yangyujie at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233

Bug ID: 113233
   Summary: LoongArch: target options from LTO objects not
respected during linking
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yangyujie at loongson dot cn
  Target Milestone: ---

Compiling LTO objects with certain target options like -march=, -msimd= have no
effect on the final linked binary, since COLLECT_GCC_OPTIONS from the "linker
GCC" driver always overrides these options.

For instance, running the following command:

$ gcc -S -o- -xc - -msimd=lasx  -O3 <<< "void m2 (float *a, int n) { for (int i
= 0; i < n; i++) a[i] += 1; }" | grep vfadd

outputs SIMD instructions (as enabled by -msimd=lasx):

xvfadd.s$xr4,$xr4,$xr0
xvfadd.s$xr2,$xr2,$xr0
xvfadd.s$xr1,$xr1,$xr0
xvfadd.s$xr3,$xr3,$xr0
vfadd.s $vr1,$vr1,$vr0
vfadd.s $vr1,$vr1,$vr0
vfadd.s $vr1,$vr1,$vr0
vfadd.s $vr1,$vr1,$vr0
vfadd.s $vr1,$vr1,$vr0
vfadd.s $vr1,$vr1,$vr0
vfadd.s $vr0,$vr1,$vr0

However, LTO generates code according to the compiler's default config
(-msimd=none).

Running:
$ gcc -flto -c -o test.o -xc - -msimd=lasx -O3 <<< "void m2 (float *a, int n) {
for (int i = 0; i < n; i++) a[i] += 1; }"

and then:
$ gcc -S -o- -xlto test.o | grep vfadd

outputs nothing, indicating that the final code generation have LSX/LASX
disabled.

While:
$ gcc -S -o- -xlto test.o -mlasx | grep vfadd

outputs the same "vfadd" / "xvfadd" instructions as above.

-
Proposed solution: implement option save/restore to enable LTO option
streaming.
-

[Bug libstdc++/113230] 27_io/print/1.cc fails when run with qemu

2024-01-04 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113230

--- Comment #1 from Jonathan Wakely  ---
But any single character should match "." in the regex. Is the output being
converted (somewhere) from Latin-1 to UTF-8 which means that "À" becomes more
than one byte, and the regex doesn't match?

[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy

2024-01-04 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #56 from Richard Earnshaw  ---
I've never heard of a memcpy implementation that corrupts data if called with
memcpy (p, p, n).  (The problems come from partial overlaps where the direction
of the copy may matter).

Has anybody considered asking the standards committee to bless this as a
special exception?

Of course, if n is large, then performing an early test is still worthwhile,
but for small n, the cost of the check possibly exceeds the benefit of eliding
the copy.

[Bug target/113231] x86_64 uses SSE instructions for `*mem <<= const` at -Os

2024-01-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113231

Uroš Bizjak  changed:

   What|Removed |Added

 CC||roger at nextmovesoftware dot 
com

--- Comment #3 from Uroš Bizjak  ---
CC Roger.

[Bug tree-optimization/110176] [11/12/13/14 Regression] wrong code at -Os and above on x86_64-linux-gnu since r11-2446

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110176

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
Summary|[11/12/13/14 Regression]|[11/12/13/14 Regression]
   |wrong code at -Os and above |wrong code at -Os and above
   |on x86_64-linux-gnu |on x86_64-linux-gnu since
   ||r11-2446
   Keywords|needs-bisection |

--- Comment #6 from Jakub Jelinek  ---
Started with r11-2446-g3e61a2056335ca7d4e2009823efae4ee2dc950ee

[Bug target/113048] [13/14 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1862 (unable to find a register to spill) {*andndi3_doubleword_bmi} with -march=cascadelake since r13-1716

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113048

--- Comment #4 from Jakub Jelinek  ---
(In reply to Manuel Lauss from comment #2)
> I'm seeing similar ICE in xgcc when trying to build GCC-14 for MIPS32;
> it goes away when I drop "-fPIC" or "-march=mips32":

Please file this separately, that is extremely unlikely the same issue.

[Bug target/113048] [13/14 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1862 (unable to find a register to spill) {*andndi3_doubleword_bmi} with -march=cascadelake since r13-1716

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113048

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[13/14 Regression] ICE: in  |[13/14 Regression] ICE: in
   |lra_split_hard_reg_for, at  |lra_split_hard_reg_for, at
   |lra-assigns.cc:1862 (unable |lra-assigns.cc:1862 (unable
   |to find a register to   |to find a register to
   |spill)  |spill)
   |{*andndi3_doubleword_bmi}   |{*andndi3_doubleword_bmi}
   |with -march=cascadelake |with -march=cascadelake
   ||since r13-1716
 CC||jakub at gcc dot gnu.org,
   ||sayle at gcc dot gnu.org,
   ||uros at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Started with r13-1716-gfd3d25d6df1cbd385d2834ff3059dfb6905dd75c

[Bug c/113232] New: wrong code at -fpack-struct on x86_64-pc-linux-gnu

2024-01-04 Thread jwzeng at nuaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113232

Bug ID: 113232
   Summary: wrong code at -fpack-struct on x86_64-pc-linux-gnu
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jwzeng at nuaa dot edu.cn
  Target Milestone: ---

I compiled the following code with gcc at -fpack-struct, and it produces the
wrong code. 

The correct output result should be 0, but 2048 was output under -fpack-struct. 

This bug seems to be a long-standing bug that exists on almost all gcc
versions.

Compiler explorer: https://godbolt.org/z/v4sq988P9

```c
$ cat test.c
int printf(const char *, ...);
struct a {
  char b;
  short c;
};
union {
  short b;
  struct a d;
} e;
int main() {
  e.b = 0;
  for (e.d.c = 0; e.d.c < 8; e.d.c++)
;  
  printf("%d\n", e.b);
}
$
$ gcc-tk test.c -O0; ./test.c
0
$ gcc-tk test.c -fpack-struct; ./test.c
2048
$ ccomp test.c -O0; ./a.out
0
$
$ gcc-tk --version
gcc (GCC) 13.1.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$
$ ccomp --version
The CompCert C verified compiler, version 3.12
```

[Bug c++/68703] __attribute__((vector_size(N))) template member confusion

2024-01-04 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68703

Richard Sandiford  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #11 from Richard Sandiford  ---
FWIW, the following adaption of the original testcase still fails on trunk, but
is accepted by Clang:

template 
struct D {
using t = int __attribute__((vector_size(N * sizeof(int;
t v;
int f1() { return this->v[N-1]; }
int f2() { return v[N-1]; }
};

int main(int ac, char**)
{
  D<> d = { { ac } };
  return d.f1() + d.f2();
}

Same with a typedef instead of "using".  But that's probably just another
instance of PR88600/PR58855.

[Bug target/113184] [14 Regression] ICE: in extract_insn, at recog.cc:2812 (unrecognizable insn) with -O -frounding-math -fnon-call-exceptions since r14-6605

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113184

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug target/113184] [14 Regression] ICE: in extract_insn, at recog.cc:2812 (unrecognizable insn) with -O -frounding-math -fnon-call-exceptions since r14-6605

2024-01-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113184

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-01-04
Summary|[14 Regression] ICE: in |[14 Regression] ICE: in
   |extract_insn, at|extract_insn, at
   |recog.cc:2812   |recog.cc:2812
   |(unrecognizable insn) with  |(unrecognizable insn) with
   |-O -frounding-math  |-O -frounding-math
   |-fnon-call-exceptions   |-fnon-call-exceptions since
   ||r14-6605
 CC||acoplan at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org
 Status|UNCONFIRMED |NEW

--- Comment #1 from Jakub Jelinek  ---
Started with r14-6605-gc0911c6b357ba916ae24926b7d8b9ca35234f33c

[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value

2024-01-04 Thread syq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914

YunQiang Su  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #26 from YunQiang Su  ---
Since we have 2 fixes both fixed this problem.
Let's close it.

Should we back port it to gcc13/gcc12?

[Bug rtl-optimization/104914] [MIPS] wrong comparison with scrabbled int value

2024-01-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914

--- Comment #25 from GCC Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:3ac58063114cf491891072be6205d32a42c6707d

commit r14-6915-g3ac58063114cf491891072be6205d32a42c6707d
Author: Roger Sayle 
Date:   Thu Jan 4 10:49:33 2024 +

Improved RTL expansion of field assignments into promoted registers.

This patch fixes PR rtl-optmization/104914 by tweaking/improving the way
the fields are written into a pseudo register that needs to be kept sign
extended.

The motivating example from the bugzilla PR is:

extern void ext(int);
void foo(const unsigned char *buf) {
  int val;
  ((unsigned char*))[0] = *buf++;
  ((unsigned char*))[1] = *buf++;
  ((unsigned char*))[2] = *buf++;
  ((unsigned char*))[3] = *buf++;
  if(val > 0)
ext(1);
  else
ext(0);
}

which at the end of the tree optimization passes looks like:

void foo (const unsigned char * buf)
{
  int val;
  unsigned char _1;
  unsigned char _2;
  unsigned char _3;
  unsigned char _4;
  int val.5_5;

   [local count: 1073741824]:
  _1 = *buf_7(D);
  MEM[(unsigned char *)] = _1;
  _2 = MEM[(const unsigned char *)buf_7(D) + 1B];
  MEM[(unsigned char *) + 1B] = _2;
  _3 = MEM[(const unsigned char *)buf_7(D) + 2B];
  MEM[(unsigned char *) + 2B] = _3;
  _4 = MEM[(const unsigned char *)buf_7(D) + 3B];
  MEM[(unsigned char *) + 3B] = _4;
  val.5_5 = val;
  if (val.5_5 > 0)
goto ; [59.00%]
  else
goto ; [41.00%]

   [local count: 633507681]:
  ext (1);
  goto ; [100.00%]

   [local count: 440234144]:
  ext (0);

   [local count: 1073741824]:
  val ={v} {CLOBBER(eol)};
  return;

}

Here four bytes are being sequentially written into the SImode value
val.  On some platforms, such as MIPS64, this SImode value is kept in
a 64-bit register, suitably sign-extended.  The function expand_assignment
contains logic to handle this via SUBREG_PROMOTED_VAR_P (around line 6264
in expr.cc) which outputs an explicit extension operation after each
store_field (typically insv) to such promoted/extended pseudos.

The first observation is that there's no need to perform sign extension
after each byte in the example above; the extension is only required
after changes to the most significant byte (i.e. to a field that overlaps
the most significant bit).

The bug fix is actually a bit more subtle, but at this point during
code expansion it's not safe to use a SUBREG when sign-extending this
field.  Currently, GCC generates (sign_extend:DI (subreg:SI (reg:DI) 0))
but combine (and other RTL optimizers) later realize that because SImode
values are always sign-extended in their 64-bit hard registers that
this is a no-op and eliminates it.  The trouble is that it's unsafe to
refer to the SImode lowpart of a 64-bit register using SUBREG at those
critical points when temporarily the value isn't correctly sign-extended,
and the usual backend invariants don't hold.  At these critical points,
the middle-end needs to use an explicit TRUNCATE rtx (as this isn't a
TRULY_NOOP_TRUNCATION), so that the explicit sign-extension looks like
(sign_extend:DI (truncate:SI (reg:DI)), which avoids the problem.

2024-01-04  Roger Sayle  
Jeff Law  

gcc/ChangeLog
PR rtl-optimization/104914
* expr.cc (expand_assignment): When target is SUBREG_PROMOTED_VAR_P
a sign or zero extension is only required if the modified field
overlaps the SUBREG's most significant bit.  On MODE_REP_EXTENDED
targets, don't refer to the temporarily incorrectly extended value
using a SUBREG, but instead generate an explicit TRUNCATE rtx.

[Bug target/113217] [14 Regression][aarch64] ICE in rtl_verify_bb_insns, at cfgrtl.cc:2796 since r14-6605-gc0911c6b357ba9

2024-01-04 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113217

--- Comment #4 from Alex Coplan  ---
Looks like the fix in r14-6784-gaca1f9d7cab3dc1a374a7dc0ec6f7a8d02d2869a wasn't
sufficient to prevent trying to move throwing accesses above debug insns.  ICEs
with just -O -fnon-call-exceptions -g.  I'll see what can be done about that. 
I don't think we need to punt on such opportunities.

[Bug testsuite/60031] dg-require-effective-target powerpc_vsx_ok is not enough

2024-01-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60031

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #7 from Kewen Lin  ---
We have vsx_hw effective target keyword which uses check_vsx_hw_available.

# Return 1 if the target supports executing VSX instructions, 0
# otherwise.  Cache the result.

Doesn't it satisfy the requirement? Or am I missing something?

  1   2   >