[Bug tree-optimization/104604] New: wrong code with -O2 -fconserve-stack --param=vrp1-mode=ranger

2022-02-18 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104604

Bug ID: 104604
   Summary: wrong code with -O2 -fconserve-stack
--param=vrp1-mode=ranger
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 52479
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52479=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc -O2 -fconserve-stack --param=vrp1-mode=ranger
testcase.c -Wno-psabi
$ ./a.out 
Aborted

The value of x[] is {3, 3, 3, 3, ... }
It seems "i /= c;" got optimised out.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-7293-20220218075854-gfe79d652c96-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r12-7293-20220218075854-gfe79d652c96-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220218 (experimental) (GCC)

[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case

2022-02-18 Thread herumi at nifty dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

--- Comment #7 from herumi  ---
Created attachment 52478
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52478=edit
an original code

The array-bounds.zip file is a little stripped original issue.

array-bounds% g++-11.2 -O2 -c a.cpp -Warray-bounds -save-temps

[Bug rtl-optimization/104596] Means to add a comment in the assembly

2022-02-18 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596

--- Comment #2 from Tom de Vries  ---
(In reply to Andrew Pinski from comment #1)
> I am trying to understand what you are trying to do.
> You want to mark an insn with a comment 

One ore more insns, yes.

> which is emitted during formation of
> the prologue generation as being generated because of a specific option?

Not necessarily in the prologue, it could be anywhere.

> and you don't want to add some extra patterns to do the marking?
> 

Yes.  Instead, ideally I'd like gcc to provide a gen_comment (or, some
alternative, like the ability to tag insns themselves with a comment that is
output before the insn).

[ Both approaches raise questions in the context of optimizations.  But I'm
planning to use this late enough in the compiler not to have to bother with
those questions. ]

> Is there a reason why you want to annotate the instruction in the assembly
> besides just easier to see if it was emitted because of that option or is
> there some assembler reason?
> 

No, it's just the former.

> If it is just for debugging, why not while emitting the prologue, print out
> the instruction # that was added (if details dump is enabled) and then use
> -dP to see the instruction outputs the assembler. 
> 

That's feasible if you're interested in say, one insn (and it still requires
you to go and reproduce the command line to generate the code, add the dump
flags, find the relevant line in the dump file and then find the corresponding
insn in the assembly.  If the compiler already emits the comment, all you need
to do is read the assembly).

Another scenario is: I have assembly for an entire executable, including all
libraries, and I want to easily be able to find all insns that where introduced
because of a compiler pass, without having to diff against a version with that
pass disabled.

Note that for nvptx, a library or executable is assembly, so those comments can
still be useful when 'disassembling' an executable.  F.i. there are various
workarounds in the nvptx port that introduce insns, and an executable is
hand-editable, so we can do the sort of tinkering of: the executable fails,
this code introduced by the workaround looks suspicious, lets disable it and
see if it passes.

> The other thing you could is have a INSN_NOTE which takes a string which you
> then output during the final scan. This requires adding some extra stuff to
> the rest of the compiler but it should work.

Sure, that would work as well.  Though I think the concept just maps very well
on the user-level 'asm ("// comment")'.

[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

--- Comment #6 from Andrew Pinski  ---
(In reply to herumi from comment #5)
> >Can you file a seperate issue with the preprocessed source (-save-temps) 
> >since it really does look like a seperate issue all together.
> 
> May I attach a zipped a.ii which is generated by the following commands?

Zipped is perfect.

[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case

2022-02-18 Thread herumi at nifty dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

--- Comment #5 from herumi  ---

>Can you file a seperate issue with the preprocessed source (-save-temps) since 
>it really does look like a seperate issue all together.

May I attach a zipped a.ii which is generated by the following commands?
The size of a.ii is over 1700KiB.

---
>cat a.cpp
#include 

using namespace Xbyak::util;

void f()
{
  ptr[eax] == ptr[eax];
}
---
---
g++-11.2 -O2 -I ../ -Warray-bounds -c a.cpp -save-temps
---

[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

--- Comment #4 from Andrew Pinski  ---
(In reply to herumi from comment #3)
> The reason why I made this code is from the issue:
> https://github.com/herumi/xbyak/issues/137

Can you file a seperate issue with the preprocessed source (-save-temps) since
it really does look like a seperate issue all together. If it is not a seperate
issue in the end, at least we recorded the issue with the original source.

[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case

2022-02-18 Thread herumi at nifty dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

--- Comment #3 from herumi  ---
>Also if this is from some larger code,
>it might be useful to have the non-reduced testcase
>since the reduced testcase might being showing something different.

The reason why I made this code is from the issue:
https://github.com/herumi/xbyak/issues/137

[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

Andrew Pinski  changed:

   What|Removed |Added

Summary|wrong detection of g++  |[10/11/12 Regression] wrong
   |-Warray-bounds about|detection of
   |downcast|-Warray-bounds for
   ||interesting tail resusive
   ||case
  Component|c++ |tree-optimization
   Target Milestone|--- |10.4

--- Comment #2 from Andrew Pinski  ---
Someone else needs to look into this further than me because the warning only
happens because there are cases where the access can happen but the accesses
are not really used.

Also if this is from some larger code, it might be useful to have the
non-reduced testcase since the reduced testcase might being showing something
different.

[Bug c++/104603] wrong detection of g++ -Warray-bounds about downcast

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

--- Comment #1 from Andrew Pinski  ---
-DA just changes inlining.

This is just an inlining mess which you can see from the diagnostic on the
trunk:
In member function 'bool Base::isX() const',
inlined from 'bool Base::operator==(const Base&) const' at :16:15,
inlined from 'bool X::operator==(const X&) const' at :10:51,
inlined from 'bool Base::operator==(const Base&) const' at :16:63,
inlined from 'bool X::operator==(const X&) const' at :10:51,
inlined from 'void f()' at :24:11:
:4:29: warning: array subscript 2 is outside array bounds of 'X [1]'
[-Warray-bounds]
4 |   bool isX() const { return isX_; }
  | ^~~~


The warning happens before some other optimizations happen which allows GCC to
prove the function will just always return false ...

[Bug c++/104603] New: wrong detection of g++ -Warray-bounds about downcast

2022-02-18 Thread herumi at nifty dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603

Bug ID: 104603
   Summary: wrong detection of g++ -Warray-bounds about downcast
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: herumi at nifty dot com
  Target Milestone: ---

Created attachment 52477
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52477=edit
a minimal sample of the bug

g++-10.3.0 and g++-11.2 -O2 -Warray-bounds on Ubuntu 20.04.3 LTS show wrong
warnings to the attachment code.

g++ -O2 -Warray-bounds -DA t.cpp does not show the warnings.
g++-9 -O2 -Warray-bounds does not, too.

---
>cat t.cpp
struct Base {
  bool isX_;
  Base(bool isX = false) : isX_(isX) { }
  bool isX() const { return isX_; }
  bool operator==(const Base& rhs) const;
};

struct X : public Base {
  X(const Base& b) : Base(true), b_(b) { }
  bool operator==(const X& rhs) const { return b_ == rhs.b_; }
  Base b_;
};

inline bool Base::operator==(const Base& rhs) const
{
return isX() && rhs.isX() && static_cast(*this) ==
static_cast(rhs);
}

Base base;

#ifndef A
void f()
{
  X(base) == X(base);
}
#endif

int main()
{
#ifdef A
  X(base) == X(base);
#endif
}
---

---
% g++-10 --version
g++-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0
% g++-10 -O2 -Warray-bounds array-bounds-bug.cpp
array-bounds-bug.cpp: In function 'void f()':
array-bounds-bug.cpp:18:29: warning: array subscript 2 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:9: note: while referencing ''
   38 |   X(base) == X(base);
  | ^
array-bounds-bug.cpp:18:29: warning: array subscript 2 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:20: note: while referencing ''
   38 |   X(base) == X(base);
  |^
array-bounds-bug.cpp:18:29: warning: array subscript 3 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:9: note: while referencing ''
   38 |   X(base) == X(base);
  | ^
array-bounds-bug.cpp:18:29: warning: array subscript 3 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:20: note: while referencing ''
   38 |   X(base) == X(base);
  |^
array-bounds-bug.cpp:18:29: warning: array subscript 4 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:9: note: while referencing ''
   38 |   X(base) == X(base);
  | ^
array-bounds-bug.cpp:18:29: warning: array subscript 4 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:20: note: while referencing ''
   38 |   X(base) == X(base);
  |^
array-bounds-bug.cpp:18:29: warning: array subscript 5 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:9: note: while referencing ''
   38 |   X(base) == X(base);
  | ^
array-bounds-bug.cpp:18:29: warning: array subscript 5 is outside array bounds
of 'X [1]' [-Warray-bounds]
   18 |   bool isX() const { return isX_; }
  | ^~~~
array-bounds-bug.cpp:38:20: note: while referencing ''
   38 |   X(base) == X(base);
  |^
array-bounds-bug.cpp:24:51: warning: array subscript 2 is outside array bounds
of 'X [1]' [-Warray-bounds]
   24 |   bool operator==(const X& rhs) const { return b_ == rhs.b_; }
  |~~~^
array-bounds-bug.cpp:38:9: note: while referencing ''
   38 |   X(base) == X(base);
  | ^
array-bounds-bug.cpp:24:58: warning: array subscript 2 is outside array bounds
of 'X [1]' [-Warray-bounds]
   24 |   bool operator==(const X& rhs) const { return b_ == rhs.b_; }
  |  ^~
array-bounds-bug.cpp:38:20: note: while referencing ''
   38 |   X(base) == X(base);
---

[Bug libstdc++/104602] std::source_location::current uses cast from void*

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104602

--- Comment #1 from Andrew Pinski  ---
https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534374.html
Explains why it is currently this way.

[Bug libstdc++/104602] New: std::source_location::current uses cast from void*

2022-02-18 Thread foom at fuhm dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104602

Bug ID: 104602
   Summary: std::source_location::current uses cast from void*
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: foom at fuhm dot net
  Target Milestone: ---

I'm working on implementing __builtin_source_location() in Clang
(https://reviews.llvm.org/D120159).

In testing it against the libstdc++  header, I ran into a
minor issue.

"current()" in GNU libstdc++ is defined as so:

static consteval source_location
current(const void* __p = __builtin_source_location()) noexcept
{
  source_location __ret;
  __ret._M_impl = static_cast (__p);
  return __ret;
}

But! A static_cast from a `const void*` parameter to `const __impl*` is not
permitted in constexpr evaluation:
"""
5. An expression E is a core constant expression unless the evaluation of E,
[...] would evaluate one of the following:
[...]
5.15. a conversion from type cv void* to a pointer-to-object type;"
"""
http://eel.is/c++draft/expr.const#5.15

Clang diagnoses this rule, but GCC apparently does not. (it's not really clear
to me why this rule really needs to exist in the standard -- why bother to
police which kinds of pointer casts you're allowed to do, instead of just
raising an error upon _access_ through the wrong type?)

Anyhow, to workaround this issue, I plan to simply hardcode an exception to the
check in Clang for casts which occur in a "std::source_location::current"
method. Yet, although it's perhaps too late to avoid this workaround, it'd be
nice if libstdc++ didn't require the use of an invalid cast.

In clang (in my proposed change), __builtin_source_location already returns the
expected `const __impl*` type, rather than `const void*` as it does in GCC. So,
the issue is only the cast TO `void*` and back again in libstdc++. ISTM this
would be fixed by moving the `static_cast ` into the default
parameter expression. That would then be a no-op cast on clang, and an (invalid
but undiagnosed) cast from void in GCC.

[Bug ipa/104597] LTO does not inline indirect call

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||lto, missed-optimization
   Severity|normal  |enhancement
 CC||marxin at gcc dot gnu.org
  Component|c++ |ipa

--- Comment #3 from Andrew Pinski  ---
I suspect this is just the standard issue where we don't inline again after
some optimizations. There is another bug like that before.

clang does though.

[Bug go/104290] [12 Regression] trunk 20220214 fails to build libgo on i686-gnu

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290

--- Comment #23 from CVS Commits  ---
The master branch has been updated by Ian Lance Taylor :

https://gcc.gnu.org/g:3343e7e2c4cd2cd111cda86737f539cc6eda49ff

commit r12-7298-g3343e7e2c4cd2cd111cda86737f539cc6eda49ff
Author: Ian Lance Taylor 
Date:   Fri Feb 18 15:04:00 2022 -0800

libgo: update Hurd support

Patches from Svante Signell for PR go/104290.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/386797

[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601

--- Comment #5 from Andrew Pinski  ---
But adding noipa to f does though:
[[gnu::noipa]]
std::optional f() { return 1; }

[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601

--- Comment #4 from Andrew Pinski  ---
(In reply to Jakub Jelinek from comment #2)
> This changed with r11-3408-ge977dd5edbcc3a3b88c3bd7efa1026c845af7487

Hmm, even -fno-ipa-modref does not prevent the wrong code from showing up.

[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2

2022-02-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601

--- Comment #3 from Jakub Jelinek  ---
Testcase without the unneeded  which aborts if miscompiled.

#include 
#include 
inline std::optional a(std::vector::iterator b,
std::vector::iterator c,
std::optional h(int)) {
  std::optional d;
  find_if(b, c, [&](auto e) {
d = h(e);
return d;
  });
  return d;
}
std::optional f(int) { return 1; }
int
main() {
  std::vector g(100);
  auto b = g.begin();
  auto c = g.end();
  auto e = a(b, c, f);
  if (!e)
__builtin_abort();
}

[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2

2022-02-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601

Jakub Jelinek  changed:

   What|Removed |Added

   Last reconfirmed||2022-02-18
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||hubicka at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
This changed with r11-3408-ge977dd5edbcc3a3b88c3bd7efa1026c845af7487

[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601

Andrew Pinski  changed:

   What|Removed |Added

Summary|[11 Regression] Invalid |[11/12 Regression] Invalid
   |branch elimination at -O2   |branch elimination at -O2
   Target Milestone|--- |11.3

--- Comment #1 from Andrew Pinski  ---
>From fre3 (with details):
Value numbering stmt = *__pred$__d_53 = _58 ();
Setting value number of .MEM_143 to .MEM_135 (changed)
Value numbering stmt = SR.60_59 = MEM  [(const struct optional
&)__pred$__d_53 + 4];
Setting value number of SR.60_59 to 0 (changed)
Value numbering stmt = _60 = VIEW_CONVERT_EXPR(SR.60_59);
Match-and-simplified VIEW_CONVERT_EXPR(SR.60_59) to 0
RHS VIEW_CONVERT_EXPR(SR.60_59) simplified to 0


Hmm,
 Somehow *__pred$__d_53 is missed.

[Bug go/104290] [12 Regression] trunk 20220126 fails to build libgo on i686-gnu

2022-02-18 Thread ian at airs dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290

--- Comment #22 from Ian Lance Taylor  ---
Thanks.  I'll commit your patches #1 through #8.

Your patch #9 is to a generated file.  The fix there can't be to patch just the
top-level Makefile.in.  It has to be to patch whatever is causing Makefile.in
to be generated the way that it is.  I don't myself know what is going wrong
there.

[Bug c++/104601] New: [11 Regression] Invalid branch elimination at -O2

2022-02-18 Thread markus.boeck02 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601

Bug ID: 104601
   Summary: [11 Regression] Invalid branch elimination at -O2
   Product: gcc
   Version: 11.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: markus.boeck02 at gmail dot com
  Target Milestone: ---

Following code has been produced via reduction with `creduce`.
When compiled with `-O2`, GCC 11 and later versions will incorrectly print `f`,
while if `-O1` or lower, or an older version of GCC is used, it will correctly
print 't'.

#include 
#include 
#include 
inline std::optional a(std::vector::iterator b,
std::vector::iterator c,
std::optional h(int)) {
  std::optional d;
  find_if(b, c, [&](auto e) {
d = h(e);
return d;
  });
  return d;
}
std::optional f(int) { return 1; }
main() {
  std::vector g(100);
  auto e = a(g.begin(), g.end(), f);
  printf("%c", e ? 't' : 'f');
}

For the sake of completion, this was the original code: 
https://godbolt.org/z/enx19v7E5

[Bug gcov-profile/100289] [11/12 Regression] libgcc/libgcov.h: bootstrap failure due to missing #include

2022-02-18 Thread j at uriah dot heep.sax.de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100289

Joerg Wunsch  changed:

   What|Removed |Added

 CC||j at uriah dot heep.sax.de

--- Comment #16 from Joerg Wunsch  ---
Can confirm this bug when building an AVR cross-compiler (11.2) on FreeBSD.

To get it working, I'm now patching it to #undef HAVE_SYS_MMAN_H in libgcov.h
before starting.

[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain

2022-02-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623

--- Comment #30 from Segher Boessenkool  ---
Btw, does this issue exist for the corresponding __builtin_{un,}pack_ibm128
builtins as well?

[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain

2022-02-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623

--- Comment #29 from Segher Boessenkool  ---
(In reply to Peter Bergner from comment #28)
> (In reply to Segher Boessenkool from comment #27)
> > OTOH, it makes no sense to test if we have hard float.  The pack and unpack
> > builtins should work (and work the same) whenever long double is
> > double-double.
> 
> Agreed.  For soft-float, the value would be a a GPR pair versus a FPR pair
> (for -m64).  It's a little tricker for -m32 -msoft-float compiles, since a
> 128-bit long double would live in 4 32-bit GPRs, so more regs than it takes
> to hold them in FPRs.  Not much of a complication, but just needs to be
> tested on 32-bit to ensure it works as expected.

It can be in memory, even; it doesn't matter.  But it is boring data
movement, and in many cases it doesn't generate any code even :-)

[Bug rtl-optimization/104596] Means to add a comment in the assembly

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596

--- Comment #1 from Andrew Pinski  ---
I am trying to understand what you are trying to do.
You want to mark an insn with a comment which is emitted during formation of
the prologue generation as being generated because of a specific option?
and you don't want to add some extra patterns to do the marking?

Is there a reason why you want to annotate the instruction in the assembly
besides just easier to see if it was emitted because of that option or is there
some assembler reason?

If it is just for debugging, why not while emitting the prologue, print out the
instruction # that was added (if details dump is enabled) and then use -dP to
see the instruction outputs the assembler. 

The other thing you could is have a INSN_NOTE which takes a string which you
then output during the final scan. This requires adding some extra stuff to the
rest of the compiler but it should work.

[Bug c/104506] [12 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in useless_type_conversion_p, at gimple-expr.cc:87 on invalid symbol redeclaration

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104506

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2022-Februar
   ||y/590595.html
   Keywords||patch

--- Comment #5 from Andrew Pinski  ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590595.html

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #18 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #6)
> Hmm:
>  _14 = {_1, _5};
>   _8 = VIEW_CONVERT_EXPR<__int128>(_14);
> 
> Wouldn't it better to convert that to just (hopefully I got the order
> correct):
> t1 = (__128)_1
> _8 = BIT_INSERT_EXPR(t1, 64, _5);
> 
> ?

I filed that as PR 104600 since it might be useful in the general case too.

[Bug tree-optimization/104600] New: VCE(vector){} should be converted (or expanded) into BIT_INSERT_EXPR

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104600

Bug ID: 104600
   Summary: VCE(vector){} should be converted (or
expanded) into BIT_INSERT_EXPR
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

When I looked at PR 104582, I Noticed that we had:

 _14 = {_1, _5};
  _8 = VIEW_CONVERT_EXPR<__int128>(_14);

Which can be converted into (with the ordering corrected for endianness):
t1 = (__128)_1
_8 = BIT_INSERT_EXPR(t1, 64, _5);

You can see this by taking the following testcases:

#define vector __attribute__((vector_size(16)))

__int128 f(long a, long b)
{
  vector long t = {a, b};
  return (__int128)t;
}

void f1(__int128 *t1, long a, long b)
{
  vector long t = {a, b};
  *t1 =  (__int128)t;
}

void f2(__int128 *t1, long a, long b)
{
  vector long t = {a, b};
  *t1 =  ((__int128)t) + 1;
}


f2 is really bad for x86_64 as GCC does a store to the stack and then loads
back.

Note if you use | instead of +, GCC does the right thing even.

[Bug c++/96445] extern template results in missing constructor symbol

2022-02-18 Thread tyu at eridex dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96445

--- Comment #2 from tyu at eridex dot org ---
The extern template and constant are what would appear in the header file for
class C. The explicit instantiation would appear in the source file:

// -- C.h 
template 
class C {
  private:
constexpr C(T){};
  public:
constexpr static C make(T t) { return C(t); }
};
extern template class C;
inline constexpr C constant = C::make(0);

// -- C.cpp --
template class C;

// -- main.cpp ---
int main() { return 0; }
// ---

[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb

2022-02-18 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121

Alexandre Oliva  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=103302

--- Comment #11 from Alexandre Oliva  ---
Ok, now I think the patch for bug 103302, that brought us this regression, is
wrong.  Unlike the old reload, lra computes live ranges for reload pseudos, and
without the clobbers, they end up much longer, possibly overlapping, to the
point that assignments become impossible.

But this is unrelated with the loop.  find_reload_regno_insns assumes
single-insn input and output reloads, and it won't find sequences like those
emitted by emit_move_multi_word (or emit_move_complex_parts, for that matter). 
That was fine when we had sequences that amounted to a clobber plus a pair of
moves, because those plus start_insn added up to more than 3, the cut-off for
find_reload_regno_insns before entering the endless loop.

But an expander for a reload insn that issued two insns could, AFAICT, trigger
the problem in which we find a first_insn and then loop forever looking for the
second_insn after next_insn became NULL and prev_insn isn't looked at any more,
or vice-versa for an output reload.  Alas, neither of the fixes for that solve
the problem:

- getting the loop to terminate and return false when we won't find all of the
reload insns with the current logic gets us an infinite loop one level up, as
we attempt to spill the reg and assign it again indefinitely.

- getting the loop to recognize the entire contiguous sequences, which is what
we should probably do, enables progress, but then, we issue more reloads, and
because of the extended live ranges, we also fail to assign them, and so on,
until we hit the lra max iteration count.

Restoring the clobber renders these changes unnecessary, and I guess that's
what we should do.  It will however bring back the obscure reloading problem we
had on risc-v, that likely affects v850 as well, in which a shared register
assignment crossing such a clobber could end up killing the source assigned to
the same hardware register before copying it to the reload destination.  That
is far less common, but far more painful when it silently hits.

[Bug middle-end/104550] bogus warning from -Wuninitialized + -ftrivial-auto-var-init=pattern

2022-02-18 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104550

--- Comment #18 from qinzhao at gcc dot gnu.org ---
One question here, for the following testing case:

[opc@qinzhao-ol7u9 104550]$ cat t1.c
struct vx_audio_level {
 int has_monitor_level : 1;
};

void vx_set_monitor_level() {
 struct vx_audio_level info;
 __builtin_clear_padding ();
}
[opc@qinzhao-ol7u9 104550]$ sh t
/home/opc/Install/latest/bin/gcc -O -Wuninitialized -Wall t1.c -S
t1.c: In function ‘vx_set_monitor_level’:
t1.c:7:2: warning: ‘info’ is used uninitialized [-Wuninitialized]
7 |  __builtin_clear_padding ();
  |  ^~~
t1.c:6:24: note: ‘info’ declared here
6 |  struct vx_audio_level info;
  |^~~~

We can see that the compiler emitted the exactly same warning as with
-ftrivial-auto-var-init=pattern. 

my question is, is the "info" in __builtin_clear_padding() a REAL use of
"info"? is it correct to report the uninitialized use message for it?

[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain

2022-02-18 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623

--- Comment #28 from Peter Bergner  ---
(In reply to Segher Boessenkool from comment #27)
> OTOH, it makes no sense to test if we have hard float.  The pack and unpack
> builtins should work (and work the same) whenever long double is
> double-double.

Agreed.  For soft-float, the value would be a a GPR pair versus a FPR pair (for
-m64).  It's a little tricker for -m32 -msoft-float compiles, since a 128-bit
long double would live in 4 32-bit GPRs, so more regs than it takes to hold
them in FPRs.  Not much of a complication, but just needs to be tested on
32-bit to ensure it works as expected.

[Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #12 from CVS Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:1931cbad498e625b1e24452dcfffe02539b12224

commit r12-7295-g1931cbad498e625b1e24452dcfffe02539b12224
Author: H.J. Lu 
Date:   Fri Feb 18 10:36:53 2022 -0800

pieces-memset-21.c: Expect vzeroupper for ia32

Update gcc.target/i386/pieces-memset-21.c to expect vzeroupper for ia32
caused by

commit fe79d652c96b53384ddfa43e312cb0010251391b
Author: Richard Biener 
Date:   Thu Feb 17 14:40:16 2022 +0100

target/104581 - compile-time regression in mode-switching

PR target/104581
* gcc.target/i386/pieces-memset-21.c: Expect vzeroupper for ia32.

[Bug go/104290] [12 Regression] trunk 20220126 fails to build libgo on i686-gnu

2022-02-18 Thread ian at airs dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290

--- Comment #21 from Ian Lance Taylor  ---
*** Bug 103573 has been marked as a duplicate of this bug. ***

[Bug go/103573] [12 Regression] trunk 20211203 fails to build libgo on i686-gnu (hurd)

2022-02-18 Thread ian at airs dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103573

Ian Lance Taylor  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #2 from Ian Lance Taylor  ---
Closing in favor of PR 104290.

*** This bug has been marked as a duplicate of bug 104290 ***

[Bug target/104024] ICE in curr_insn_transform with -O1 -mpower10-fusion -mpower10-fusion-2logical with __int128_t and __builtin_add_overflow

2022-02-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104024

--- Comment #3 from Segher Boessenkool  ---
Most of those options were removed.  Does this problem (adjusted properly,
those options are now enabled iff you use -mcpu=power10 or later) still
happen on trunk?

[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain

2022-02-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623

--- Comment #27 from Segher Boessenkool  ---
OTOH, it makes no sense to test if we have hard float.  The pack and unpack
builtins should work (and work the same) whenever long double is double-double.

[Bug middle-end/104550] bogus warning from -Wuninitialized + -ftrivial-auto-var-init=pattern

2022-02-18 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104550

--- Comment #17 from qinzhao at gcc dot gnu.org ---
So, based on the discussion so far, I'd like to take the following steps:

1. In GCC12, I will take a conservative solution to fix this bug, i.e:

mark the load "MEM" as not needing a warning during __builtin_clear_padding
folding phase;

this should resolve this issue and has lowest risk to introduce more issues.

2. In GCC13, seeking a better way to do padding initialization. right now,
based on the discussion so far, there is no conclusion on which way is better
yet.

let me know if you have other comments or suggestions.

[Bug testsuite/104599] New: [12 regression] gcc.dg/deprecated.c has excess errors after r12-7287-g1b71bc7c8b18bd

2022-02-18 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104599

Bug ID: 104599
   Summary: [12 regression] gcc.dg/deprecated.c has excess errors
after r12-7287-g1b71bc7c8b18bd
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:1b71bc7c8b18bd1b22debfde155f175fd1654942, r12-7287-g1b71bc7c8b18bd
make  -k check-gcc RUNTESTFLAGS="dg.exp=gcc.dg/deprecated.c"
FAIL: gcc.dg/deprecated.c  (test for warnings, line 28)
FAIL: gcc.dg/deprecated.c (test for excess errors)
# of expected passes22
# of unexpected failures2

Excess errors:
/home/seurer/gcc/git/gcc-test/gcc/testsuite/gcc.dg/deprecated.c:28:1: warning:
type is deprecated [-Wdeprecated-declarations]

commit 1b71bc7c8b18bd1b22debfde155f175fd1654942 (HEAD, refs/bisect/bad)
Author: Jason Merrill 
Date:   Tue Feb 15 19:17:03 2022 -0500

tree: tweak warn_deprecated_use

[Bug tree-optimization/104595] unvectorized loop due to bool condition loaded from memory

2022-02-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104595

--- Comment #2 from Segher Boessenkool  ---
This is exactly the same as the char case here though, so it is a bit silly
that we miss it :-)

[Bug target/104598] [12 regression] g++.dg/ext/undef-bool-1.C has excess errors after r12-7284-gefbb17db52afd8

2022-02-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104598

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Priority|P3  |P1
 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Fixed now.

[Bug target/104257] rs6000/*intrin.h headers using non-uglified automatic variables

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104257

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:df5ed150ee5fbcb8255e05eed978c4af2b3d9bcc

commit r12-7294-gdf5ed150ee5fbcb8255e05eed978c4af2b3d9bcc
Author: Jakub Jelinek 
Date:   Fri Feb 18 17:21:43 2022 +0100

rs6000: Fix up posix_memalign call in _mm_malloc [PR104598]

The uglification changes went in one spot too far and uglified also
the anem of function, posix_memalign should be called like that and
not a non-existent function instead of it.

2022-02-18  Jakub Jelinek  

PR target/104257
PR target/104598
* config/rs6000/mm_malloc.h (_mm_malloc): Call posix_memalign
rather than __posix_memalign.

[Bug target/104598] [12 regression] g++.dg/ext/undef-bool-1.C has excess errors after r12-7284-gefbb17db52afd8

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104598

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:df5ed150ee5fbcb8255e05eed978c4af2b3d9bcc

commit r12-7294-gdf5ed150ee5fbcb8255e05eed978c4af2b3d9bcc
Author: Jakub Jelinek 
Date:   Fri Feb 18 17:21:43 2022 +0100

rs6000: Fix up posix_memalign call in _mm_malloc [PR104598]

The uglification changes went in one spot too far and uglified also
the anem of function, posix_memalign should be called like that and
not a non-existent function instead of it.

2022-02-18  Jakub Jelinek  

PR target/104257
PR target/104598
* config/rs6000/mm_malloc.h (_mm_malloc): Call posix_memalign
rather than __posix_memalign.

[Bug target/104598] New: [12 regression] g++.dg/ext/undef-bool-1.C has excess errors after r12-7284-gefbb17db52afd8

2022-02-18 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104598

Bug ID: 104598
   Summary: [12 regression] g++.dg/ext/undef-bool-1.C has excess
errors after r12-7284-gefbb17db52afd8
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:efbb17db52afd802300c4dcce208fab326ec2915, r12-7284-gefbb17db52afd8
make  -k check-gcc RUNTESTFLAGS="dg.exp=g++.dg/ext/undef-bool-1.C"
FAIL: g++.dg/ext/undef-bool-1.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/ext/undef-bool-1.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ext/undef-bool-1.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/ext/undef-bool-1.C  -std=gnu++20 (test for excess errors)
# of unexpected failures4

Excess errors:
/home/seurer/gcc/git/build/gcc-test/gcc/include/mm_malloc.h:50:7: error:
'__posix_memalign' was not declared in this scope; did you mean
'posix_memalign'?

commit efbb17db52afd802300c4dcce208fab326ec2915 (HEAD, refs/bisect/bad)
Author: Paul A. Clarke 
Date:   Wed Feb 16 20:01:41 2022 -0600

rs6000: __Uglify non-uglified local variables in headers

[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb

2022-02-18 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121

--- Comment #10 from Alexandre Oliva  ---
and then, as I reduced it myself down to the following and compared with the
minimized test, I've finally turned on both of my neurons ;-) and it finally
hit me: "only with -mv850e2v3" didn't mean "not with other multilibs", but
rather "without any optimization".  of course, none of the minimized test would
survive with optimization.  doh!

this one triggers with -O2 -g -mv850e2v3:

typedef float DFtype __attribute__ ((mode (DF)));
typedef _Complex float DCtype __attribute__ ((mode (DC)));
DCtype
__muldc3 (DFtype a, DFtype b, DFtype c, DFtype d)
{
   DFtype x = __builtin_huge_val () * (a * c - b * d);
   DFtype y = __builtin_huge_val () * (a * d + b * c);

   DCtype res;
  __real__ res = x;
  __imag__ res = y;
  return res;
}

[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb

2022-02-18 Thread aoliva at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121

Alexandre Oliva  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |aoliva at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #9 from Alexandre Oliva  ---
Thanks, I've succeeded in duplicating the problem with the preprocessed
testcase, both with the earlier tree and with a more recent one.  Now I can
look into it.

[Bug c++/104597] LTO does not inline indirect call

2022-02-18 Thread m.cencora at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597

--- Comment #2 from m.cencora at gmail dot com ---
Similarly when indirect call is a result of virtual function call, gcc cannot
optimize it, while clang can:

// main.cpp
struct foo
{
   virtual int getInt0() const = 0;
   virtual int getInt1() const = 0;
};

const foo& getFooInstance();

namespace
{
int test()
{
   auto& foo = getFooInstance();
   return foo.getInt1();
}
}

int main()
{
   return test();
}

// lib1.cpp

struct foo
{
   virtual int getInt0() const = 0;
   virtual int getInt1() const = 0;
};

namespace
{

struct bar final : foo
{
   int getInt0() const override
   {
  return 0;
   }

   int getInt1() const override
   {
  return 1;
   }
};

constexpr bar b;

}

const foo& getFooInstance()
{
   return b;
}


gcc-11 output:
Dump of assembler code for function main:
   0x1040 <+0>: endbr64 
   0x1044 <+4>: lea0x2d75(%rip),%rdi# 0x3dc0
<_ZN12_GLOBAL__N_1L1bE>
   0x104b <+11>:jmp0x1150
<_ZNK12_GLOBAL__N_13bar7getInt1Ev>


clang-12 output:
Dump of assembler code for function main:
   0x00401110 <+0>: mov$0x1,%eax
   0x00401115 <+5>: ret

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #17 from Richard Biener  ---
For

FAIL: gcc.target/i386/pr91446.c scan-assembler-times vmovdqa[^\\n\\r]*xmm[0-9]
2

we used to produce

 :
   0:   48 83 ec 28 sub$0x28,%rsp
   4:   c4 e1 f9 6e d7  vmovq  %rdi,%xmm2
   9:   c4 e1 f9 6e da  vmovq  %rdx,%xmm3
   e:   c4 e3 e9 22 ce 01   vpinsrq $0x1,%rsi,%xmm2,%xmm1
  14:   c4 e3 e1 22 c1 01   vpinsrq $0x1,%rcx,%xmm3,%xmm0
  1a:   48 89 e7mov%rsp,%rdi
  1d:   c5 f9 7f 0c 24  vmovdqa %xmm1,(%rsp)
  22:   c5 f9 7f 44 24 10   vmovdqa %xmm0,0x10(%rsp)
  28:   e8 00 00 00 00  call   2d 
  2d:   48 83 c4 28 add$0x28,%rsp
  31:   c3  ret

but now reject this on costing grounds.  The scalar code is

 :
   0:   48 83 ec 28 sub$0x28,%rsp
   4:   48 89 3c 24 mov%rdi,(%rsp)
   8:   48 89 e7mov%rsp,%rdi
   b:   48 89 74 24 08  mov%rsi,0x8(%rsp)
  10:   48 89 54 24 10  mov%rdx,0x10(%rsp)
  15:   48 89 4c 24 18  mov%rcx,0x18(%rsp)
  1a:   e8 00 00 00 00  call   1f 
  1f:   48 83 c4 28 add$0x28,%rsp
  23:   c3  ret

I think the scalar variant is 5 uops up to the call while the vector variant
is 9 uops.  The scalar variant can also execute 4 of the uops in parallel
(well, I guess only up to 3 with 3 store ports).  I think the scalar
variant is better and so I'm inclined to adjust the testcase.

[Bug c++/102286] [constexpr] construct_at incorrectly starts union array lifetime in some cases

2022-02-18 Thread gcc at ebasoft dot com.pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102286

Artur Bać  changed:

   What|Removed |Added

 CC||gcc at ebasoft dot com.pl

--- Comment #4 from Artur Bać  ---
This is not legal in c++20 but gcc allows accessing not active member of a
union in constexpr

https://godbolt.org/z/KsEhffeEa

clang refuses and it is right

"construction of subobject of member 'object' of union with active member
'init' is not allowed in a constant expression"
With c++20 there is no way to have aligned_storage for not trivial type in
constexpr (storage without initialization, not std::array that requires
construction of non trivial objects upon member activation)

https://godbolt.org/z/9EGrf8fbr

#include 

struct foo
{
int * i;
constexpr foo() { i = new int; }
constexpr ~foo() { delete(i); }
};
union storage
{
  constexpr storage() 
  : init{}
  {}
  constexpr ~storage() {}

  foo object[1];
  char init = '\0';
  };

consteval bool test()
{
storage u;
auto p = std::addressof(u.object[0]);
std::construct_at(p);
std::destroy_at(p);
return true;
}

int main()
{ 
static_assert( test() );
return test();
}

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

Richard Biener  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=99881

--- Comment #16 from Richard Biener  ---
See also PR99881 where this XPASSes its testcase for eventual fallout in x264_r
on CLX and 538.imagick_r on Kabylake.  Unlike the fix for that PR I'm simply
re-using x86_cost->sse_to_integer here.

[Bug c++/104597] LTO does not inline indirect call

2022-02-18 Thread m.cencora at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597

--- Comment #1 from m.cencora at gmail dot com ---
clang-12 optimizes it to:
Dump of assembler code for function main:
   0x00401110 <+0>: mov$0x1,%eax
   0x00401115 <+5>: ret

[Bug c++/104597] New: LTO does not inline indirect call

2022-02-18 Thread m.cencora at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597

Bug ID: 104597
   Summary: LTO does not inline indirect call
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: m.cencora at gmail dot com
  Target Milestone: ---

Given following files:
// main.cpp
using intfunc = int (*)();

intfunc getIntFunc(int i);

namespace
{
int test()
{
   auto func = getIntFunc(1);
   return func();
}
}

int main()
{
   return test();
}


// lib1.cpp
namespace
{

int getInt0()
{
   return 0;
}

int getInt1()
{
   return 1;
}

int getInt2()
{
   return 2;
}

}

using intfunc = int (*)();

intfunc getIntFunc(int i)
{
   if (i == 0)
   {
  return getInt0;
   }
   else if (i == 1)
   {
  return getInt1;
   }
   else if (i == 2)
   {
  return getInt2;
   }
   __builtin_abort();
}


and compilation with:
g++ -std=c++20 -Wall -Wextra -O3 -flto -fvisibility=hidden
-fvisibility-inlines-hidden -ffunction-sections -Wl,-gc-sections main.cpp
lib1.cpp -o test

Call to getInt1 does not get inlined:
Dump of assembler code for function main:
   0x1040 <+0>: endbr64 
   0x1044 <+4>: jmp0x1140 <_ZN12_GLOBAL__N_17getInt1Ev>

[Bug c++/104588] memset loses alignment infomation in some cases

2022-02-18 Thread lh_mouse at 126 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104588

--- Comment #3 from LIU Hao  ---
Sounds so. Changing `char a[32]` to `long a[4]` or `void* a[4]` makes GCC
generate MOVAPS like Clang, but `int a[8]` or `short a[16]` does not.

[Bug target/104590] ppc64: even/odd permutation for VSX 64-bit to 32-bit conversions is no longer necessary.

2022-02-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104590

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement
 Ever confirmed|0   |1
   Last reconfirmed||2022-02-18

--- Comment #2 from Segher Boessenkool  ---
Please send the patch to gcc-patches@ if you want it included.  It cannot be
committed until stage 1 opens though (but feel free to send it).

Thanks!

[Bug rtl-optimization/104596] New: Means to add a comment in the assembly

2022-02-18 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596

Bug ID: 104596
   Summary: Means to add a comment in the assembly
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

I wanted to mark some insns in a way that is visible in the assembly, without
having to tinker with the .md file.

The user-level equivalent would be something like:
...
  asm ("// Start: added by x")
  ...
  asm ("// End: added by x")
...

So I wrote:
...
static rtx
gen_comment (const char *s)
{
  const char *sep = " ";
  size_t len = strlen (ASM_COMMENT_START) + strlen (sep) + strlen (s) + 1;
  char *comment = (char *) alloca (len);
  snprintf (comment, len, "%s%s%s", ASM_COMMENT_START, sep, s);
  return gen_rtx_ASM_INPUT_loc (VOIDmode, ggc_strdup (comment),
cfun->function_start_locus);
}
...
and used it to generate comments like this:
...
// #APP 
// 2 "pr53465.c" 1  
// Start: Added by -minit-regs=3:   
// #NO_APP  
mov.u32 %r25, 0;
// #APP 
// 2 "pr53465.c" 1  
// End: Added by -minit-regs=3: 
// #NO_APP  
...

This however is a bit verbose.

The APP/NO_APP is there to separate user insn from compiler insn, but in this
case, the compiler added the comments.

Furthermore, the file info is not meaningful either, we just use
cfun->function_start_locus because with UNKNOWN_LOCATION we run into a
segfault.

Both these issues are addressed by:
...
diff --git a/gcc/final.cc b/gcc/final.cc
index a9868861bd2c..5d47f3d5ba0e 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -2642,15 +2642,20 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int
optimize_p 
ATTRIBUTE_UNUSED,
if (string[0])
  {
expanded_location loc;
-
-   app_enable ();
-   loc = expand_location (ASM_INPUT_SOURCE_LOCATION (body));
-   if (*loc.file && loc.line)
- fprintf (asm_out_file, "%s %i \"%s\" 1\n",
-  ASM_COMMENT_START, loc.line, loc.file);
+   bool unknown_loc_p
+ = ASM_INPUT_SOURCE_LOCATION (body) == UNKNOWN_LOCATION;
+
+   if (!unknown_loc_p)
+ {
+   app_enable ();
+   loc = expand_location (ASM_INPUT_SOURCE_LOCATION (body));
+   if (*loc.file && loc.line)
+ fprintf (asm_out_file, "%s %i \"%s\" 1\n",
+  ASM_COMMENT_START, loc.line, loc.file);
+ }
fprintf (asm_out_file, "\t%s\n", string);
 #if HAVE_AS_LINE_ZERO
-   if (*loc.file && loc.line)
+   if (!unknown_loc_p && loc.file && *loc.file && loc.line)
  fprintf (asm_out_file, "%s 0 \"\" 2\n", ASM_COMMENT_START);
 #endif
  }
...
after which we can do in gen_comment:
...
-  return gen_rtx_ASM_INPUT_loc (VOIDmode, ggc_strdup (comment),
-   cfun->function_start_locus);
+  return gen_rtx_ASM_INPUT (VOIDmode, ggc_strdup (comment));
...
and have the less verbose:
...
// Start: Added by -minit-regs=3:   
mov.u32 %r25, 0;
// End: Added by -minit-regs=3:
  ...

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #15 from Richard Biener  ---
The patch will cause

FAIL: gcc.target/i386/pr91446.c scan-assembler-times vmovdqa[^\\n\\r]*xmm[0-9]
2
FAIL: gcc.target/i386/pr92658-avx512bw-2.c scan-assembler-times pmovsxdq 2
FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxbq 2
FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxdq 2
FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxwq 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxbq 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxdq 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxwq 2
XPASS: gcc.target/i386/pr99881.c scan-assembler-not xmm[0-9]

I have to look into some of them.  The pr92658 one seems to be cases like

void
bar_u32_u64 (v2di * dst, v4si src)
{
  unsigned long long tem[2];
  tem[0] = src[0];
  tem[1] = src[1];
  dst[0] = *(v2di *) tem;
}

where we fail to recognize the BIT_FIELD_REF as accessing a pre-existing
vector (we only support a subset of cases during SLP discovery):

  _1 = BIT_FIELD_REF ;
  _2 = (long long unsigned int) _1;
  tem[0] = _2;
  _3 = BIT_FIELD_REF ;
  _4 = (long long unsigned int) _3;
  tem[1] = _4;

but when vectorizing just store and the conversion as

   [local count: 1073741824]:
  _1 = BIT_FIELD_REF ;
  _3 = BIT_FIELD_REF ;
  _13 = {_1, _3};
  vect__2.110_14 = (vector(2) long long unsigned int) _13;
  MEM  [(long long unsigned int *)] =
vect__2.110_14;

we can recover things on the RTL side.

So we just realize that costing is a difficult thing.

Cost model analysis:
_2 1 times scalar_store costs 12 in body
_4 1 times scalar_store costs 12 in body
(long long unsigned int) _1 1 times scalar_stmt costs 4 in body
(long long unsigned int) _3 1 times scalar_stmt costs 4 in body
(long long unsigned int) _1 1 times vector_stmt costs 4 in body
node 0x415e268 1 times vec_construct costs 20 in prologue
_2 1 times vector_store costs 16 in body
Cost model analysis for part in loop 0:
  Vector cost: 40
  Scalar cost: 32
not vectorized: vectorization is not profitable.

note this uses icelake-server costs which has an unusally high sse_to_integer
cost.

The fix here would best be to recognize the BIT_FIELD_REF vector use of course.

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #14 from Richard Biener  ---
Another testcase is

struct S { double a, b; } s;

void
foo (double a, double b)
{
  s.a = a;
  s.b = b;
}

which also receives the same costs and compiles vectorized to

  unpcklpd %xmm1,%xmm0
  movaps %xmm0,0x0(%rip)  
  ret

which is also smaller than unvectorized.

[Bug tree-optimization/104595] unvectorized loop due to bool condition loaded from memory

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104595

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2022-02-18
 Blocks||53947
   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
Confirmed.  I think this is a omission somewhere in bool pattern recog since
we need a

  tem = pb[i] != 0 ? -1 : 0;

kind of computation to generate a mask suitable for vectorization here.  We
might have a duplicate bugreport as well.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug c++/104000] Ordinary constructor cannot delegate to `consteval` constructor

2022-02-18 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104000

--- Comment #5 from Fedor Chelnokov  ---
Based on stackoverflow answer, a modified example was found with the delegation
to consteval constructor:
```
struct A {   
int i = 0;
consteval A() = default;
A(const A&) = delete;
A(int) : A(A()) {}
};
```
which is accepted in GCC. Demo: https://gcc.godbolt.org/z/5PjraK5ox

Clang rejects it until one remove `A(const A&) = delete`, which is probably
another issue.

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #13 from Richard Biener  ---
Created attachment 52476
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52476=edit
minimal patch

This is a minimal untested patch adjusting APIs to allow for the cost hook to
receive a slp_node in addition to a stmt_vec_info and make the x86 backend
use it and successfully disregard the vectorization that's not doing
a CTOR from memory.

Other targets need minimal adjustments as well of course and some of the
cleanups (additional overloads for record/add_stmt_cost for scalar and branch
stmts and two fixes using scalar_stmt rather than vector_stmt kinds for
versioning costs can and will be split out).

Richard - any comments?  Would you object to doing this for GCC 12 (give we
changed the costing API anyway)?

[Bug libstdc++/104592] Problem with std::basic_ostream

2022-02-18 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

Jonathan Wakely  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #7 from Jonathan Wakely  ---
Not a bug.

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 CC||rsandifo at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug target/104593] Problem with va_list

2022-02-18 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

Jonathan Wakely  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #7 from Jonathan Wakely  ---
.

Re: Bug#1005297: gcc-12-12-20220214: FTBFS on hurd-i386; Was: gcc-12-12-20220206: FTBFS on hurd-i386

2022-02-18 Thread Svante Signell via Gcc-bugs
retitle 1005297 gcc-12-12-29229214: FTBFS on hurd-i386
thanks

On Thu, 2022-02-10 at 19:34 +0100, Svante Signell wrote:
> Source: gcc-12
> Version: 12_12-20220206-1
> Severity: important
> Tags: patch
> User: debian-h...@lists.debian.org
> Usertags: hurd

Hi again,

Attached are patches to successfully build gcc-12-12-20220214-1 from source:

debian_rules.patch.patch: Adds hurd-specific patches to to debian/rules.patch.
gcc_config_gnu.h.diff: Re-enables split-stack support.

The patches below are needed for a successful build of libgo.so.21.0.0:
libgo_go_net_unixsock_readmsg_cloexec.go.diff
libgo_go_runtime_netpoll_hurd.go.diff
libgo_go_runtime_os_hurd.go.diff
libgo_go_syscall_exec_bsd.go.diff
libgo_go_syscall_exec_hurd.go.diff
libgo_go_os_user_cgo_listgroups_unix.go.diff
libgo_go_os_user_getgrouplist_unix.go.diff
libgo_go_internal_testenv_testenv_unix.go.diff
libgo_go_os_exec_internal_fdtest_exists_unix.go.diff

Makefile.in.diff: This patch is modified from the upstream patch
src_Makefile.in.diff since the Debian-specific patch gm2.diff makes changes to
src/Makefile.in before Makefile.in.diff is applied.

All patches have been submitted upstream in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290

The patch, libgo_go_net_unixsock_readmsg_cloexec.go.diff, has already been
committed upstream by Ian.

Thanks!
--- a/debian/rules.patch	2022-02-15 23:07:04.0 +0100
+++ b/debian/rules.patch	2022-02-15 23:14:51.0 +0100
@@ -192,6 +192,17 @@
 
 ifeq ($(DEB_TARGET_ARCH_OS),hurd)
   debian_patches += hurd-changes
+  debian_patches += gcc_config_gnu.h
+  debian_patches += libgo_go_net_unixsock_readmsg_cloexec.go
+  debian_patches += libgo_go_runtime_netpoll_hurd.go
+  debian_patches += libgo_go_runtime_os_hurd.go
+  debian_patches += libgo_go_syscall_exec_bsd.go
+  debian_patches += libgo_go_syscall_exec_hurd.go
+  debian_patches += libgo_go_os_user_cgo_listgroups_unix.go
+  debian_patches += libgo_go_os_user_getgrouplist_unix.go
+  debian_patches += libgo_go_internal_testenv_testenv_unix.go
+  debian_patches += libgo_go_os_exec_internal_fdtest_exists_unix.go
+  debian-patches += Makefile.in
 endif
 
 debian_patches += gcc-ice-dump
--- a/src/gcc/config/gnu.h	2022-02-06 11:59:41.0 +0100
+++ b/src/gcc/config/gnu.h	2022-02-06 12:00:19.0 +0100
@@ -19,6 +19,9 @@
 along with GCC.  If not, see .
 */
 
+#define OPTION_GLIBC_P(opts)	(DEFAULT_LIBC == LIBC_GLIBC)
+#define OPTION_GLIBC		OPTION_GLIBC_P (_options)
+
 #undef GNU_USER_TARGET_OS_CPP_BUILTINS
 #define GNU_USER_TARGET_OS_CPP_BUILTINS()		\
 do {	\
--- a/src/libgo/go/internal/testenv/testenv_unix.go	2022-02-14 03:23:21.0 +0100
+++ b/src/libgo/go/internal/testenv/testenv_unix.go	2022-02-15 13:06:16.0 +0100
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build aix || darwin || dragonfly || freebsd || linux || netbsd || openbsd || solaris
+//go:build aix || darwin || dragonfly || freebsd || hurd || linux || netbsd || openbsd || solaris
 
 package testenv
 
--- a/src/libgo/go/net/unixsock_readmsg_cloexec.go	2022-02-15 00:27:45.0 +0100
+++ b/src/libgo/go/net/unixsock_readmsg_cloexec.go	2022-02-15 00:29:01.0 +0100
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build aix || darwin || freebsd || solaris
+//go:build aix || darwin || freebsd || hurd || solaris
 
 package net
 
--- a/src/libgo/go/os/exec/internal/fdtest/exists_unix.go	2022-02-14 03:23:21.0 +0100
+++ b/src/libgo/go/os/exec/internal/fdtest/exists_unix.go	2022-02-15 13:38:42.0 +0100
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build aix || darwin || dragonfly || freebsd || linux || netbsd || openbsd || solaris
+//go:build aix || darwin || dragonfly || freebsd || hurd || linux || netbsd || openbsd || solaris
 
 // Package fdtest provides test helpers for working with file descriptors across exec.
 package fdtest
--- a/src/libgo/go/os/user/cgo_listgroups_unix.go	2022-02-14 03:23:22.0 +0100
+++ b/src/libgo/go/os/user/cgo_listgroups_unix.go	2022-02-15 12:13:35.0 +0100
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build (dragonfly || darwin || freebsd || (!android && linux) || netbsd || openbsd || (solaris && !illumos)) && cgo && !osusergo
+//go:build (dragonfly || darwin || freebsd || hurd || (!android && linux) || netbsd || openbsd || (solaris && !illumos)) && cgo && !osusergo
 
 package user
 
--- a/src/libgo/go/os/user/getgrouplist_unix.go	2022-02-14 03:23:22.0 +0100
+++ b/src/libgo/go/os/user/getgrouplist_unix.go	2022-02-15 12:17:12.0 +0100
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found 

[Bug libstdc++/104592] Problem with std::basic_ostream

2022-02-18 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

--- Comment #6 from Jonathan Wakely  ---
I don't know why you think this is a bug in GCC, the spiff_profile_id is a
scoped enumeration type, which cannot be written to an ostream unless there is
an operator<< overload for it.

[Bug libstdc++/104592] Problem with std::basic_ostream

2022-02-18 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

--- Comment #5 from Jonathan Wakely  ---
No, that is not preprocessed source, please read https://gcc.gnu.org/bus again

And please attach the preprocessed source here, instead of using a dodgy
download site with popup ads and dodgy certs:

Connecting to fs12n4.sendspace.com (fs12n4.sendspace.com)|69.31.136.53|:443...
connected.
ERROR: cannot verify fs12n4.sendspace.com's certificate, issued by
‘/C=BE/O=GlobalSign nv-sa/CN=AlphaSSL CA - SHA256 - G2’:
  Unable to locally verify the issuer's authority.

[Bug tree-optimization/104595] New: unvectorized loop due to bool condition loaded from memory

2022-02-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104595

Bug ID: 104595
   Summary: unvectorized loop due to bool condition loaded from
memory
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

For the case:

#include "stdbool.h"
#define N 256
typedef char T;
extern T a[N];
extern T b[N];
extern T c[N];
extern bool pb[N];
extern char pc[N];

void predicate_by_bool() {
  for (int i = 0; i < N; i++)
c[i] = pb[i] ? a[i] : b[i];
}

void predicate_by_char() {
  for (int i = 0; i < N; i++)
c[i] = pc[i] ? a[i] : b[i];
}

Simply compiled with -Ofast -mcpu=power10, vectorizer can vectorize the 2nd
function predicate_by_char but can't vectorize the first. It seems currently
GCC just supports very limited case with bool types such as some patterns in
vect_recog_bool_pattern.

I guess here the size of bool seems to be a problem, for the size of bool, C
says "An object declared as type _Bool is large enough to store the values 0
and 1.", C++ says "The value of sizeof(bool) is implementation defined and
might differ from 1.". But the "implementation defined" looks to be compiler
defined? then compiler should be aware of it when compiling. If so, we can use
the equivalent size type for the load instead and make it compare with zero to
get the predicate just like the char variant, I think the expectation to see
both these loops vectorized is reasonable then?

[Bug target/104593] Problem with va_list

2022-02-18 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

--- Comment #6 from Jamaika  ---
https://www.sendspace.com/file/e4n2xj

echo off
set PATH=C:\MSYS1201\bin;%PATH%
rem echo %PATH%
rem cd "C:\MSYS1201\bin"


cd lib\CommonLib
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -DNDEBUG -ftree-vectorize -g0
-O3 -fPIC -DWIN32 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602 -c %%f -o %%~nf.o
cd ..\..
cd lib\apputils
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -DNDEBUG -ftree-vectorize -g0
-O3 -fPIC -DWIN32 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602
-DVVENC_ENABLE_THIRDPARTY_JSON=1 -c %%f -o %%~nf.o
cd ..\..
cd lib\Utilities
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -DNDEBUG -ftree-vectorize -g0
-O3 -fPIC -DWIN32 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602 -c %%f -o %%~nf.o
cd ..\..
cd lib\EncoderLib
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -DNDEBUG -ftree-vectorize -g0
-O3 -fPIC -DWIN32 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602
-DVVENC_ENABLE_THIRDPARTY_JSON=1 -c %%f -o %%~nf.o
cd ..\..
cd lib\DecoderLib
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -DNDEBUG -ftree-vectorize -g0
-O3 -fPIC -DWIN32 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602 -c %%f -o %%~nf.o
cd ..\..
cd lib\vvenc
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -DNDEBUG -ftree-vectorize -g0
-O3 -fPIC -DWIN32 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602 -c %%f -o %%~nf.o
cd ..\..

cd App\vvencapp
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -DNDEBUG -ftree-vectorize -g0
-O3 -fPIC -DWIN32 -DWINVER=0x0602 -D_WIN32_WINNT=0x0602 -c %%f -o %%~nf.o
cd ..\..

g++ -std=gnu++14 -fPIC -ftree-vectorize -g0 -O3 -Wall -Wextra -Werror -o
VVEncoderApp.exe App\vvencapp\vvencapp.o lib\apputils\ParseArg.o
lib\apputils\VVEncAppCfg.o lib\apputils\YuvFileIO.o lib\vvenc\vvenc.o
lib\vvenc\vvencCfg.o lib\vvenc\vvencimpl.o lib\Utilities\NoMallocThreadPool.o
lib\EncoderLib\BitAllocation.o lib\EncoderLib\BinEncoder.o
lib\EncoderLib\CABACWriter.o lib\EncoderLib\EncAdaptiveLoopFilter.o
lib\EncoderLib\EncGOP.o lib\EncoderLib\EncCu.o lib\EncoderLib\EncHRD.o
lib\EncoderLib\EncLib.o lib\EncoderLib\EncModeCtrl.o
lib\EncoderLib\EncPicture.o lib\EncoderLib\EncReshape.o
lib\EncoderLib\EncSampleAdaptiveOffset.o lib\EncoderLib\EncSlice.o
lib\EncoderLib\InterSearch.o lib\EncoderLib\IntraSearch.o
lib\EncoderLib\LegacyRateCtrl.o lib\EncoderLib\NALwrite.o
lib\CommonLib\ProfileLevelTier.o lib\EncoderLib\RateCtrl.o
lib\EncoderLib\SEIEncoder.o lib\EncoderLib\SEIwrite.o
lib\EncoderLib\VLCWriter.o lib\CommonLib\AdaptiveLoopFilter.o
lib\CommonLib\AffineGradientSearch.o lib\CommonLib\BitStream.o
lib\CommonLib\Buffer.o lib\CommonLib\CodingStructure.o
lib\CommonLib\ContextModelling.o lib\CommonLib\Contexts.o
lib\CommonLib\DepQuant.o lib\CommonLib\dtrace.o
lib\CommonLib\InterpolationFilter.o lib\CommonLib\InterPrediction.o
lib\CommonLib\IntraPrediction.o lib\CommonLib\LoopFilter.o
lib\CommonLib\MatrixIntraPrediction.o lib\CommonLib\MCTF.o lib\CommonLib\Mv.o
lib\CommonLib\Picture.o lib\CommonLib\PicYuvMD5.o lib\CommonLib\Quant.o
lib\CommonLib\QuantRDOQ.o lib\CommonLib\QuantRDOQ2.o lib\CommonLib\RdCost.o
lib\CommonLib\Reshape.o lib\CommonLib\Rom.o lib\CommonLib\RomTr.o
lib\CommonLib\SampleAdaptiveOffset.o lib\CommonLib\SEI.o lib\CommonLib\Slice.o
lib\CommonLib\TrQuant.o lib\CommonLib\TrQuant_EMT.o lib\CommonLib\Unit.o
lib\CommonLib\UnitPartitioner.o lib\CommonLib\UnitTools.o
lib\DecoderLib\AnnexBread.o lib\DecoderLib\CABACReader.o
lib\DecoderLib\BinDecoder.o lib\DecoderLib\DecCu.o lib\DecoderLib\DecLib.o
lib\DecoderLib\DecSlice.o lib\DecoderLib\NALread.o lib\DecoderLib\SEIread.o
lib\DecoderLib\VLCReader.o 
pause

[Bug target/104593] Problem with va_list

2022-02-18 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

--- Comment #5 from Jonathan Wakely  ---
No, that's your compiler, please provide preprocessed source. As requested when
creating bug reports, please read https://gcc.gnu.org/bugs/ for the explanation
what you need to provide.

[Bug libstdc++/104592] Problem with std::basic_ostream

2022-02-18 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

--- Comment #4 from Jamaika  ---
https://www.sendspace.com/file/ubncby

echo off
set PATH=C:\msys1201\bin;%PATH%
rem echo %PATH%
rem cd "C:\msys1201\bin"

for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -ftree-vectorize -g0 -O3
-fPIC -DHAVE_ISATTY -c %%f -o %%~nf.o
cd boost
for %%f in ("%~dp1*.cpp") do g++.exe -std=gnu++14 -ftree-vectorize -g0 -O3
-fPIC -c %%f -o %%~nf.o
cd ..


g++.exe -std=gnu++14 -Wall -Wextra -ftree-vectorize -g0 -O3 -fPIC cjpls.o
cjpls_options.o crc32.o dest.o format.o image.o jls.o options.o pnm.o raw.o
source.o utils.o boost/cmdline.o boost/config_file.o boost/convert.o
boost/options_description.o boost/parsers.o boost/positional_options.o
boost/split.o boost/utf8_codecvt_facet.o boost/value_semantic.o
boost/variables_map.o boost/winmain.o jpegls/charls_jpegls_decoder.o
jpegls/charls_jpegls_encoder.o jpegls/jpegls_error.o
jpegls/jpeg_stream_reader.o jpegls/jpeg_stream_writer.o jpegls/jpegls.o
jpegls/util.o jpegls/version.o -o cjpls.exe
pause

[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb

2022-02-18 Thread sebastian.huber--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121

--- Comment #8 from Sebastian Huber  ---
I can't reproduce the issue with the reduced test case, however, compiling the
preprocessed file still results in an infinite loop.

[Bug tree-optimization/100464] [11 Regression] emitted binary code changes when -g is enabled at -O3

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100464

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
 Status|ASSIGNED|RESOLVED
  Known to fail||11.2.0
 Resolution|--- |FIXED
  Known to work||11.2.1

--- Comment #17 from Richard Biener  ---
Fixed.

[Bug go/100537] [12 Regression] Bootstrap-O3 and bootstrap-debug fail on 32-bit ARM after gcc-12-657-ga076632e274a

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100537

--- Comment #21 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:8a1e92ff45e8e254fb557d20dcfa54a88d354329

commit r11-9592-g8a1e92ff45e8e254fb557d20dcfa54a88d354329
Author: Ian Lance Taylor 
Date:   Sat May 22 19:19:13 2021 -0700

compiler: mark global variables whose address is taken

To implement this, change the backend to use flag bits for variables.

Fixes https://gcc.gnu.org/PR100537

PR go/100537
* go-gcc.cc (class Gcc_backend): Update methods that create
variables to take a flags parameter.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/322129
(cherry picked from commit 358832c46a378e5a0b8a2fa3c2739125e3e680c7)

[Bug c++/100468] set_up_extended_ref_temp via extend_ref_init_temps_1 drops TREE_ADDRESSABLE

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100468

--- Comment #7 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:462900ba21f5fdf865c93f693083da3179dd3151

commit r11-9591-g462900ba21f5fdf865c93f693083da3179dd3151
Author: Richard Biener 
Date:   Fri May 7 09:51:18 2021 +0200

middle-end/100464 - avoid spurious TREE_ADDRESSABLE in folding debug stmts

canonicalize_constructor_val was setting TREE_ADDRESSABLE on bases
of ADDR_EXPRs but that's futile when we're dealing with CTOR values
in debug stmts.  This rips out the code which was added for Java
and should have been an assertion when we didn't have debug stmts.
To not regress g++.dg/tree-ssa/array-temp1.C we have to adjust the
testcase to not look for a no longer applied invalid optimization.

2021-05-10  Richard Biener  

PR middle-end/100464
PR c++/100468
gcc/
* gimple-fold.c (canonicalize_constructor_val): Do not set
TREE_ADDRESSABLE.

gcc/cp/
* call.c (set_up_extended_ref_temp): Mark the temporary
addressable if the TARGET_EXPR was.

gcc/testsuite/
* gcc.dg/pr100464.c: New testcase.
* g++.dg/tree-ssa/array-temp1.C: Adjust.

(cherry picked from commit a076632e274abe344ca7648b7c7f299273d4cbe0)

[Bug tree-optimization/100464] [11 Regression] emitted binary code changes when -g is enabled at -O3

2022-02-18 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100464

--- Comment #16 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:462900ba21f5fdf865c93f693083da3179dd3151

commit r11-9591-g462900ba21f5fdf865c93f693083da3179dd3151
Author: Richard Biener 
Date:   Fri May 7 09:51:18 2021 +0200

middle-end/100464 - avoid spurious TREE_ADDRESSABLE in folding debug stmts

canonicalize_constructor_val was setting TREE_ADDRESSABLE on bases
of ADDR_EXPRs but that's futile when we're dealing with CTOR values
in debug stmts.  This rips out the code which was added for Java
and should have been an assertion when we didn't have debug stmts.
To not regress g++.dg/tree-ssa/array-temp1.C we have to adjust the
testcase to not look for a no longer applied invalid optimization.

2021-05-10  Richard Biener  

PR middle-end/100464
PR c++/100468
gcc/
* gimple-fold.c (canonicalize_constructor_val): Do not set
TREE_ADDRESSABLE.

gcc/cp/
* call.c (set_up_extended_ref_temp): Mark the temporary
addressable if the TARGET_EXPR was.

gcc/testsuite/
* gcc.dg/pr100464.c: New testcase.
* g++.dg/tree-ssa/array-temp1.C: Adjust.

(cherry picked from commit a076632e274abe344ca7648b7c7f299273d4cbe0)

[Bug c++/104594] narrowing of -1 to unsigned char not detected with requires concepts

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104594

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-02-18
Summary|narrowing conversion of -1  |narrowing of -1 to unsigned
   |to unsigned char at compile |char not detected with
   |time not detected   |requires concepts
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.
GCC correctly detects:
template 
constexpr bool Geometry = (DIM_FROM == -1);
template 
constexpr bool tt1 = Geometry;
template
struct X {
  static constexpr int n = N;
};
bool t = tt1>;

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #12 from rguenther at suse dot de  ---
On Fri, 18 Feb 2022, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582
> 
> --- Comment #11 from Jakub Jelinek  ---
> True.
> So another option is to try to undo some of those short vectorization cases
> during isel, expansion or later, though e.g. for the negdi2 case it will go
> already during expansion into memory.

Yes, there are duplicates about this issue and it's really hard to
solve generally.  There's the possibility to try improving on the
costing side but currently the cost hooks just see

ix86_vector_costs::add_stmt_cost (this=0x41b88c0, count=1, 
kind=vec_construct, stmt_info=0x0, vectype=, 
misalign=0, where=vect_prologue)

so they have no idea about the feeding stmts.  The cost entry
is generated by vect_prologue_cost_for_slp which knows the
scalar operands but we do not pass the SLP node down to the cost
hooks (that's something on my list but my idea was to push it back
when we only have SLP nodes and thus could go w/o the stmt_info then).

The other possibility is (for the original testcase) to anticipate
that RTL expansion will expand 'w' to a TImode register and take
that as a reason to pessimize vectorization (but we don't know how
it's going to be used, so that's probably a flawed attempt).

The only short-term fixes are a) biasing the costing, regressing
the from memory case, b) pass down the SLP node where available
and look at the defs of the CTOR components, costing a gpr->xmm
move where it can be anticipated.

b) is more future-proof, if we'd take that at this point I can
see how intrusive it would be.

[Bug c++/104594] New: narrowing conversion of -1 to unsigned char at compile time not detected

2022-02-18 Thread raffael at casagrande dot ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104594

Bug ID: 104594
   Summary: narrowing conversion of -1 to unsigned char at compile
time not detected
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: raffael at casagrande dot ch
  Target Milestone: ---

The current gcc trunk compiles the following piece of code:

template 
concept Geometry = (DIM_FROM == -1);

template 
requires Geometry
auto GaussNewton(const INIT& init) -> void {}

template
struct X {
  static constexpr int n = N;
};

int main() { GaussNewton(X<-1>{}); }
--

I think this should NOT compile since it entails a narrowing conversion of -1
to an unsigned char type at compile time. Clang as well as MSVC fail to compile
the code.
(In many other cases, gcc also fails to compile if such a narrowing conversion
happens at compile time.)

[Bug libstdc++/104592] Problem with std::basic_ostream

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

--- Comment #3 from Andrew Pinski  ---
(In reply to Jamaika from comment #2)
> http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z

That is the GCC binary.
Please read https://gcc.gnu.org/bugs/ and provide the preprocessed source for
what you are compiling.
And also all of the options.

[Bug target/104593] Problem with va_list

2022-02-18 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

--- Comment #4 from Jamaika  ---
(In reply to Andrew Pinski from comment #1)
> Can you provide the preprocessed source?

http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z

[Bug libstdc++/104592] Problem with std::basic_ostream

2022-02-18 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

--- Comment #2 from Jamaika  ---
http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z

[Bug target/104593] Problem with va_list

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

--- Comment #3 from Andrew Pinski  ---
#include 
#include 
extern std::function g_msgFnc;

Does not warn for me on x86_64-linux-gnu.

[Bug target/104593] Problem with va_list

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic
  Component|c++ |target

--- Comment #2 from Andrew Pinski  ---
Also what exact target is this on?

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #11 from Jakub Jelinek  ---
True.
So another option is to try to undo some of those short vectorization cases
during isel, expansion or later, though e.g. for the negdi2 case it will go
already during expansion into memory.

[Bug c++/104593] Problem with va_list

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2022-02-18
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Can you provide the preprocessed source?

[Bug c++/104593] New: Problem with va_list

2022-02-18 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593

Bug ID: 104593
   Summary: Problem with va_list
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lukaszcz18 at wp dot pl
  Target Milestone: ---

I use vvenc/vvdec c++14
https://github.com/fraunhoferhhi/vvenc/commit/69469d7ac5de882d9f5e12b24ee87f376df20262

In file included from AffineGradientSearch.h:53,
 from AffineGradientSearch.cpp:57:
CommonDef.h:595:62: warning: ignoring attributes on template argument
'void(void*, int, const char*, va_list)' {aka 'void(void*, int, const char*,
char*)'} [-Wignored-attributes]
  595 | extern std::function
g_msgFnc;

[Bug libstdc++/104592] Problem with std::basic_ostream

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2022-02-18
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Can you provide the preprocessed source?

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #10 from Richard Biener  ---
Btw, I think it makes sense to build libgcc with -mno-sse, maybe even
-mgeneral-regs-only.  Or globally with -fno-tree-vectorize (but we likely do
not want
%xmm uses for parameter setup either with the move-by-pieces changes - IIRC
I've seen uses in the unwinder code trapping because of a misaligned stack
in an executable).

[Bug target/104024] ICE in curr_insn_transform with -O1 -mpower10-fusion -mpower10-fusion-2logical with __int128_t and __builtin_add_overflow

2022-02-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104024

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug c++/104592] New: Problem with std::basic_ostream

2022-02-18 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592

Bug ID: 104592
   Summary: Problem with std::basic_ostream
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lukaszcz18 at wp dot pl
  Target Milestone: ---

I use boost 1.78 c++11
https://boostorg.jfrog.io/artifactory/main/release/1.78.0/source/boost_1_78_0.zip
and charls-tools
https://github.com/malaterre/charls-tools/commit/b27c6071a42996b26ca91fbb015d4b42238d13cf
and
https://github.com/team-charls/charls/commit/662d4f2a0238357ccc4d89cd14b1fa67d2597ff1


jplsinfo.cpp:26:12: error: no match for 'operator<<' (operand types are
'std::stringstream' {aka 'std::__cxx11::basic_stringstream'} and 'const
charls::spiff_profile_id')
   26 | ss << val;
  | ~~~^~
In file included from c:\msys1200\include\c++\12.0.1\istream:39,
 from c:\msys1200\include\c++\12.0.1\fstream:38,
 from jplsinfo.cpp:6:
c:\msys1200\include\c++\12.0.1\ostream:108:7: note: candidate:
'std::basic_ostream<_CharT, _Traits>::__ostream_type&
std::basic_ostream<_CharT, _Traits>::operator<<(__ostream_type&
(*)(__ostream_type&)) [with _CharT = char; _Traits = std::char_traits;
__ostream_type = std::basic_ostream]'
  108 |   operator<<(__ostream_type& (*__pf)(__ostream_type&))
  |   ^~~~
c:\msys1200\include\c++\12.0.1\ostream:108:36: note:   no known conversion for
argument 1 from 'const charls::spiff_profile_id' to
'std::basic_ostream::__ostream_type&
(*)(std::basic_ostream::__ostream_type&)' {aka 'std::basic_ostream&
(*)(std::basic_ostream&)'}
  108 |   operator<<(__ostream_type& (*__pf)(__ostream_type&))
  |  ~~^~

[Bug target/104024] ICE in curr_insn_transform with -O1 -mpower10-fusion -mpower10-fusion-2logical with __int128_t and __builtin_add_overflow

2022-02-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104024

--- Comment #2 from Kewen Lin  ---
Created attachment 52475
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52475=edit
Tested patch

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #9 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #8)
> Just trying a dumb microbenchmark:
> struct S { unsigned long a, b; } s;
> 
> __attribute__((noipa)) void
> foo (unsigned long a, unsigned long b)
> {
>   s.a = a;
>   s.b = b;
> }
> 
> int
> main ()
> {
>   int i;
>   for (i = 0; i < 10; i++)
> foo (42, 43);
>   return 0;
> }
> the GCC 11 vs. GCC 12 code:
> - movq%rdi, s(%rip)
> - movq%rsi, s+8(%rip)
> + movq%rdi, %xmm0
> + movq%rsi, %xmm1
> + punpcklqdq  %xmm1, %xmm0
> + movaps  %xmm0, s(%rip)
> seems to be exactly the same speed (on i9-7960X) and the GCC 11 code is 7
> bytes smaller.

The GCC 12 code is 30% slower on Zen 2 (the gpr -> xmm move is comparatively
more costly there).  As said we fail to account for that.  But as I said
the cost is not there if it's

struct S { unsigned long a, b; } s;

__attribute__((noipa)) void
foo (unsigned long *a, unsigned long *b)
{
  unsigned long a_ = *a;
  unsigned long b_ = *b;
  s.a = a_;
  s.b = b_;
}

which vectorizes to

movq(%rdi), %xmm0
movhps  (%rsi), %xmm0
movaps  %xmm0, s(%rip)
ret

which is _smaller_ than the scalar code.  So it's important to be able
to distinguish those cases.  The above is also

a__3 1 times scalar_store costs 12 in body
b__5 1 times scalar_store costs 12 in body
a__3 1 times vector_store costs 12 in body
 1 times vec_construct costs 8 in prologue

[Bug libstdc++/104591] Problem with unary_function

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104591

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=91260

--- Comment #2 from Andrew Pinski  ---
See PR 91260.

[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain

2022-02-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623

--- Comment #26 from Kewen Lin  ---
Created attachment 52474
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52474=edit
Untested patch

[Bug target/104353] ppc64le: Apparent reliance on undefined behavior of xvcvdpsxws

2022-02-18 Thread seiko at imavr dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104353

Sayed Adel  changed:

   What|Removed |Added

 CC||seiko at imavr dot com

--- Comment #3 from Sayed Adel  ---
that close leads to another bug since GCC follows the previous versions of ISA
in several places. for example, VSX intrinsics vec_floate, vec_floato, etc. I
filled a bug for it, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104590

[Bug libstdc++/104591] Problem with unary_function

2022-02-18 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104591

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
  Component|c++ |libstdc++
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
std::unary_function has been deprecated since C++11 and is also removed in
C++17.
Just GCC before 12 did not warn about the deprecatation.

[Bug target/104590] ppc64: even/odd permutation for VSX 64-bit to 32-bit conversions is no longer necessary.

2022-02-18 Thread seiko at imavr dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104590

--- Comment #1 from Sayed Adel  ---
forget to mention:
- vector signed int vec_signede(vector double) -> xvcvdpsxws
- vector signed int vec_signedo(vector double) -> xvcvdpsxws

[Bug c++/104591] New: Problem with unary_function

2022-02-18 Thread lukaszcz18 at wp dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104591

Bug ID: 104591
   Summary: Problem with unary_function
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lukaszcz18 at wp dot pl
  Target Milestone: ---

I use library vvenc c++14
https://github.com/fraunhoferhhi/vvenc/commit/69469d7ac5de882d9f5e12b24ee87f376df20262
or jvetvvc c++11
https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/commit/ab0bea02235bb876c9d3bd8a9d3b2fca7ad1b8eb
and https://github.com/Jamaika1/mingw_std_threads
and http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z

In file included from Unit.h:53,
 from AdaptiveLoopFilter.h:54:
Common.h:184:41: warning: 'template struct
std::unary_function' is deprecated [-Wdeprecated-declarations]
  184 |   struct hash : public unary_function
  | ^~
In file included from c:\msys1200\include\c++\12.0.1\string:48,
 from c:\msys1200\include\c++\12.0.1\bits\locale_classes.h:40,
 from c:\msys1200\include\c++\12.0.1\bits\ios_base.h:41,
 from c:\msys1200\include\c++\12.0.1\ios:42,
 from c:\msys1200\include\c++\12.0.1\ostream:38,
 from c:\msys1200\include\c++\12.0.1\iostream:39,
 from CommonDef.h:53:
c:\msys1200\include\c++\12.0.1\bits\stl_function.h:117:12: note: declared here
  117 | struct unary_function
  |^~

[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain

2022-02-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623

--- Comment #25 from Kewen Lin  ---
The key difference from the previous bif support is that: previously we checked
TARGET_HARD_FLOAT but now we didn't. I think we still need to check it, as the
document here
https://gcc.gnu.org/onlinedocs/gcc/Basic-PowerPC-Built-in-Functions-Available-on-ISA-2_002e05.html,
these bifs requires "-mhard-float" option. And all the alternatives of
unpack_nodm and pack with mode iterator FMOVE128 will use
constraint d which only takes effect with -mhard-float.

Just a record for the guards in the previous support:

/* 128-bit long double floating point builtins.  */
#define BU_LDBL128_2(ENUM, NAME, ATTR, ICODE)   \
  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
"__builtin_" NAME,  /* NAME */  \
(RS6000_BTM_HARD_FLOAT  /* MASK */  \
 | RS6000_BTM_LDBL128), \
(RS6000_BTC_ ## ATTR/* ATTR */  \
 | RS6000_BTC_BINARY),  \
CODE_FOR_ ## ICODE) /* ICODE */

/* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
   __ibm128 is available).  */
#define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)\
  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,  /* ENUM */  \
"__builtin_" NAME,  /* NAME */  \
(RS6000_BTM_HARD_FLOAT  /* MASK */  \
 | RS6000_BTM_FLOAT128),\
(RS6000_BTC_ ## ATTR/* ATTR */  \
 | RS6000_BTC_BINARY),  \
CODE_FOR_ ## ICODE) /* ICODE */

[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization

2022-02-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582

--- Comment #8 from Jakub Jelinek  ---
Just trying a dumb microbenchmark:
struct S { unsigned long a, b; } s;

__attribute__((noipa)) void
foo (unsigned long a, unsigned long b)
{
  s.a = a;
  s.b = b;
}

int
main ()
{
  int i;
  for (i = 0; i < 10; i++)
foo (42, 43);
  return 0;
}
the GCC 11 vs. GCC 12 code:
-   movq%rdi, s(%rip)
-   movq%rsi, s+8(%rip)
+   movq%rdi, %xmm0
+   movq%rsi, %xmm1
+   punpcklqdq  %xmm1, %xmm0
+   movaps  %xmm0, s(%rip)
seems to be exactly the same speed (on i9-7960X) and the GCC 11 code is 7 bytes
smaller.

  1   2   >