date:20190304

[Bug target/88497] Improve Accumulation in Auto-Vectorized Code

2019-03-04 Thread linkw at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497

Kewen Lin  changed:

   What|Removed |Added

 Status|SUSPENDED   |ASSIGNED

[Bug target/88497] Improve Accumulation in Auto-Vectorized Code

2019-03-04 Thread linkw at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497

--- Comment #9 from Kewen Lin  ---
As Kelvin mentioned in the last comment, there is some thing we teach reassoc
to get the below code better, although it's in low priority.

double foo (double accumulator, vector double arg2[], vector double arg3[])
{
  vector double temp;

  temp = arg2[0] * arg3[0];
  accumulator += temp[0] + temp[1];
  temp = arg2[1] * arg3[1];
  accumulator += temp[0] + temp[1];
  temp = arg2[2] * arg3[2];
  accumulator += temp[0] + temp[1];
  temp = arg2[3] * arg3[3];
  accumulator += temp[0] + temp[1];
  return accumulator;
}

Confirmed.

[Bug target/89411] RISC-V backend will generate wrong instruction for longlong type like lw a3,-2048(a5)

2019-03-04 Thread wangtao42 at huawei dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89411

Tao Wang  changed:

   What|Removed |Added

 CC||wangtao42 at huawei dot com

--- Comment #2 from Tao Wang  ---
When using gdb debugging this, I forced the riscv_valid_lo_sum_p return false,
and it do solve this problem. But in function riscv_valid_lo_sum_p a lot of
information to identify the wrong scene has been lost. I think it may work that
using a global var which is defined in expand_XXX, but it seems like not good.
So is there another way to solve this? Or how can identify the wrong scene?

Thanks

[Bug c/65403] -Wno-error= is an error

2019-03-04 Thread alexhenrie24 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65403

--- Comment #11 from Alex Henrie  ---
Created attachment 45889
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45889=edit
Proposed patches

I fixed up the patch from comment 4 and added a second patch with tests. Now
I'm just waiting to receive a copyright assignment form.

[Bug c++/68975] Request: Provide alternate keyword for decltype in C++03

2019-03-04 Thread eric at efcs dot ca

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68975

Eric Fiselier  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #2 from Eric Fiselier  ---
Apparently I don't need this as bad as I thought. And nobody else seems to be
asking for it. Closing.

[Bug c++/86641] Regression: non-ODR used auto class data members fail to deduce.

2019-03-04 Thread eric at efcs dot ca

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86641

Eric Fiselier  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #1 from Eric Fiselier  ---
I can't reproduce anymore. It must have been fixed.

[Bug rtl-optimization/89588] New: [8/9 Regression] ICE in unroll_loop_constant_iterations, at loop-unroll.c:498

2019-03-04 Thread asolokha at gmx dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89588

Bug ID: 89588
   Summary: [8/9 Regression] ICE in
unroll_loop_constant_iterations, at loop-unroll.c:498
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc-9.0.0-alpha20190303 snapshot (r269357) ICEs when compiling the following
testcase w/ -O1 -fno-tree-loop-optimize:

void
bb (void)
{
  int nv;

#pragma GCC unroll 2
  for (nv = 0; nv <= 2; nv += 2)
{
}
}

% gcc-9.0.0-alpha20190303 -O1 -fno-tree-loop-optimize -c hiz1wdrt.c
during RTL pass: loop2_unroll
hiz1wdrt.c: In function 'bb':
hiz1wdrt.c:10:1: internal compiler error: in unroll_loop_constant_iterations,
at loop-unroll.c:498
   10 | }
  | ^
0x65443d unroll_loop_constant_iterations
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190303/work/gcc-9-20190303/gcc/loop-unroll.c:498
0x65443d unroll_loops(int)
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190303/work/gcc-9-20190303/gcc/loop-unroll.c:295
0xbcbf0f execute
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190303/work/gcc-9-20190303/gcc/loop-init.c:584
0xbcbf0f execute
   
/var/tmp/portage/sys-devel/gcc-9.0.0_alpha20190303/work/gcc-9-20190303/gcc/loop-init.c:571

Even though I cannot reproduce the ICE w/ gcc older than 8, it seems the firing
assert has been there forever. -fno-tree-loop-optimize breaks the designed-in
assumption on this this particular transformation is based.

[Bug c++/45065] -fvisibility-inlines-hidden: Decl order in derived class affects visibility of inlines in base.

2019-03-04 Thread egallager at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45065

--- Comment #4 from Eric Gallager  ---
(In reply to Eric Gallager from comment #3)
> (In reply to Dean Edmonds from comment #0)
> > Compiling with -fvisibility=hidden and -fvisibility-inlines-hidden.
> > 
> > I have a Base class with default visibility which contains two virtual
> > methods, one inlined and the other not. A Derived class with hidden
> > visibility overrides the non-inlined method and doesn't touch the inlined
> > one. If the declaration of the overridden method appears *before* the
> > Derived's virtual destructor then the object file for Derived weakly exports
> > the Base class's inlined method. But if the declaration appears *after*
> > Derived's virtual destructor then the object for Derived doesn't export the
> > Base class's inlined method at all.
> > 
> > Given that I'm compiling with -fvisibility-inlines-hidden I *think* that
> > means that the Base class's inlined method should never be exported. Even if
> > I'm wrong about that, surely it should not matter the order in which the
> > Derived class's methods are declared.
> > 
> > Here's an example which demonstrates the problem:
> > 
> > class __attribute__ ((visibility("default"))) Base
> > {
> > public:
> > Base();
> > virtual ~Base();
> > virtual void func()  const;
> > virtual void inlineFunc()   {}
> > };
> > 
> > class Derived : public Base
> > {
> > public:
> > Derived();
> > void func() const;
> > virtual ~Derived();
> > };
> > 
> > void Derived::func() const
> > {}
> > 
> > Compiled on OSX 10.6.4 with g++ 4.2.1, using the following command:
> > 
> >   g++-4.2 -Wall -c -arch x86_64 -fvisibility=hidden
> > -fvisibility-inlines-hidden -O3 -m64 -isysroot
> > /Developer/SDKs/MacOSX10.6.sdk -mmacosx-version-min=10.6 -o Derived.o
> > Derived.cpp
> > 
> > Looking at the object file using 'nm -m Derived.o | grep inlineFunc' gives:
> > 
> >   0010 (__TEXT,__textcoal_nt) weak private external
> > __ZN6Common10inlineFuncEv
> >   0098 (__TEXT,__eh_frame) weak private external
> > __ZN6Common10inlineFuncEv.eh
> > 
> > If I move the declaration of Derived::func() so that it comes after
> > ~Derived() then 'nm -m Derived.o | grep inlineFunc' returns nothing.
> > 
> > 
> 
> On 10.5 with gcc8, the grep only returns one line:
> 
> $ /usr/local/bin/g++ -Wall -c -arch x86_64 -fvisibility=hidden
> -fvisibility-inlines-hidden -O3 -m64 -isysroot
> /Developer/SDKs/MacOSX10.5.sdk -mmacosx-version-min=10.5 -Wextra -pedantic
> -o 45065.o 45065.cc
> $ nm -m 45065.o | grep inlineFunc
> 0010 (__TEXT,__textcoal_nt) weak private external
> __ZN4Base10inlineFuncEv
> 
> The difference is no version suffixed with '.eh' so I think it's a 10.5 to
> 10.6 difference.

huh, that's strange, I checked on 10.6 and I don't get the '.eh'-suffixed
version here either, so I guess it isn't a 10.5 vs. 10.6 difference after
all... what version cctools were you using? 

> 
> > I see similar behaviour on GNU/Linux (2.6.30.9-96.fc11.x86_64) using g++
> > 4.4.1. Compiling with this command:
> > 
> >   g++ -Wall -c -fvisibility=hidden -fvisibility-inlines-hidden -O3 -m64 -o
> > Derived.o Derived.cpp
> > 
> > and using 'objdump -t Derived.o | grep inlineFunc' to inspect the result
> > gives this when Derived::func() is declared before ~Derived():
> > 
> >    ld  .text._ZN4Base10inlineFuncEv   
> > .text._ZN4Base10inlineFuncEv
> >     wF .text._ZN4Base10inlineFuncEv   0002
> > .hidden _ZN4Base10inlineFuncEv
> > 
> > and gives nothing when Derived::func() is declared after ~Derived().
> 
> (In reply to Paolo Carlini from comment #2)
> > I can confirm the behavior with today's mainline. And seems weird indeed.
> 
> Changing status to NEW then since it's confirmed.

[Bug target/66203] aarch64-none-elf does not automatically find librdimon

2019-03-04 Thread egallager at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66203

--- Comment #5 from Eric Gallager  ---
(In reply to Richard Earnshaw from comment #4)
> The Arm builds that do not need anything from libgloss (and thus do not need
> a specs file) while linking come from a configuration that hard codes the
> underlying runtime monitor (usually the arm semihosting ABI) directly into
> newlib.
> 
> I understand that's deprecated and was not implemented for AArch64.

So does this bug need to stay open then?

[Bug driver/89587] New: gcc's rs6000 configuration unconditionally sets MULTIARCH_DIRNAME, even when multiarch is disabled

2019-03-04 Thread awilfox at adelielinux dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89587

Bug ID: 89587
   Summary: gcc's rs6000 configuration unconditionally sets
MULTIARCH_DIRNAME, even when multiarch is disabled
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: awilfox at adelielinux dot org
  Target Milestone: ---
  Host: powerpc-linux-*
Target: powerpc-linux-*

Simple reproduce is to build GCC to target powerpc-*-linux-*;
powerpc-unknown-linux-gnu and powerpc-foxkit-linux-musl both exhibit this
behaviour.

The config/rs6000/t-linux file unconditionally sets MULTIARCH_DIRNAME, even
when ./configure is passed --disable-multiarch.  This results in, for instance:

$ gcc -print-multiarch
powerpc-linux-musl

No other architecture's config file does this, other than powerpcspe, which is
already deprecated.

[Bug tree-optimization/89566] [9 Regression] ICE on compilable C++ code: in gimple_call_arg, at gimple.h:3166

2019-03-04 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89566

Martin Sebor  changed:

   What|Removed |Added

   Keywords||ice-on-invalid-code
 CC||msebor at gcc dot gnu.org

--- Comment #3 from Martin Sebor  ---
This is a recurring problem.  I wonder how many more bugs like this there are. 
See for example below.  There ought to be a better way to prevent it than to
keep patching all the places in the middle end where the bad calls end up.

$ cat t.c && gcc -O2 -S -Wall t.c
void f (void)
{
  ((void (*)()) __builtin_free) ();
}
during RTL pass: expand
t.c: In function ‘f’:
t.c:3:4: internal compiler error: tree check: accessed operand 4 of call_expr
with 3 operands in maybe_emit_free_warning, at builtins.c:10608
3 |   ((void (*)()) __builtin_free) ();
  |   ~^~~
0x15b045d tree_operand_check_failed(int, tree_node const*, char const*, int,
char const*)
/src/gcc/svn/gcc/tree.c:10059
0x827e8c tree_operand_check(tree_node*, int, char const*, int, char const*)
/src/gcc/svn/gcc/tree.h:3676
0x9d4bdb maybe_emit_free_warning
/src/gcc/svn/gcc/builtins.c:10608
0x9cc317 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
/src/gcc/svn/gcc/builtins.c:8305
0xbf115c expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/src/gcc/svn/gcc/expr.c:11029
0xbe33e3 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
/src/gcc/svn/gcc/expr.c:8274
0xa06252 expand_expr
/src/gcc/svn/gcc/expr.h:279
0xa0f581 expand_call_stmt
/src/gcc/svn/gcc/cfgexpand.c:2724
0xa12f29 expand_gimple_stmt_1
/src/gcc/svn/gcc/cfgexpand.c:3691
0xa135e4 expand_gimple_stmt
/src/gcc/svn/gcc/cfgexpand.c:3850
0xa136fc expand_gimple_tailcall
/src/gcc/svn/gcc/cfgexpand.c:3897
0xa1bc7c expand_gimple_basic_block
/src/gcc/svn/gcc/cfgexpand.c:5863
0xa1da9a execute
/src/gcc/svn/gcc/cfgexpand.c:6509
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug middle-end/61112] Simple example triggers false-positive -Wmaybe-uninitialized warning

2019-03-04 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61112

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
 CC||msebor at gcc dot gnu.org

--- Comment #6 from Martin Sebor  ---
The test case in comment #0 was fixed by r227942:

[PATCH] Fix 47679 by improving jump threading

PR tree-optimization/47679
* tree-ssa-dom.c (record_temporary_equivalences): No longer static.
* tree-ssa-dom.h (record_temporary_equivalences): Add prototype.
* tree-ssa-threadedge.c: Include tree-ssa-dom.h.
(thread_through_normal_block): Use record_temporary_equivalences.

PR tree-optimization/47679
* g++.dg/warn/Wuninitialized-6.C: New test.

The test case in comment #5 still results in -Wmaybe-uninitialized with GCC 9. 
As far as bisection shows, it has always been diagnosed.

[Bug target/68211] Free __m128d subreg of double

2019-03-04 Thread glisse at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68211

--- Comment #8 from Marc Glisse  ---
(In reply to Steven Bosscher from comment #7)
>   __m128d y = { x, 0 };
>   return _mm_cvtsd_f64(_mm_sqrt_round_sd(y, y,
> _MM_FROUND_TO_POS_INF|_MM_FROUND_NO_EXC));

I don't necessarily advocate for optimizing out an existing explicit mov. Maybe
I should, but there could be cases where mov makes the code faster, and I
haven't experimented enough. I am only asking for a way to explicitly skip it
if I believe I know what I am doing. Of course, if we start optimizing out the
mov, it makes my request useless.

[Bug middle-end/89501] Odd lack of warning about missing initialization

2019-03-04 Thread ncm at cantrip dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89501

--- Comment #13 from ncm at cantrip dot org ---
What I am getting is that the compiler is leaving that permitted optimization
-- eliminating the inode check -- on the table. It is doing that not so much
because it would make Linus angry, but as an artifact of the particular
optimization processes used in Gcc at the moment. Clang, or some later release
of Gcc or Clang, or even this Gcc under different circumstances, might choose
differently.

But maybe there are some flavors of UB, among which returning uninitialized
variables might be the poster child, that you don't ever want to use to drive
some kinds of optimizations. Maybe Gcc's process has that baked in.

[Bug c++/89585] GCC 8.3: asm volatile no longer accepted at file scope

2019-03-04 Thread harald at gigawatt dot nl

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89585

--- Comment #5 from Harald van Dijk  ---
According to , the
GCC 8 backport was not supposed to break existing code, it was supposed to warn
about code that would become invalid in GCC 9. It seems like this case was
missed.

[Bug c++/89585] GCC 8.3: asm volatile no longer accepted at file scope

2019-03-04 Thread harald at gigawatt dot nl

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89585

Harald van Dijk  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|DUPLICATE   |---

--- Comment #4 from Harald van Dijk  ---
Re-opening. Marking this as invalid for GCC 9 is fine. Breaking existing code
without so much as a warning in a minor release is not.

[Bug c++/89579] -Wclobbered warning false positive when compiling with -Og

2019-03-04 Thread egallager at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89579

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org
 Blocks||82738

--- Comment #3 from Eric Gallager  ---
There's a bunch of other -Wclobbered bugs; is this related to any of them?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82738
[Bug 82738] [meta-bug] issues with the -Og optimization level

[Bug c++/89585] GCC 8.3: asm volatile no longer accepted at file scope

2019-03-04 Thread egallager at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89585

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||egallager at gcc dot gnu.org
 Resolution|--- |DUPLICATE

--- Comment #3 from Eric Gallager  ---
dup of bug 89443

*** This bug has been marked as a duplicate of bug 89443 ***

[Bug c++/89443] toplevel inline-asm with volatile after the asm is not anymore support in C++

2019-03-04 Thread egallager at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89443

Eric Gallager  changed:

   What|Removed |Added

 CC||harald at gigawatt dot nl

--- Comment #4 from Eric Gallager  ---
*** Bug 89585 has been marked as a duplicate of this bug. ***

[Bug c++/84605] [7/8 Regression] internal compiler error: in xref_basetypes, at cp/decl.c:13818

2019-03-04 Thread paolo.carlini at oracle dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84605

Paolo Carlini  changed:

   What|Removed |Added

Summary|[7/8/9 Regression] internal |[7/8 Regression] internal
   |compiler error: in  |compiler error: in
   |xref_basetypes, at  |xref_basetypes, at
   |cp/decl.c:13818 |cp/decl.c:13818

--- Comment #4 from Paolo Carlini  ---
Fixed in trunk so far.

[Bug c++/84605] [7/8/9 Regression] internal compiler error: in xref_basetypes, at cp/decl.c:13818

2019-03-04 Thread paolo at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84605

--- Comment #3 from paolo at gcc dot gnu.org  ---
Author: paolo
Date: Mon Mar  4 23:49:23 2019
New Revision: 269378

URL: https://gcc.gnu.org/viewcvs?rev=269378=gcc=rev
Log:
/cp
2019-03-04  Paolo Carlini  

PR c++/84605
* parser.c (cp_parser_class_head): Reject TYPE_BEING_DEFINED too.

/testsuite
2019-03-04  Paolo Carlini  

PR c++/84605
* g++.dg/parse/crash69.C: New.

Added:
trunk/gcc/testsuite/g++.dg/parse/crash69.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/parser.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/89501] Odd lack of warning about missing initialization

2019-03-04 Thread torva...@linux-foundation.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89501

--- Comment #12 from Linus Torvalds  ---
(In reply to Jeffrey A. Law from comment #11)
> 
> More generally we  have considered whether or not we could eliminate the
> control dependent path which leads to undefined behavior.  But you have to
> be careful  because statements on the path prior to the actual undefined
> behavior may have observable side effects.

Note that for the kernel, we consider those kinds of "optimizations" completely
and utterly wrong-headed, and will absolutely refuse to use a compiler that
does things like that.

It's basically the compiler saying "I don't care what you meant, I can do
anything I want, and that means I will screw the code up on purpose".

I will personally switch the kernel immediately to clang the moment we cannot
turn off idiotic broken behavior like that.

We currently already disable 'strict-aliasing', 'strict-overflow' and
'delete-null-pointer-checks'. Because compilers that intentionally create
garbage code are evil and wrong.

Compiler standards bodies that add even more of them (the C aliasing rules
being the prime example) should be ashamed of themselves.

And compiler people that think it's ok to say "it's undefined, so we do
anything we damn well like" shouldn't be working with compilers IMNSHO.

If you know something is undefined,  you warn about it. You don't silently
generate random code that doesn't match the source code just because some paper
standard says you "can".

[Bug middle-end/89501] Odd lack of warning about missing initialization

2019-03-04 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89501

--- Comment #11 from Jeffrey A. Law  ---
WRT c#9.  Linus is right.  THe condition is dynamic and we don't want to remove
it in this circumstance.

More generally we  have considered whether or not we could eliminate the
control dependent path which leads to undefined behavior.  But you have to be
careful  because statements on the path prior to the actual undefined behavior
may have observable side effects.

So what we do in those cases is isolate the path  via block copying and CFG
manipulations (usually it's just a subset of paths to a particular statement
which result in undefined behavior).  Once the path is isolated we can replace
the statement which triggers the undefined behavior with a trap.  That in turn
allows some CFG simplifications, const/copy propagation and usually further
dead code elimination.

Right now we do this just for NULLs that are dereferenced and division by zero.
 But I can see doing it for out of bounds array indexing, bogus mem* calls and
perhaps other cases of undefined behavior.  ie, if you have something like

char foo[10];

if (shouldn't happen)
  indx = -1;
else
  indx = whatever ();
something = foo[indx];
... more work ...

This kind of idiom occurs fairly often when -1 is used to represent a can't
happen case and you've got abstraction layers that get eliminated via inlining.
 Path isolation would turn that into something like this:

if (shouldn't happen) {
  indx = -1
  trap;
} else {
  indx = whatever ();
  something = foo[indx];
}
... more work ...

Which will simplify in various useful ways.



--

If we were to apply those concepts to this example we'd conceptually turn the
code into something like this:

if (somecondition) {
ret = functioncall(x);
if (ret)
  return ret;
}
... some more work ...
trap

[Bug c++/89585] GCC 8.3: asm volatile no longer accepted at file scope

2019-03-04 Thread harald at gigawatt dot nl

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89585

--- Comment #2 from Harald van Dijk  ---
This was intentionally(!) broken in a backport,
https://gcc.gnu.org/viewcvs/gcc?view=revision=267534, which
specifically adds a test that this results not even in a user-friendly error
message, but a bad syntax error, even though it had previously been accepted.

[Bug libobjc/89586] warning: cast between incompatible function types when building libobjc

2019-03-04 Thread ubizjak at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89586

Uroš Bizjak  changed:

   What|Removed |Added

 CC||bernd.edlinger at hotmail dot 
de
Version|unknown |9.0

--- Comment #1 from Uroš Bizjak  ---
CC author of the warning patch.

[Bug libobjc/89586] New: warning: cast between incompatible function types when building libobjc

2019-03-04 Thread ubizjak at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89586

Bug ID: 89586
   Summary: warning: cast between incompatible function types when
building libobjc
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libobjc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ubizjak at gmail dot com
  Target Milestone: ---

This warning is emitted during gcc bootstrap from the time warning for invalid
function casts was introduced:

/space/homedirs/uros/gcc-svn/trunk/libobjc/sendmsg.c: In function
‘__objc_get_forward_imp’:
/space/homedirs/uros/gcc-svn/trunk/libobjc/sendmsg.c:129:16: warning: cast
between incompatible function types from ‘__big (*)(struct objc_object *, const
struct objc_selector *, ...)’ {aka ‘struct  (*)(struct objc_object
*, const struct objc_selector *, ...)’} to ‘struct objc_object * (*)(struct
objc_object *, const struct objc_selector *, ...)’ [-Wcast-function-type]
  129 | return (IMP)__objc_block_forward;
  |^
/space/homedirs/uros/gcc-svn/trunk/libobjc/sendmsg.c:131:16: warning: cast
between incompatible function types from ‘double (*)(struct objc_object *,
const struct objc_selector *, ...)’ to ‘struct objc_object * (*)(struct
objc_object *, const struct objc_selector *, ...)’ [-Wcast-function-type]
  131 | return (IMP)__objc_double_forward;
  |^


We have following snippets in libobjc/sendmsg.c:

--cut here--
/* Various forwarding functions that are used based upon the
   return type for the selector.
   __objc_block_forward for structures.
   __objc_double_forward for floats/doubles.
   __objc_word_forward for pointers or types that fit in registers.  */
static double __objc_double_forward (id, SEL, ...);
static id __objc_word_forward (id, SEL, ...);
typedef struct { id many[8]; } __big;
#if INVISIBLE_STRUCT_RETURN 
static __big 
#else
static id
#endif
__objc_block_forward (id, SEL, ...);

...

/* Given a selector, return the proper forwarding implementation.  */
inline
IMP
__objc_get_forward_imp (id rcv, SEL sel)
{
  ...
return (IMP)__objc_block_forward;
  else if (t && (*t == 'f' || *t == 'd'))
return (IMP)__objc_double_forward;
  else
return (IMP)__objc_word_forward;
}
}
--cut here--

with IMP defined in libobjc/objc/objc.h:

--cut here--
/* An 'id' is an object of an unknown class.  The way the object data
   is stored inside the object is private and what you see here is
   only the beginning of the actual struct.  The first field is always
   a pointer to the Class that the object belongs to.  */
typedef struct objc_object
{
  /* 'class_pointer' is the Class that the object belongs to.  In case
 of a Class object, this pointer points to the meta class.

 Compatibility Note: The Apple/NeXT runtime calls this field
 'isa'.  To access this field, use object_getClass() from
 runtime.h, which is an inline function so does not add any
 overhead and is also portable to other runtimes.  */
  Class class_pointer;
} *id;

/* 'IMP' is a C function that implements a method.  When retrieving
   the implementation of a method from the runtime, this is the type
   of the pointer returned.  The idea of the definition of IMP is to
   represent a 'pointer to a general function taking an id, a SEL,
   followed by other unspecified arguments'.  You must always cast an
   IMP to a pointer to a function taking the appropriate, specific
   types for that function, before calling it - to make sure the
   appropriate arguments are passed to it.  The code generated by the
   compiler to perform method calls automatically does this cast
   inside method calls.  */
typedef id (*IMP)(id, SEL, ...); 
--cut here--

[Bug c++/89585] GCC 8.3: asm volatile no longer accepted at file scope

2019-03-04 Thread harald at gigawatt dot nl

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89585

--- Comment #1 from Harald van Dijk  ---
(Sorry, to be clear, the comment about -Wasm-ignored-qualifier is about a
warning clang emits for this construct, not a GCC warning.)

[Bug c/89585] New: GCC 8.3: asm volatile no longer accepted at file scope

2019-03-04 Thread harald at gigawatt dot nl

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89585

Bug ID: 89585
   Summary: GCC 8.3: asm volatile no longer accepted at file scope
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: harald at gigawatt dot nl
  Target Milestone: ---

From GCC 3.4 to 8.2, GCC would accept

  asm volatile("");

at file scope in C++ mode, though not in C mode. This no longer works in GCC
8.3, in either mode. It now results in

test.cc:1:5: error: expected ‘(’ before ‘volatile’
 asm volatile("");
 ^~~~
 (
test.cc:1:14: error: expected unqualified-id before string constant
 asm volatile("");
  ^~
test.cc:1:14: error: expected ‘)’ before string constant
 asm volatile("");
 ~^~
  )

This may have been useless (as covered by -Wasm-ignored-qualifier), but making
it a hard error breaks the compilation of existing projects after upgrading
from GCC 8.2 to 8.3.

[Bug c++/89538] [7.3.0] GCC miscompiling LLVM because of wrong vectorization

2019-03-04 Thread twoh at fb dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89538

--- Comment #5 from Taewook Oh  ---
The name of the function is "void llvm::SmallVectorTemplateBase
>::grow(size_t) [with T = std::pair, const
llvm::DICompositeType*>; bool  = false]".

I tried with GCC 7.4.0, and seems that GCC 7.4.0 doesn't attempt to vectorize
the problematic code. 

Thank you for taking a look!

[Bug ipa/89584] New: CPU2000 degradations with r268448 (172.mgrid -22%, 252.eon -8%)

2019-03-04 Thread pthaugen at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89584

Bug ID: 89584
   Summary: CPU2000 degradations with r268448 (172.mgrid -22%,
252.eon -8%)
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pthaugen at linux dot ibm.com
CC: dje at gcc dot gnu.org, hubicka at gcc dot gnu.org,
marxin at gcc dot gnu.org, rguenth at gcc dot gnu.org,
segher at gcc dot gnu.org, wschmidt at gcc dot gnu.org
  Target Milestone: ---
  Host: powerpc64-unknown-linux-gnu
Target: powerpc64-unknown-linux-gnu
 Build: powerpc64-unknown-linux-gnu

Revision 268448 introduced the noted degradations. Compile flags are -m64 -O3
-mcpu=power7 -fpeel-loops -funroll-loops -ffast-math -mpopcntd -mrecip=all.

I dug into the mgrid degradation further to have some more detail. The main
difference appears to be that the last call to RESID() in the main function is
now inlined. RESID() is actually cloned, and this call is to the clone,
resid_.constprop.0. I'm not sure if this is another instance of losing RESTRICT
on the parameters as seen in prior PRs (54497/55334 and 84737) or just a fact
of inlining that specific call into an inner loop now creates too much register
pressure and we spill too much (I suspect the latter). Following is a simple
static instruction count comparison of the vectorized loop from
resid_.constprop.0() and the same loop after inlining, note the obvious
increase in load/store insns.

Old = constprop.s
New = constprop_inline.s
INSTR  Old  New Change
---  -- --
addi-1   29   28
bc  -110
cmpl-110
ld  -0   17   17
lxvd2x  -   19   33   14
ori -055
stxvd2x -1   15   14
xvadddp -   17   170
xvnmsubadp  -110
xvnmsubmdp  -330
xxlor   -32   -1
---  ---
load-   19   50   31
store   -1   15   14
total   -   47  124   77

[Bug middle-end/89544] Argument marshalling incorrectly assumes stack slots are naturally aligned.

2019-03-04 Thread bernd.edlinger at hotmail dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89544

--- Comment #5 from Bernd Edlinger  ---
Created attachment 45888
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45888=edit
untested patch

This is an update of my previous patch, avoids the unaligned mem:DI
in about the same way how a TREE_ADDRESSABLE decl does avoid the invalid code.

Creates good code in the first test case, but a lot worse code (compared to the
previous patch version at least) in the second test case.

So I don't see invalid RTL with this patch.

Thoughts?

[Bug fortran/71203] ICE in add_init_expr_to_sym, at fortran/decl.c:1512 and :1564

2019-03-04 Thread anlauf at gmx dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71203

--- Comment #8 from Harald Anlauf  ---
The following obvious patch fixes the character-related issues
(z1,z2,z3,z3a,z3b):

Index: expr.c
===
--- expr.c  (revision 269357)
+++ expr.c  (working copy)
@@ -1897,8 +1897,14 @@
string_len = 0;

  if (!p->ts.u.cl)
-   p->ts.u.cl = gfc_new_charlen (p->symtree->n.sym->ns,
- NULL);
+   {
+ if (p->symtree)
+   p->ts.u.cl = gfc_new_charlen
(p->symtree->n.sym->ns,
+ NULL);
+ else
+   p->ts.u.cl = gfc_new_charlen (gfc_current_ns,
+ NULL);
+   }
  else
gfc_free_expr (p->ts.u.cl->length);

However, due to my limited understanding of namespace handling,
this might be an improper solution.

The cases z4,z5 are different issues.

[Bug libstdc++/88996] Implement P0439R0 - Make std::memory_order a scoped enumeration.

2019-03-04 Thread emsr at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88996

--- Comment #7 from emsr at gcc dot gnu.org ---
Author: emsr
Date: Mon Mar  4 20:11:14 2019
New Revision: 269372

URL: https://gcc.gnu.org/viewcvs?rev=269372=gcc=rev
Log:
2019-03-04  Edward Smith-Rowland  <3dw...@verizon.net>

PR libstdc++/88996 Implement P0439R0
Make std::memory_order a scoped enumeration.
* include/bits/atomic_base.h: For C++20 make memory_order a scoped
enum,
add variables for the old enumerators.  Adjust calls.
* testsuite/29_atomics/headers/atomic/types_std_c++2a.cc: New test.
* testsuite/29_atomics/headers/atomic/types_std_c++2a_neg.cc: New test.


Added:
trunk/libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++2a.cc
   
trunk/libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++2a_neg.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/atomic_base.h

[Bug middle-end/89501] Odd lack of warning about missing initialization

2019-03-04 Thread torva...@linux-foundation.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89501

--- Comment #10 from Linus Torvalds  ---
(In reply to ncm from comment #9)
> What I don't understand is why it doesn't optimize away the check on
> (somecondition), since it is assuming the code in the dependent block always
> runs.

No, it very much doesn't assume that. The 'somecondition' really is dynamic.

What happens is simply that because gcc sees only a single assignment to the
variable, and that assignment is then limited by the subsequent value test to a
single value, gcc will just say "ok, any other place where that variable is
used, just use the known single value".

And it does that whether the 'if (somecondition)' path was taken or not.

It's a perfectly valid optimization. In fact, it's a lovely optimization, I'm
not at all complaining about the code generation.

It's just that as part of that (quite reasonable) optimization it also
optimized away the whole "oops, there wasn't really a valid initialization in
case the if-statement wasn't done".

Obviously that's undefined behavior, and the optimization is valid regardless,
but the lack of warning means that we didn't see that we had technically
undefined behavior that the compiler has silently just fixed up for us.

I think the cause of this all is quite reasonable and understandable, and I
also see why gcc really does want to throw away the undefined case entirely
(because otherwise you can get into the reverse situation where you warn
unnecessarily, because gcc isn't smart enough to see that some undefined case
will never ever actually happen). 

Plus I assume it simplifies things a lot to just not even have to track the
undefined case at all. You can just track "ok, down this path we have a known
value for this SSA, and we don't need to keep track of any inconvenient phi
nodes etc".

[Bug c++/71446] Incorrect overload resolution when using designated initializers

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71446

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #20 from Jakub Jelinek  ---
Should be fixed for GCC9 and later.

[Bug middle-end/89501] Odd lack of warning about missing initialization

2019-03-04 Thread ncm at cantrip dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89501

ncm at cantrip dot org changed:

   What|Removed |Added

 CC||ncm at cantrip dot org

--- Comment #9 from ncm at cantrip dot org ---
What I don't understand is why it doesn't optimize away the check on
(somecondition), since it is assuming the code in the dependent block always
runs.

[Bug c++/71446] Incorrect overload resolution when using designated initializers

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71446

--- Comment #19 from Jakub Jelinek  ---
Author: jakub
Date: Mon Mar  4 18:57:13 2019
New Revision: 269371

URL: https://gcc.gnu.org/viewcvs?rev=269371=gcc=rev
Log:
PR c++/71446
* call.c (field_in_pset): New function.
(build_aggr_conv): Handle CONSTRUCTOR_IS_DESIGNATED_INIT correctly.

* g++.dg/cpp2a/desig12.C: New test.
* g++.dg/cpp2a/desig13.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/cpp2a/desig12.C
trunk/gcc/testsuite/g++.dg/cpp2a/desig13.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/call.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/89561] feature request: undefined behaviour compile-time configuration

2019-03-04 Thread bugsthecode at mail dot ru

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89561

--- Comment #8 from bugsthecode at mail dot ru ---
(In reply to Martin Sebor from comment #7)
> It's not possible to detect all instances of undefined behavior and emit
> some "reasonable" or "safe" code (whatever that might mean in each
> instance), certainly not without compromising efficiency.  Timing a program
> compiled with the -fsanitize= options shows just how much of an impact even
> a subset of such detection has.
> 

Yes, it's a hard task to detect all undefined behaviour. It's less hard task to
generate sane code.

And main impact of -fsanitize, as far as I understand, is that it's a runtime
check. Of course runtime stuff can take a lot of performance. Like, try running
something under the valgrind. That's the performance hit of runtime debugging.
But issue lies in compile-time actions, since invalid code is generated at
compile time. There is a choice between fast compilation emiting miscompiled
stuff and proper compilation but maybe a bit slower. Does the speed of code
generation matter? Or should quality matter? Same questions apply to runtime:
should it run properly or just crash very fast?

> On the other hand, it certainly is possible to provide options to control
> what sort of code GCC should emit in addition to giving a warning when it
> does detect such undefined behavior.  In response to pr89218, I don't think
> it's unreasonable to ask for an option to make GCC emit the same code as if
> the function returned zero (since GCC issues a warning, what the default
> setting of the option should be can be debated).  GCC does that in other
> contexts.  For example, in:
> 
>   const char* const a[] = { "1", "12", "123" };
>   const char* f (void) { return a[99]; }
> 
> GCC replaces the argument of the return statement with zero, unfortunately
> without a warning).  Or in 
> 
>   void *f (void) { int i; return  }
> 
> GCC has f() return null rather than a dangling pointer, in addition to
> issuing a warning.
> 
> At the same time, in
> 
>   const char a[][4] = { "1", "12", "123" };
>   const char* f (void) { return a[99]; }
> 
> GCC emits code returning an invalid address (but it does issue a warning).
> 

And what prohibits from emitting proper code in those cases? Dangling pointer,
you say? Ok, I take it. It may be unused dangling pointer. Still much better
than code executing arbitrary data located after the body of function, and even
dropping actual body of function like in bug 87515.

And how does returning nullptr instead of dangling pointer make generated code
faster? Unless the goal is to drop as much code as possible at all costs. In
that case I may propose to just generate any application like it was just 'void
main() {}'. That'd be fastest code ever.

> The trouble here, from my point of view, is more than just the lack of
> consistency, but the lack of consensus on how to respond to such instances,
> or if an effort should even be made to deal with these cases.

Well, it did take some effort to break compiler in these cases first (and other
cases as well probably), of course it'd take some effort to fix broken stuff
now.

[Bug c++/89550] [8/9 Regression] Spurious array-bounds warning when using __PRETTY_FUNCTION__ as a string_view

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89550

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
find_last_off returns npos (i.e. size_t(-1)) if the character is not found, and
with that value remove_prefix is invalid.  You get this warning whenever the
compiler doesn't decide to unroll the loop or through some other optimization
prove that the npos value will certainly not be returned.
The number of n characters determines the length of the __PRETTY_FUNCTION__
string and thus also affects the decision if it should completely unroll the
loop or not.
Unfortunately, our current SCEV final value replacement isn't able to handle
this, e.g. because the loop has multiple exits.  Would be nice if we could try
constexpr-like evaluation (JITing) loops and see if we can actually evaluate
their final value even with multiple exits etc.
As a workaround, perhaps use
  auto l = s.find_last_of(' ');
  if (l != std::string::npos)
s.remove_prefix(l);
?

[Bug c++/89561] feature request: undefined behaviour compile-time configuration

2019-03-04 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89561

--- Comment #7 from Martin Sebor  ---
It's not possible to detect all instances of undefined behavior and emit some
"reasonable" or "safe" code (whatever that might mean in each instance),
certainly not without compromising efficiency.  Timing a program compiled with
the -fsanitize= options shows just how much of an impact even a subset of such
detection has.

On the other hand, it certainly is possible to provide options to control what
sort of code GCC should emit in addition to giving a warning when it does
detect such undefined behavior.  In response to pr89218, I don't think it's
unreasonable to ask for an option to make GCC emit the same code as if the
function returned zero (since GCC issues a warning, what the default setting of
the option should be can be debated).  GCC does that in other contexts.  For
example, in:

  const char* const a[] = { "1", "12", "123" };
  const char* f (void) { return a[99]; }

GCC replaces the argument of the return statement with zero, unfortunately
without a warning).  Or in 

  void *f (void) { int i; return  }

GCC has f() return null rather than a dangling pointer, in addition to issuing
a warning.

At the same time, in

  const char a[][4] = { "1", "12", "123" };
  const char* f (void) { return a[99]; }

GCC emits code returning an invalid address (but it does issue a warning).

The trouble here, from my point of view, is more than just the lack of
consistency, but the lack of consensus on how to respond to such instances, or
if an effort should even be made to deal with these cases.

[Bug libstdc++/86655] std::assoc_legendre should not constrain the value of m

2019-03-04 Thread emsr at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86655

--- Comment #6 from emsr at gcc dot gnu.org ---
Also, the legendre functions should not be onstrained on the argument x either.
They are just polynomials.  The recursions are numerically good in this range
(|x| > 1) also.

[Bug ipa/89139] GCC emits code for static functions that aren't used by the optimized code

2019-03-04 Thread ppalka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89139

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #7 from Patrick Palka  ---
Here's a simple C++ example using std::function that illustrates the same
thing.  The call to h below is optimized away, but two now-unused static member
functions _M_invoke and _M_manager remain in the optimized code.


#include 

int
m ()
{
  std::function h = [] (int i) { return i - 1; };
  return h (5);
}


-fdump-tree-optimized:
;; Function std::_Function_handler >::_M_invoke
(_ZNSt17_Function_handlerIFiiEZ1mvEUliE_E9_M_invokeERKSt9_Any_dataOi,
funcdef_no=1738, decl_uid=33290, cgraph_uid=651, symbol_order=654)

std::_Function_handler >::_M_invoke (const union
_Any_data & {ref-all} __functor, int & __args#0)
{
  ...
}
;; Function std::_Function_base::_Base_manager >::_M_manager
(_ZNSt14_Function_base13_Base_managerIZ1mvEUliE_E10_M_managerERSt9_Any_dataRKS3_St18_Manager_operation,
funcdef_no=1739, decl_uid=33254, cgraph_uid=652, symbol_order=655)

Removing basic block 7
Removing basic block 9
std::_Function_base::_Base_manager >::_M_manager (union
_Any_data & {ref-all} __dest, const union _Any_data & {ref-all} __source,
_Manager_operation __op)
{
  ...
}

;; Function m (_Z1mv, funcdef_no=1392, decl_uid=31150, cgraph_uid=311,
symbol_order=314)

m ()
{
   [local count: 1073741824]:
  return 4;

}

[Bug debug/89498] [8/9 Regression] ICE in AT_loc_list, at dwarf2out.c:4871

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89498

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-03-04
 CC||aoliva at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from Jakub Jelinek  ---
CCing Alex as this is his location views code for feedback.

[Bug tree-optimization/21982] GCC should combine adjacent stdio calls

2019-03-04 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982

Martin Sebor  changed:

   What|Removed |Added

 CC||msebor at gcc dot gnu.org

--- Comment #39 from Martin Sebor  ---
I don't know what happened to the patch but since it was posted a printf
optimization and warning pass has been added to GCC that deals with a small
subset of the issues raised here (e.g., it has its own format string parser and
detects its own set of printf problems).

I think merging only printf calls that are adjacent in GIMPLE with no
intervening assignments to subsequent printf arguments from variables that may
have escaped would obviate the problems raised in comment #18 and comment #29. 
I don't have a sense how much that would impact the optimization.

FWIW, I think a more interesting and more widely applicable optimization
opportunity than merging printf calls is in transforming sprintf and especially
snprintf calls with format strings containing multiple %s directives (and no
others) into sequences of strcpy/memcpy/memccpy calls (pr88813).

[Bug debug/89498] [8/9 Regression] ICE in AT_loc_list, at dwarf2out.c:4871

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89498

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Created attachment 45887
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45887=edit
gcc9-pr89498.patch

Untested fix.  If output_view_list_offset is correct, then I think we need this
patch to match what output_view_list_offset will actually do.  If it is not
correct for -gsplit-dwarf (whether DWARF-5 or 4), then the other functions will
need to be adjusted for what it does.

[Bug fortran/89574] [7/8/9 Regression] internal compiler error: in conv_function_val, at fortran/trans-expr.c:3792

2019-03-04 Thread dominiq at lps dot ens.fr

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89574

Dominique d'Humieres  changed:

   What|Removed |Added

   Priority|P3  |P4
 Status|UNCONFIRMED |NEW
  Known to work||4.8.5
   Keywords||ice-on-valid-code
   Last reconfirmed||2019-03-04
 Ever confirmed|0   |1
Summary|internal compiler error: in |[7/8/9 Regression] internal
   |conv_function_val, at   |compiler error: in
   |fortran/trans-expr.c:3792   |conv_function_val, at
   ||fortran/trans-expr.c:3792
   Target Milestone|--- |7.5
  Known to fail||7.4.1, 8.3.1, 9.0

--- Comment #1 from Dominique d'Humieres  ---
Confirmed from 4.9 up to trunk (9.0).
The test compile with 4.8.5 and prints " Hello!" at runtime.

> The flag -save-temps doesn't produce any *.i files, only one *.s file.

See pr81615.

[Bug c++/89561] feature request: undefined behaviour compile-time configuration

2019-03-04 Thread bugsthecode at mail dot ru

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89561

bugsthecode at mail dot ru changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|WONTFIX |---

--- Comment #6 from bugsthecode at mail dot ru ---
(In reply to Richard Biener from comment #5)
> Note that iff GCC could easily see "what you want" and see that some
> undefined behavior rule contradicts this then from a QOI perspective GCC
> already tries
> to do what you want.

What does "QOI" mean? And no, there is no perspective where GCC tries to do
what I want, unless it's a perspective with brain damage. I'm not talking even
about "legacy" option, an "error" option is unavailable as well. Only
"generate-crap" is available, and happens more and more often.

> The difficult thing is to detect what you want (from
> inside generic analysis infrastructure).

You literally have source code to see what I want. I wrote it there. Maybe not
ideally and without bugs, but it is there.

That's definitely not what user wants:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89218

That's miscompilation by gcc. User didn't request a crash there. And gcc-7
didn't generate such crap. But it's not recognized as regression because it's
swept under the "we can generate any crap when undefined behaviour is
encountered" crap.

> For example GCC will not misoptimize
> 
> int i;
> 
> int main() { *(float *) = 0.0; return i; }
> 
> even if it could (because type-based alias rules make the code undefined)
> because it sees the must-alias.
> 

Are you going to fix this case to generate crap as well in next gcc release?

> That is, -fundefined-behavior=XYZ is impossible besides making all undefined
> behavior implementation-defined (there are many options to individually
> control
> such thing already, like -fwrapv for example).

Ok, there's huge amount of warning flags, but there's also -Wall and -Wextra to
enable a lot of them at once. And there is -Werror to turn all of them into
errors. Is there a global option for undefined behaviour configuration? None I
know of.

And what's so bad with changing undefined behaviour into implementation-defined
behaviour? Better than potential CVE in my book.

Why such option as -fwrapv even exists? Why not use safe defaults, and add
options like -funsafe-fast-no-wrapv which would disable such behaviour but
potentially make binary faster? -O2 used to be safe recommended optimization
level, but now it generates a lot of crap. Maybe fast crap, but still a pile of
vulnerable crashing crap.

[Bug c++/89579] -Wclobbered warning false positive when compiling with -Og

2019-03-04 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89579

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at redhat dot com
   Assignee|unassigned at gcc dot gnu.org  |law at redhat dot com

--- Comment #2 from Jeffrey A. Law  ---
I've got some patches in this space that might help.  Essentially we compute an
inaccurate view of lifetime information in the presence of setjmp/longjmp. 
They're a bit gross as they have to deal with the various forms of RTL we can
see between the setjmp point and testing of the return value and I was having
trouble convincing myself of their correctness in one case.  But they're
certainly worth revisiting.

I'm going to assign to myself to do that analysis, but not commit to fixing
(yet) :-)

[Bug c++/89579] -Wclobbered warning false positive when compiling with -Og

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89579

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||law at gcc dot gnu.org,
   ||redi at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
This is caused by the uninitialized __capacity variable in
basic_string& operator=(basic_string&& __str);

The code is like:
// Just move the allocated pointer, our allocator can free it.
pointer __data = nullptr;
size_type __capacity;
if (!_M_is_local())
  {
if (_Alloc_traits::_S_always_equal())
  {
// __str can reuse our existing storage.
__data = _M_data();
__capacity = _M_allocated_capacity;
  }
else // __str can't use it, so free it.
  _M_destroy(_M_allocated_capacity);
  }

_M_data(__str._M_data());
_M_length(__str.length());
_M_capacity(__str._M_allocated_capacity);
if (__data)
  {
__str._M_data(__data);
__str._M_capacity(__capacity);
  }
else
  __str._M_data(__str._M_local_buf);

Although __capacity is never actually used uninitialized (and e.g. the uninit
warning pass figures that out), because it is used only if __data is non-NULL
and __data is non-NULL only if __capacity is also initialized, with -Og or e.g.
with -O1 -fno-tree-dominator-opts, if we don't perform jump threading, we end
up with something like:
  if (_M_local_buf != _35)
goto ; [70.00%]
  else
goto ; [30.00%]

   [local count: 173485181]:
  __capacity_44 = dummy.D.19123._M_allocated_capacity;

   [local count: 247835973]:
  # __data_47 = PHI <_35(14), 0B(13)>
  # __capacity_48 = PHI <__capacity_44(14), __capacity_50(D)(13)>
  dummy._M_dataplus._M_p = _37;
  _45 = D.24766._M_string_length;
  dummy._M_string_length = _45;
  _46 = D.24766.D.19123._M_allocated_capacity;
  dummy.D.19123._M_allocated_capacity = _46;
  if (__data_47 != 0B)
goto ; [70.00%]
  else
goto ; [30.00%]

   [local count: 173485181]:
  D.24766._M_dataplus._M_p = __data_47;
  D.24766.D.19123._M_allocated_capacity = __capacity_48;
  goto ; [100.00%]

before expansion and later when IRA checks for the clobbered vars, it does:
  if (VAR_P (decl)
  && DECL_RTL_SET_P (decl)
  && REG_P (DECL_RTL (decl))
  && regno_clobbered_at_setjmp (setjmp_crosses, REGNO (DECL_RTL
(decl
where the last one uses:
  return ((REG_N_SETS (regno) > 1
   || REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)),
   regno))
  && REGNO_REG_SET_P (setjmp_crosses, regno));
While REG_N_SETS is just 1 here for the pseudo used for __capacity, because of
the uninitialized use in the PHI it is considered live on the entry block and
that is why the warning is emitted.

Changing libstdc++ code to size_type __capacity = 0; makes the warning away,
though we'd need to check if it doesn't result in worse generated code.

[Bug c++/89579] -Wclobbered warning false positive when compiling with -Og

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89579

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-03-04
 Ever confirmed|0   |1

[Bug ada/89583] New: GNAT.Sockets.Bind_Socket fails with IPv4 address

2019-03-04 Thread simon at pushface dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89583

Bug ID: 89583
   Summary: GNAT.Sockets.Bind_Socket fails with IPv4 address
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: simon at pushface dot org
  Target Milestone: ---

Created attachment 45886
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45886=edit
Reproducer

On Darwin 18.2.0 (Mojave), gcc (GCC) 9.0.1 20190219 (experimental),
GNAT.Sockets.Bind_Socket fails if given an IPv4 address.

I believe that this is caused by the introduction of the new 
IPv6-aware Sockaddr, which is an Unchecked_Union where the size 
in bytes can be 4 (unspec), 16 (IPv4) or 28 (IPv6) bytes; when 
Bind_Socket says

  Sin : aliased Sockaddr;
  Len : constant C.int := Sin'Size / 8;

   begin
  Set_Address (Sin'Unchecked_Access, Address);

  Res := C_Bind (C.int (Socket), Sin'Address, Len);

it sets Len to be 28, and macOS (darwin 15, darwin 18) says this 
is an invalid argument; see demonstrator.

[Bug target/89581] Unneeded stack alignment on windows x86

2019-03-04 Thread yyc1992 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89581

--- Comment #1 from Yichao Yu  ---
The problem is still there when compiled with -O2

```
f:
pushq   %rbp
vmovq   (%r8), %xmm1
movq%rcx, %rax
vmovq   8(%r8), %xmm0
vaddsd  (%rdx), %xmm1, %xmm1
vaddsd  8(%rdx), %xmm0, %xmm0
movq%rsp, %rbp
andq$-16, %rsp
vmovsd  %xmm1, (%rcx)
vmovsd  %xmm0, 8(%rcx)
leave
ret
```


but is not there under `-O2` when the arguments and results are passed
explicitly by reference.

```
void f2(vdouble *res, const vdouble *x, const vdouble *y)
{
*res = (vdouble){x->x1 + y->x1, x->x2 + y->x2};
}
```


```
f2:
vmovsd  8(%rdx), %xmm0
vmovsd  (%rdx), %xmm1
vaddsd  8(%r8), %xmm0, %xmm0
vaddsd  (%r8), %xmm1, %xmm1
vmovsd  %xmm0, 8(%rcx)
vmovsd  %xmm1, (%rcx)
```

The problem comes back, however, with the explicit pass by reference version
when compiled under -O3

```
f2:
pushq   %rbp
vmovapd (%rdx), %xmm0
vaddpd  (%r8), %xmm0, %xmm0
movq%rsp, %rbp
andq$-16, %rsp
vmovaps %xmm0, (%rcx)
leave
ret
```

[Bug target/89582] New: Suboptimal code generated for floating point struct in -O3 compare to -O2

2019-03-04 Thread yyc1992 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89582

Bug ID: 89582
   Summary: Suboptimal code generated for floating point struct in
-O3 compare to -O2
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

When testing the code for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89581 on
linux, I noticed that the code seems suboptimum when compiled under -O3 rather
than -O2 on linux x64.

```
typedef struct {
double x1;
double x2;
} vdouble __attribute__((aligned(16)));

vdouble f(vdouble x, vdouble y)
{
return (vdouble){x.x1 + y.x1, x.x2 + y.x2};
}
```

Compiled with `-O2` produces
```
f:
addsd   %xmm3, %xmm1
addsd   %xmm2, %xmm0
ret
```

With `-O3` or `-Ofast`, however, the code produced is,

```
f:
movq%xmm0, -40(%rsp)
movq%xmm1, -32(%rsp)
movapd  -40(%rsp), %xmm4
movq%xmm2, -24(%rsp)
movq%xmm3, -16(%rsp)
addpd   -24(%rsp), %xmm4
movaps  %xmm4, -40(%rsp)
movsd   -32(%rsp), %xmm1
movsd   -40(%rsp), %xmm0
ret
```

It seems that gcc tries to use the vector instruction but had to use the stack
for that. I did a quick benchmark which confirms that the -O3 version is much
slower than the -O2 version.

Clang produces

```
f:
addsd   %xmm2, %xmm0
addsd   %xmm3, %xmm1
retq
```

As long as any optimizations are on, which seems appropriate.

[Bug target/89581] New: Unneeded stack alignment on windows x86

2019-03-04 Thread yyc1992 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89581

Bug ID: 89581
   Summary: Unneeded stack alignment on windows x86
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

On windows, when compiling the following code with ` gcc -mavx2 a.c -o - -S -O3
-g0 -fno-asynchronous-unwind-tables -fomit-frame-pointer -Wall -Wextra`

```
typedef struct {
double x1;
double x2;
} vdouble __attribute__((aligned(16)));

vdouble f(vdouble x, vdouble y)
{
return (vdouble){x.x1 + y.x1, x.x2 + y.x2};
}
```

I got

```
pushq   %rbp
vmovdqa (%r8), %xmm0
movq%rcx, %rax
vaddpd  (%rdx), %xmm0, %xmm0
movq%rsp, %rbp
andq$-16, %rsp
vmovaps %xmm0, (%rcx)
leave
ret
```

which include 4 extra instructions to align the stack without actually using
it

FWIW, clang has a similar problem on linux...
https://bugs.llvm.org/show_bug.cgi?id=40844

Also worth noting that with -O2 all three vector instructions are splitted into
scalar ones whereas clang does this transformation at -O2...

[Bug target/88530] [8 Regression] AArch64 Unsupported options passed to assemblers when it doesn't need to.

2019-03-04 Thread tnfchris at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88530

--- Comment #7 from Tamar Christina  ---
Author: tnfchris
Date: Mon Mar  4 15:48:49 2019
New Revision: 269366

URL: https://gcc.gnu.org/viewcvs?rev=269366=gcc=rev
Log:
AArch64: Make test options_set_10.c not run on native.

The test options_set_10.c shouldn't run when cross compiled.
In addition to gating it on linux I'm also gating it on native now.

gcc/testsuite/ChangeLog:

PR target/88530
* gcc.target/aarch64/options_set_10.c:


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/options_set_10.c

[Bug bootstrap/89560] [9 regression] ICE In function 'rtx_def* gen_vec_extract_lo_v64qi(rtx, rtx)'

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89560

--- Comment #9 from Jakub Jelinek  ---
Created attachment 45885
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45885=edit
gcc9-pr89560.patch

Fix the buffer overflow.  Unlike most other trees, CALL_EXPR has variable size,
on 64-bit targets 48 + call_expr_nargs * 8, while fold_checksum_tree was using
fixed size 216 byte long buffer, so any CALL_EXPR with 22 or more arguments and
TREE_NO_WARNING flag set caused buffer overflow.

[Bug bootstrap/89560] [9 regression] ICE In function 'rtx_def* gen_vec_extract_lo_v64qi(rtx, rtx)'

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89560

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-03-04
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #8 from Jakub Jelinek  ---
Oops, caused by my previous fold_checksum_tree fix r269303 aka PR89503.

Short testcase:
#define TEN(x) x##0, x##1, x##2, x##3, x##4, x##5, x##6, x##7, x##8, x##9,
#define HUNDRED(x) TEN(x##0) TEN(x##1) TEN(x##2) TEN(x##3) TEN(x##4) \
   TEN(x##5) TEN(x##6) TEN(x##7) TEN(x##8) TEN(x##9)
int foo (int, ...);

int
bar (void)
{
  return (foo (HUNDRED (1) 0));
}

[Bug c++/89580] New: overload resolution for pointers fails to consider conversion operator

2019-03-04 Thread eric at efcs dot ca

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89580

Bug ID: 89580
   Summary: overload resolution for pointers fails to consider
conversion operator
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eric at efcs dot ca
  Target Milestone: ---

I believe the following code is valid and should be accepted.

// g++ -std=c++11
struct Foo {
  template  operator T() const;
};
// error: no match for 'operator==' (operand types are 'Foo' and 'void*')
bool R = (Foo{} == static_cast(nullptr));


Note that when Foo's conversion operator is declared as `operator T*()`, the
code is accepted.

See https://godbolt.org/z/nc43wx

[Bug c++/89579] New: -Wclobbered warning false positive when compiling with -Og

2019-03-04 Thread abigail.buccaneer at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89579

Bug ID: 89579
   Summary: -Wclobbered warning false positive when compiling with
-Og
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: diagnostic, rejects-valid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: abigail.buccaneer at gmail dot com
  Target Milestone: ---

Created attachment 45884
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45884=edit
Minimal reproduction

When compiling with `-Wclobbered -Og`, we see false positives from the
-Wclobbered warning. These don't occur with any other optimization level. This
breaks our build under `-Wextra -Werror -Og`.

Attached is a minimal reproduction test case. This bug is present from GCC
5.5.0 to GCC trunk (9.0.1 20190303). GCC 5.4.0 and below are not affected.

[Bug target/89578] [9 Regression] 5% runtime regression for 481.wrf at -Ofast -flto

2019-03-04 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89578

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-03-04
 Ever confirmed|0   |1

--- Comment #3 from Martin Liška  ---
(In reply to Richard Biener from comment #2)
> Suspicious changes include the fix for PR87609 and the new
> pass_remove_partial_avx_dependency.  I see no other relevant changes.

Yes, started with r269098.

[Bug target/68211] Free __m128d subreg of double

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68211

Steven Bosscher  changed:

   What|Removed |Added

 Status|REOPENED|NEW
   Last reconfirmed|2016-04-19 00:00:00 |2019-3-4

--- Comment #7 from Steven Bosscher  ---
"g++ (Compiler-Explorer-Build) 9.0.1 20190303 (experimental)":

#include 

double sqrt_up(double x){
  __m128d y = { x, 0 };
  return _mm_cvtsd_f64(_mm_sqrt_round_sd(y, y,
_MM_FROUND_TO_POS_INF|_MM_FROUND_NO_EXC));
}

double f(double x)
{
  return __builtin_sqrt(x);
}


sqrt_up(double):
vmovq   %xmm0, %xmm0
vsqrtsd {ru-sae}, %xmm0, %xmm0, %xmm0
ret
f(double):
vsqrtsd %xmm0, %xmm0, %xmm0
ret

[Bug target/89578] [9 Regression] 5% runtime regression for 481.wrf at -Ofast -flto

2019-03-04 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89578

Richard Biener  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 Status|ASSIGNED|UNCONFIRMED
   Last reconfirmed|2019-03-04 00:00:00 |
 CC||hjl.tools at gmail dot com
   Assignee|marxin at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org
   Target Milestone|--- |9.0
 Ever confirmed|1   |0

--- Comment #2 from Richard Biener  ---
Suspicious changes include the fix for PR87609 and the new
pass_remove_partial_avx_dependency.  I see no other relevant changes.

[Bug target/89578] [9 Regression] 5% runtime regression for 481.wrf at -Ofast -flto

2019-03-04 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89578

Martin Liška  changed:

   What|Removed |Added

   Keywords||needs-bisection
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-03-04
 CC||marxin at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Also seen here:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=2.270.0=7.270.0=4.270.0=8.270.0

I'll bisect that.

[Bug c++/89538] [7.3.0] GCC miscompiling LLVM because of wrong vectorization

2019-03-04 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89538

--- Comment #4 from Martin Liška  ---
> 
> So I investigated further to figure out which instruction actually sets
> "0x0" to the new location, and found that instruction 202aef4 below is the
> one
> 
>  202aed0: 48 c7 00 00 00 00 00  movq   $0x0,(%rax)
>  202aed7: f3 0f 6f 08   movdqu (%rax),%xmm1
>  202aedb: 48 83 c0 20   add$0x20,%rax
>  202aedf: 48 83 c1 20   add$0x20,%rcx
>  202aee3: 48 c7 40 f0 00 00 00  movq   $0x0,-0x10(%rax)
>  202aeea: 00
>  202aeeb: f3 0f 6f 40 f0movdqu -0x10(%rax),%xmm0
>  202aef0: 0f 11 49 e0   movups %xmm1,-0x20(%rcx)
>  202aef4: 0f 11 41 f0   movups %xmm0,-0x10(%rcx)
>  202aef8: 48 39 f8  cmp%rdi,%rax
>  202aefb: 75 d3 jne202aed0

What's the name of the function this assembly belongs to?

Can you please test the 7.4.0 release?

[Bug target/89578] New: [9 Regression] 5% runtime regression for 481.wrf at -Ofast -flto

2019-03-04 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89578

Bug ID: 89578
   Summary: [9 Regression] 5% runtime regression for 481.wrf at
-Ofast -flto
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

I see with -march=native on haswell in addition to -Ofast -flto ~5% runtime
regression for 481.wrf with the last known good rev. being r269093 and
the first known bad one r269146.

[Bug tree-optimization/17217] not removing removal of nested structs

2019-03-04 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17217

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |UNCONFIRMED
 Ever confirmed|1   |0

--- Comment #7 from Andrew Pinski  ---
(In reply to Steven Bosscher from comment #6)
> What code is expected in this really old bug? Trunk x86-64 today:
> __attribute__((noinline))

g1 is not to be noinline for the nested testcase.

The un-nested testcase is actually should not be optimized.
The nested function testcase shows the issue.
For the original testcase:
int h(int *a);
int f(int i, int j)
{
  int t = i;
  int t1 = j;
  int g()
  {
 return h() + t1;
  }
  return g() + t1;
}

We should optimize this to the same as:
int h(int *a);
int f(int i, int j)
{
  return h() + 2 * j;
}

[Bug tree-optimization/89437] [9 regression] incorrect result for sinl (atanl (x))

2019-03-04 Thread wilco at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89437

Wilco  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Wilco  ---
Fixed

[Bug gcov-profile/89577] In the manual, replace -fprofile-arcs -ftest-coverage by the simpler --coverage

2019-03-04 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89577

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-03-04
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
   Target Milestone|--- |9.0
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Yep, let me update that.

[Bug tree-optimization/17217] not removing removal of nested structs

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17217

Steven Bosscher  changed:

   What|Removed |Added

 Status|REOPENED|WAITING

--- Comment #6 from Steven Bosscher  ---
What code is expected in this really old bug? Trunk x86-64 today:

int h(int *a);
struct G
{
int t;
int t1;
};
static int g1(struct G*);
int f1(int i, int j)
{
  struct G nestedf1={i,j};
  return g1()+nestedf1.t1;
}
__attribute__((noinline))
static int g1(struct G *nestedf1)
{
  return h(>t)+ nestedf1->t1;
}



g1(G*):
pushq   %rbx
movq%rdi, %rbx
callh(int*)
addl4(%rbx), %eax
popq%rbx
ret
f1(int, int):
subq$24, %rsp
movl%edi, 8(%rsp)
leaq8(%rsp), %rdi
movl%esi, 12(%rsp)
callg1(G*)
addl12(%rsp), %eax
addq$24, %rsp
ret

[Bug ipa/89567] [missed-optimization] Should not be initializing unused struct parameter members

2019-03-04 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89567

--- Comment #5 from Martin Jambor  ---
(In reply to Eyal Rozenberg from comment #4)
> > In the first excample, the interproceudral constant propagation pass
> > (IPA-CP) found that foo1 is so small that copying all of it might be
> > worth not passing the unused argument and so it does, that is why
> > you'll find function foo1 twice in the assembly. 
> 
> Why does this have anything to do with constant propagation? I also don't
> understand the sense in two identical copies.

The transformation literally removes a parameter from a function.  All
(direct) callers in the same compilation unit then call the new clone,
all indirect clones and callers from other compilation units call the
old one, with the old calling convention.  I understand that in your
simple testcase that does not matter but in others it might (and
IPA-CP is a high-level pass that does not know about physical
registers, calling conventions etc.).

> 
> It also sounds like "the wrong optimization" is being used if it's not about
> noticing unused parameters.

You can call it that way if you like to.  It was just easy to add
there, it makes a good job and has no practical disadvantages.

> 
> > This functionality
> > in the pass is there just "on the side" and it is not easy to make it
> > also work with aggegates, not even desireable (that is the job of a
> > different pass, see below).
> >
> > Both examples are compiled better if you make foo1 and foo2 static.
> 
> This really makes no sense to me! bar() is not affected by other TUs at
> all...

IPA-SRA primarily changes foo.  

> 
> > In the latter case, you get exactly what you want, the structure is be
> > split and only the used part survives.  In the first example, you
> > don't get a clone emitted which you probably don't need.  Both of
> > these transformation are done by a pass called interprocedural scalar
> > replacement of aggregates (IPA-SRA), which specifically also aims to
> > remove unused arguments, but it never creates multiple clones.
> 
> I like this pass :-) ... so, why does it work for the static case with
> bar2() but doesn't work with bar1() ?

I don't understand your question, just make foo1 and/or foo2 static
and it will trigger.   The pass needs to adjust all callers and
therefore only works on static functions because otherwise there may
be other call in other compilation units.

> 
> > I'm afraid you'd need to provide a strong real-world use-case to make
> > me investigate how to make IPA-SRA clone so you might not need static
> > and/or LTO because that would mean devising a cost/benefit
> > (size/speedup) heuristics and that is not easy.
> 
> For now I'm just trying to understand why this isn't already happening. Then
> I'll perhaps try to understand why clang does do this.
> 
> But - don't necessarily clone. IIUC,  cloning would possibly mean removing
> that parameter even though it's a field of a struct. But even if you _don't_
> clone, functions calling foo() should still not have to initialize that
> member. It seems like we're talking about different optimizations.

Indeed, you really have a IPA-DSE in your mind (DSE stands for dead
store elimination), that would only affect callers.  We don't have
that, it might be an alternative for to IPA-SRA when we do cannot or
do not want to clone.

[Bug target/40072] Nonoptimal code - CMOVxx %eax,%edi; mov %edi,%eax; retq

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40072

Steven Bosscher  changed:

   What|Removed |Added

 Status|REOPENED|NEW
   Last reconfirmed|2009-05-08 16:08:31 |2019-3-4
 CC|steven at gcc dot gnu.org  |
  Known to fail||8.2.0

[Bug target/86011] Inefficient code generated for ldivmod with constant value

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86011

Steven Bosscher  changed:

   What|Removed |Added

 Status|WAITING |NEW

[Bug tree-optimization/83352] Missed optimization in math expression: sqrt(sqrt(a)) == pow(a, 1/4)

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83352

Steven Bosscher  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Steven Bosscher  ---
More than 1 year no explanation for how "pow(a, 1/4)" can be compiled 
to faster code than "sqrt(sqrt(a))" => WONTFIX.

[Bug c/89569] line number is not accurate on large file gcc compared to clang

2019-03-04 Thread dmalcolm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89569

--- Comment #1 from David Malcolm  ---
I don't think the word "accurate" is right here: both gcc and clang print the
wrong line number - they're just getting it wrong in different ways.

[Bug tree-optimization/65964] [meta] Operand Shortening

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65964
Bug 65964 depends on bug 14844, which changed state.

Bug 14844 Summary: [tree-ssa] narrow types if wide result is not needed for 
unsigned types or when wrapping is true
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14844

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

[Bug middle-end/19986] [meta-bug] fold missing optimizations (compared to RTL)

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19986
Bug 19986 depends on bug 14844, which changed state.

Bug 14844 Summary: [tree-ssa] narrow types if wide result is not needed for 
unsigned types or when wrapping is true
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14844

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

[Bug tree-optimization/14844] [tree-ssa] narrow types if wide result is not needed for unsigned types or when wrapping is true

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14844

Steven Bosscher  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

[Bug other/89394] libiberty :stack overflow in nm

2019-03-04 Thread wcventure at 126 dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89394

--- Comment #5 from Cheng Wen  ---
So many similar cases and repetitive CVEs.

This problem has been fixed before, but it has not been completely fixed.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85122
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85452
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87335
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87636
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87675
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87681

[Bug middle-end/78824] multiple add should in my opinion be optimized to multiplication

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78824

Steven Bosscher  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

[Bug c++/89576] [8/9 Regression] constexpr not working if implicitly captured in a lambda in a function template (gcc 8.3+)

2019-03-04 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89576

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P2
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-03-04
 CC||jakub at gcc dot gnu.org,
   ||nathan at gcc dot gnu.org
   Target Milestone|--- |8.4
Summary|constexpr not working if|[8/9 Regression] constexpr
   |implicitly captured in a|not working if implicitly
   |lambda in a function|captured in a lambda in a
   |template (gcc 8.3+) |function template (gcc
   ||8.3+)
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Started with r268016.  When not it template, it is still accepted.

[Bug other/89394] libiberty :stack overflow in nm

2019-03-04 Thread wcventure at 126 dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89394

Cheng Wen  changed:

   What|Removed |Added

 CC||wcventure at 126 dot com

--- Comment #4 from Cheng Wen  ---
This issue is similar to CVE-2018-18700 & CVE-2018-18701

[Bug tree-optimization/14455] Structs that cannot alias are not SRA'd

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14455

Steven Bosscher  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

[Bug other/89577] New: In the manual, replace -fprofile-arcs -ftest-coverage by the simpler --coverage

2019-03-04 Thread vincent-gcc at vinc17 dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89577

Bug ID: 89577
   Summary: In the manual, replace -fprofile-arcs -ftest-coverage
by the simpler --coverage
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vincent-gcc at vinc17 dot net
  Target Milestone: ---

The GCC manual Section 10 (gcov)

  https://gcc.gnu.org/onlinedocs/gcc/Invoking-Gcov.html
  https://gcc.gnu.org/onlinedocs/gcc/Gcov-and-Optimization.html

uses "-fprofile-arcs -ftest-coverage" while the simpler option --coverage could
be used instead (I also assume that --coverage may be better in general, as it
generates the right option when linking). I suggest the replacement.

[Bug tree-optimization/21982] GCC should combine adjacent stdio calls

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982

Steven Bosscher  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING
   Last reconfirmed|2006-02-05 21:14:21 |2019-3-4

--- Comment #38 from Steven Bosscher  ---
What happened with Diego's patch?
(https://gcc.gnu.org/ml/gcc-patches/2005-06/msg00909.html)

[Bug rtl-optimization/37516] ~(-2 - a) is not being optimized into a + 1

2019-03-04 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37516

--- Comment #7 from Richard Biener  ---
A match.pd rule should be reasonably easy to write.  Beware that ~a is not
undefined but -a - 1 might be if it overflows.  That is

 (simplify
  (bit_not (minus INTEGER_CST@0 @1))
  (plus @1 (minus (negate @0) { build_one_cst (type); }))

with the constant folding done at compile-time, caring for overflow in the
constant term.  As said, not sure how to avoid @1 + new-cst from overflowing.

[Bug tree-optimization/63864] Missed late memory CSE

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63864

--- Comment #4 from Steven Bosscher  ---
Code looks pretty much the same for "test_ok" and "test_slow" since GCC 6 for
x86-64, and since GCC 7 for i686.

GCC 6.3 x86-64:
test_ok(float (*) [3], float, float, float, float, float):
mulss   %xmm3, %xmm0
movss   4(%rdi), %xmm6
mulss   %xmm3, %xmm1
mulss   %xmm3, %xmm2
movss   12(%rdi), %xmm3
movaps  %xmm0, %xmm5
addss   %xmm4, %xmm1
movss   (%rdi), %xmm0
addss   %xmm4, %xmm5
addss   %xmm4, %xmm2
mulss   %xmm1, %xmm3
mulss   %xmm5, %xmm0
mulss   %xmm5, %xmm6
mulss   8(%rdi), %xmm5
addss   %xmm3, %xmm0
movss   24(%rdi), %xmm3
mulss   %xmm2, %xmm3
addss   %xmm3, %xmm0
movss   16(%rdi), %xmm3
mulss   %xmm1, %xmm3
mulss   20(%rdi), %xmm1
addss   %xmm3, %xmm6
movss   28(%rdi), %xmm3
mulss   %xmm2, %xmm3
mulss   32(%rdi), %xmm2
addss   %xmm1, %xmm5
addss   %xmm3, %xmm6
addss   %xmm2, %xmm5
addss   %xmm6, %xmm0
addss   %xmm5, %xmm0
ret
test_slow(mat3&, float, float, float, float, float):
mulss   %xmm3, %xmm0
mulss   %xmm3, %xmm1
mulss   %xmm2, %xmm3
movss   16(%rdi), %xmm2
movaps  %xmm0, %xmm6
addss   %xmm4, %xmm1
movss   4(%rdi), %xmm0
addss   %xmm4, %xmm6
addss   %xmm3, %xmm4
movss   (%rdi), %xmm3
mulss   %xmm1, %xmm2
mulss   %xmm6, %xmm0
mulss   %xmm6, %xmm3
mulss   8(%rdi), %xmm6
addss   %xmm2, %xmm0
movss   28(%rdi), %xmm2
mulss   %xmm4, %xmm2
addss   %xmm2, %xmm0
movss   12(%rdi), %xmm2
mulss   %xmm1, %xmm2
mulss   20(%rdi), %xmm1
addss   %xmm2, %xmm3
movss   24(%rdi), %xmm2
mulss   %xmm4, %xmm2
mulss   32(%rdi), %xmm4
addss   %xmm6, %xmm1
addss   %xmm2, %xmm3
addss   %xmm4, %xmm1
addss   %xmm3, %xmm0
addss   %xmm1, %xmm0
ret


GCC 7.4 i686:
test_ok(float (*) [3], float, float, float, float, float):
flds20(%esp)
flds8(%esp)
fmul%st(1), %st
movl4(%esp), %eax
fadds   24(%esp)
flds12(%esp)
fmul%st(2), %st
fadds   24(%esp)
fxch%st(2)
fmuls   16(%esp)
fadds   24(%esp)
flds(%eax)
fmul%st(2), %st
flds12(%eax)
fmul%st(4), %st
faddp   %st, %st(1)
flds24(%eax)
fmul%st(2), %st
faddp   %st, %st(1)
flds4(%eax)
fmul%st(3), %st
flds16(%eax)
fmul%st(5), %st
faddp   %st, %st(1)
flds28(%eax)
fmul%st(3), %st
faddp   %st, %st(1)
faddp   %st, %st(1)
fxch%st(2)
fmuls   8(%eax)
fxch%st(3)
fmuls   20(%eax)
faddp   %st, %st(3)
fmuls   32(%eax)
faddp   %st, %st(2)
faddp   %st, %st(1)
ret
test_slow(mat3&, float, float, float, float, float):
flds20(%esp)
flds8(%esp)
fmul%st(1), %st
movl4(%esp), %eax
fadds   24(%esp)
flds12(%esp)
fmul%st(2), %st
fadds   24(%esp)
fxch%st(2)
fmuls   16(%esp)
fadds   24(%esp)
flds4(%eax)
fmul%st(2), %st
flds16(%eax)
fmul%st(4), %st
faddp   %st, %st(1)
flds28(%eax)
fmul%st(2), %st
faddp   %st, %st(1)
flds(%eax)
fmul%st(3), %st
flds12(%eax)
fmul%st(5), %st
faddp   %st, %st(1)
flds24(%eax)
fmul%st(3), %st
faddp   %st, %st(1)
faddp   %st, %st(1)
fxch%st(2)
fmuls   8(%eax)
fxch%st(3)
fmuls   20(%eax)
faddp   %st, %st(3)
fmuls   32(%eax)
faddp   %st, %st(2)
faddp   %st, %st(1)
ret

[Bug ipa/89567] [missed-optimization] Should not be initializing unused struct parameter members

2019-03-04 Thread eyalroz at technion dot ac.il

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89567

--- Comment #4 from Eyal Rozenberg  ---

> In the first excample, the interproceudral constant propagation pass
> (IPA-CP) found that foo1 is so small that copying all of it might be
> worth not passing the unused argument and so it does, that is why
> you'll find function foo1 twice in the assembly. 

Why does this have anything to do with constant propagation? I also don't
understand the sense in two identical copies.

It also sounds like "the wrong optimization" is being used if it's not about
noticing unused parameters.

> This functionality
> in the pass is there just "on the side" and it is not easy to make it
> also work with aggegates, not even desireable (that is the job of a
> different pass, see below).
>
> Both examples are compiled better if you make foo1 and foo2 static.

This really makes no sense to me! bar() is not affected by other TUs at all...

> In the latter case, you get exactly what you want, the structure is be
> split and only the used part survives.  In the first example, you
> don't get a clone emitted which you probably don't need.  Both of
> these transformation are done by a pass called interprocedural scalar
> replacement of aggregates (IPA-SRA), which specifically also aims to
> remove unused arguments, but it never creates multiple clones.

I like this pass :-) ... so, why does it work for the static case with bar2()
but doesn't work with bar1() ?


> I'm afraid you'd need to provide a strong real-world use-case to make
> me investigate how to make IPA-SRA clone so you might not need static
> and/or LTO because that would mean devising a cost/benefit
> (size/speedup) heuristics and that is not easy.

For now I'm just trying to understand why this isn't already happening. Then
I'll perhaps try to understand why clang does do this.

But - don't necessarily clone. IIUC,  cloning would possibly mean removing that
parameter even though it's a field of a struct. But even if you _don't_ clone,
functions calling foo() should still not have to initialize that member. It
seems like we're talking about different optimizations.

[Bug tree-optimization/41320] XFAIL gcc.dg/tree-ssa/forwprop-12.c

2019-03-04 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41320

--- Comment #4 from Richard Biener  ---
Actually reconstructing array-refs is dangerous so I think the testcase looks
for something unwanted...

[Bug tree-optimization/15241] [tree-ssa] Convert a <= 7 && b <= 7 into (a | b) <= 7.

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15241

Steven Bosscher  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Steven Bosscher  ---
Fixed since GCC 8

[Bug middle-end/19987] [meta-bug] fold missing optimizations in general

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987
Bug 19987 depends on bug 15241, which changed state.

Bug 15241 Summary: [tree-ssa] Convert a <= 7 && b <= 7 into (a | b) <= 7.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15241

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/89437] [9 regression] incorrect result for sinl (atanl (x))

2019-03-04 Thread wilco at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89437

--- Comment #1 from Wilco  ---
Author: wilco
Date: Mon Mar  4 12:36:04 2019
New Revision: 269364

URL: https://gcc.gnu.org/viewcvs?rev=269364=gcc=rev
Log:
Fix PR89437

Fix PR89437. Fix the sinatan-1.c testcase to not run without
a C99 target system.  Use nextafterl for long double initialization.

Fix an issue with sinl (atanl (sqrtl (LDBL_MAX)) returning 0.0
instead of 1.0 by using x < sqrtl (LDBL_MAX) in match.pd.

gcc/
PR tree-optimization/89437
* match.pd: Use lt in sin(atan(x)) and cos(atan(x)) simplifications.

testsuite/
PR tree-optimization/89437
* gcc.dg/sinatan-1.c: Fix testcase.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/match.pd
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/sinatan-1.c

[Bug tree-optimization/45144] SRA optimization issue of bit-field

2019-03-04 Thread steven at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45144

Steven Bosscher  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||steven at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #6 from Steven Bosscher  ---
With/without -fno-tree-sra gives same code since GCC 5.4.1.

[Bug c++/89576] New: constexpr not working if implicitly captured in a lambda in a function template (gcc 8.3+)

2019-03-04 Thread wielkiegie at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89576

Bug ID: 89576
   Summary: constexpr not working if implicitly captured in a
lambda in a function template (gcc 8.3+)
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wielkiegie at gmail dot com
  Target Milestone: ---

There are many similar bugs that I have found in the bugzilla (see below), but
this one differs from the other ones:
1. It happens only in gcc 8.3 and trunk (gcc 8.2 and earlier are fine)
2. The lambda is not generic (but it is contained in a function template which
might be similar internally)
3. The capture is implicit

Example to reproduce:


template 
void foo()
{
constexpr int x = 0;
[&] {
if constexpr (x) {}
};
}


Produces the following in gcc 8.3 and trunk:


: In lambda function:

:6:24: error: lambda capture of 'x' is not a constant expression

 if constexpr (x) {}

^


The following changes to the example make it compile:
1. Adding static to the constexpr int
2. Making it a regular function and not a template
3. Removing the [&] capture

Changing the capture from [&] to [=] doesn't help.

If I change it to an explicit capture [x], it fails in all gcc versions except
gcc 7.4 and passes on clang.

The other similar bug reports: bug 86429, bug 82643

[Bug tree-optimization/89572] [7/8 Regression] ICE in dyn_cast(gimple) / get_loop_exit_condition(loop const)

2019-03-04 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89572

--- Comment #4 from Richard Biener  ---
Author: rguenth
Date: Mon Mar  4 12:23:17 2019
New Revision: 269363

URL: https://gcc.gnu.org/viewcvs?rev=269363=gcc=rev
Log:
2019-03-04  Richard Biener  

PR middle-end/89572
* tree-scalar-evolution.c: (get_loop_exit_condition): Use
safe_dyn_cast.

* gcc.dg/torture/pr89572.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/torture/pr89572.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-scalar-evolution.c

[Bug tree-optimization/89572] [7/8 Regression] ICE in dyn_cast(gimple) / get_loop_exit_condition(loop const)

2019-03-04 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89572

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
  Known to work||9.0
Summary|[7/8/9 Regression] ICE in   |[7/8 Regression] ICE in
   |dyn_cast(gimple*) /  |gimple>(gimple*) /
   |get_loop_exit_condition(loo |get_loop_exit_condition(loo
   |p const*)   |p const*)
  Known to fail|9.0 |

--- Comment #3 from Richard Biener  ---
Fixed on trunk sofar.

[Bug ipa/89567] [missed-optimization] Should not be initializing unused struct parameter members

2019-03-04 Thread jamborm at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89567

Martin Jambor  changed:

   What|Removed |Added

 CC||jamborm at gcc dot gnu.org

--- Comment #3 from Martin Jambor  ---
In the first excample, the interproceudral constant propagation pass
(IPA-CP) found that foo1 is so small that copying all of it might be
worth not passing the unused argument and so it does, that is why
you'll find function foo1 twice in the assembly.  This functionality
in the pass is there just "on the side" and it is not easy to make it
also work with aggegates, not even desireable (that is the job of a
different pass, see below).

Both examples are compiled better if you make foo1 and foo2 static.
In the latter case, you get exactly what you want, the structure is be
split and only the used part survives.  In the first example, you
don't get a clone emitted which you probably don't need.  Both of
these transformation are done by a pass called interprocedural scalar
replacement of aggregates (IPA-SRA), which specifically also aims to
remove unused arguments, but it never creates multiple clones.

If you cannot make these functions static, you need link-time
optimization (LTO, option -flto) because you need information about
one compilation unit to optimize others.  The current IPA-SRA cannot
unfortunately make use of it but I have a replacement for it that can,
hopefully it will be part of GCC 10.

I'm afraid you'd need to provide a strong real-world use-case to make
me investigate how to make IPA-SRA clone so you might not need static
and/or LTO because that would mean devising a cost/benefit
(size/speedup) heuristics and that is not easy.

[Bug tree-optimization/89570] [9 Regression] ICE in prepare_cmp_insn, at optabs.c:4001

2019-03-04 Thread rguenther at suse dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89570

--- Comment #5 from rguenther at suse dot de  ---
On Mon, 4 Mar 2019, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89570
> 
> Jakub Jelinek  changed:
> 
>What|Removed |Added
> 
>  Status|NEW |ASSIGNED
>Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot 
> gnu.org
> 
> --- Comment #4 from Jakub Jelinek  ---
> Created attachment 45880
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45880=edit
> gcc9-pr89570.patch
> 
> Untested fix.  No way to test on aarch64 with SVE though (nor experience with
> that in cross-testing).

LGTM

[Bug rtl-optimization/89575] LRA for msp430 - Max. number of generated reload insns - frame pointer subreg simplification

2019-03-04 Thread jozef.l at mittosystems dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89575

--- Comment #2 from Jozef Lawrynowicz  ---
Created attachment 45883
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45883=edit
reload dump

[Bug rtl-optimization/89575] LRA for msp430 - Max. number of generated reload insns - frame pointer subreg simplification

2019-03-04 Thread jozef.l at mittosystems dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89575

--- Comment #1 from Jozef Lawrynowicz  ---
Created attachment 45882
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45882=edit
ira dump

[Bug rtl-optimization/89575] New: LRA for msp430 - Max. number of generated reload insns - frame pointer subreg simplification

2019-03-04 Thread jozef.l at mittosystems dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89575

Bug ID: 89575
   Summary: LRA for msp430 - Max. number of generated reload insns
- frame pointer subreg simplification
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jozef.l at mittosystems dot com
  Target Milestone: ---

Created attachment 45881
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45881=edit
testcase

When enabling LRA for msp430, libgcc fails to build, specifically _muldi3.o.

> gcc -S tester.i -O2

> during RTL pass: reload
> ../../../../libgcc/libgcc2.c: In function '__muldi3':
> ../../../../libgcc/libgcc2.c:558:1: internal compiler error: Max. number of 
> generated reload insns per insn is achieved (90)
> 
>   558 | }
>   | ^
> 0xa2ab20 lra_constraints(bool)
>   ../../gcc/lra-constraints.c:4875
> 0xa12b84 lra(_IO_FILE*)
>   ../../gcc/lra.c:2461
> 0x9c68f1 do_reload
>   ../../gcc/ira.c:5516
> 0x9c68f1 execute
>   ../../gcc/ira.c:5700

The cycling reload occurs because IRA assigns hard register R4 (also
FRAME_POINTER_REGNUM, but not fixed for this use) to a pseudo reg, but when LRA
goes to simplify a subreg of the pseudo, it disallows simplification of this
subreg.

Specifically, simplify_subreg_regno (rtlanal.c):

> /* We shouldn't simplify stack-related registers.  */
> if ((!reload_completed || frame_pointer_needed)
> && xregno == FRAME_POINTER_REGNUM)
>   return -1;

This is in an output reload, so a new set of mov insns are generated to load
the value back into the original, problematic pseudo of R4. Once again
simplify_subreg_regno is called to simplify the pseudo of R4, but it is
disallowed and the cycle continues.

From the IRA dump:

> Disposition:
> 0:r28  l0 82:r30  l0 41:r31  l0 4
> ...
> (insn 2 6 3 2 (set (subreg:HI (reg/v:DI 30 [ arg1 ]) 0)
> (reg:HI 12 R12 [ arg1 ])) "tester.c":16:1 12 {movhi}
>  (expr_list:REG_DEAD (reg:HI 12 R12 [ arg1 ])
> (nil)))

From the reload dump:

> Creating newreg=37 from oldreg=30, assigning class NO_REGS to subreg reg 
> r37
>   2: r37:DI#0=R12:HI
>   ...
>   Inserting subreg reload after:
>  42: r30:DI#0=r37:DI#0
>  ...
> Creating newreg=38 from oldreg=30, assigning class NO_REGS to subreg reg 
> r38
>  42: r38:DI#0=r37:DI#0
>  ...
>   Inserting subreg reload after:
>  52: r30:DI#0=r38:DI#0
And so on.

Is it OK to allow simplification of a subreg of FRAME_POINTER_REGNUM
when lra_in_progress is true? After all, constraints on the allocation of hard
regs shouldn't get more resitrictive as compilation progresses?
e.g.

diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 3873b4098b0..9700928ff4e 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -3971,7 +3971,7 @@ simplify_subreg_regno (unsigned int xregno, machine_mode
xmode,
 return -1;

   /* We shouldn't simplify stack-related registers.  */
-  if ((!reload_completed || frame_pointer_needed)
+  if ((!(reload_completed || lra_in_progress) || frame_pointer_needed)
   && xregno == FRAME_POINTER_REGNUM)
 return -1;

This fixes the cycling reload for insn 2, as the frame pointer is not needed,
but there are further separate issues building the test case.

I've attached a reduced test case, and the IRA and reload dumps.

> gcc -v

> Target: msp430-elf
> Configured with: ../configure --target=msp430-elf --disable-nls 
> --enable-languages=c,c++
> Thread model: single
> gcc version 9.0.1 20190301 (experimental) (GCC)

1 2 >

1 - 100 of 146 matches

Mail list logo