[Bug tree-optimization/32648] missed-optimization: bit-manipulation via bool's

2018-04-21 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32648

--- Comment #4 from Andrew Pinski  ---
For ARM64 we get:
f1:
ubfxx1, x0, 5, 1
ubfxx0, x0, 3, 1
eor w0, w1, w0
ret

f2:
eor w0, w0, w0, lsl 2
ubfxx0, x0, 5, 1
ret

Which might be just as the same really depending if the lsl is not split away
from the eor.

[Bug tree-optimization/52345] Missed optimization dealing with bools

2018-04-21 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52345

--- Comment #4 from Andrew Pinski  ---
The generic rule is:
original 3 expressions;

(((int)b) | i) == 0 -> (b == 0) & (i == 0) -> ~b & (i == 0)
(4 expressions; maybe 3 if b is a comparison and single use)
(((int)b) | i) != 0 -> (b != 0) | (i != 0) -> b | (i != 0)
(still 3 expressions)

(((int)b) & i) == 0 -> (b == 0) | ((i&1) == 0) -> ~b | ((i&1) == 0)
(4 expressions; maybe 3 if b is a comparison and single use)

(((int)b) & i) != 0 -> (b != 0) & ((i&1) != 0) -> b & ((i&1) != 0)
(still 3 expressions)
Where b is a "boolean" type variable.

[Bug tree-optimization/71636] Missed optimization in variable alignment test

2018-04-21 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71636

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |7.0

--- Comment #6 from Andrew Pinski  ---
Fixed as mentioned.

[Bug rtl-optimization/48696] Horrible bitfield code generation on x86

2018-04-21 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696

--- Comment #17 from Andrew Pinski  ---
I think some of this is due to SLOW_BYTE_ACCESS being set to 0.

[Bug fortran/67076] [6/7/8 Regression] [F08] Critical inside a module procedure

2018-04-21 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67076

Thomas Koenig  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Thomas Koenig  ---
Works on gcc-6, gcc-7 and gcc-8.

Closing as fixed.

[Bug target/85485] Remove -mcet

2018-04-21 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85485

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-04-21
 Ever confirmed|0   |1

--- Comment #2 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #1)
> I think it is better to keep it as alias, I think -mcet is much more
> familiar to people than -mshstk.

-mcet won't get any shadow stack protection.  -mcet/-mshstk are used to
enable SHSTK intrinsics to IMPLEMENT shadow stack, not to USE shadow stack.
To enable shadow stack protection, you need to use -fcf-protection=return.
-mcet will only lead user confusions.

[Bug target/85485] Remove -mcet

2018-04-21 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85485

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
I think it is better to keep it as alias, I think -mcet is much more familiar
to people than -mshstk.

[Bug c++/58372] internal compiler error: ix86_compute_frame_layout

2018-04-21 Thread cantabile.desu at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58372

Bitterblue  changed:

   What|Removed |Added

 CC||cantabile.desu at gmail dot com

--- Comment #12 from Bitterblue  ---
Hi.

This bug still exists in GCC 7.3.0. It comes up when cross-compiling Qt 5.10.1
from 64 bit Linux for 32 bit Windows.

Well, I assume it's the same bug because of comments seen online.

https://github.com/mxe/mxe/issues/2011
https://bugreports.qt.io/browse/QTBUG-64707

The offending function can be seen here:
https://github.com/qt/qtbase/blob/6c6ace9d23f90845fd424e474d38fe30f070775e/src/corelib/global/qrandom.cpp#L104

% i686-w64-mingw32-gcc -v
Using built-in specs.
COLLECT_GCC=i686-w64-mingw32-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-w64-mingw32/7.3.0/lto-wrapper
Target: i686-w64-mingw32
Configured with: /build/mingw-w64-gcc/src/gcc/configure --prefix=/usr
--libexecdir=/usr/lib --target=i686-w64-mingw32
--enable-languages=c,lto,c++,objc,obj-c++,fortran,ada --enable-shared
--enable-static --enable-threads=posix --enable-fully-dynamic-string
--enable-libstdcxx-time=yes --with-system-zlib --enable-cloog-backend=isl
--enable-lto --disable-dw2-exceptions --enable-libgomp --disable-multilib
--enable-checking=release
Thread model: posix
gcc version 7.3.0 (GCC) 

No problem with x86_64-w64-mingw32, of course.

[Bug target/85220] [meta-bug, nvptx] Run trunk with og7 openacc testcases and analyze execution failures

2018-04-21 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85220

Tom de Vries  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Tom de Vries  ---
(In reply to Tom de Vries from comment #3)
> Marking resolved-fixed.

[Bug fortran/68933] ICE when mixing "-fprofile-arcs -ftest-coverage" and "-fcoarray=lib" on gcc-6 only

2018-04-21 Thread zbeekman at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68933

--- Comment #6 from Zaak  ---
Thanks, I'll check it out.
On Sat, Apr 21, 2018 at 8:20 AM dominiq at lps dot ens.fr <
gcc-bugzi...@gcc.gnu.org> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68933
>
> --- Comment #5 from Dominique d'Humieres  ---
> It seems to work with 7.3.0.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-21 Thread david at westcontrol dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #37 from David Brown  ---
(In reply to Martin Sebor from comment #35)
> Here are the proposed changes:
> 
> Pointer Provenance:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2219.htm#proposed-technical-
> corrigendum
> 
> Trap Representations:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2220.htm#proposed-technical-
> corrigendum
> 
> Unspecified Values:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2221.htm#proposed-technical-
> corrigendum

I am a little unsure of the suggestions for unspecified values here.  Can I
give some examples, to see if my interpretation is correct?

Let's use a new gcc builtin "__builtin_unspecified()" that returns an
unspecified value of int type, with no possible traps (no gcc target has trap
representations for int types, AFAIK).

int x = __builtin_unspecified();
int y = __builtin_unspecified();

if (x == y) doThis();  // The compiler can skip doThis()
if (x != y) doThat();  // The compiler can skip doThat() too

if (x == y) doThis(); else doThat();  
// The compiler can choose to doThis() or doThat(),
// but must do one or the other

if (x == x) doThis();  // This compiler must doThis()
if (x != x) doThat();  // The compiler cannot doThat()

if (x == 3) doThis();  // The compiler can choose to doThis()
// if the compiler does choose to doThis() the it fixes the value of x as 3

if (x & 0x01) doThis(); else doThat();
// The compiler can choose do doThis() or doThat(),
// but that choice fixes the LSB of x

This could allow for a range of possible optimisations, especially if there is
a nice way to make unspecified values like __builtin_unspecified(). 
(Unspecified values of other types could be made by casts.)  For example:

struct opt_int { bool valid; int value; };
struct opt_int safe_sqrt(struct opt_int x) {
opt_int y;
if (!x.valid || x.value < 0) {
y.valid = false;
y.value = __builtin_unspecified();
} else {
y.valid = true;
y.value = unsafe_sqrt(x.value);
}
return y;
 }

This kind of structure would mean minimal effort when you only need part of a
struct to contain specified values.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-21 Thread david at westcontrol dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #36 from David Brown  ---
(In reply to Martin Sebor from comment #34)

> I think in the use case below:
> 
>struct { int i; char buf[4]; } s, r;
>*(float *)s.buf = 1.;
>r = s;
> 
> the aggregate copy has to be viewed as a recursive copy of each of its
> members and copying buf[4] must be viewed as a memcpy,  Char is definitely
> special (it can accesses anything with impunity, even indeterminate values).
> That said, I don't think the rules allow char arrays to be treated as
> allocated storage so while the store to s.buf via float* may be valid it
> doesn't change the effective type of s.buf and so the only way to read the
> float value stored in it is to copy it byte-by-byte (i.e., copy the float
> representation) to an object whose effective type is float.  Some of the
> papers that deal with the effective type rules might touch on this (e.g., DR
> 236, Clark's N1520

In bare metal embedded development, it is common to have to have a way to treat
static declared storage (like a char[] array) as a pool for dynamic storage. 
Often you don't want to use standard library malloc() because of requirements
on deterministic timing, etc.  What you are saying here is that this is not
possible - meaning there is no way to write such malloc replacement in normal C
code.  (It is possible, I think, to use gcc extensions such as the "may_alias"
type attribute and the "malloc" function attribute.  And -fno-strict-alias is
always a safe resort.)  It would be /very/ nice if there were a way to declare
statically allocated pools of memory that could be doled out by user-made
functions and - like malloc'ed memory - take their effective type when used.

It would be even better if there were a standard way to say that the initial
value of such memory is "unspecified".  The compiler and linker could give such
memory a static allocation (essential for small embedded systems with limited
memory, so that you can be sure of your memory usage) but there would be no
need for zeroing the memory at startup.

[Bug target/85491] [8 Regression] scimark LU Decomposition test 15% slower than GCC 7, 30% slower than peak

2018-04-21 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85491

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization,
   ||needs-bisection
 Target||x86_64-*-*
 Blocks||79703, 53947
   Target Milestone|--- |8.0


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79703
[Bug 79703] [meta-bug] SciMark 2.0 performance issues

[Bug target/85491] New: [8 Regression] scimark LU Decomposition test 15% slower than GCC 7, 30% slower than peak

2018-04-21 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85491

Bug ID: 85491
   Summary: [8 Regression] scimark LU Decomposition test 15%
slower than GCC 7, 30% slower than peak
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

https://gcc.opensuse.org/gcc-old/c++bench-czerny/nbench/ shows a LU
decomposition
slowdown between r257710 and r257760 after a previous improvement between
r253986 and r254023.

These are numbers on Haswell with -O3 -ffast-math -funroll-loops -march=native.

The improvement is likely r253993/r254012, x86 vectorization cost changes.  The
regression might be r257734.

[Bug fortran/67076] [6/7/8 Regression] [F08] Critical inside a module procedure

2018-04-21 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67076

Dominique d'Humieres  changed:

   What|Removed |Added

 CC|dominiq at lps dot ens.fr  |

--- Comment #6 from Dominique d'Humieres  ---
It seems to have been fixed by revision r231649.

[Bug fortran/68933] ICE when mixing "-fprofile-arcs -ftest-coverage" and "-fcoarray=lib" on gcc-6 only

2018-04-21 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68933

--- Comment #5 from Dominique d'Humieres  ---
It seems to work with 7.3.0.

[Bug bootstrap/85490] New: Missing STAGE4_CFLAGS in bootstrap-cet.mk

2018-04-21 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85490

Bug ID: 85490
   Summary: Missing STAGE4_CFLAGS in bootstrap-cet.mk
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: igor.v.tsimbalist at intel dot com
Blocks: 81652
  Target Milestone: ---
Target: x86_64-*-*, i?86-*-*

Since profiledbootstrap uses

STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use

bootstrap-cet.mk should define STAGE4_CFLAGS  to support profiledbootstrap
with CET.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81652
[Bug 81652] [meta-bug] -fcf-protection=full bugs

[Bug rtl-optimization/85423] [8 Regression] ICE in code_motion_process_successors, at sel-sched.c:6403

2018-04-21 Thread abel at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85423

Andrey Belevantsev  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |abel at gcc dot gnu.org

--- Comment #4 from Andrey Belevantsev  ---
Sigh, I've put the condition that was too broad in the previous patch and also
allowed some legitimate dependencies between debug insns (so Alex was kind of
right when he expressed his concern).  I'm testing the following after checking
that all of the previous PRs and this one passes:

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index ee970522890..85ff5bd3eb4 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -3308,7 +3308,7 @@ has_dependence_note_dep (insn_t pro, ds_t ds
ATTRIBUTE_UNUSED)
  that a bookkeeping copy should be movable as the original insn.
  Detect that here and allow that movement if we allowed it before
  in the first place.  */
-  if (DEBUG_INSN_P (real_con)
+  if (DEBUG_INSN_P (real_con) && !DEBUG_INSN_P (real_pro)
   && INSN_UID (NEXT_INSN (pro)) == INSN_UID (real_con))
 return;

[Bug rtl-optimization/79985] ICE in code_motion_path_driver, at sel-sched.c:6580

2018-04-21 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79985

--- Comment #8 from Alexander Monakov  ---
Unfortunately the above doesn't fully address the issue, as schedulers and
other passes still have no idea that DF makes those assumptions and will allow
reordering of asms:

register int r asm("ebx");

int f(int x, int y)
{
int t = x/y/r;
asm("#asm" );
return t-x;
}

_Z1fii:
#APP
#asm
#NO_APP
movl%edi, %eax
cltd
idivl   %esi
cltd
idivl   %ebx
subl%edi, %eax
ret

See how the asm is first, even though from DF point of view it should remain
after the read of %ebx for division by r; here cprop_hardreg makes the
offending propagation.

So currently GCC has a rather split personality when it comes to deps w.r.t
global reg vars in asm statements. The documentation should spell out the
intended behavior. My suggestion is to require that references are exposed to
the compiler via constraints, allowing to remove the ad-hoc treatment in DF. I
intend to do that early in stage 1.

[Bug target/85381] [og7, nvptx, openacc] parallel-loop-1.c fails with default vector length 128

2018-04-21 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85381

Tom de Vries  changed:

   What|Removed |Added

   Keywords||openacc
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Tom de Vries  ---
I've committed this workaround:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b957d7c71a0a8984a88f5bfccf5a4b6fd080c47d
:
...
[nvptx, openacc] Don't emit barriers for empty loops

2018-04-21  Tom de Vries  

PR target/85381
* config/nvptx/nvptx.c (nvptx_process_pars): Don't emit barriers for
empty loops.

* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/pr85381-5.c: New test.
* testsuite/libgomp.oacc-c-c++-common/pr85381.c: New test.
...

Submitted here: https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01023.html .

This fixes the hang that I observed, so I'm closing this PR.

[ If nvidia comes back with a clear workaround description, I'll implement
that, but there's no point in keeping the PR open. ]

Wrong snprintf optimalization

2018-04-21 Thread Dávid Bolvanský
Hello,

#include 
int main(void)
{
char buf[10];
return snprintf(buf, 0, "string");
}

GCC simplifies it to
main:
mov eax, 6
ret

but 0 is correct I think.


[Bug libstdc++/85466] Performance is slow when doing 'branchless' conditional style math operations

2018-04-21 Thread cpphackster at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466

--- Comment #22 from Daniel Elliott  ---
(In reply to Marc Glisse from comment #21)
> (In reply to Daniel Elliott from comment #20)
> > still clang is 1.64x faster. had a look at the assembly. My limited
> > understanding makes me think that the ucomiss is not fully vectorized and
> > the clang one is (clangs ucomiss %xmm0,%xmm1 vs gcc's ucomiss
> > 0x218b4(%rip),%xmm0). Feel free to correct me if I am wrong.
> 
> Nothing gets vectorized (likely because of the "dontoptimize" code). The
> ucomiss difference is that llvm keeps the constant .5f in a register, while
> gcc reloads it every time. I don't know if the speed difference comes from
> that, or from some subtle tuning arrangement of the operations (I didn't try
> to understand why llvm has 4 mov where gcc has only 2).

Right I thought because it was an xmm0 that means vector register. I'm going to
go and read up on assembly!

[Bug fortran/67076] [6/7/8 Regression] [F08] Critical inside a module procedure

2018-04-21 Thread janus at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67076

--- Comment #5 from janus at gcc dot gnu.org ---
I also see it working with gfortran 7.2 and OpenCoarrays 1.9.1 (as packaged in
Ubuntu 17.10).

[Bug c++/81837] Internal compiler error (cp/typeck2.c:1264)

2018-04-21 Thread paolo.carlini at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81837

--- Comment #10 from Paolo Carlini  ---
In my opinion we should *always* set it when closing bugs: many, many, users
complain that isn't always clear which is the first release where a bug is
fixed. Actually, we should also spend more time on keeping up to date the
"Known to work" and "Known to fail" fields.

[Bug libstdc++/85466] Performance is slow when doing 'branchless' conditional style math operations

2018-04-21 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466

--- Comment #21 from Marc Glisse  ---
(In reply to Daniel Elliott from comment #20)
> still clang is 1.64x faster. had a look at the assembly. My limited
> understanding makes me think that the ucomiss is not fully vectorized and
> the clang one is (clangs ucomiss %xmm0,%xmm1 vs gcc's ucomiss
> 0x218b4(%rip),%xmm0). Feel free to correct me if I am wrong.

Nothing gets vectorized (likely because of the "dontoptimize" code). The
ucomiss difference is that llvm keeps the constant .5f in a register, while gcc
reloads it every time. I don't know if the speed difference comes from that, or
from some subtle tuning arrangement of the operations (I didn't try to
understand why llvm has 4 mov where gcc has only 2).

[Bug fortran/67076] [6/7/8 Regression] [F08] Critical inside a module procedure

2018-04-21 Thread juergen.reuter at desy dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67076

Jürgen Reuter  changed:

   What|Removed |Added

 CC||dominiq at lps dot ens.fr,
   ||janus at gcc dot gnu.org,
   ||juergen.reuter at desy dot de,
   ||tkoenig at gcc dot gnu.org

--- Comment #4 from Jürgen Reuter  ---
This example works for me with the 8.0.1 trunk. As version 5 of the compiler is
no longer supported, I wouldn't see much sense in keeping this open. Of course,
one should check versions 6 and 7. I had to link libcaf_mpi dynamically, btw.