[Bug debug/90586] New: [gdb] gdb wrongly set the breakpoint as expected

2019-05-22 Thread yangyibiao at nju dot edu.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90586

Bug ID: 90586
   Summary: [gdb] gdb wrongly set the breakpoint as expected
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yangyibiao at nju dot edu.cn
  Target Milestone: ---

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/10.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --enable-languages=c,c++
--disable-multilib --prefix=/usr/local/gcc-trunk
Thread model: posix
gcc version 10.0.0 20190517 (experimental) (GCC)

$ gdb -v
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.

$ cat small.c
int c() 
{
  int b = 1;
f:
  if (b) {
short g[1];
for (; b < 0;) {
  goto f;
  return 0; // line 9
}
return 0;
  } else
;
  return 0;
}

void main() 
{ 
  c();
}

$ gcc -O0 -g small.c;  gdb -batch -x cmds a.out
Breakpoint 1 at 0x40049b: file small.c, line 9.

Breakpoint 1, c () at small.c:11
11  return 0;
g = {64}
b = 1
Kill the program being debugged? (y or n) [answered Y; input not from terminal]

$ cat cmds
b 9
r
info locals
kill
q


=
We set breakpoint at line 9 "b 9" in cmds. Line #9 is never executed. Thus, the
expected behavior should be exit normally. However, it stopped at line 11. We
are not set breakpoint in line 11. Thus, I was wondering this is a bug in gdb.

[Bug tree-optimization/89479] __restrict on a pointer ignored when a function is passed alongside it

2019-05-22 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89479

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=81009

--- Comment #8 from Eric Gallager  ---
(In reply to Eyal Rozenberg from comment #2)
> (In reply to Marc Glisse from comment #1)
> > Seems similar enough.
> 
> With respect  - this is not about x being a const __restrict pointer; what I
> said (including the clang behavior) applies exactly the same when we remove
> the const. See: https://godbolt.org/z/hH643a (where the const is gone).

OK, but even if it's not a dup, I still think it's related enough to go under
"See Also"

[Bug libgomp/90585] libgomp hsa plugin ftbfs in the x32 multilib variant

2019-05-22 Thread doko at debian dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90585

--- Comment #1 from Matthias Klose  ---
looks like libgomp/configure.ac always sets -Werror, not respecting the
--disable-werror configure option.

[Bug c/88144] remove long-obsolete syntax for designated initializers

2019-05-22 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88144

--- Comment #6 from Eric Gallager  ---
(In reply to Jonathan Wakely from comment #3)
> Maybe -Wdeprecated or -Wdeprecated-declarations

I think clang puts this under -Wgnu-designator: 
https://clang.llvm.org/docs/DiagnosticsReference.html#wgnu-designator

Just brainstorming an options entry:
Wgnu-designator
C ObjC C++ ObjC++ Warning Var(warn_gnu_designator) LangEnabledBy(C ObjC C++
ObjC++,Wall || Wextra || Wpedantic || Wdeprecated || Wdeprecated-declarations
|| Wdesignated-init)
Warn on use of obsolete GNU syntax for designated initializers.

[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63

2019-05-22 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547

--- Comment #5 from uros at gcc dot gnu.org ---
Author: uros
Date: Thu May 23 04:55:40 2019
New Revision: 271537

URL: https://gcc.gnu.org/viewcvs?rev=271537=gcc=rev
Log:
Backported from mainline
2019-05-21  Uroš Bizjak  

* config/i386/cpuid.h (__cpuid): For 32bit targets, zero
%ebx and %ecx bafore calling cpuid with leaf 1 or
non-constant leaf argument.

2019-05-21  Uroš Bizjak  

PR target/90547
* config/i386/i386.md (anddi_1 to andsi_1_zext splitter):
Avoid calling gen_lowpart with CONST operand.

testsuite/ChangeLog:

Backported from mainline
2019-05-21  Uroš Bizjak  

PR target/90547
* gcc.target/i386/pr90547.c: New test.


Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr90547.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/i386/cpuid.h
branches/gcc-7-branch/gcc/config/i386/i386.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63

2019-05-22 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547

Uroš Bizjak  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Uroš Bizjak  ---
Fixed everywhere.

[Bug libgomp/90585] New: libgomp hsa plugin ftbfs in the x32 multilib variant

2019-05-22 Thread doko at debian dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90585

Bug ID: 90585
   Summary: libgomp hsa plugin ftbfs in the x32 multilib variant
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: doko at debian dot org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

seen when configuring with

  --enable-offload-targets=nvptx-none,hsa
  --with-multilib-list=m32,m64,mx32

libtool: compile:  /home/packages/gcc/9/gcc-9-9.1.0/build/./gcc/xgcc
-B/home/packages/gcc/9/gcc-9-9.1.0/build/./gcc/ -B/usr/x86_64-linux
-gnu/bin/ -B/usr/x86_64-linux-gnu/lib/ -isystem /usr/x86_64-linux-gnu/include
-isystem /usr/x86_64-linux-gnu/sys-include -isystem /home/
packages/gcc/9/gcc-9-9.1.0/build/sys-include -DHAVE_CONFIG_H -I.
-I../../../../src/libgomp -I../../../../src/libgomp/config/linux/x86 -I
../../../../src/libgomp/config/linux -I../../../../src/libgomp/config/posix
-I../../../../src/libgomp -I../../../../src/libgomp/../inclu
de -D_GNU_SOURCE -Wall -Werror -ftls-model=initial-exec -pthread
-DUSING_INITIAL_EXEC_TLS -g -O2 -mx32 -MT libgomp_plugin_hsa_la-plugin-
hsa.lo -MD -MP -MF .deps/libgomp_plugin_hsa_la-plugin-hsa.Tpo -c
../../../../src/libgomp/plugin/plugin-hsa.c  -fPIC -DPIC -o .libs/libgo
mp_plugin_hsa_la-plugin-hsa.o
../../../../src/libgomp/plugin/plugin-hsa.c: In function
'release_kernel_dispatch':
../../../../src/libgomp/plugin/plugin-hsa.c:1158:22: error: cast to pointer
from integer of different size [-Werror=int-to-pointer-cast]
 1158 |   shadow->debug, (void *) shadow->debug);
  |  ^
../../../../src/libgomp/plugin/plugin-hsa.c:261:19: note: in definition of
macro 'HSA_LOG'
  261 |  fprintf (stderr, __VA_ARGS__); \
  |   ^~~
../../../../src/libgomp/plugin/plugin-hsa.c:1157:3: note: in expansion of macro
'HSA_DEBUG'
 1157 |   HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n",
shadow,
  |   ^
../../../../src/libgomp/plugin/plugin-hsa.c:1157:14: error: format '%lu'
expects argument of type 'long unsigned int', but argument 4 has type
'uint64_t' {aka 'long long unsigned int'} [-Werror=format=]
 1157 |   HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n",
shadow,
  |  ^~~~
 1158 |   shadow->debug, (void *) shadow->debug);
  |   ~
  | |
  | uint64_t {aka long long unsigned int}
../../../../src/libgomp/plugin/plugin-hsa.c:261:19: note: in definition of
macro 'HSA_LOG'
  261 |  fprintf (stderr, __VA_ARGS__); \
  |   ^~~
../../../../src/libgomp/plugin/plugin-hsa.c:1157:3: note: in expansion of macro
'HSA_DEBUG'
 1157 |   HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n",
shadow,
  |   ^
../../../../src/libgomp/plugin/plugin-hsa.c:1157:57: note: format string is
defined here
 1157 |   HSA_DEBUG ("Released kernel dispatch: %p has value: %lu (%p)\n",
shadow,
  |   ~~^
  | |
  | long unsigned
int
  |   %llu
../../../../src/libgomp/plugin/plugin-hsa.c: In function
'print_kernel_dispatch':
../../../../src/libgomp/plugin/plugin-hsa.c:1279:31: error: format '%lu'
expects argument of type 'long unsigned int', but argument 3 ha
s type 'uint64_t' {aka 'long long unsigned int'} [-Werror=format=]
 1279 |   fprintf (stderr, "object: %lu\n", dispatch->object);
  | ~~^ 
  |   | |
  |   | uint64_t {aka long long
unsigned int}
  |   long unsigned int
  | %llu
../../../../src/libgomp/plugin/plugin-hsa.c:1281:31: error: format '%lu'
expects argument of type 'long unsigned int', but argument 3 has type
'uint64_t' {aka 'long long unsigned int'} [-Werror=format=]
 1281 |   fprintf (stderr, "signal: %lu\n", dispatch->signal);
  | ~~^ 
  |   | |
  |   | uint64_t {aka long long
unsigned int}
  |   long unsigned int
  | %llu
../../../../src/libgomp/plugin/plugin-hsa.c:1289:44: error: format '%lu'
expects argument of type 'long unsigned int', but argument 3 has type
'uint64_t' {aka 'long long unsigned int'} [-Werror=format=]
 1289 |   fprintf (stderr, "children dispatches: %lu\n",
  |  ~~^
  |

[Bug c++/78388] Bogus "declaration shadows template parameter" error with parenthesized function-style casts

2019-05-22 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78388

Eric Gallager  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org,
   ||nathan at gcc dot gnu.org

--- Comment #2 from Eric Gallager  ---
cc-ing C++ FE maintainers

[Bug middle-end/88784] Middle end is missing some optimizations about unsigned

2019-05-22 Thread ffengqi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784

--- Comment #9 from Qi Feng  ---
And there's another problem. Take `x >  y  &&  x != 0   -->   x > y' for
example, I would also like to do

   x <  y  &&  y != 0  -->  x < y
   x != 0  &&  x >  y  -->  x > y
   y != 0  &&  x <  y  -->  x < y

If the constant always comes in as the second operand is incorrect, these would
have to be doubled.

I tried to add :c to truth_andif, but got the `operation is not commutative'
error.  I also tried to make truth_andif commutative by modifying genmatch.c,
but again, I don't know it well, afraid that I would break something.

The patterns I wrote looks like:

/* x >  y  &&  x != 0 --> x > y
   Only for unsigned x and y.  */
(simplify
 (truth_andif:c (gt@2 @0 @1) (ne @0 integer_zerop))
 (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0))
  && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1)))
   @2))

I have to wrote 4 of this with minor modification for a single transformation.
If there's better way to do it, please do leave a comment.

[Bug libstdc++/90415] std::is_copy_constructible> is incomplete

2019-05-22 Thread rafael at espindo dot la
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415

--- Comment #3 from Rafael Avila de Espindola  ---
I see now that the corresponding commit on trunk was
31011b9a94fed33170c009292e82558336d1c4d7 (r261146).

At that revision, the test in this bug passes. There was a more recent
regression on trunk on revision a9b768f8f4fd471e315623b23c4f9e83463bf92e
(r270433).

[Bug debug/90584] New: [gdb] gdb is not stopped at a breakpoint in an executed line of code

2019-05-22 Thread yangyibiao at nju dot edu.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90584

Bug ID: 90584
   Summary: [gdb] gdb is not stopped at a breakpoint in an
executed line of code
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yangyibiao at nju dot edu.cn
  Target Milestone: ---

$ gcc --version
gcc (GCC) 10.0.0 20190517 (experimental)
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gdb --version
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.


$ cat small.c
#include 
int main()
{
  int i = 0;
  int j = 0;
  for (; i<=1; i++) {
for (; j<=1; j++) {
  goto lbl;
}
  }
lbl:  // line 11
  printf("hello\n");
  return 0;
}

$ gcc -O0 -g small.c; ./a.out
hello

$ gdb -batch -x cmds a.out
Breakpoint 1 at 0x40051a: file small.c, line 11.
hello
[Inferior 1 (process 2774) exited normally]
cmds:3: Error in sourced command file:
No frame selected.

$ cat cmds
b 11
r
info locals
kill
q

According to the program output, Line 11 should be executed. Thus, when we set
breakpoint at line 11, it should be stopped and print something. However, the
program executed and exit directly.

[Bug c++/90462] Internal compiler error with deprecated-copy and json diagnostics

2019-05-22 Thread dmalcolm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90462

--- Comment #4 from David Malcolm  ---
r271535 should fix the ICE on trunk, but it doesn't fix the missing "finish"
location for the warning described in comment #2.

[Bug c++/90583] New: Implement DR 1722, lambda to function pointer conversion should be noexcept

2019-05-22 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90583

Bug ID: 90583
   Summary: Implement DR 1722, lambda to function pointer
conversion should be noexcept
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mpolacek at gcc dot gnu.org
  Target Milestone: ---

Cf. http://wg21.link/cwg1722

void
foo ()
{
  auto l = [](int){ return 42; };
  static_assert(noexcept((int (*)(int))(l)), "");
}

[Bug c++/90462] Internal compiler error with deprecated-copy and json diagnostics

2019-05-22 Thread dmalcolm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90462

--- Comment #3 from David Malcolm  ---
Author: dmalcolm
Date: Thu May 23 00:42:03 2019
New Revision: 271535

URL: https://gcc.gnu.org/viewcvs?rev=271535=gcc=rev
Log:
Bulletproof -fdiagnostics-format=json against bad locations (PR c++/90462)

PR c++/90462 reports an ICE with -fdiagnostics-format=json when
attempting to serialize a malformed location to JSON.

The compound location_t in question has meaningful "caret" and "start"
locations, but has UNKNOWN_LOCATION for its "finish" location,
leading to a NULL pointer dereference when attempting to build a JSON
string for the filename.

This patch bulletproofs the JSON output so that attempts to write
a JSON object for a location with a NULL file will lead to an object
with no "file" key, and attempts to write a compound location with
UNKNOWN_LOCATION for its start or finish will lead to the corresponding
JSON child object being omitted.

This patch also adds a json::object::get member function, for self-testing
the above.

gcc/ChangeLog:
PR c++/90462
* diagnostic-format-json.cc: Include "selftest.h".
(json_from_expanded_location): Only add "file" key for non-NULL
file strings.
(json_from_location_range): Don't add "start" and "finish"
children if they are UNKNOWN_LOCATION.
(selftest::test_unknown_location): New selftest.
(selftest::test_bad_endpoints): New selftest.
(selftest::diagnostic_format_json_cc_tests): New function.
* json.cc (json::object::get): New function.
(selftest::test_object_get): New selftest.
(selftest::json_cc_tests): Call it.
* json.h (json::object::get): New decl.
* selftest-run-tests.c (selftest::run_tests): Call
selftest::diagnostic_format_json_cc_tests.
* selftest.h (selftest::diagnostic_format_json_cc_tests): New
decl.

gcc/testsuite/ChangeLog:
PR c++/90462
* g++.dg/pr90462.C: New test.


Added:
trunk/gcc/testsuite/g++.dg/pr90462.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/diagnostic-format-json.cc
trunk/gcc/json.cc
trunk/gcc/json.h
trunk/gcc/selftest-run-tests.c
trunk/gcc/selftest.h
trunk/gcc/testsuite/ChangeLog

[Bug fortran/90536] Spurious (?) warning when using -Wconversion with -fno-range-check

2019-05-22 Thread sgk at troutmask dot apl.washington.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536

--- Comment #14 from Steve Kargl  ---
On Wed, May 22, 2019 at 11:21:52PM +, j.ravens.nz at gmail dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536
> 
> --- Comment #13 from Jonathan Ravens  ---
> Thanks everyone for your input on this issue.  I hadn't realised thati
> it could cause such dissent.

There is no dissent.

> As a software developer, my major driver is to manage the users'
> expectations. 
> In that respect, declaring a byte and being able to set it to a valid value
> should not raise a warning, especially when an option called no-range-check is
> in use which, intuitively, would suppress range-checking errors instead of
> causing them.  I suggest it might only be technically correct from a
> developer's perspective, but not from the user's.

You appear to be conflating 2 issues.  Range checking has nothing
to do with type conversion.  In your original code, you have a BOZ
of '89'X (which to be standard conforming should be written as Z'89').
This BOZ is either an INTEGER(8) or INTEGER(16) (depends on the target)
because gfortran follows how Fortran 95 handles a BOZ in a DATA
statement (the only place a BOZ can appear in valid Fortran 95 code).
It has a value of 137.  So, you now have 2 problems when you are
trying to assign it to a BYTE (aka INTEGER(1)) entity:
  1) It is out-of-range.
  2) It has a type of INTEGER(8) or INTEGER(16).

-fno-range-check takes care of 1).
-Wno-conversion takes care or 2).

Now, when you have '09'X (or correctly Z'09'), this BOZ has a 
value of 9, but it is still a INTEGER(8) or INTEGER(16) entity.
When gfortran performs the ranging checking for assigning 9 to
a BYTE (aka INTEGER(1)) entity, it inibits the conversion warning
because 9 is in range of a BYTE (aka INTEGER(1)).  A warning isn't
needed because gfortran knows there is no problem.

When you specify -Wall -fno-range-check, the only thing that
gfortran knows is that you're assigning an INTEGER(8) or 
INTEGER(16) entity to a BYTE (aka INTEGER(1)). So, gfortran 
brings the potential problem to your attention.  You specifically
requested this behavior via the options!

> If commonly-used constructs such as BYTE are to be removed from gfortran, I'd
> expect that to require a lot of re-coding for people in general, given the
> amount of legacy Fortran code in use.  In our case, I think the best option
> would be to phase out usage of gfortran.

No decisions have been made.  I'll raise an RFC about deprecation
of a number of mistakes in gfortran (when time permits as I am not
paid to contribute to gfortran).

The plan would be to issue a deprecation notice in the 10.x
releases of gfortran with removal of the mistakes in 11.1.
A deprection notice cannot be suppressed by an option, so
user will see the notice everytime the user compiles his/her
code.  So, removal won't happen for 2 or more years.  If removal
of a mistake such as BYTE causes you to stop using gfortran,
oh well.

[Bug ipa/88231] aligned functions laid down inefficiently

2019-05-22 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231

--- Comment #8 from Martin Sebor  ---
(In reply to Martin Liška from comment #7)
> Can we do such an optimization without GAS information about size of every
> function?

My thought was that we could use alignment alone if we didn't know the sizes of
instructions on targets like i386 with variable instruction lengths, as a
guesstimate, to do better than chance.  On RISC targets with fixed instruction
length like SPARC it should be possible to get the size just by counting
instructions.  I don't know this part of GCC so I have no idea what's
available.

[Bug fortran/90536] Spurious (?) warning when using -Wconversion with -fno-range-check

2019-05-22 Thread j.ravens.nz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536

--- Comment #13 from Jonathan Ravens  ---
Thanks everyone for your input on this issue.  I hadn't realised that it could
cause such dissent.

As a software developer, my major driver is to manage the users' expectations. 
In that respect, declaring a byte and being able to set it to a valid value
should not raise a warning, especially when an option called no-range-check is
in use which, intuitively, would suppress range-checking errors instead of
causing them.  I suggest it might only be technically correct from a
developer's perspective, but not from the user's.

If commonly-used constructs such as BYTE are to be removed from gfortran, I'd
expect that to require a lot of re-coding for people in general, given the
amount of legacy Fortran code in use.  In our case, I think the best option
would be to phase out usage of gfortran.

[Bug libstdc++/83237] Values returned by std::poisson_distribution are not distributed correctly

2019-05-22 Thread hp at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83237

Hans-Peter Nilsson  changed:

   What|Removed |Added

 CC||hp at gcc dot gnu.org

--- Comment #6 from Hans-Peter Nilsson  ---
A note for the record:

(In reply to pa...@gcc.gnu.org from comment #5)
> Author: paolo
> Date: Sun Dec 24 22:08:52 2017
> New Revision: 255993
> 
> URL: https://gcc.gnu.org/viewcvs?rev=255993=gcc=rev
> Log:
> 2017-12-24  Michele Pezzutti 
> 
>   PR libstdc++/83237
>   * include/bits/random.tcc (poisson_distribution<>::operator()):
>   Fix __x = 1 case - see updated Errata of Devroye's treatise.
>   * testsuite/26_numerics/random/poisson_distribution/operators/
>   values.cc: Add test.

Please don't "add test" to an existing file like that, instead put it in a new
file.

(This method of adding a test can cause side-effects such as a timeout. 
Example: cris-elf which runs in a simulator, now needs >10 minutes on a
"i7-4770K CPU @ 3.50GHz".  I intend to split up the test, as has been done in
the past.)

Also a question: is there a reasonable (much) lower number combination than the 
"testDiscreteDist<100, 200>" in the test?  Perhaps that part of the test
can reasonably be disabled for simulator targets?

(Also, it seems this PR should be closed as the original issue has been fixed.)

[Bug target/90582] AArch64 stack-protector wastes an instruction on address-generation

2019-05-22 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90582

--- Comment #1 from Andrew Pinski  ---
> I assume EOR / CBNZ is as at least as efficient as SUBS / BNE on
> all/most AArch64 microarchitectures, but someone should check.

It is similar as x86 with that respect on some cores (Marvell's cores mostly).
That is ThunderX, ThunderX 2 and OcteonTX and OcteonTX2 all have the ability to
do macro-combining of the two instructions into one micro-op.

[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63

2019-05-22 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547

--- Comment #4 from uros at gcc dot gnu.org ---
Author: uros
Date: Wed May 22 22:50:39 2019
New Revision: 271529

URL: https://gcc.gnu.org/viewcvs?rev=271529=gcc=rev
Log:
Backported from mainline
2019-05-21  Uroš Bizjak  

* config/i386/cpuid.h (__cpuid): For 32bit targets, zero
%ebx and %ecx bafore calling cpuid with leaf 1 or
non-constant leaf argument.

2019-05-21  Uroš Bizjak  

PR target/90547
* config/i386/i386.md (anddi_1 to andsi_1_zext splitter):
Avoid calling gen_lowpart with CONST operand.

testsuite/ChangeLog:

Backported from mainline
2019-05-21  Uroš Bizjak  

PR target/90547
* gcc.target/i386/pr90547.c: New test.


Added:
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr90547.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/i386/cpuid.h
branches/gcc-8-branch/gcc/config/i386/i386.md
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569

--- Comment #6 from Jonathan Wakely  ---
This bug also affects 32-bit GNU/Linux with older versions of glibc.

[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557

Jonathan Wakely  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jonathan Wakely  ---
Fixed for GCC 9.2 and trunk. Thanks for the report.

[Bug target/90582] New: AArch64 stack-protector wastes an instruction on address-generation

2019-05-22 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90582

Bug ID: 90582
   Summary: AArch64 stack-protector wastes an instruction on
address-generation
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: peter at cordes dot ca
  Target Milestone: ---

void protect_me() {
volatile int buf[2];
buf[1] = 3;
}

https://godbolt.org/z/xdlr5w AArch64 gcc8.2 -O3 -fstack-protector-strong

protect_me:
stp x29, x30, [sp, -32]!
adrpx0, __stack_chk_guard
add x0, x0, :lo12:__stack_chk_guard ### this instruction
mov x29, sp # frame pointer even though
-fomit-frame-pointer is part of -O3.  Goes away with explicit
-fomit-frame-pointer

ldr x1, [x0]# copy the cookie
str x1, [sp, 24]
mov x1,0# and destroy the reg

mov w1, 3   # right before it's already
destroyed
str w1, [sp, 20] # buf[1] = 3

ldr x1, [sp, 24]# canary
ldr x0, [x0]# key destroys the key pointer
eor x0, x1, x0
cbnzx0, .L5
ldp x29, x30, [sp], 32  # FP and LR save/restore (for
some reason?)
ret
.L5:
  # can the store of the link register go here, for backtracing?
bl  __stack_chk_fail

A function that returns a global can embed the low 12 bits of the address into
the load instruction.  AArch64 instructions are fixed-width, so there's no
reason (AFAIK) not to do this.

f:
adrpx0, foo
ldr w0, [x0, #:lo12:foo]
ret

I'm not an AArch64 performance expert; it's plausible that zero displacements
are worth spending an extra instruction on for addresses that are used twice,
but unlikely.

So we should be doing 

adrpx0, __stack_chk_guard
ldr x1, [x0, #:lo12:__stack_chk_guard]  # in prologue to copy
cookie
... 
ldr x0, [x0, #:lo12:__stack_chk_guard]  # in epilogue to check
cookie

This also avoids leaving an exact pointer right to __stack_chk_guard in a
register, in case a vulnerable callee or code in the function body can be
tricked into dereferencing it and leaking the cookie.  (In non-leaf functions,
we generate the pointer in a call-preserved register like x19, so yes it will
be floating around in a register for callees).

I'd hate to suggest destroying the pointer when copying to the stack, because
that would require another adrp later.

Finding a gadget that has exactly the right offset (the low 12 bits of
__stack_chk_guard's address) is a lot less likely than finding an  ldr from
[x0].  Of course this will introduce a lot of LDR instructions with an
#:lo12:__stack_chk_guard offset, but hopefully they won't be part of useful
gadgets because they lead to writing the stack, or to EOR/CBNZ to
__stack_chk_fail



I don't see a way to optimize canary^key == 0 any further, unlike x86-64 PR
90568.  I assume EOR / CBNZ is as at least as efficient as SUBS / BNE on
all/most AArch64 microarchitectures, but someone should check.



-O3 includes -fomit-frame-pointer according to -fverbose-asm, but functions
protected with -fstack-protector-strong still get a frame pointer in x29
(costing a MOV x29, sp instruction, and save/restore with STP/LDP along with
x30.)

However, explicitly using -fomit-frame-pointer stops that from happening.  Is
that a separate bug, or am I missing something?



Without stack-protector, the function is vastly simpler

protect_me:
sub sp, sp, #16
mov w0, 3
str w0, [sp, 12]
add sp, sp, 16
ret

Does stack-protector really need to spill/reload x29/x30 (FP and LR)?  Bouncing
the return address through memory seems inefficient, even though branch
prediction does hide that latency.

Is that just so __stack_chk_fail can backtrace?  Can we move the store of the
link register into the __stack_chk_fail branch, off the fast path?

Or if we do unconditionally store x30 (the link register), at least don't
bother reloading it in a leaf function if register allocation didn't need to
clobber it.  Unlike x86-64, the return address can't be attacked with buffer
overflows if it stays safe in a register the whole function.

Obviously my test-case with a volatile array and no inputs at all is making
-fstack-protector-strong look dumb by protecting a perfectly safe function. 
IDK how common it is to have leaf functions with arrays or structs that just
use them for some computation on function args or globals and then return,
maybe after copying the array back to somewhere else.  A sort function might
use a tmp 

[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557

--- Comment #3 from Jonathan Wakely  ---
Author: redi
Date: Wed May 22 22:36:21 2019
New Revision: 271528

URL: https://gcc.gnu.org/viewcvs?rev=271528=gcc=rev
Log:
PR libstdc++/90557 fix path assignment that alters source

Backport from mainline
2019-05-22  Jonathan Wakely  

PR libstdc++/90557
* src/c++17/fs_path.cc (path::_List::operator=(const _List&)): Fix
reversed arguments to uninitialized_copy_n.
* testsuite/27_io/filesystem/path/assign/copy.cc: Check that source
is unchanged by copy assignment.
* testsuite/util/testsuite_fs.h (compare_paths): Use std::equal to
compare path components.

Modified:
branches/gcc-9-branch/libstdc++-v3/ChangeLog
branches/gcc-9-branch/libstdc++-v3/src/c++17/fs_path.cc
   
branches/gcc-9-branch/libstdc++-v3/testsuite/27_io/filesystem/path/assign/copy.cc
branches/gcc-9-branch/libstdc++-v3/testsuite/util/testsuite_fs.h

[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557

--- Comment #2 from Jonathan Wakely  ---
Author: redi
Date: Wed May 22 22:14:34 2019
New Revision: 271527

URL: https://gcc.gnu.org/viewcvs?rev=271527=gcc=rev
Log:
PR libstdc++/90557 fix path assignment that alters source

PR libstdc++/90557
* src/c++17/fs_path.cc (path::_List::operator=(const _List&)): Fix
reversed arguments to uninitialized_copy_n.
* testsuite/27_io/filesystem/path/assign/copy.cc: Check that source
is unchanged by copy assignment.
* testsuite/util/testsuite_fs.h (compare_paths): Use std::equal to
compare path components.

Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/src/c++17/fs_path.cc
trunk/libstdc++-v3/testsuite/27_io/filesystem/path/assign/copy.cc
trunk/libstdc++-v3/testsuite/util/testsuite_fs.h

[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64

2019-05-22 Thread dave.anglin at bell dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577

--- Comment #16 from dave.anglin at bell dot net ---
On 2019-05-22 5:23 p.m., bugzilla-gcc at thewrittenword dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577
>
> --- Comment #15 from The Written Word  com> ---
> (In reply to dave.anglin from comment #12)
>> It might help to compile stage1 with -O2 or -Os.
> How does one do this? After ./configure, "gmake CFLAGS=-Os"? BOOT_CFLAGS
> applies to stage2/3.
STAGE1_CFLAGS and STAGE1_CXXFLAG used to work:
make  STAGE1_CFLAGS="-O2 -g"  STAGE1_CXXFLAGS="-O2 -g" -j2 bootstrap

[Bug preprocessor/90581] provide an option to adjust the maximum depth of nested #include

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90581

Jonathan Wakely  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug middle-end/20408] Unnecessary code generated for empty structs

2019-05-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20408

--- Comment #22 from Jason Merrill  ---
Author: jason
Date: Wed May 22 21:39:08 2019
New Revision: 271523

URL: https://gcc.gnu.org/viewcvs?rev=271523=gcc=rev
Log:
PR c++/20408 - unnecessary code for empty struct.

Here initializing the argument from a TARGET_EXPR isn't an empty class
copy even though the type is !TREE_ADDRESSABLE, so we should check
simple_empty_class_p.

* call.c (build_call_a): Use simple_empty_class_p.

Added:
trunk/gcc/testsuite/g++.dg/tree-ssa/empty-3.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/call.c
trunk/gcc/cp/cp-gimplify.c
trunk/gcc/cp/cp-tree.h

[Bug middle-end/90549] missing -Wreturn-local-addr maybe returning an address of a local array plus offset

2019-05-22 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90549

Martin Sebor  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #4 from Martin Sebor  ---
Patch: https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01525.html

[Bug c/71924] missing -Wreturn-local-addr returning alloca result

2019-05-22 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71924

Martin Sebor  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #5 from Martin Sebor  ---
Patch: https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01525.html

[Bug preprocessor/90581] New: provide an option to adjust the maximum depth of nested #include

2019-05-22 Thread qinzhao at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90581

Bug ID: 90581
   Summary: provide an option to adjust the maximum depth of
nested #include
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: qinzhao at gcc dot gnu.org
  Target Milestone: ---

for some large complicate applications, sometimes the depth of nested #include
might be very big, exceeding the current hard-coded limit 200: 

directives.c:

  if (pfile->line_table->depth >= CPP_STACK_MAX)
cpp_error (pfile, CPP_DL_ERROR, "#include nested too deeply");

internal.h:

#define CPP_STACK_MAX 200

This PR is to request a first class option for users to adjust this limit
during compilation time in order to compile the large application successfully.

[Bug target/61577] [4.9.0] can't compile on hp-ux v3 ia64

2019-05-22 Thread bugzilla-gcc at thewrittenword dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61577

--- Comment #15 from The Written Word  
---
(In reply to dave.anglin from comment #12)
> It might help to compile stage1 with -O2 or -Os.

How does one do this? After ./configure, "gmake CFLAGS=-Os"? BOOT_CFLAGS
applies to stage2/3.

[Bug libstdc++/90557] [9/10 Regression] Incorrect std::filesystem::path::operator=(std::filesystem::path const&) in gcc 9.1.0

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90557

Jonathan Wakely  changed:

   What|Removed |Added

  Known to work||8.3.0
   Target Milestone|--- |9.2
Summary|Incorrect   |[9/10 Regression] Incorrect
   |std::filesystem::path::oper |std::filesystem::path::oper
   |ator=(std::filesystem::path |ator=(std::filesystem::path
   |const&) in gcc 9.1.0|const&) in gcc 9.1.0
  Known to fail||10.0, 9.1.0

[Bug c/90580] New: error: ‘offsetof’ undeclared when it is declared, but used with the wrong number of arguments

2019-05-22 Thread slandden at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90580

Bug ID: 90580
   Summary: error: ‘offsetof’ undeclared when it is declared, but
used with the wrong number of arguments
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slandden at gmail dot com
  Target Milestone: ---

test2.c: In function ‘main’:
test2.c:113:42: error: macro "offsetof" requires 2 arguments, but only 1 given
  113 | printf("offsetof %u", offsetof(key.rounds));
  |  ^
In file included from test2.c:64:
/usr/lib/gcc/powerpc64le-linux-gnu/9/include/stddef.h:406: note: macro
"offsetof" defined here
  406 | #define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER)
  | 
test2.c:113:23: error: ‘offsetof’ undeclared (first use in this function)
  113 | printf("offsetof %u", offsetof(key.rounds));
  |   ^~~~
test2.c:65:1: note: ‘offsetof’ is defined in header ‘’; did you
forget to ‘#include ’?
   64 | #include 
  +++ |+#include 
   65 | /*
test2.c:113:23: note: each undeclared identifier is reported only once for each
function it appears in
  113 | printf("offsetof %u", offsetof(key.rounds));
  |   ^~~~

[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691

--- Comment #39 from Jonathan Wakely  ---
(In reply to dave.anglin from comment #37)
> I believe I changed the glibc value because of the pthread mutex issue.

Aha.

> MALLOC_ABI_ALIGNMENT is defined in pa32-linux.h as follows:
> #define MALLOC_ABI_ALIGNMENT 128
> 
> So, the defines are now consistent on linux.  The only remaining problem is
> 64-bit hpux where the actual
> malloc alignment is 8 bytes.  The resource_adapter.cc test still fails on

I've just committed a change to the resource_adaptor implementation, but I
don't expect it to change the FAIL for hpux yet. I hope the FAILs are fixed for
Solaris now though, and if so then we make the special case apply to 64-bit
hpux too, like so (are these the right macros to check for?):

diff --git a/libstdc++-v3/include/experimental/memory_resource
b/libstdc++-v3/include/experimental/memory_resource
index dde3753fab7..dd6f3099a78 100644
--- a/libstdc++-v3/include/experimental/memory_resource
+++ b/libstdc++-v3/include/experimental/memory_resource
@@ -413,7 +413,8 @@ namespace pmr {
   do_allocate(size_t __bytes, size_t __alignment) override
   {
// Cannot use max_align_t on 32-bit Solaris x86, see PR libstdc++/77691
-#if ! (defined __sun__ && defined __i386__)
+#if ! (defined __sun__ && defined __i386__) \
+   && ! (defined __hpux && defined _LP64)
if (__alignment == alignof(max_align_t))
  return _M_allocate(__bytes);
 #endif
@@ -439,7 +440,8 @@ namespace pmr {
   do_deallocate(void* __ptr, size_t __bytes, size_t __alignment) noexcept
   override
   {
-#if ! (defined __sun__ && defined __i386__)
+#if ! (defined __sun__ && defined __i386__) \
+   && ! (defined __hpux && defined _LP64)
if (__alignment == alignof(max_align_t))
  return (void) _M_deallocate(__ptr, __bytes);
 #endif
diff --git
a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
index 7dcb408f3f7..d4353ff6464 100644
---
a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
+++
b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
@@ -23,7 +23,8 @@
 #include 
 #include 

-#if defined __sun__ && defined __i386__
+#if (defined __sun__ && defined __i386__) \
+   || (defined __hpux && defined _LP64)
 // See PR libstdc++/77691
 # define BAD_MAX_ALIGN_T 1
 #endif




> it.  Maybe I should change BIGGEST_ALIGNMENT
> and MALLOC_ABI_ALIGNMENT to match the malloc implementation?

I think that makes sense (although it won't change anything until we make the
suggestion from PR 90569 as well, so I'll do that this week).

[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691

--- Comment #38 from Jonathan Wakely  ---
Author: redi
Date: Wed May 22 20:29:39 2019
New Revision: 271522

URL: https://gcc.gnu.org/viewcvs?rev=271522=gcc=rev
Log:
PR libstdc++/77691 fix resource_adaptor failures due to max_align_t bugs

Remove the hardcoded whitelist of allocators expected to return memory
aligned to alignof(max_align_t), because that doesn't work when the
platform's malloc() and GCC's max_align_t do not agree what the largest
fundamental alignment is. It's also sub-optimal for user-defined
allocators that return memory suitable for any fundamental alignment.

Instead use a hardcoded list of alignments that are definitely supported
by the platform malloc, and use a copy of the allocator rebound to a POD
type with the requested alignment. Only allocate an oversized
buffer to use with std::align for alignments larger than any of the
hardcoded values.

For 32-bit Solaris x86 do not include alignof(max_align_t) in the
hardcoded values.

PR libstdc++/77691
* include/experimental/memory_resource: Add system header pragma.
(__resource_adaptor_common::__guaranteed_alignment): Remove.
(__resource_adaptor_common::_Types)
(__resource_adaptor_common::__new_list)
(__resource_adaptor_common::_New_list)
(__resource_adaptor_common::_Alignments)
(__resource_adaptor_common::_Fund_align_types): New utilities for
creating a list of types with fundamental alignments.
(__resource_adaptor_imp::do_allocate): Call new _M_allocate function.
(__resource_adaptor_imp::do_deallocate): Call new _M_deallocate
function.
(__resource_adaptor_imp::_M_allocate): New function that first tries
to use an allocator rebound to a type with a fundamental alignment.
(__resource_adaptor_imp::_M_deallocate): Likewise for deallocation.
* testsuite/experimental/memory_resource/new_delete_resource.cc:
Adjust expected allocation sizes.
* testsuite/experimental/memory_resource/resource_adaptor.cc: Remove
xfail.

Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/experimental/memory_resource
   
trunk/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
   
trunk/libstdc++-v3/testsuite/experimental/memory_resource/resource_adaptor.cc

[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs

2019-05-22 Thread dave.anglin at bell dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691

--- Comment #37 from dave.anglin at bell dot net ---
On 2019-05-22 3:41 p.m., redi at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691
>
> --- Comment #36 from Jonathan Wakely  ---
> Interesting. Yes, definitely similar ideas. It looks like it was solved
> differently though, as config/pa/pa.h has
>
> #define MALLOC_ABI_ALIGNMENT (TARGET_64BIT ? 128 : 64)
>
> which should get used by the aligned new code, even without my suggested 
> change
> in PR 90569.
>
> As an aside, the comment on MALLOC_ABI_ALIGNMENT says "The glibc 
> implementation
> currently provides 8-byte alignment." But glibc malloc was changed to 16-byte
> alignment a couple of years ago.
I believe I changed the glibc value because of the pthread mutex issue.

MALLOC_ABI_ALIGNMENT is defined in pa32-linux.h as follows:
#define MALLOC_ABI_ALIGNMENT 128

So, the defines are now consistent on linux.  The only remaining problem is
64-bit hpux where the actual
malloc alignment is 8 bytes.  The resource_adapter.cc test still fails on it. 
Maybe I should change BIGGEST_ALIGNMENT
and MALLOC_ABI_ALIGNMENT to match the malloc implementation?

[Bug target/90330] gcc 9.1.0 fails to install on macOS 10.14.4

2019-05-22 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90330

Iain Sandoe  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2019-05-22
 Ever confirmed|0   |1

[Bug target/90330] gcc 9.1.0 fails to install on macOS 10.14.4

2019-05-22 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90330

--- Comment #14 from Iain Sandoe  ---
(In reply to Iain Sandoe from comment #13)
> (In reply to Matt Thompson from comment #12)
> > (In reply to Iain Sandoe from comment #11)
> > > (In reply to Matt Thompson from comment #10)
> > > > (In reply to Iain Sandoe from comment #9)
> > > > > (In reply to Matt Thompson from comment #8)
> 

> > Well, 9.1.0 built just fine with 8.2.0 loaded in my environment. This seems
> > to point to clang, which, well, doesn't surprise me as clang and I have had
> > a difficult life together, but then again clang built 5.4.0 up to 8.2.0 just
> > fine for me. 
> > 
> > I'm ran a 'make check' and got:
> > 
> > Fixed:  time.h
> > Fixed:  tinfo.h
> > Fixed:  types/vxTypesBase.h
> > Fixed:  unistd.h
> > Newly fixed header:  sys/ucred.h
> > 
> > There were fixinclude test FAILURES
> > make[2]: *** [Makefile:177: check] Error 1
> > make[2]: Leaving directory
> > '/Users/mathomp4/src/GCC/gcc-9.1.0-BUILD-820loaded/fixincludes'
> > make[1]: *** [Makefile:3829: check-fixincludes] Error 2
> > make[1]: Leaving directory
> > '/Users/mathomp4/src/GCC/gcc-9.1.0-BUILD-820loaded'
> > make: *** [Makefile:2358: do-check] Error 2

This was 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90379

fixed on trunk and for 9.2 (so current snapshots from the branch should have
the fix).

Other than that, I can't reproduce the problem locally - it installs for me
whether built using the XC10.2 command line tools, or my own (GCC-8.3) toolset.

... is there anything more we need to do on this PR?
(very happy to help, but not sure how to make pogress without a reproducer for
the issue).

[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377

2019-05-22 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539

--- Comment #22 from Thomas Koenig  ---
I've been trying out some things, and I cannot construct a failing
test case.

A sane way to build such an interface would be

 cat tst.f90
module x
  use, intrinsic :: iso_c_binding, only : c_double
  implicit none
  interface
 subroutine foo(a) bind(c)
   import
   real(kind=c_double) :: a(*)
 end subroutine foo
  end interface
  private
  public :: bar

contains
  subroutine bar(a)
real(kind=c_double), dimension(:) :: a
a = 42._c_double
call foo(a)
  end subroutine bar
end module x

program main
  use, intrinsic :: iso_c_binding, only : c_double
  use x
  implicit none
  real(kind=c_double), dimension(1) :: a
  call bar(a)
end program main
$ cat foo.c
#include 

void foo (double *a)
{
  printf("%f\n", *a);
}
$ gfortran -flto -O tst.f90 foo.c
$ ./a.out
42.00

This works as expected.

What I do not understand is (comment #17)

(gdb) p debug(fsym)
|| symbol: '_formal_107'  
  type spec : (REAL 8)
  attributes: (VARIABLE  DIMENSION DUMMY)
  Array spec:(0 [0])


This means that the dummy parameter has rank zero. How, then,
is it possible to pass a rank-1 argument to it?

(gdb) p debug(expr)
nf90_put_var_1d_eightbytereal:values(FULL) (REAL 8)

(gdb) p *expr->ref
$8 = {
  type = REF_ARRAY, 
  u = {
ar = {
  type = AR_FULL, 
  dimen = 1, 
  codimen = 0, 

Something very fishy going on here.

Please look up the Fortran interface to the C function that is called,
nc_put_vara_double.

Also, please break on gfc_conv_procedure_call for the call
in question and do

$ call debug(sym)
$ p args
$ call debug(args->expr)
$ p args->next
$ call debug(args->next->expr)

... and so on, until args->...->next becomes a null pointer.

I am starting do suspect that this is, in fact, another piece of SPEC
bugware where they made some sort of broken interface between C
and Fortran, which is exposed by my patch.

Hmpf...

[Bug c++/86485] [7/8 Regression] "anonymous" maybe-uninitialized false positive with ternary operator

2019-05-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86485

--- Comment #7 from Jason Merrill  ---
Author: jason
Date: Wed May 22 19:48:05 2019
New Revision: 271521

URL: https://gcc.gnu.org/viewcvs?rev=271521=gcc=rev
Log:
PR c++/86485 - simple_empty_class_p

Yet another tweak that would have fixed this bug: we should treat INIT_EXPR
and MODIFY_EXPR differently for determining whether this is a simple empty
class copy, since a TARGET_EXPR on the RHS is direct initialization if
INIT_EXPR but copy if MODIFY_EXPR.

* cp-gimplify.c (simple_empty_class_p): Also true for MODIFY_EXPR.

Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/cp-gimplify.c

[Bug target/90568] stack protector should use cmp or sub, not xor, to allow macro-fusion on x86

2019-05-22 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568

--- Comment #5 from Peter Cordes  ---
And BTW, this only helps if the SUB and JNE are consecutive, which GCC
(correctly) doesn't currently optimize for with XOR.

If this sub/jne is different from a normal sub/branch and won't already get
optimized for macro-fusion, we may get even more benefit from this change by
teaching gcc to keep them adjacent.

GCC currently sometimes splits up the instructions like this:

xorq%fs:40, %rdx
movl%ebx, %eax
jne .L7

from gcc8.3 (but not 9.1 or trunk in this case) on https://godbolt.org/z/nNjQ8u


#include 
unsigned int get_random_seed() {
std::random_device rd;
return rd();
}

Even with -O3 -march=skylake.
That's not wrong because XOR can't macro-fuse, but the point of switching to
SUB is that it *can* macro-fuse into a single sub-and-branch uop on
Sandybridge-family.  So we might need to teach gcc about that.

So when you change this, please make it aware of optimizing for macro-fusion by
keeping the sub and jne back to back.  Preferably with tune=generic (because
Sandybridge-family is fairly widespread and it doesn't hurt on other CPUs), but
definitely with -mtune=intel or -mtune=sandybridge or later.

Nehalem and earlier can only macro-fuse test/cmp

The potential downside of putting it adjacent instead of 1 or 2 insns earlier
for uarches that can't macro-fuse SUB/JNE should be about zero on average. 
These branches should predict very well, and there are no in-order x86 CPUs
still being sold.  So it's mostly just going to be variations in fetch/decode
that help sometimes, hurt sometimes, like any code alignment change.

[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691

--- Comment #36 from Jonathan Wakely  ---
Interesting. Yes, definitely similar ideas. It looks like it was solved
differently though, as config/pa/pa.h has

#define MALLOC_ABI_ALIGNMENT (TARGET_64BIT ? 128 : 64)

which should get used by the aligned new code, even without my suggested change
in PR 90569.

As an aside, the comment on MALLOC_ABI_ALIGNMENT says "The glibc implementation
currently provides 8-byte alignment." But glibc malloc was changed to 16-byte
alignment a couple of years ago.

[Bug testsuite/90565] [10 regression] test cases gcc.dg/uninit-18.c and uninit-pr90394-1-gimple.c broken as of r271460

2019-05-22 Thread seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90565

--- Comment #2 from seurer at gcc dot gnu.org ---
Also possibly gcc.dg/pr67512.c

[Bug tree-optimization/90579] New: Huge store forward stall due to vectorizer

2019-05-22 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

Bug ID: 90579
   Summary: Huge store forward stall due to vectorizer
   Product: gcc
   Version: 9.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
  Target Milestone: ---
Target: x86-64

loop/avx256 branch at

https://gitlab.com/x86-benchmarks/microbenchmark

shows huge store forward stall due to vectorizer in

---
extern double r[6];
extern double a[];

double
loop (int k, double x)
{
  int i;
  double t=0;
  for (i=0;i<6;i++)
r[i] = x * a[i + k];
  for (i=0;i<6;i++)
t+=r[5-i];
  return t;
}
---

when compiled with -O3 -march=skylake:

[hjl@gnu-cfl-1 microbenchmark]$ perf stat -e ld_blocks.store_forward ./event
loop: 229408

 Performance counter stats for './event':

 1  ld_blocks.store_forward:u   

   0.000478529 seconds time elapsed

   0.000502000 seconds user
   0.0 seconds sys


[hjl@gnu-cfl-1 microbenchmark]$ perf stat -e ld_blocks.store_forward
./event-avx128
loop: 191390

 Performance counter stats for './event-avx128':

 1  ld_blocks.store_forward:u   

   0.000526154 seconds time elapsed

   0.000507000 seconds user
   0.0 seconds sys


[hjl@gnu-cfl-1 microbenchmark]$ perf stat -e ld_blocks.store_forward
./event-avx256
loop: 1312864

 Performance counter stats for './event-avx256':

30,001  ld_blocks.store_forward:u   

   0.000756643 seconds time elapsed

   0.000723000 seconds user
   0.0 seconds sys


[hjl@gnu-cfl-1 microbenchmark]$

[Bug target/88483] Unnecessary stack alignment

2019-05-22 Thread hjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88483

--- Comment #5 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Wed May 22 18:53:37 2019
New Revision: 271517

URL: https://gcc.gnu.org/viewcvs?rev=271517=gcc=rev
Log:
x86: Don't allocate stack frame nor align stack if not needed

get_frame_size () returns used stack slots during compilation, which
may be optimized out later.  This patch does the followings:

1. Add stack_frame_required to machine_function to indicate that the
function needs a stack frame.
2. Change ix86_find_max_used_stack_alignment to set stack_frame_required.
3. Always call ix86_find_max_used_stack_alignment to check if stack
frame is needed.

Tested on i686 and x86-64 with

--with-arch=native --with-cpu=native

Tested on AVX512 machine configured with

--with-arch=native --with-cpu=native

gcc/

PR target/88483
* config/i386/i386-options.c (ix86_init_machine_status): Set
stack_frame_required to true.
* config/i386/i386.c (ix86_get_frame_size): New function.
(ix86_frame_pointer_required): Replace get_frame_size with
ix86_get_frame_size.
(ix86_compute_frame_layout): Likewise.
(ix86_find_max_used_stack_alignment): Changed to void.  Set
stack_frame_required.
(ix86_finalize_stack_frame_flags): Always call
ix86_find_max_used_stack_alignment.  Replace get_frame_size with
ix86_get_frame_size.
* config/i386/i386.h (machine_function): Add stack_frame_required.

gcc/testsuite/

PR target/88483
* gcc.target/i386/stackalign/pr88483-1.c: New test.
* gcc.target/i386/stackalign/pr88483-2.c: Likewise.

Added:
trunk/gcc/testsuite/gcc.target/i386/stackalign/pr88483-1.c
trunk/gcc/testsuite/gcc.target/i386/stackalign/pr88483-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386-options.c
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.h
trunk/gcc/testsuite/ChangeLog

[Bug target/90547] [8/9/10 Regression] ICE in gen_lowpart_general, at rtlhooks.c:63

2019-05-22 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90547

--- Comment #3 from uros at gcc dot gnu.org ---
Author: uros
Date: Wed May 22 18:49:22 2019
New Revision: 271516

URL: https://gcc.gnu.org/viewcvs?rev=271516=gcc=rev
Log:
Backported from mainline
2019-05-21  Uroš Bizjak  

* config/i386/cpuid.h (__cpuid): For 32bit targets, zero
%ebx and %ecx bafore calling cpuid with leaf 1 or
non-constant leaf argument.

2019-05-21  Uroš Bizjak  

PR target/90547
* config/i386/i386.md (anddi_1 to andsi_1_zext splitter):
Avoid calling gen_lowpart with CONST operand.

testsuite/ChangeLog:

Backported from mainline
2019-05-21  Uroš Bizjak  

PR target/90547
* gcc.target/i386/pr90547.c: New test.


Added:
branches/gcc-9-branch/gcc/testsuite/gcc.target/i386/pr90547.c
Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/config/i386/cpuid.h
branches/gcc-9-branch/gcc/config/i386/i386.md
branches/gcc-9-branch/gcc/testsuite/ChangeLog

[Bug lto/90577] [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with -O(2|3) and -flto

2019-05-22 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90577

Iain Sandoe  changed:

   What|Removed |Added

 Target||x86_64-apple-darwin*,
   ||x86_64-gnu-linux

--- Comment #1 from Iain Sandoe  ---
this is repeatable on Linux (m32 and m64)

FAIL: gfortran.dg/lrshift_1.f90   -O2  execution test
FAIL: gfortran.dg/lrshift_1.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/lrshift_1.f90   -O3 -g  execution test
FAIL: gfortran.dg/lrshift_1.f90   -Os  execution test

[Bug libstdc++/90415] std::is_copy_constructible> is incomplete

2019-05-22 Thread rafael at espindo dot la
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415

--- Comment #2 from Rafael Avila de Espindola  ---
The bug is still present on trunk.

[Bug libstdc++/90415] std::is_copy_constructible> is incomplete

2019-05-22 Thread rafael at espindo dot la
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415

Rafael Avila de Espindola  changed:

   What|Removed |Added

 CC||jason at redhat dot com

--- Comment #1 from Rafael Avila de Espindola  ---
This bug was present when gcc 8 branched. It was fixed in the gcc 8 branch, but
I guess it was never fixed on trunk.

On the gcc 8 branch it was fixed by r261463
(d26c6b8b0c6abba9a67b87a1d48f0c3165d021cc).

[Bug target/90568] stack protector should use cmp or sub, not xor, to allow macro-fusion on x86

2019-05-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-05-22
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #4 from Jakub Jelinek  ---
Ok, will change it then.  THanks for the report.

[Bug lto/90577] [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with -O(2|3) and -flto

2019-05-22 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90577

Dominique d'Humieres  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-05-22
 Ever confirmed|0   |1

[Bug fortran/90578] Wrong code with LSHIFT and optimization

2019-05-22 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90578

Dominique d'Humieres  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-05-22
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=90577
 Ever confirmed|0   |1

[Bug fortran/90578] New: Wrong code with LSHIFT and optimization

2019-05-22 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90578

Bug ID: 90578
   Summary: Wrong code with LSHIFT and optimization
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dominiq at lps dot ens.fr
  Target Milestone: ---

While experimenting with pr90577, I have found that the run-time output for the
following reduced test

program test_rshift_lshift
  implicit none
  integer :: i(15), j, n

  i = (/ -huge(i), -huge(i)/2, -129, -128, -127, -2, -1, 0, &
 1, 2, 127, 128, 129, huge(i)/2, huge(i) /)

  print *, lshift(i(1),-30) 
  print *, lshift(i(1),-29) 
  if (lshift(i(1),-30) /= 4) STOP 1

end program test_rshift_lshift

depends on the optimization level:

% gfc lrshift_1_red.f90
% ./a.out 
   4
   8
% gfc lrshift_1_red.f90 -O
% ./a.out
   2
   4
STOP 1

but

gfc lrshift_1_red.f90 -fauto-inc-dec -fbranch-count-reg
-fcombine-stack-adjustments -fcompare-elim -fcprop-registers -fdce -fdefer-pop
-fdse -fforward-propagate -fguess-branch-probability -fif-conversion
-fif-conversion2 -finline-functions-called-once -fipa-profile -fipa-pure-const
-fipa-reference -fipa-reference-addressable -fmerge-constants
-fmove-loop-invariants -fomit-frame-pointer -freorder-blocks -fshrink-wrap
-fshrink-wrap-separate -fsplit-wide-types -fssa-backprop -fssa-phiopt
-ftree-bit-ccp -ftree-ccp -ftree-ch -ftree-coalesce-vars -ftree-copy-prop
-ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
-ftree-phiprop -ftree-pta -ftree-scev-cprop -ftree-sink -ftree-slsr -ftree-sra
-ftree-ter -funit-at-a-time

gives also

   4
   8

I see this behavior from at least 4.8 up to trunk (10.0).

[Bug target/90568] stack protector should use cmp or sub, not xor, to allow macro-fusion on x86

2019-05-22 Thread peter at cordes dot ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568

--- Comment #3 from Peter Cordes  ---
(In reply to Jakub Jelinek from comment #2)
> The xor there is intentional, for security reasons we do not want the stack
> canary to stay in the register afterwards, because then it could be later
> spilled or accessible to some exploit in another way.

Ok, so we can't use CMP, therefore we should use SUB, which as I showed does
help on Sandybridge-family vs. XOR.

x - x = 0   just like 
x ^ x = 0

Otherwise SUB wouldn't set ZF.

SUB is not worse than XOR on any other CPUs; there are no CPUs with better XOR
throughput than ADD/SUB.

In the canary mismatch case, leaving  attacker_value - key  in a register seems
no worse than leaving attacker_value ^ key in a register.  Either value
trivially reveals the canary value to an attacker that knows what they
overwrote the stack with, if it does somehow leak.  We jump to __stack_chk_fail
in that case, not relying on the return value on the stack, so a ROP attack
wouldn't be sufficient to leak that value anywhere.

[Bug lto/90577] New: [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with -O(2|3) and -flto

2019-05-22 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90577

Bug ID: 90577
   Summary: [9/10 Regression] FAIL: gfortran.dg/lrshift_1.f90 with
-O(2|3) and -flto
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dominiq at lps dot ens.fr
CC: hubicka at gcc dot gnu.org, iains at gcc dot gnu.org,
marxin at gcc dot gnu.org
  Target Milestone: ---

Testing fortran with -flto gives

Running /opt/gcc/work/gcc/testsuite/gfortran.dg/dg.exp ...
FAIL: gfortran.dg/lrshift_1.f90   -O2  execution test
FAIL: gfortran.dg/lrshift_1.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/lrshift_1.f90   -O3 -g  execution test
FAIL: gfortran.dg/lrshift_1.f90   -Os  execution test

=== gfortran Summary for unix/-m32/-flto ===

# of expected passes10
# of unexpected failures4

The behavior changed between revisions r268729 (2019-02-09, OK) and r269160
(2019-02-23, wrong-code).

With the following change

   do n = 1, size(i)
 do j = -30, 30
+  print *, n, j, lshift(i(n),j) 
+  print *, n, j, c_lshift(i(n),j) 
   if (lshift(i(n),j) /= c_lshift(i(n),j)) STOP 1
   if (rshift(i(n),j) /= c_rshift(i(n),j)) STOP 2
 end do

the wrong code gives

   1 -30   2
   1 -30  -2
STOP 1

while the working one gives

   1 -30   4
   1 -30   4
   1 -29   8
   1 -29   8
...

I also see

FAIL: gfortran.dg/ISO_Fortran_binding_9.f90   -g -O3 -fwhole-program -flto 
execution test

[Bug rtl-optimization/64895] RA picks the wrong register for -fipa-ra

2019-05-22 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64895

--- Comment #16 from Iain Sandoe  ---
Created attachment 46398
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46398=edit
testsuite patch

Will post this later, tested on x86_64-linux and x86_64-darwin.

[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569

--- Comment #5 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #4)
> Rainer, the change to gcc/cp/init.c would allow you to do:
> 
> #define MALLOC_ABI_ALIGNMENT 8

Oops, it's in bits not bytes, so that should be

#define MALLOC_ABI_ALIGNMENT 64

[Bug target/68485] ICE while building gpsd package on microblaze

2019-05-22 Thread giulio.benetti at micronovasrl dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68485

Giulio Benetti  changed:

   What|Removed |Added

 CC||giulio.benetti@micronovasrl
   ||.com

--- Comment #5 from Giulio Benetti  ---
This seems to be a duplicate of Bug 69401.

[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs

2019-05-22 Thread dave.anglin at bell dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691

--- Comment #35 from dave.anglin at bell dot net ---
On 2019-05-22 11:03 a.m., redi at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691
>
> --- Comment #34 from Jonathan Wakely  ---
> (In reply to Jonathan Wakely from comment #33)
>> The correct fix is to adjust the value of __STDCPP_DEFAULT_NEW_ALIGNMENT__
>> on targets where malloc doesn't agree with GCC's alignof(max_align_t).
> That only helps for C++17 and later though :-(
>
> The  header is defined for C++14.
>
Reminds me of this patch:
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00528.html

[Bug tree-optimization/90576] New: [10 regression] SPEC CPU2006 450.soplex miscompiled with -Os -flto after r271413

2019-05-22 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90576

Bug ID: 90576
   Summary: [10 regression] SPEC CPU2006 450.soplex miscompiled
with -Os -flto after r271413
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

After
===
commit ce7b4f267706c23405705d848c1dcf686496f262
Author: hubicka 
Date:   Mon May 20 12:01:40 2019 +

   * tree-ssa-alias.c (compare_sizes): New function.
   (sompare_type_sizes): New function
   (aliasing_component_refs_p): Use it.
   (indirect_ref_may_alias_decl_p): Likewise.


   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@271413
138bc75d-0d04-0410-961f-82ee72b054a4
===
GCC miscompiles 450.soplex with -Os -flto at least on AArch64 and AArch32.  The
benchmark finishes within seconds with
===
450.soplex: copy 0 non-zero return code (exit code=11, signal=0)
===

FWIW, "-Os -fno-lto" seem to work.

Considering that both AArch64 and AArch32 are affected and the nature of the
patch, this is likely affects other architectures.

Honza, would you please investigate?  Please let me know if it doesn't readily
reproduce for you, and I'll help with a testcase.

[Bug libstdc++/77691] [7/8/9/10 regression] experimental/memory_resource/resource_adaptor.cc FAILs

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691

--- Comment #34 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #33)
> The correct fix is to adjust the value of __STDCPP_DEFAULT_NEW_ALIGNMENT__
> on targets where malloc doesn't agree with GCC's alignof(max_align_t).

That only helps for C++17 and later though :-(

The  header is defined for C++14.

[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569

--- Comment #4 from Jonathan Wakely  ---
Rainer, the change to gcc/cp/init.c would allow you to do:

#define MALLOC_ABI_ALIGNMENT 8

in gcc/config/i386/sol2.h and that would cause std::allocator to know that it
can't rely on malloc for 16-byte alignment.

Although that would only help for C++17, because otherwise __cpp_aligned_new
isn't defined ... drat.

It's better than nothing though.

Does that seem acceptable for your target?

[Bug debug/90575] New: -gsplit-dwarf leaves behind .dwo file in cwd

2019-05-22 Thread sbergman at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90575

Bug ID: 90575
   Summary: -gsplit-dwarf leaves behind .dwo file in cwd
   Product: gcc
   Version: 9.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sbergman at redhat dot com
  Target Milestone: ---

At least with current GCC 9.1.1:

> $ mkdir testdir
> $ echo 'int main(void) { return 0; }' > testdir/test.c
> $ gcc -gsplit-dwarf testdir/test.c -o testdir/test
> $ ls
> testdir  test.dwo

I at least wouldn't expect the above to leave behind a test.dwo in the current
working dir.

[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11

2019-05-22 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569

Jason Merrill  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #3 from Jason Merrill  ---
(In reply to Jonathan Wakely from comment #0)
> unsigned
> malloc_alignment ()
> {
>   if (MALLOC_ABI_ALIGNMENT != BITS_PER_WORD)
> return MALLOC_ABI_ALIGNMENT;
>   return MAX (max_align_t_align(), MALLOC_ABI_ALIGNMENT);
> }

The last line can just be 

  return max_align_t_align();

Otherwise looks good to me.

[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377

2019-05-22 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539

--- Comment #21 from Thomas Koenig  ---
OK, if the callee is a C function... what is its declaration
on the Fortran side?  Is there any interface, bind(c) or otherwise?

I suppose there must be something, otherwise nf_put_vara_double would
have a trailing underscore.

On the caller side, I see that an array is passed, but the fsym
has rank=0.  I think this would be flagged otherwise.

[Bug debug/90574] New: [gdb] gdb wrongly stopped at a breakpoint in an unexecuted line of code

2019-05-22 Thread yangyibiao at nju dot edu.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90574

Bug ID: 90574
   Summary: [gdb] gdb wrongly stopped at a breakpoint in an
unexecuted line of code
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yangyibiao at nju dot edu.cn
  Target Milestone: ---

$ gcc --version
gcc (GCC) 10.0.0 20190517 (experimental)
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gdb --version
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 

$ cat small.c
#include 

int main(int argc, char **argv)
{
  if (argc == 0)
  {
int *ptr;
label:
  {
  }
  }
  if (argc == 1)
  {
 printf("hello\n");
  }
  return 0;
}

$ gcc -g small.c; ./a.out
hello

$ gdb -batch -x cmds a.out
Breakpoint 1 at 0x400501: file small.c, line 8.

Breakpoint 1, main (argc=1, argv=0x7fffde58) at small.c:8
8   label:
ptr = 
Kill the program being debugged? (y or n) [answered Y; input not from terminal]


$ cat cmds
b 8
r
info locals
kill
q


Line 8 in the body of the "if (argc==0)" is not executed according to the
program output. 

Thus, when we set breakpoint in Line #8, gdb should not stop. However, in this
case, it stopped and print something. Thus, I was wondering this should be a
bug in gdb.

[Bug rtl-optimization/64895] RA picks the wrong register for -fipa-ra

2019-05-22 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64895

Iain Sandoe  changed:

   What|Removed |Added

 CC||iains at gcc dot gnu.org

--- Comment #15 from Iain Sandoe  ---
(IIUC the thread here) It looks to me that the codegen is now DTRT for both pic
and non pic.
Darwin is doing pic by default, so sees XPASSes
There is no Linux pic test (so the change was not noticed there):

(I will produce a patch for the tests on the basis that this is now fixed).

Linux x86-64 (r271505):

Running target unix/-fpic/-m32
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for
target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /home/iains/gcc-trunk/src-local/gcc/testsuite/config/default.exp as
tool-and-target-specific interface file.
Running /home/iains/gcc-trunk/src-local/gcc/testsuite/gcc.target/i386/i386.exp
...
XPASS: gcc.target/i386/fuse-caller-save-rec.c scan-assembler-not push
XPASS: gcc.target/i386/fuse-caller-save-rec.c scan-assembler-not pop
XPASS: gcc.target/i386/fuse-caller-save-rec.c scan-assembler-times
addl\t%[re]?dx, %[re]?ax 1
FAIL: gcc.target/i386/fuse-caller-save-xmm.c scan-assembler-times
addpd\t\\.?LC0.*, %xmm0 1
XPASS: gcc.target/i386/fuse-caller-save-xmm.c scan-assembler-times
addpd\t%xmm1, %xmm0 1
XPASS: gcc.target/i386/fuse-caller-save.c scan-assembler-not push
XPASS: gcc.target/i386/fuse-caller-save.c scan-assembler-not pop
XPASS: gcc.target/i386/fuse-caller-save.c scan-assembler-times
addl\t%[re]?d[ix], %[re]?ax 1

=== gcc Summary for unix/-fpic/-m64 ===

# of expected passes18

=
code output below - Darwin produces the same (or equivalent, for m32/pic).
=
(extraneous lines snipped for clarity)

fuse-caller-save-rec.c -O2 -fipa-ra -fomit-frame-pointer
-fno-optimize-sibling-calls -mregparm=1 -m32 -S {,-fpic}

bar:
cmpl$4, %eax
jg  .L9
xorl%eax, %eax
ret
.L9:
subl$12, %esp
subl$3, %eax
callbar
addl$12, %esp
ret

foo:
subl$12, %esp
movl%eax, %edx
callbar
addl$12, %esp
addl%edx, %eax
ret
=
fuse-caller-save.c -O2 -fipa-ra -fomit-frame-pointer -mregparm=1 -m32 -S
{,-fpic}

bar:
addl$3, %eax
ret
foo:

movl%eax, %edx
callbar
addl%edx, %eax
ret

= 
fuse-caller-save-xmm.c -O2 -fipa-ra -fomit-frame-pointer -msse2 -mno-avx -m32
-S

bar:
addpd   .LC0, %xmm0
ret

foo:
subl$12, %esp
movapd  %xmm0, %xmm1
callbar
addl$12, %esp
addpd   %xmm1, %xmm0
ret

fuse-caller-save-xmm.c -O2 -fipa-ra -fomit-frame-pointer -msse2 -mno-avx -m32
-S -fpic

bar:
call__x86.get_pc_thunk.ax
addl$_GLOBAL_OFFSET_TABLE_, %eax
movapd  .LC0@GOTOFF(%eax), %xmm1
addpd   %xmm1, %xmm0
ret

foo:
subl$12, %esp
movapd  %xmm0, %xmm2
callbar
addl$12, %esp
addpd   %xmm2, %xmm0
ret

[Bug tree-optimization/90573] Avoid unnecessary data transfer into OMP construct

2019-05-22 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90573

--- Comment #1 from Thomas Schwinge  ---
Probably some of these transformation should come with compiler diagnostics,
especially for explicit clauses.

For example, need to relate this to 'OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT': PR70550
(r234779, r234824, r234826).

PR72781?

Or "the other way round", PR69876?

[Bug c++/68476] microblaze: compilation of btSoftBody.cpp doesn't terminate with optimisation

2019-05-22 Thread giulio.benetti at micronovasrl dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68476

Giulio Benetti  changed:

   What|Removed |Added

 CC||giulio.benetti@micronovasrl
   ||.com

--- Comment #8 from Giulio Benetti  ---
Duplicate.
It turns out that this bug behaves like 85180:
- hang on gcc version < 8.x with -O1/2/3

*** This bug has been marked as a duplicate of bug 85180 ***

[Bug c++/90569] __STDCPP_DEFAULT_NEW_ALIGNMENT__ is wrong for i386-pc-solaris2.11

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90569

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-05-22
 Ever confirmed|0   |1

--- Comment #2 from Jonathan Wakely  ---
I thought we could workaround this in libstdc++ like so:

diff --git a/libstdc++-v3/libsupc++/Makefile.am
b/libstdc++-v3/libsupc++/Makefile.am
index eec7b953514..a50a9848461 100644
--- a/libstdc++-v3/libsupc++/Makefile.am
+++ b/libstdc++-v3/libsupc++/Makefile.am
@@ -129,6 +129,8 @@ cp-demangle.o: cp-demangle.c


 # Use special rules for the C++17 sources so that the proper flags are passed.
+new_op.lo: new_op.cc
+   $(LTCXXCOMPILE) -std=gnu++1z -c $<
 new_opa.lo: new_opa.cc
$(LTCXXCOMPILE) -std=gnu++1z -c $<
 new_opant.lo: new_opant.cc
diff --git a/libstdc++-v3/libsupc++/Makefile.in
b/libstdc++-v3/libsupc++/Makefile.in
index 5d8ac5ca0ba..0e3cbff0055 100644
--- a/libstdc++-v3/libsupc++/Makefile.in
+++ b/libstdc++-v3/libsupc++/Makefile.in
@@ -956,6 +956,8 @@ cp-demangle.o: cp-demangle.c
$(C_COMPILE) -DIN_GLIBCPP_V3 -Wno-error -c $<

 # Use special rules for the C++17 sources so that the proper flags are passed.
+new_op.lo: new_op.cc
+   $(LTCXXCOMPILE) -std=gnu++1z -c $<
 new_opa.lo: new_opa.cc
$(LTCXXCOMPILE) -std=gnu++1z -c $<
 new_opant.lo: new_opant.cc
diff --git a/libstdc++-v3/libsupc++/new_op.cc
b/libstdc++-v3/libsupc++/new_op.cc
index 863530b7564..203c57d9171 100644
--- a/libstdc++-v3/libsupc++/new_op.cc
+++ b/libstdc++-v3/libsupc++/new_op.cc
@@ -27,6 +27,9 @@
 #include 
 #include 
 #include "new"
+#if defined __sun__ || defined __i386__
+# include 
+#endif

 using std::new_handler;
 using std::bad_alloc;
@@ -41,6 +44,14 @@ extern "C" void *malloc (std::size_t);
 _GLIBCXX_WEAK_DEFINITION void *
 operator new (std::size_t sz) _GLIBCXX_THROW (std::bad_alloc)
 {
+#if defined __sun__ || defined __i386__
+  if (sz >= alignof(std::max_align_t))
+{
+  std::align_val_t al{alignof(std::max_align_t)};
+  return ::operator new(sz, al);
+}
+#endif
+
   void *p;

   /* malloc (0) is unpredictable; avoid it.  */



This would force operator new to use aligned_alloc instead of malloc for
allocations that might be for objects large enough to require greater alignment
than malloc guarantees. But since Solaris 11 doesn't appear to define
aligned_alloc, this would use the fallback implementation in
libsupc++/new_opa.cc which is much less efficient than plain malloc.

[Bug tree-optimization/90573] New: Avoid unnecessary data transfer into OMP construct

2019-05-22 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90573

Bug ID: 90573
   Summary: Avoid unnecessary data transfer into OMP construct
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: openacc, openmp
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

As mentioned in PR90067: "it might generally be beneficial to have a pass
promoting 'firstprivate(x)' with a dominating write operation on 'x' to
'private(x)'".  This will avoid unnecessary data transfer for (all too common!)
code like:

int i;
#pragma acc parallel loop // implicit 'firstprivate(i)'
for (i = 0; i < N; ++i)
  [...]

Similarly, there are cases where 'copy(x)' can be optimized to 'copyout(x)', or
'copyin(x)' to 'create(x)'.

This need not apply to implicit clauses only, but also to explicit ones, when
the user can't observe any difference.

The same applies to certain OpenMP clauses too, I suppose.

[Bug middle-end/34678] Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)

2019-05-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #39 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #36)
> Created attachment 46396 [details]
> poor mans solution^Whack
> 
> So this is what a hack looks like, basically sprinkling those asm()s
> throughout the code automatically.
> 
> Note I need to protect inputs, not outputs, otherwise the last
> testcase isn't fixed.
> 
> Improving this poor-mans solution by writing in some flow-sensitivity
> like tracking which values are already protected and if there's a possibly
> harmful FENV access inbetween maybe in a similar way tree-complex.c tracks
> complex components might work.
> 
> Note that the FENV pragma does _not_ enable -frounding-math (it really has
> no effect!) so you need to supply -frounding-math yourself (or fix the
> frontends to do that).
> 
> It's a hack of course.
> 
> But it fixes the testcase:
> 
> > ./xgcc -B. t.c -O3 -lm
> > ./a.out
> 1/0.2: down = 4.999 near = 4.999 up =
> 4.999
> a.out: t.c:32: main: Assertion `5.0 <= up' failed.
> Aborted
> > ./xgcc -B. t.c -O3 -lm -frounding-math
> > ./a.out
> 1/0.2: down = 4.999 near = 5 up = 5
> 
> IL after the lowering:
> 
> main ()
> {
>   static const char __PRETTY_FUNCTION__[5] = "main";
>   double near;
>   double up;
>   double down;
>   double op;
>   int D.3058;
> 
>   op = atof ("0.2");
>   fesetround (1024);
>   __asm__ __volatile__("" : "=g" op : "0" op);
>   down = 1.0e+0 / op;
>   fesetround (2048);
>   __asm__ __volatile__("" : "=g" op : "0" op);
>   up = 1.0e+0 / op;
>   fesetround (0);
>   __asm__ __volatile__("" : "=g" op : "0" op);
>   near = 1.0e+0 / op;
>   printf ("1/%.16g: down = %.16g near = %.16g up = %.16g\n", op, down, near,
> up);
> ...

How does this work if op is a SSA_NAME?

[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope

2019-05-22 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570

Martin Liška  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #5 from Martin Liška  ---
I've got a patch candidate.

[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope

2019-05-22 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570

--- Comment #4 from Martin Liška  ---
(In reply to Jakub Jelinek from comment #3)
> Given the TREE_STATIC on:
>   static const int C.0[2] = {1, 2};
> I don't understand why there is ASAN_UNPOISON/ASAN_POISON for C.0, shouldn't
> that be applied solely to automatic variables, not block scope locals?

Ah, you are right. We shouldn't do it for static variables.

[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope

2019-05-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570

--- Comment #3 from Jakub Jelinek  ---
Given the TREE_STATIC on:
  static const int C.0[2] = {1, 2};
I don't understand why there is ASAN_UNPOISON/ASAN_POISON for C.0, shouldn't
that be applied solely to automatic variables, not block scope locals?

[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope

2019-05-22 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570

Martin Liška  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
Started with r260969 where Jason emit initializer list initialization as
automatic variable instead of a const int variable.

Difference:

BEFORE:
stru::stru (struct stru * const this)
{
  struct initializer_list D.17010;
  const int D.16442[2];
  struct allocator_type D.16443;

  _1 = >v;
  D.16442[0] = 1;
  D.16442[1] = 2;
  D.17010._M_array = 
  D.17010._M_len = 2;
  .ASAN_MARK (UNPOISON, , 1);
  std::allocator::allocator ();
  try
{
  try
{
  std::vector::vector (_1, D.17010, );
}
  finally
{
  std::allocator::~allocator ();
}
}
  finally
{
  .ASAN_MARK (POISON, , 1);
}
  try
{
  this->i = 5;
}
  catch
{
  _2 = >v;
  std::vector::~vector (_2);
}
}

AFTER:

stru::stru (struct stru * const this)
{
  struct initializer_list D.17010;
  static const int C.0[2] = {1, 2};
  struct allocator_type D.16443;

  _1 = >v;
  .ASAN_MARK (UNPOISON, , 8);
  try
{
  D.17010._M_array = 
  D.17010._M_len = 2;
  .ASAN_MARK (UNPOISON, , 1);
  std::allocator::allocator ();
  try
{
  try
{
  std::vector::vector (_1, D.17010, );
}
  finally
{
  std::allocator::~allocator ();
}
}
  finally
{
  .ASAN_MARK (POISON, , 1);
}
}
  finally
{
  .ASAN_MARK (POISON, , 8);
}
  try
{
  this->i = 5;
}
  catch
{
  _2 = >v;
  std::vector::~vector (_2);
}
}

I believe we're doing good and the code is really invalid. Jason?

[Bug tree-optimization/90571] Missed optimization opportunity when returning function pointers based on run-time boolean

2019-05-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571

--- Comment #3 from Richard Biener  ---
Turning indirect calls into direct ones might be important enough to also
handle

int x, y;
int f() { return x; }
int g() { return y; }
int t0(bool b) { int (*i)() = b ?  :  x = 1; return i(); }
int main(int ac, char**) { return t0(ac & 1); }

like where there are statements before the indirect call that prevent it
from being simply duplicated into the predecessor blocks.

The transformation "primitive" would then be to duplicate the
joiner up to the call and the "interesting" part of it is
creating all required PHI nodes (unless you want to make SSA rewrite
deal with this somehow).  Sinking stmts below the call and limiting
the amount of copying is important.  Note there are related PRs for
that we miss to sink/hoist stmts through PHI nodes when that reduces the
number of PHI nodes.

The transform would likely split the block, insert "block-closed" PHI
nodes in the tail part for all SSA names defined in the first half and
live over the new edge and then duplicate the first half re-wiring
edges as needed.

This is as opposed to the original testcase where a simpler pattern-matching
scheme could be invented.  I wonder how the "original" testcase looked
like - the one in this bug is probably simplified from real-world code?

[Bug c++/90572] Wrong disambiguation in friend declaration as implicit typename context

2019-05-22 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90572

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-05-22
 CC||mpolacek at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Marek Polacek  ---
Thanks for the bug report; mine.

[Bug target/71124] Compiler enters infinite loop on Microblaze with -O1/-O2/-O3

2019-05-22 Thread giulio.benetti at micronovasrl dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71124

--- Comment #4 from Giulio Benetti  ---
Previous Comment was wrong.
This duplicates bug:

*** This bug has been marked as a duplicate of bug 85180 ***

[Bug target/71124] Compiler enters infinite loop on Microblaze with -O1/-O2/-O3

2019-05-22 Thread giulio.benetti at micronovasrl dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71124

Giulio Benetti  changed:

   What|Removed |Added

 CC||giulio.benetti@micronovasrl
   ||.com

--- Comment #3 from Giulio Benetti  ---
Duplicate then.

*** This bug has been marked as a duplicate of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85180 ***

[Bug middle-end/34678] Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)

2019-05-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678

--- Comment #38 from Marc Glisse  ---
(In reply to Marc Glisse from comment #37)
> If you protect even constants, the current effects of -frounding-math become
> redundant.

Oops, forget that, the hack is too late for this sentence to be true, some
constant propagation has already happened by that time.

[Bug tree-optimization/90571] Missed optimization opportunity when returning function pointers based on run-time boolean

2019-05-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-05-22
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
Let me try.

[Bug tree-optimization/90571] Missed optimization opportunity when returning function pointers based on run-time boolean

2019-05-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
  Component|c++ |tree-optimization

--- Comment #1 from Richard Biener  ---
I think there's a dup for this somewhere.  Basically we fail to optimize

  if (_2 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   :

   :
  # iftmp.0_10 = PHI 
  _11 = iftmp.0_10 ();

on the GIMPLE level.  It might be tempting to enable tree-ssa-phiprop.c to
transform this into

  if (_2 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   :
tem1 = f();
goto bb4;

  :
tem2 = g();

   :
  # iftmp.0_10 = PHI 
  _11 = iftmp.0_10;

[Bug middle-end/34678] Optimization generates incorrect code with -frounding-math option (#pragma STDC FENV_ACCESS not implemented)

2019-05-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34678

--- Comment #37 from Marc Glisse  ---
(In reply to Richard Biener from comment #36)
> Created attachment 46396 [details]
> poor mans solution^Whack
> 
> So this is what a hack looks like, basically sprinkling those asm()s
> throughout the code automatically.
> 
> Note I need to protect inputs, not outputs, otherwise the last
> testcase isn't fixed.

Actually, you need to protect both inputs *and* outputs...

> Improving this poor-mans solution by writing in some flow-sensitivity
> like tracking which values are already protected

At least if you use "=x" (or whatever the right constraint is on each target)
it doesn't really hurt to have a dozen protections on the same variable.

> and if there's a possibly
> harmful FENV access in between maybe in a similar way tree-complex.c tracks
> complex components might work.
> 
> Note that the FENV pragma does _not_ enable -frounding-math (it really has
> no effect!) so you need to supply -frounding-math yourself (or fix the
> frontends to do that).

If you protect even constants, the current effects of -frounding-math become
redundant.

[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377

2019-05-22 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539

--- Comment #20 from Martin Liška  ---
(In reply to Thomas Koenig from comment #19)
> Thanks.
> 
> A bit more:
> 
> What are the declarations of the actual srgument,
> of the dummy argument (on the callee side),
> and what is the argument in the call list?
> 
> 
> Ill try to construct a test case tonight then.

So the callee is actually a C function:

;; Function nf_put_vara_double_ (null)
;; enabled by -tree-original


{
  size_t B3[512];
  size_t B4[512];
  int A0;

  # DEBUG BEGIN STMT;
size_t B3[512];
  # DEBUG BEGIN STMT;
size_t B4[512];
  # DEBUG BEGIN STMT;
int A0;
  # DEBUG BEGIN STMT;
  A0 = nc_put_vara_double (*fncid, *fvarid + -1, (const size_t *) f2c_coords
(*fncid, *fvarid + -1, (const int *) A3, (size_t *) ), (const size_t *)
f2c_counts (*fncid, *fvarid + -1, (const int *) A4, (size_t *) ), A5);
  # DEBUG BEGIN STMT;
  return A0;
}

where nc_put_vara_double is defined as:

int
nc_put_vara_double(int ncid, int varid,
 const size_t *start, const size_t *edges, const double *value)
{
int status = NC_NOERR;
NC *ncp;
const NC_var *varp;
int ii;
size_t iocount;

status = NC_check_id(ncid, ); 
if(status != NC_NOERR)
return status;

if(NC_readonly(ncp))
return NC_EPERM;

if(NC_indef(ncp))
return NC_EINDEFINE;

varp = NC_lookupvar(ncp, varid);
if(varp == NULL)
return NC_ENOTVAR; /* TODO: lost NC_EGLOBAL */

if(varp->type == NC_CHAR)
return NC_ECHAR;

status = NCcoordck(ncp, varp, start);
if(status != NC_NOERR)
return status;
status = NCedgeck(ncp, varp, start, edges);
if(status != NC_NOERR)
return status;

if(varp->ndims == 0) /* scalar variable */
{
return( putNCv_double(ncp, varp, start, 1, value) );
}

if(IS_RECVAR(varp))
{
status = NCvnrecs(ncp, *start + *edges);
if(status != NC_NOERR)
return status;

if(varp->ndims == 1
&& ncp->recsize <= varp->len)
{
/* one dimensional && the only record variable  */
return( putNCv_double(ncp, varp, start, *edges, value)
);
}
}

/*
 * find max contiguous
 *   and accumulate max count for a single io operation
 */
ii = NCiocount(ncp, varp, edges, );

if(ii == -1)
{
return( putNCv_double(ncp, varp, start, iocount, value) );
}

assert(ii >= 0);


{ /* inline */
ALLOC_ONSTACK(coord, size_t, varp->ndims);
ALLOC_ONSTACK(upper, size_t, varp->ndims);
const size_t index = ii;

/* copy in starting indices */
(void) memcpy(coord, start, varp->ndims * sizeof(size_t));

/* set up in maximum indices */
set_upper(upper, start, edges, [varp->ndims]);

/* ripple counter */
while(*coord < *upper)
{
const int lstatus = putNCv_double(ncp, varp, coord, iocount,
 value);
if(lstatus != NC_NOERR)
{
if(lstatus != NC_ERANGE)
{
status = lstatus;
/* fatal for the loop */
break;
}
/* else NC_ERANGE, not fatal for the loop */
if(status == NC_NOERR)
status = lstatus;
}
value += iocount;
odo1(start, upper, coord, [index], [index]);
}

FREE_ONSTACK(upper);
FREE_ONSTACK(coord);
} /* end inline */

return status;
}


that calls:

[Bug c++/90572] New: Wrong disambiguation in friend declaration as implicit typename context

2019-05-22 Thread blitzrakete at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90572

Bug ID: 90572
   Summary: Wrong disambiguation in friend declaration as implicit
typename context
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: blitzrakete at gmail dot com
  Target Milestone: ---

template  struct C {
  friend C(T::fn)();  // not implicit typename context, declarator-id of friend
  // declaration
};

Courtesy of rsmith. gcc fails to compile this with -std=c++2a, but accepts it
in C++17 mode.

:11:19: error: ISO C++ forbids declaration of 'C' with no type
[-fpermissive]

   11 |   friend C(T::fn)();  // not implicit typename context, declarator-id
of friend

  |   ^

:11:19: error: 'C' declared as function returning a function

gcc interprets this as a function taking a T::fn and returning a function,
while it should be a function returning C taking no parameters with the
(qualified) name T::fn.

[Bug fortran/90539] [10 Regression] 481.wrf slowdown by 25% on Intel Kaby with -Ofast -march=native starting with r271377

2019-05-22 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90539

--- Comment #19 from Thomas Koenig  ---
Thanks.

A bit more:

What are the declarations of the actual srgument,
of the dummy argument (on the callee side),
and what is the argument in the call list?


Ill try to construct a test case tonight then.

[Bug tree-optimization/88440] size optimization of memcpy-like code

2019-05-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440

--- Comment #22 from Richard Biener  ---
The code in question was originally added with r202721 by Vlad and likely
became more costly after making the target macro a hook (no inlining
anymore).

[Bug c++/90571] New: Missed optimization opportunity when returning function pointers based on run-time boolean

2019-05-22 Thread vittorio.romeo at outlook dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90571

Bug ID: 90571
   Summary: Missed optimization opportunity when returning
function pointers based on run-time boolean
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vittorio.romeo at outlook dot com
  Target Milestone: ---

Given the following two functions:

int f() { return 0; }
int g() { return 1; }

And the following code to invoke one of them depending on a boolean `b`:

int t0(bool b) { return (b ?  : )(); }
int t1(bool b) { return b ? f() : g(); }
int t2(bool b) { return b ? t0(true) : t0(false); }

Both `g++ (trunk)` and `clang++ (trunk)` with `-std=c++2a -Ofast -march=native`
fail to optimize the following code:

int main(int ac, char**) { return t0(ac & 1); }

Producing the following assembly:

> main:
>   and edi, 1
>   mov eax, OFFSET FLAT:f()
>   mov edx, OFFSET FLAT:g()
>   cmove rax, rdx
>   jmp rax
> 

Invoking `t1` or `t2` (instead of `t0`) produces the following optimized
assembly:

> main:
> mov eax, edi
> not eax
> and eax, 1
> ret

Everything can be reproduced live on **gcc.godbolt.org**:
https://godbolt.org/z/gh7270

[Bug ipa/88231] aligned functions laid down inefficiently

2019-05-22 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|marxin at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org

--- Comment #7 from Martin Liška  ---
(In reply to Martin Sebor from comment #5)
> The feature already exists at -Os by default (i.e., all functions are by
> default minimally aligned).  The suggestion here is only to let GCC minimize
> the amount of padding it adds to functions in order to align the explicitly
> overaligned ones that follow by changing the order it emits them in.
> 
> Outside -Os, functions would continue to be optimally aligned unless
> overridden by the attribute.  When their alignment is explicitly reduced by
> the attribute GCC could still be smart about ordering them so as to minimize
> wasted space.  Consider:
> 
>   __attribute__ ((aligned (4))) int f4 (int i) { return 2 * i; }
>   double f (double x) { return x * x * x; }
>   __attribute__ ((aligned (4))) int g4 (int i) { return i; }
> 
> for which GCC for x86_64 emits:
> 
>    :;; unnecessarily overaligned
>  0:   8d 04 3flea(%rdi,%rdi,1),%eax
>  3:   c3  retq   
>  4:   66 90   xchg   %ax,%ax
>  6:   66 2e 0f 1f 84 00 00nopw   %cs:0x0(%rax,%rax,1)
>  d:   00 00 00 
> 
>   0010 : ;; optimally aligned
> 10:   66 0f 28 c8 movapd %xmm0,%xmm1
> 14:   f2 0f 59 c8 mulsd  %xmm0,%xmm1
> 18:   f2 0f 59 c1 mulsd  %xmm1,%xmm0
> 1c:   c3  retq   
> 1d:   0f 1f 00nopl   (%rax)
> 
>   0020 :;; also unnecessarily overaligned
> 20:   89 f8   mov%edi,%eax
> 22:   c3  retq   
> 
> If it laid down f first instead it would be able to avoid padding f4:
> 
>  :
>0: 66 0f 28 c8 movapd %xmm0,%xmm1
>4: f2 0f 59 c8 mulsd  %xmm0,%xmm1
>8: f2 0f 59 c1 mulsd  %xmm1,%xmm0
>c: c3  retq   
>d: 0f 1f 00nopl   (%rax)
> 
> 0010 :  ;; unavoidably overaligned
>   10: 8d 04 3flea(%rdi,%rdi,1),%eax
>   13: c3  retq   
> 
> 0014 :  ;; aligned exactly as requested
>   14: 89 f8   mov%edi,%eax
>   16: c3  retq   
> 

Can we do such an optimization without GAS information about size of every
function?

[Bug debug/86964] [7/8 Regression] Too many debug symbols included, especially for extern globals

2019-05-22 Thread patrickdepinguin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86964

--- Comment #18 from Thomas De Schampheleire  ---
Second version of patch, fixing testsuite failures, was posted:
https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01403.html

[Bug ipa/88231] aligned functions laid down inefficiently

2019-05-22 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88231

--- Comment #6 from Martin Liška  ---
(In reply to Andi Kleen from comment #4)
> I'm not sure it's a good idea to do this. Often the goal is not to get the
> absolute smallest code, but to get code that minimizes cache line usage.
> This is important for "frontend bound" code like gcc itself often is.
> 
> It would be rather better to use an algorithm like Petis-Hansen or the one
> in hfsort (see
> https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.
> pdf) to lay out the code based on expected call order to minimize foot
> print. For best result would need profile feedback of course, but it might
> already do a reasonable job based on static call frequencies.

I'm planning to implement that for GCC10 with LTO and PGO. So far, we've been
ordering functions with LTO by it's first call. We can definitely do better.

[Bug sanitizer/90570] [9/10 Regression] AddressSanitizer: stack-use-after-scope

2019-05-22 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-05-22
  Known to work||8.3.0
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
   Target Milestone|--- |9.2
Summary|AddressSanitizer:   |[9/10 Regression]
   |stack-use-after-scope   |AddressSanitizer:
   ||stack-use-after-scope
 Ever confirmed|0   |1
  Known to fail||10.0, 9.1.0

--- Comment #1 from Martin Liška  ---
Let me take a look..

[Bug sanitizer/90570] New: AddressSanitizer: stack-use-after-scope

2019-05-22 Thread mtekieli at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90570

Bug ID: 90570
   Summary: AddressSanitizer: stack-use-after-scope
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mtekieli at gmail dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

root@marcin:~# cat main.cpp 
#include 

struct stru
{
std::vector v{1,2,3,4};
int i{5};
};

int main()
{
stru s1;
stru s2;

return 0;
}
root@marcin:~# g++-9 -fsanitize=address main.cpp -o main  
root@marcin:~# ./main 
=
==1656==ERROR: AddressSanitizer: stack-use-after-scope on address
0x55fd2cb681c0 at pc 0x7f1c3d1a7b90 bp 0x7fff14bed7c0 sp 0x7fff14becf68

It doesn't matter if vector changed to set or map. Works OK on gcc8.3 and
clang8.

[Bug bootstrap/90543] Build failure on MINGW for gcc-9.1.0

2019-05-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90543

--- Comment #8 from Jakub Jelinek  ---
(In reply to Jonathan Wakely from comment #6)
> Neither uintptr_t nor PRIxPTR (nor long long nor uint64_t) is part of C++98,
> which GCC still requires. I do see existing uses of intptr_t and uintptr_t
> in gcc/cp/*.c though.

For intptr_t and uintptr_t configure arranges to have those defined:
 -- Macro: AC_TYPE_UINTPTR_T
 If `stdint.h' or `inttypes.h' defines the type `uintptr_t', define
 `HAVE_UINTPTR_T'.  Otherwise, define `uintptr_t' to an unsigned
 integer type wide enough to hold a pointer, if such a type exists.
 -- Macro: AC_TYPE_INTPTR_T
 If `stdint.h' or `inttypes.h' defines the type `intptr_t', define
 `HAVE_INTPTR_T'.  Otherwise, define `intptr_t' to a signed integer
 type wide enough to hold a pointer, if such a type exists.
and all we require is that such a type exists, so hosts where pointers don't
have size of unsigned int, unsigned long or unsigned long long and don't
provide stdint.h or inttypes.h defining those are unsupported.  Are there any?

[Bug fortran/89100] Default widths for i, f and g format specifiers in format strings

2019-05-22 Thread jb at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89100

--- Comment #12 from Janne Blomqvist  ---
Author: jb
Date: Wed May 22 11:56:01 2019
New Revision: 271511

URL: https://gcc.gnu.org/viewcvs?rev=271511=gcc=rev
Log:
fortran/89100: Default widths with -fdec-format-defaults

gcc/fortran ChangeLog:

2019-05-22  Jeff Law  
Mark Eggleston  

PR fortran/89100
* gfortran.texi: Add Default widths for F, G and I format
descriptors to Extensions section.
* invoke.texi: Add -fdec-format-defaults
* io.c (check_format): Use default widths for i, f and g when
flag_dec_format_defaults is enabled.
* lang.opt: Add new option.
* options.c (set_dec_flags): Add SET_BITFLAG for
flag_dec_format_defaults.


gcc/testsuite ChangeLog:

2019-05-22  Mark Eggleston  

PR fortran/89100
* gfortran.dg/fmt_f_default_field_width_1.f90: New test.
* gfortran.dg/fmt_f_default_field_width_2.f90: New test.
* gfortran.dg/fmt_f_default_field_width_3.f90: New test.
* gfortran.dg/fmt_g_default_field_width_1.f90: New test.
* gfortran.dg/fmt_g_default_field_width_2.f90: New test.
* gfortran.dg/fmt_g_default_field_width_3.f90: New test.
* gfortran.dg/fmt_i_default_field_width_1.f90: New test.
* gfortran.dg/fmt_i_default_field_width_2.f90: New test.
* gfortran.dg/fmt_i_default_field_width_3.f90: New test.


libgfortran ChangeLog:

2019-05-22  Jeff Law  

PR fortran/89100
* io/format.c (parse_format_list): set default width when the
IOPARM_DT_DEC_EXT flag is set for i, f and g.
* io/io.h: add default_width_for_integer, default_width_for_float
and default_precision_for_float.
* io/write.c (write_boz): extra parameter giving length of data
corresponding to the type's kind.
(write_b): pass data length as extra parameter in calls to
write_boz.
(write_o): pass data length as extra parameter in calls to
write_boz.
(write_z): pass data length as extra parameter in calls to
write_boz.
(size_from_kind): also set size is default width is set.
* io/write_float.def (build_float_string): new paramter inserted
before result parameter. If default width use values passed
instead of the values in fnode.
(FORMAT_FLOAT): macro modified to check for default width and
calls to build_float_string to pass in default width.
(get_float_string): set width and precision to defaults when
needed.


Added:
trunk/gcc/testsuite/gfortran.dg/fmt_f_default_field_width_1.f90
trunk/gcc/testsuite/gfortran.dg/fmt_f_default_field_width_2.f90
trunk/gcc/testsuite/gfortran.dg/fmt_f_default_field_width_3.f90
trunk/gcc/testsuite/gfortran.dg/fmt_g_default_field_width_1.f90
trunk/gcc/testsuite/gfortran.dg/fmt_g_default_field_width_2.f90
trunk/gcc/testsuite/gfortran.dg/fmt_g_default_field_width_3.f90
trunk/gcc/testsuite/gfortran.dg/fmt_i_default_field_width_1.f90
trunk/gcc/testsuite/gfortran.dg/fmt_i_default_field_width_2.f90
trunk/gcc/testsuite/gfortran.dg/fmt_i_default_field_width_3.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/gfortran.texi
trunk/gcc/fortran/invoke.texi
trunk/gcc/fortran/io.c
trunk/gcc/fortran/lang.opt
trunk/gcc/fortran/options.c
trunk/gcc/testsuite/ChangeLog
trunk/libgfortran/ChangeLog
trunk/libgfortran/io/format.c
trunk/libgfortran/io/io.h
trunk/libgfortran/io/read.c
trunk/libgfortran/io/write.c
trunk/libgfortran/io/write_float.def

[Bug bootstrap/90543] Build failure on MINGW for gcc-9.1.0

2019-05-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90543

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
http://gcc.gnu.org/ml/gcc-patches/2019-05/msg01492.html

[Bug bootstrap/90543] Build failure on MINGW for gcc-9.1.0

2019-05-22 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90543

--- Comment #6 from Jonathan Wakely  ---
Neither uintptr_t nor PRIxPTR (nor long long nor uint64_t) is part of C++98,
which GCC still requires. I do see existing uses of intptr_t and uintptr_t in
gcc/cp/*.c though.

[Bug tree-optimization/88440] size optimization of memcpy-like code

2019-05-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440

--- Comment #21 from Richard Biener  ---
Ick.

static inline void
check_pseudos_live_through_calls (int regno,
  HARD_REG_SET last_call_used_reg_set,
  rtx_insn *call_insn)
{
...
  for (hr = 0; HARD_REGISTER_NUM_P (hr); hr++)
if (targetm.hard_regno_call_part_clobbered (call_insn, hr,
PSEUDO_REGNO_MODE (regno)))
  add_to_hard_reg_set (_reg_info[regno].conflict_hard_regs,
   PSEUDO_REGNO_MODE (regno), hr);

this loop is repeatedly computing an implicit hard-reg set for
which hard-regs are partly clobbered by the call for the _same_
actual instruction since check_pseudos_live_through_calls is called
via

  /* Mark each defined value as live.  We need to do this for
 unused values because they still conflict with quantities
 that are live at the time of the definition.  */
  for (reg = curr_id->regs; reg != NULL; reg = reg->next)
{
  if (reg->type != OP_IN)
{
  update_pseudo_point (reg->regno, curr_point, USE_POINT);
  mark_regno_live (reg->regno, reg->biggest_mode);
  check_pseudos_live_through_calls (reg->regno,
last_call_used_reg_set,
call_insn);
...
}

and

  EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, j)
{
  IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set,
this_call_used_reg_set);

  if (flush)
check_pseudos_live_through_calls (j,
  last_call_used_reg_set,
  last_call_insn);
}

and

  /* Mark each used value as live.  */
  for (reg = curr_id->regs; reg != NULL; reg = reg->next)
if (reg->type != OP_OUT)
  {
if (reg->type == OP_IN)
  update_pseudo_point (reg->regno, curr_point, USE_POINT);
mark_regno_live (reg->regno, reg->biggest_mode);
check_pseudos_live_through_calls (reg->regno,
  last_call_used_reg_set,
  call_insn);
  }

and

  EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, j, bi)
{
  if (sparseset_cardinality (pseudos_live_through_calls) == 0)
break;
  if (sparseset_bit_p (pseudos_live_through_calls, j))
check_pseudos_live_through_calls (j, last_call_used_reg_set,
call_insn);
}

the pseudos mode may change but I guess usually it doesn't.  I also wonder
why the target hook doesn't return a hard-reg-set ...

That said, the above code doesn't scale well with functions with a lot of
calls at least, also the passed call_insn isn't the current insn and
might even be NULL.  All but aarch64 do not even look at the actual instruction
(even more an argument for re-designing the hook with it's use in mind).

I guess an artificial testcase with a lot of calls and a lot of live
pseudos (even single-BB) should show this issue easily.

Samples: 579  of event 'cycles:ppp', Event count (approx.): 257134187434191 
Overhead  Command  Shared Object Symbol 
  22.26%  f951 f951  [.] process_bb_lives
  15.06%  f951 f951  [.] ix86_hard_regno_call_part_clobbered
   8.55%  f951 f951  [.] concat
   6.88%  f951 f951  [.] find_base_term
   3.60%  f951 f951  [.] get_ref_base_and_extent
   3.27%  f951 f951  [.] find_base_term
   2.95%  f951 f951  [.] make_hard_regno_dead

  1   2   >