date:20190823

[Bug fortran/91537] New: Memory leak involving nested allocatable derived types

2019-08-23 Thread townsend at astro dot wisc.edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91537

Bug ID: 91537
   Summary: Memory leak involving nested allocatable derived types
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: townsend at astro dot wisc.edu
  Target Milestone: ---

Created attachment 46748
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46748=edit
Leak demonstration program

The attached test program demonstrates a memory leak on gfortran 8.3.0.
Intriguingly, valgrind reports no leak; but the memory usage grows steadily
over time, even though only ALLOCATABLE arrays are used. By the end of
execution, 4GB  is being used.

To demonstrate the memory growth, I use the routine system_mem_usage, which
looks at files inside /proc to find the current RSS. Of course, this only works
on Linux -- for those on other platforms, you may have to comment out (or
rework) the call.

Typical output is as follows (from ./test_leak_new | tail -10):

  91 3682816
  92 3722680
  93 3762808
  94 3802672
  95 3842800
  96 3882664
  97 3922792
  98 3962656
  99 4002784
 100 4042648

The first number is the iteration number, the second is the RSS. So, 4GB by the
end of execution, despite the fact that bp is explicitly deallocated at the end
of each loop.

The test program may seem a little contrived, but it's a cut-down version of
production code (which shows the same valgrind-invisible leak behavior).

cheers,

Rich

[Bug c++/91521] [9/10 Regression] expression incorrectly evaluated as function with trailing return type

2019-08-23 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91521

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Marek Polacek  ---
Fixed.

[Bug c++/91521] [9/10 Regression] expression incorrectly evaluated as function with trailing return type

2019-08-23 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91521

--- Comment #4 from Marek Polacek  ---
Author: mpolacek
Date: Fri Aug 23 23:26:17 2019
New Revision: 274892

URL: https://gcc.gnu.org/viewcvs?rev=274892=gcc=rev
Log:
PR c++/91521 - wrong error with operator->.
* decl.c (grokdeclarator): Return error_mark_node for an invalid
trailing return type.

* g++.dg/parse/operator8.C: New test.

Added:
branches/gcc-9-branch/gcc/testsuite/g++.dg/parse/operator8.C
Modified:
branches/gcc-9-branch/gcc/cp/ChangeLog
branches/gcc-9-branch/gcc/cp/decl.c

[Bug debug/91536] New: gcc generates invalid DW_OP_GNU_parameter_ref

2019-08-23 Thread robert at ocallahan dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91536

Bug ID: 91536
   Summary: gcc generates invalid DW_OP_GNU_parameter_ref
   Product: gcc
   Version: 9.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: robert at ocallahan dot org
  Target Milestone: ---

Compiling the following test program with `gcc -g -O2 -o ~/tmp/test
~/tmp/test.cpp`, `gcc (GCC) 9.1.1 20190503 (Red Hat 9.1.1-1)`

volatile char volatile_store;
struct X {
  int field;
  X() : field(111) {}
  __attribute__((noinline))
  void method() {
volatile_store = 2;
  }
};
int main(void) {
  X x;
  x.method();
  return 0;
}

gdb can't print `this` in `method`:

(gdb) break method
Breakpoint 1 at 0x401120: file /home/roc/tmp/test.cpp, line 7.
(gdb) run
Starting program: /home/roc/tmp/test 
Breakpoint 1, X::method (this=) at /home/roc/tmp/test.cpp:7
7   volatile_store = 2;

gcc has generated a location list for `this` that uses
`DW_OP_GNU_parameter_ref`:

< 2><0x0177>  DW_TAG_formal_parameter
DW_AT_abstract_origin   <0x0127>
DW_AT_location  
[ 0]DW_OP_GNU_parameter_ref 0x0127
DW_OP_stack_value

`method`'s return address is 0x401025 in this case, which corresponds to this
`DW_TAG_GNU_call_site`:

< 2><0x010b>  DW_TAG_GNU_call_site
DW_AT_low_pc0x00401025
DW_AT_abstract_origin   <0x0160>

Unfortunately this `DW_TAG_GNU_call_site` is completely useless because it
doesn't have any variables with `DW_AT_GNU_call_site_value`, so no wonder gdb
can't find the value of `this`. The subprgoram at 0x160 is not helpful, that's
just the subprogram containing the `DW_TAG_formal_parameter` at 0x177.

Aside: the DWARF5 spec and the original proposal for `DW_TAG_(GNU_)call_site`
(http://www.dwarfstd.org/ShowIssue.php?issue=100909.2 I think) don't list
`DW_AT_low_pc`  and `DW_AT_abstract_origin` as valid attributes, and thus don't
document what they mean here.

[Bug c++/91521] [9/10 Regression] expression incorrectly evaluated as function with trailing return type

2019-08-23 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91521

--- Comment #3 from Marek Polacek  ---
Author: mpolacek
Date: Fri Aug 23 23:24:46 2019
New Revision: 274891

URL: https://gcc.gnu.org/viewcvs?rev=274891=gcc=rev
Log:
PR c++/91521 - wrong error with operator->.
* decl.c (grokdeclarator): Return error_mark_node for an invalid
trailing return type.

* g++.dg/parse/operator8.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/parse/operator8.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/decl.c
trunk/gcc/testsuite/ChangeLog

[Bug target/91481] POWER9 "DARN" RNG intrinsic produces repeated output

2019-08-23 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91481

--- Comment #7 from Segher Boessenkool  ---
Author: segher
Date: Fri Aug 23 22:19:40 2019
New Revision: 274889

URL: https://gcc.gnu.org/viewcvs?rev=274889=gcc=rev
Log:
rs6000: New darn testcase (PR91481)

We used to implement darn with unspecs, not unspec_volatiles, which
means two darn instructions could be CSEd together.

This testcase tests it by adding together four random numbers.  If all
is well that means we get four darn instructions, because such a small
loop is unrolled fine at -O2 already.  If things go bad, combine will
combine it all to one darn and a shift left by two.


gcc/testsuite/
PR target/91481
* gcc.target/powerpc/darn-3.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.target/powerpc/darn-3.c
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug c++/79817] GCC does not recognize [[deprecated]] attribute for namespace

2019-08-23 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79817

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Marek Polacek  ---
Fixed in GCC 10.

[Bug c++/79817] GCC does not recognize [[deprecated]] attribute for namespace

2019-08-23 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79817

--- Comment #5 from Marek Polacek  ---
Author: mpolacek
Date: Fri Aug 23 22:04:32 2019
New Revision: 274888

URL: https://gcc.gnu.org/viewcvs?rev=274888=gcc=rev
Log:
PR c++/79817 - attribute deprecated on namespace.
* cp-tree.h (cp_warn_deprecated_use_scopes): Declare.
* decl.c (grokdeclarator): Call cp_warn_deprecated_use_scopes.
(type_is_deprecated): Likewise.
* decl2.c (cp_warn_deprecated_use_scopes): New function.
* name-lookup.c (handle_namespace_attrs): Handle attribute deprecated.
* parser.c (cp_parser_namespace_alias_definition): Call
cp_warn_deprecated_use_scopes.
(cp_parser_using_declaration): Likewise.
(cp_parser_using_directive): Likewise.
* semantics.c (finish_id_expression_1): Likewise.

* g++.dg/cpp0x/attributes-namespace1.C: New test.
* g++.dg/cpp0x/attributes-namespace2.C: New test.
* g++.dg/cpp0x/attributes-namespace3.C: New test.
* g++.dg/cpp0x/attributes-namespace4.C: New test.
* g++.dg/cpp0x/attributes-namespace5.C: New test.
* g++.dg/cpp1z/namespace-attribs.C: Adjust.
* g++.dg/cpp1z/namespace-attribs2.C: Adjust.

Added:
trunk/gcc/testsuite/g++.dg/cpp0x/attributes-namespace1.C
trunk/gcc/testsuite/g++.dg/cpp0x/attributes-namespace2.C
trunk/gcc/testsuite/g++.dg/cpp0x/attributes-namespace3.C
trunk/gcc/testsuite/g++.dg/cpp0x/attributes-namespace4.C
trunk/gcc/testsuite/g++.dg/cpp0x/attributes-namespace5.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/cp-tree.h
trunk/gcc/cp/decl.c
trunk/gcc/cp/decl2.c
trunk/gcc/cp/name-lookup.c
trunk/gcc/cp/parser.c
trunk/gcc/cp/semantics.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/cpp1z/namespace-attribs.C
trunk/gcc/testsuite/g++.dg/cpp1z/namespace-attribs2.C

[Bug middle-end/91535] New: missing warning on strchr reading from an empty constant array

2019-08-23 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91535

Bug ID: 91535
   Summary: missing warning on strchr reading from an empty
constant array
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

GCC diagnoses calls to string functions like strcpy or strlen that attempt to
access an empty flexible array member of constant object but it doesn't issue
the same warning for calls to strchr, strrchr, or strdup and others.

The handlers of all built-ins that accept string arguments should be reviewed
to make sure they diagnose these bugs.

$ cat x.c && gcc -O2 -S -Wall x.c
const struct S { int i; char a[]; } s = { 0 };

int f0 (void)
{
  return __builtin_strlen (s.a);
}

int f1 (void)
{
  return __builtin_strcmp (s.a, "123");
}

int f2 (void)
{
  return __builtin_strcmp ("123", s.a);
}

void f3 (char *d)
{
  __builtin_strcpy (d, s.a);
}

int f4 (char *d)
{
  return 0 != __builtin_strchr (s.a, 'x');   // missing warning
}

int f5 (char *d)
{
  return 0 != __builtin_strrchr (s.a, 'x');   // missing warning
}

x.c: In function ‘f0’:
x.c:5:29: warning: offset ‘0’ outside bounds of constant string
[-Warray-bounds]
5 |   return __builtin_strlen (s.a);
  |~^~
x.c:1:37: note: ‘s’ declared here
1 | const struct S { int i; char a[]; } s = { 0 };
  | ^
x.c: In function ‘f3’:
x.c:20:25: warning: offset ‘0’ outside bounds of constant string
[-Warray-bounds]
   20 |   __builtin_strcpy (d, s.a);
  |~^~
x.c:1:37: note: ‘s’ declared here
1 | const struct S { int i; char a[]; } s = { 0 };
  | ^
x.c: In function ‘f1’:
x.c:10:29: warning: offset ‘0’ outside bounds of constant string
[-Warray-bounds]
   10 |   return __builtin_strcmp (s.a, "123");
  |~^~
x.c:1:37: note: ‘s’ declared here
1 | const struct S { int i; char a[]; } s = { 0 };
  | ^
x.c: In function ‘f2’:
x.c:15:36: warning: offset ‘0’ outside bounds of constant string
[-Warray-bounds]
   15 |   return __builtin_strcmp ("123", s.a);
  |   ~^~
x.c:1:37: note: ‘s’ declared here
1 | const struct S { int i; char a[]; } s = { 0 };
  | ^

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread danglin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #15 from John David Anglin  ---
Created attachment 46747
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46747=edit
ld symbol resolution

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread dave.anglin at bell dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #14 from dave.anglin at bell dot net ---
On 2019-08-23 3:40 p.m., dave.anglin at bell dot net wrote:
> readelf shows:
>
> Symbol table '.symtab' contains 31 entries:
>    Num:    Value  Size Type    Bind   Vis  Ndx Name
>
>     29:  0 NOTYPE  WEAK   HIDDEN 1 
> pr41893_2.c.f7e743e4
>     30:  4 NOTYPE  WEAK   HIDDEN   UND
For the other file:

    30:  0 NOTYPE  WEAK   HIDDEN 1 pr41893_1.c.ebbf0839
    31:  4 NOTYPE  WEAK   HIDDEN   UND
    32:  8 NOTYPE  WEAK   HIDDEN   UND
    33:  4 NOTYPE  WEAK   HIDDEN   UND

[Bug target/91534] New: some defined builtins are not usable

2019-08-23 Thread pc at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91534

Bug ID: 91534
   Summary: some defined builtins are not usable
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pc at gcc dot gnu.org
  Target Milestone: ---

On a ppc64le system, some builtins which appear to have the beginnings of
support are not usable at compilation time.

Example from gcc/config/rs6000/rs6000-builtin.def:
BU_VSX_X (XSMADDMDP,  "xsmaddmdp",  FP)

$ cat xsmaddmdp.c
#include 
#include 
double foo(double a, double b, double c) {
double d = __builtin_vsx_xsmaddmdp (2.0, 3.0, 11.0);
return d;
}

$ /opt/at12.0/bin/gcc --version
gcc (GCC) 8.2.1 20180813 (Advance-Toolchain-at12.0) [revision 263510]

$ /opt/at12.0/bin/gcc -c xsmaddmdp.c -mcpu=power9
xsmaddmdp.c: In function ‘foo’:
xsmaddmdp.c:4:13: warning: implicit declaration of function
‘__builtin_vsx_xsmaddmdp’; did you mean ‘__builtin_vsx_xvmadddp’?
[-Wimplicit-function-declaration]

Unscientifically, I took all of the __builtin_{altivec,vmx,vsx,vec} strings
from /opt/at12.0/libexec/gcc/powerpc64le-linux-gnu/8.2.1/cc1, and the following
builtins exhibit the same issue:
implicit declaration of function ‘__builtin_altivec_mask_for_store’ 
implicit declaration of function ‘__builtin_altivec_vec_init_v4si’  
implicit declaration of function ‘__builtin_altivec_vec_init_v8hi’
implicit declaration of function ‘__builtin_altivec_vec_init_v16qi’
implicit declaration of function ‘__builtin_altivec_vec_init_v4sf’  
implicit declaration of function ‘__builtin_altivec_vec_set_v4si’
implicit declaration of function ‘__builtin_altivec_vec_set_v8hi’
implicit declaration of function ‘__builtin_altivec_vec_set_v16qi’
implicit declaration of function ‘__builtin_altivec_vec_set_v4sf’
implicit declaration of function ‘__builtin_altivec_vec_ext_v4si’
implicit declaration of function ‘__builtin_altivec_vec_ext_v8hi’
implicit declaration of function ‘__builtin_altivec_vec_ext_v16qi’
implicit declaration of function ‘__builtin_altivec_vec_ext_v4sf’
implicit declaration of function ‘__builtin_vec_sldw’
implicit declaration of function ‘__builtin_vsx_lxsdx’
implicit declaration of function ‘__builtin_vsx_lxvdsx’
implicit declaration of function ‘__builtin_vsx_stxsdx’
implicit declaration of function ‘__builtin_vsx_xsabsdp’
implicit declaration of function ‘__builtin_vsx_xsadddp’
implicit declaration of function ‘__builtin_vsx_xscmpodp’
implicit declaration of function ‘__builtin_vsx_xscmpudp’
implicit declaration of function ‘__builtin_vsx_xscvdpsxds’
implicit declaration of function ‘__builtin_vsx_xscvdpsxws’
implicit declaration of function ‘__builtin_vsx_xscvdpuxds’
implicit declaration of function ‘__builtin_vsx_xscvdpuxws’
implicit declaration of function ‘__builtin_vsx_xscvsxddp’
implicit declaration of function ‘__builtin_vsx_xscvuxddp’
implicit declaration of function ‘__builtin_vsx_xsdivdp’
implicit declaration of function ‘__builtin_vsx_xsmaddadp’
implicit declaration of function ‘__builtin_vsx_xsmaddmdp’
implicit declaration of function ‘__builtin_vsx_xsmovdp’
implicit declaration of function ‘__builtin_vsx_xsmsubadp’
implicit declaration of function ‘__builtin_vsx_xsmsubmdp’
implicit declaration of function ‘__builtin_vsx_xsmuldp’
implicit declaration of function ‘__builtin_vsx_xsnabsdp’
implicit declaration of function ‘__builtin_vsx_xsnegdp’
implicit declaration of function ‘__builtin_vsx_xsnmaddadp’
implicit declaration of function ‘__builtin_vsx_xsnmaddmdp’
implicit declaration of function ‘__builtin_vsx_xsnmsubadp’
implicit declaration of function ‘__builtin_vsx_xsnmsubmdp’
implicit declaration of function ‘__builtin_vsx_xssubdp’
implicit declaration of function ‘__builtin_vsx_vec_init_v1ti’
implicit declaration of function ‘__builtin_vsx_vec_init_v2df’
implicit declaration of function ‘__builtin_vsx_vec_init_v2di’
implicit declaration of function ‘__builtin_vsx_vec_set_v1ti’
implicit declaration of function ‘__builtin_vsx_vec_set_v2df’
implicit declaration of function ‘__builtin_vsx_vec_set_v2di’
implicit declaration of function ‘__builtin_vsx_vec_ext_v1ti’
implicit declaration of function ‘__builtin_vsx_vec_ext_v2df’
implicit declaration of function ‘__builtin_vsx_vec_ext_v2di’
implicit declaration of function ‘__builtin_altivec_xst_len_r’

...that is certainly not a complete set, because it excludes all of the form
__builtin_.

[Bug target/91533] New: abs pattern generates MMX instructions but fails to call EMMS

2019-08-23 Thread kretz at kde dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91533

Bug ID: 91533
   Summary: abs pattern generates MMX instructions but fails to
call EMMS
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kretz at kde dot org
  Target Milestone: ---
Target: x86_64-*-*, i?86-*-*

Test case (cf. https://godbolt.org/z/IfL1mF):

using V [[gnu::vector_size(8)]] = int;

V f(V a, long double& x) {
a = a < 0 ? -a : a;
x += 1;
return a;
}

Compile with e.g. `-O2 -march=skylake`. This generates a call to `PABSD mm1,
mm2/m64` but fails to call `EMMS`. It even interleaves the FPU instructions
with the MMX instructions. GCC 10 has a fix, it simply calls `PABSD xmm1,
xmm2/m128`.

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread dave.anglin at bell dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #13 from dave.anglin at bell dot net ---
On 2019-08-23 10:09 a.m., marxin at gcc dot gnu.org wrote:
> Ok, now I'm more understanding the code in
> simple_object_elf_copy_lto_debug_sections and I implemented the suggested
> approach.
I still see the error:

COMPILER_PATH=/test/gnu/gcc/objdir/gcc/:/test/gnu/gcc/objdir/gcc/:/usr/ccs/bin/:
   /usr/ccs/bin
LIBRARY_PATH=/test/gnu/gcc/objdir/gcc/:/test/gnu/gcc/objdir/gcc/:/usr/ccs/lib/pa
 
20_64/:/opt/langtools/lib/pa20_64/:/lib/pa20_64/:/usr/lib/pa20_64/:/usr/ccs/lib/
 
pa20_64/:/opt/langtools/lib/pa20_64/:/lib/pa20_64/:/usr/lib/pa20_64/
COLLECT_GCC_OPTIONS='-fdiagnostics-color=never' '-c' '-fno-openmp'
'-fno-openacc    ' '-fPIC'
'-O1' '-B' '/test/gnu/gcc/objdir/gcc/' '-fno-diagnostics-show-caret' '  
-fno-diagnostics-show-line-numbers' '-gdwarf-2' '-g1' '-fwhole-program' '-O'
'-v    '
'-save-temps' '-dumpdir' './' '-dumpbase' 'pr41893-1.exe.ltrans0' '-fltrans' ' 
  -o'
'pr41893-1.exe.ltrans0.ltrans.o'
[Leaving LTRANS pr41893-1.exe.ltrans0.o]
ld: Unsatisfied hidden symbol "". Symbol was referenced from file
pr41893-1.o.de    bug.temp.o
ld: Unsatisfied hidden symbol "". Symbol was referenced from file
pr41893-1.o.de    bug.temp.o
ld: Unsatisfied hidden symbol "". Symbol was referenced from file
pr41893-1.o.de    bug.temp.o
ld: Unsatisfied hidden symbol "". Symbol was referenced from file
pr41893-2.o.de    bug.temp.o
4 errors.
collect2: fatal error: ld returned 1 exit status
compilation terminated.

readelf shows:

Symbol table '.symtab' contains 31 entries:
   Num:    Value  Size Type    Bind   Vis  Ndx Name

    29:  0 NOTYPE  WEAK   HIDDEN 1 pr41893_2.c.f7e743e4
    30:  4 NOTYPE  WEAK   HIDDEN   UND

[Bug c/91206] -Wformat doesn't warn for %hd with char parameter

2019-08-23 Thread ndesaulniers at google dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91206

Nick Desaulniers  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Nick Desaulniers  ---
Thanks for the feedback, in https://reviews.llvm.org/rL369791, Nathan made
[unsigned] char -> [unsigned]short warn only for -Wformat-pedantic, not
-Wformat.

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2019-08-23 Thread wilco at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308

Wilco  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #68 from Wilco  ---
Now also fixed when Neon is enabled (r274823, r274824, r274825). Softfp, vfp
and neon all generate similar instruction counts and stack size, all below 300
bytes with -O3.

[Bug c++/91361] Implement P1152R4: Deprecating some uses of volatile

2019-08-23 Thread mpolacek at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91361

Marek Polacek  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #3 from Marek Polacek  ---
https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01661.html

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread joseph at codesourcery dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

--- Comment #7 from joseph at codesourcery dot com  ---
There's more or less the same ABI question as in bug 91398 about whether 
there is any constraint on the called function writing to the return value 
slot in cases where it does not return normally.

Supposing the ABI allows the return value slot (register or memory) to be 
written to by the called function even if it does not end up returning 
normally, then the optimization in this bug would be valid, while that in 
bug 91398 would not be valid if non-normal return is a possibility.  (The 
example in the present bug also doesn't allow non-normal return, unless we 
say longjmp from a SIGFPE handler is OK - is -fnon-call-exceptions only 
needed for language exceptions or also for longjmp?)

(Validity would also depend on it not affecting the observed address of 
the variable "result" in such a way as to make it equal to the observed 
address of some object in a calling function - but I expect the 
interesting cases for this optimization are where the variable is only 
stored to, not ones where addresses get compared, if it's even possible 
for the same return value slot to get used in more than one function on 
the call stack.)

[Bug middle-end/91512] [10 Regression] Fortran compile time regression.

2019-08-23 Thread skpgkp2 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91512

--- Comment #16 from Sunil Pandey  ---
(In reply to rguent...@suse.de from comment #15)
> On Thu, 22 Aug 2019, skpgkp2 at gmail dot com wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91512
> > 
> > --- Comment #14 from Sunil Pandey  ---
> > (In reply to Richard Biener from comment #7)
> > > (In reply to Sunil Pandey from comment #4)
> > > > Actually it is spec cpu 2017 521.wrf benchmark getting this problem 
> > > > while
> > > > compiling. Compilation taking forever, you can see while compiling file
> > > > module_first_rk_step_part1.fppized.f90 as a representative.
> > > 
> > > Note this file contains a single function which (besides USEing quite a
> > > number
> > > of modules...) has only function calls involving a lot of parameters
> > > effectively forwarding parameters from the function.  Thus
> > > 
> > > SUBROUTINE foo (psim, ..., ims, ime, jms, jme)
> > > REAL,DIMENSION(ims:ime,jms:jme), INTENT(INOUT) :: psim
> > > call sub1 (PSIM=psim, ...)
> > > call sub2 (PSIM=psim, ...)
> > > END SUBROUTINE
> > > 
> > > with a _lot_ of arrays being passed through.  A simple testcase like
> > > 
> > > SUBROUTINE sub1 (psim, ims, ime, jms, jme)
> > > REAL,DIMENSION(ims:ime,jms:jme), INTENT(INOUT) :: psim
> > > END SUBROUTINE
> > > SUBROUTINE foo (psim, ims, ime, jms, jme)
> > > REAL,DIMENSION(ims:ime,jms:jme), INTENT(INOUT) :: psim
> > > call sub1 (psim, ims, ime, jms, jme)
> > > END SUBROUTINE
> > > 
> > > doesn't show any extra loops generated though, so I'm not sure what to
> > > look after.
> > 
> > It seems very hard to create a small test case which reproduce the long 
> > compile
> > time problem. Unfortunately, I'm not allowed to upload spec source file. 
> > Also
> > it's very big with lots of module dependency. Assuming you have spec 2017
> > sources,
> > 
> > Here is unmodified command line, which show compile time problem.
> > 
> > Spec build dir: 
> > ===
> > 
> > /local/skpandey/gccwork/specx5/cpu2017/benchspec/CPU/521.wrf_r/build/build_base_gcc-10.0.0-x86-64.
> > 
> > Before the commit in question:
> > ==
> > 
> > Take 41 second to compile unmodified file with -O2 -march=skylake
> > 
> > $ time
> > /local/skpandey/gccwork/gcc_trunk/tools-build/gcc-debug/release.a4ba5c3ec624008e899a8bcb687359db25140c23/usr/gcc-10.0.0-x86-64/bin/gfortran
> >  -m64 -c -o module_first_rk_step_part1.fppized.o -I. -I./netcdf/include 
> > -I./inc
> > -fno-unsafe-math-optimizations -mfpmath=sse -O2 -march=skylake 
> > -funroll-loops
> > -fconvert=big-endian module_first_rk_step_part1.fppized.f90
> > 
> > real0m41.295s
> > user0m41.031s
> > sys 0m0.204s
> > 
> > After the commit in question:
> > =
> > 
> > It take about 12 minute with -O2 -march=skylake
> > 
> > $ time
> > /local/skpandey/gccwork/gcc_trunk/tools-build/gcc-debug/release/usr/gcc-10.0.0-x86-64/bin/gfortran
> >  -m64 -c -o module_first_rk_step_part1.fppized.o -I. -I./netcdf/include 
> > -I./inc
> > -fno-unsafe-math-optimizations -mfpmath=sse -O2 -march=skylake 
> > -funroll-loops
> > -fconvert=big-endian module_first_rk_step_part1.fppized.f90
> > 
> > real11m59.498s
> > user11m53.304s
> > sys 0m4.835s
> > 
> > 
> > With higher optimization like -O3 or -Ofast, it take even longer and I have 
> > to
> > kill it.
> 
> Does it help to omit -funroll-loops?

Omitting -funroll-loops help a bit but not much.

$ time
/local/skpandey/gccwork/gcc_trunk/tools-build/gcc-debug/release/usr/gcc-10.0.0-x86-64/bin/gfortran
 -m64 -c -o module_first_rk_step_part1.fppized.o -I. -I./netcdf/include -I./inc
-fno-unsafe-math-optimizations -mfpmath=sse -O2 -march=skylake
-fconvert=big-endian module_first_rk_step_part1.fppized.f90

real9m4.806s
user9m2.180s
sys 0m1.620s
$ time
/local/skpandey/gccwork/gcc_trunk/tools-build/gcc-debug/release/usr/gcc-10.0.0-x86-64/bin/gfortran
 -m64 -c -o module_first_rk_step_part1.fppized.o -I. -I./netcdf/include -I./inc
-fno-unsafe-math-optimizations -mfpmath=sse -O3 -march=skylake
-fconvert=big-endian module_first_rk_step_part1.fppized.f90

real18m7.810s
user18m4.395s
sys 0m1.498s
$ time
/local/skpandey/gccwork/gcc_trunk/tools-build/gcc-debug/release/usr/gcc-10.0.0-x86-64/bin/gfortran
 -m64 -c -o module_first_rk_step_part1.fppized.o -I. -I./netcdf/include -I./inc
-fno-unsafe-math-optimizations -mfpmath=sse -O3 -march=skylake -funroll-loops
-fconvert=big-endian module_first_rk_step_part1.fppized.f90

real25m47.889s
user25m40.571s
sys 0m4.639s

[Bug tree-optimization/81810] unused strcpy to a local buffer not eliminated

2019-08-23 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81810

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||law at redhat dot com
 Resolution|--- |DUPLICATE

--- Comment #2 from Jeffrey A. Law  ---
This is really a degenerate case of 80576.

*** This bug has been marked as a duplicate of bug 80576 ***

[Bug tree-optimization/80576] dead strcpy and strncpy followed by memset not eliminated

2019-08-23 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80576

--- Comment #6 from Jeffrey A. Law  ---
*** Bug 81810 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/90883] Generated code is worse if returned struct is unnamed

2019-08-23 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883

--- Comment #23 from Martin Sebor  ---
I get the same failure with -m32 -mtune=generic:

spawn -ignore SIGHUP /ssd/build/gcc-svn/gcc/testsuite/g++/../../xg++
-B/ssd/buil
d/gcc-svn/gcc/testsuite/g++/../../
/src/gcc/svn/gcc/testsuite/g++.dg/tree-ssa/pr
90883.C -m32 -mtune=generic -fno-diagnostics-show-caret
-fno-diagnostics-show-li
ne-numbers -fdiagnostics-color=never -nostdinc++
-I/ssd/build/gcc-svn/x86_64-pc-
linux-gnu/32/libstdc++-v3/include/x86_64-pc-linux-gnu
-I/ssd/build/gcc-svn/x86_6
4-pc-linux-gnu/32/libstdc++-v3/include -I/src/gcc/svn/libstdc++-v3/libsupc++
-I/src/gcc/svn/libstdc++-v3/include/backward
-I/src/gcc/svn/libstdc++-v3/testsuite/util -fmessage-length=0 -O2 -Os
-fdump-tree-dse-details -std=c++11 -S -o pr90883.s
PASS: g++.dg/tree-ssa/pr90883.C   (test for excess errors)
FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted redundant store:
.*.a = {}"

[Bug tree-optimization/90883] Generated code is worse if returned struct is unnamed

2019-08-23 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90883

--- Comment #22 from Jeffrey A. Law  ---
The test is somewhat sensitive to target bits that select between various
strategies for implementing mem* routines.

Can you try with -mtune=generic?  If that works, I can adjust the testcase
appropriately.

[Bug libgomp/91530] Several libgomp./scan- tests FAIL without avx_runtime

2019-08-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91530

--- Comment #2 from Jakub Jelinek  ---
Created attachment 46746
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46746=edit
gcc10-pr91530.patch

Does the following patch fix it?

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

Martin Liška  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 Status|WAITING |ASSIGNED

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #12 from Martin Liška  ---
Created attachment 46745
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46745=edit
Patch candidate

Ok, now I'm more understanding the code in
simple_object_elf_copy_lto_debug_sections and I implemented the suggested
approach.

@John: Can you please test it?

[Bug target/91518] [9/10 Regression] segfault when run CPU2006 465.tonto since r263875

2019-08-23 Thread wschmidt at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91518

--- Comment #5 from Bill Schmidt  ---
r8 should be the base address, for what it's worth.  For a version of GCC where
this is working, a data address is loaded there.  For the failing version, we
see a value of 1 loaded instead.

[Bug preprocessor/91517] Pragma expansion in variadic macro reorders pragmas and breaks code

2019-08-23 Thread paboyle at ph dot ed.ac.uk

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91517

--- Comment #5 from Peter Boyle  ---
Hi Jakub,

The difference between these two cases (one maintaining the pragma in right
place, 
the other note) suggested a viable work around in the code.

I can eliminate the extra naked_for macro and (with some undesired code
replication)
get a working solution.

However, that doesn't mean it isn't a bug, and it should of course be fixed,  !

Thanks for the pointer - I will make the change to the code to tolerate the
issue,
because GCC is clearly an important target for us.

Best wishes,

Peter

[Bug preprocessor/91517] Pragma expansion in variadic macro reorders pragmas and breaks code

2019-08-23 Thread paboyle at ph dot ed.ac.uk

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91517

--- Comment #4 from Peter Boyle  ---
Hi Jakob, 

thanks for looking at this.

I'm trying to cut down a fail in 100k line of code package to the minimal thing
that I can submit.

www.github.com/paboyle/Grid

Is the original package;

WITH -fopenmp the following larger example still fails

#define DO_PRAGMA_(x) _Pragma (#x)
#define DO_PRAGMA(x) DO_PRAGMA_(x)
#define thread_num(a) omp_get_thread_num()
#define thread_max(a) omp_get_max_threads()

#define naked_for(i,num,...) for ( uint64_t i=0;i
inline void blockProject(Lattice > ,
 const Lattice   ,
 const std::vector > )
{
  GridBase * fine  = fineData.Grid();
  GridBase * coarse= coarseData.Grid();
  int  _ndimension = coarse->_ndimension;

  // checks
  assert( nbasis == Basis.size() );
  subdivides(coarse,fine); 
  for(int i=0;i_rdimensions[d] / coarse->_rdimensions[d];
assert(block_r[d]*coarse->_rdimensions[d] == fine->_rdimensions[d]);
  }

  coarseData=Zero();

  auto fineData_   = fineData.View();
  auto coarseData_ = coarseData.View();
  // Loop over coars parallel, and then loop over fine associated with coarse.
  thread_for( sf, fine->oSites(), {
int sc;
Coordinate coor_c(_ndimension);
Coordinate coor_f(_ndimension);
Lexicographic::CoorFromIndex(coor_f,sf,fine->_rdimensions);
for(int d=0;d<_ndimension;d++) coor_c[d]=coor_f[d]/block_r[d];
Lexicographic::IndexFromCoor(coor_c,sc,coarse->_rdimensions);

thread_critical {
  for(int i=0;i"
# 1 ""
# 1 "tmp.cc"
# 19 "tmp.cc"
template
inline void blockProject(Lattice > ,
const Lattice ,
const std::vector > )
{
  GridBase * fine = fineData.Grid();
  GridBase * coarse= coarseData.Grid();
  int _ndimension = coarse->_ndimension;

  assert( nbasis == Basis.size() );
  subdivides(coarse,fine);
  for(int i=0;i_rdimensions[d] / coarse->_rdimensions[d];
assert(block_r[d]*coarse->_rdimensions[d] == fine->_rdimensions[d]);
  }

  coarseData=Zero();

  auto fineData_ = fineData.View();
  auto coarseData_ = coarseData.View();


# 61 "tmp.cc"

# 61 "tmp.cc"
#pragma omp parallel for schedule(static)
# 47 "tmp.cc"
# 61 "tmp.cc"

# 61 "tmp.cc"
#pragma omp critical
# 55 "tmp.cc"
# 47 "tmp.cc"
  for ( uint64_t sf=0;sfoSites();sf++) { {{ int sc; Coordinate
coor_c(_ndimension); Coordinate coor_f(_ndimension);
Lexicographic::CoorFromIndex(coor_f,sf,fine->_rdimensions); for(int
d=0;d<_ndimension;d++) coor_c[d]=coor_f[d]/block_r[d];
Lexicographic::IndexFromCoor(coor_c,sc,coarse->_rdimensions); { for(int
i=0;i

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread rguenther at suse dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #11 from rguenther at suse dot de  ---
On Fri, 23 Aug 2019, marxin at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478
> 
> --- Comment #10 from Martin Liška  ---
> (In reply to Richard Biener from comment #9)
> > Yeah, we went though this back in time when I struggled to find a solution
> > working in all environments we support (HP ld, Solaris ld, AIX ld).
> 
> Hm, does it mean I'll have to revert all the removal of gnu_lto_v1?

Well, at least I doubt we can add a weak def of "".  As I said
repeatedly the other option is to really remove symbols but that
entails rewriting all relocation sections (ick).  Previously we've
had libgcc "provide" the __gnu_lto_v1 symbol (that was a hack, but
it worked...).

I guess we might want to at least _try_ doing the right thing
and remove the symbols for real...

I guess rematerializing gnu_lto_v1 just for the sake of removed
symbols would be odd.  I wonder what happens when we instead
of aliasing the removed to UNDEF gnu_lto_v1 (or "" as now)
use a random symbol that prevails... (we should have at least
one for the debuginfo entry).  Then we'd have

19:  0 NOTYPE  WEAK   HIDDEN 4 t.c.61d57031
20:  0 NOTYPE  LOCAL  DEFAULT  UND t.c.61d57031

for example.  Of course we then need to figure which linkers
are happy with that and which not...  And we need to do two passes
over the symtab as we need to find a prevailing symbol to use.

[Bug libstdc++/91486] future::wait_for and shared_timed_mutex::wait_for do not work properly with float duration

2019-08-23 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486

--- Comment #6 from Jonathan Wakely  ---
Ah yes, of course. Thanks!

[Bug pch/61250] Random pch failures with -save-temps on x86_64-apple-darwin1(3-8).

2019-08-23 Thread iains at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61250

--- Comment #26 from Iain Sandoe  ---
so, should be fixed on trunk, so far.

[Bug tree-optimization/91532] New: [SVE] Redundant predicated store in gcc.target/aarch64/fmla_2.c

2019-08-23 Thread rsandifo at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91532

Bug ID: 91532
   Summary: [SVE] Redundant predicated store in
gcc.target/aarch64/fmla_2.c
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---

In gcc.target/aarch64/fmla_2.c, we end up with two stores to the first array
after if-conversion:

  _ifc__59 = *_55;
  _ifc__61 = _4 != 0 ? iftmp.0_31 : _ifc__59;
  *_55 = _ifc__61;
  iftmp.1_35 = __builtin_fma (_6, pretmp_53, pretmp_54);
  _ifc__64 = _4 == 0 ? pretmp_53 : _ifc__61;
  *_55 = _ifc__64;

instead of:

  iftmp.1_35 = __builtin_fma (_6, pretmp_53, pretmp_54);
  _ifc__64 = _4 == 0 ? pretmp_53 : iftmp.0_31;
  *_55 = _ifc__64;

We never recover from this and end up with the two stores to *_55 in the
output:

st1dz2.d, p0, [x0, x6, lsl 3]
...
st1dz0.d, p0, [x0, x6, lsl 3]

[Bug pch/61250] Random pch failures with -save-temps on x86_64-apple-darwin1(3-8).

2019-08-23 Thread iains at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61250

--- Comment #25 from Iain Sandoe  ---
Author: iains
Date: Fri Aug 23 12:41:39 2019
New Revision: 274856

URL: https://gcc.gnu.org/viewcvs?rev=274856=gcc=rev
Log:
[PATCH, c-family] Fix a PCH thinko (and thus PR61250).

When we are parsing a source file, the very first token might
be a PRAGMA_GCC_PCH_PREPROCESS.  This indicates that we are going
read in a PCH file (named as the value of the pragma).  If we don't
see this pragma, then we know that it's OK to release any resources
that the host might have set aside for the PCH file.

This fixes a thinko in the current implementation, in that the decision
to release resources was happening unconditionally right after the first
token is extracted but before it's been checked or acted upon.

This leads to the pch bug (seen on Darwin), because we actually do release
resources - which are subsequently (reasonably) assumed to be available
when reading a PCH file.  We then get random crashes or hangs depending
on the interaction between unmmap and malloc.

The bug is present everywhere but doesn't show on (say) Linux, since
the release of PCH resources is a NOP there.

This effects all the c-family front ends, because they all use c_lex_with_flags
()
to implement this.

The solution is to check for the PRAGMA_GCC_PCH_PREPROCESS and only call
c_common_no_more_pch () when that is not the first token.

A secondary effect of the collection is that the name of the PCH file
can be collected during the ggc_pch_read() reset of state.  Therefore
we should issue any diagnostic that might name the file before the
collections are triggered.


gcc/c-family/

2019-08-23  Iain Sandoe  

PR pch/61250
* c-lex.c (c_lex_with_flags):  Don't call
c_common_no_more_pch () from here.

gcc/c/

2019-08-23  Iain Sandoe  

PR pch/61250
* c-parser.c (c_parse_file): Call c_common_no_more_pch ()
after determining that the first token is not
PRAGMA_GCC_PCH_PREPROCESS.

gcc/cp/

2019-08-23  Iain Sandoe  

PR pch/61250
* parser.c (cp_parser_initial_pragma): Call c_common_no_more_pch ()
after determining that the first token is not
PRAGMA_GCC_PCH_PREPROCESS.

gcc/

2019-08-23  Iain Sandoe  

PR pch/61250
* ggc-page.c (ggc_pch_read): Read the ggc_pch_ondisk structure
and issue any diagnostics needed before collecting the pre-PCH
state.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-lex.c
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-parser.c
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/parser.c
trunk/gcc/ggc-page.c

[Bug libstdc++/91486] future::wait_for and shared_timed_mutex::wait_for do not work properly with float duration

2019-08-23 Thread john.salmon at deshaw dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91486

--- Comment #5 from John Salmon  ---
C++17 already has the needed helper function:  ceil(duration).  

So just change all instances of:

 __clock_t::now() + __reltime

to

using __dur = typename __clock_t::duration;
__clock_t::now() + __chrono_detail::ceil<__dur>(__reltime)

and make the C++17 implementation of ceil(duration) visible in all versions as
__chrono_detail::__ceil.

[Bug libstdc++/91531] New: _Rb_tree's copy assignment should respect to POCCA regardless of is_always_equal

2019-08-23 Thread frankhb1989 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91531

Bug ID: 91531
   Summary: _Rb_tree's copy assignment should respect to POCCA
regardless of is_always_equal
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: frankhb1989 at gmail dot com
  Target Milestone: ---

In :

  template
_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>&
_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::
operator=(const _Rb_tree& __x)
{
  if (this != &__x)
{
  // Note that _Key may be a constant type.
#if __cplusplus >= 201103L
  if (_Alloc_traits::_S_propagate_on_copy_assign())
{
  auto& __this_alloc = this->_M_get_Node_allocator();
  auto& __that_alloc = __x._M_get_Node_allocator();
  if (!_Alloc_traits::_S_always_equal()
  && __this_alloc != __that_alloc)
{
  // Replacement allocator cannot free existing storage, we
need
  // to erase nodes first.
  clear();
  std::__alloc_on_copy(__this_alloc, __that_alloc);
}
}
#endif

  _Reuse_or_alloc_node __roan(*this);
  _M_impl._M_reset();
  _M_impl._M_key_compare = __x._M_impl._M_key_compare;
  if (__x._M_root() != 0)
_M_root() = _M_copy(__x, __roan);
}

  return *this;
}

As `std::__alloc_on_copy` is called only when
`!_Alloc_traits::_S_always_equal() && __this_alloc != __that_alloc`, so a POCCA
allocator will not be propagated once it is always equal. This is also not
consistent with all sequence/unordered associative standard allocator-aware
containers implemented in libstdc++.

[Bug libgomp/91530] Several libgomp./scan- tests FAIL without avx_runtime

2019-08-23 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91530

--- Comment #1 from Rainer Orth  ---
Forgot to mention: there are many other reports of the same failures on
gcc-testresults for all sorts of different x86 targets.

[Bug libgomp/91530] New: Several libgomp./scan- tests FAIL without avx_runtime

2019-08-23 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91530

Bug ID: 91530
   Summary: Several libgomp.*/scan-* tests FAIL without
avx_runtime
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---
Target: i?86-*-*, x86_64-*-*

I just noticed that several libgomp.*/scan-* tests FAIL or are UNRESOLVED. 
E.g.
on i386-pc-solais2.11

* 32-bit:

+UNRESOLVED: libgomp.c++/scan-10.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c++/scan-11.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c++/scan-12.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c++/scan-13.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c++/scan-14.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c++/scan-15.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c++/scan-16.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c++/scan-9.C scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-11.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-12.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-13.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-14.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-15.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-16.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-17.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-18.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-19.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2
+UNRESOLVED: libgomp.c/scan-20.c scan-tree-dump-times vect "vectorized [2-6]
loops" 2

  They all fail like this:

libgomp.c++/scan-10.C: dump file does not exist

  while with -mavx the dump is created.

* 64-bit:

+FAIL: libgomp.c/scan-13.c scan-tree-dump-times vect "vectorized [2-6] loops" 2
+FAIL: libgomp.c/scan-17.c scan-tree-dump-times vect "vectorized [2-6] loops" 2

  Similarly, they fail like

libgomp.c/scan-13.c: pattern found 0 times

  while adding -mavx fixes the failure.

[Bug libgomp/91530] Several libgomp./scan- tests FAIL without avx_runtime

2019-08-23 Thread ro at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91530

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |10.0

[Bug middle-end/91283] [10 regression] gcc.dg/torture/c99-contract-1.c FAILs

2019-08-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91283

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jakub Jelinek  ---
Fixed now.

[Bug other/91511] documentation of the effect of #pragma omp simd

2019-08-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91511

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
omp simd pragma is documented in the OpenMP standard.  It doesn't permit the
compiler from doing optimizations that affect floating point precision, such as
using fma, so for that you need some other option like -ffast-math or -Ofast or
the suboptions those enable.

[Bug preprocessor/91517] Pragma expansion in variadic macro reorders pragmas and breaks code

2019-08-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91517

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
It works properly if you use -fopenmp during preprocessing (or compilation).
Without -fopenmp, the pragmas aren't recognized.
In your use case, are you preprocessing separately without -fopenmp and then
compiling with -fopenmp?  If so, why?

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #10 from Martin Liška  ---
(In reply to Richard Biener from comment #9)
> Yeah, we went though this back in time when I struggled to find a solution
> working in all environments we support (HP ld, Solaris ld, AIX ld).

Hm, does it mean I'll have to revert all the removal of gnu_lto_v1?

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #9 from Richard Biener  ---
Yeah, we went though this back in time when I struggled to find a solution
working in all environments we support (HP ld, Solaris ld, AIX ld).

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread dave.anglin at bell dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #8 from dave.anglin at bell dot net ---
On 2019-08-23 7:20 a.m., marxin at gcc dot gnu.org wrote:
> Which GCC version are you testing? Do you have following trunk commit:
I am testing trunk.  The error in Comment #1 was for r274539.
>
> Fix off-by-one in simple-object-elf.c (PR lto/91228).
>
> 2019-07-24  Martin Liska  
>
> PR lto/91228
> * simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
> Find first '\0' starting from gnu_lto + 1.
>
>
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@273757
> 138bc75d-0d04-0410-961f-82ee72b054a4
The symbols shown by nm in Comment #5 compared r273661 and r273662.  Thus, the
symbols
for r273662 are affected by the off-by-one bug.

In order to do regression search, I also had to apply this patch:

Index: lto-wrapper.c
===
--- lto-wrapper.c   (revision 274037)
+++ lto-wrapper.c   (working copy)
@@ -1112,7 +1112,7 @@

 /* Number of CPUs that can be used for parallel LTRANS phase.  */

-static unsigned long nthreads_var = 0;
+static unsigned long nthreads_var = 1;

 #ifdef HAVE_PTHREAD_AFFINITY_NP
 unsigned long cpuset_size;

This is because make objects to "-j0".

64-Bit HP ld issues errors or warnings about unstats depending on
+[no]allowunsats option.
It doesn't help to allow unstats as the dynamic linker will object to unstats
in an executable
when it is run.  So, the symbols that used to turn into gnu_lto_v1 need to turn
into a common
or defined weak symbol on this target.

[Bug c++/91525] ICE (Segmentation Fault) on a bool conversion operator with concepts

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91525

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-23
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
I can confirm that, for the provided snippet, all releases fail for me.

[Bug ipa/91508] [9 Regression] Segfault due to referencing removed cgraph_node

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91508

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Martin Liška  ---
Fixed on gcc-9 branch.

[Bug ipa/91508] [9 Regression] Segfault due to referencing removed cgraph_node

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91508

--- Comment #5 from Martin Liška  ---
Author: marxin
Date: Fri Aug 23 11:42:19 2019
New Revision: 274853

URL: https://gcc.gnu.org/viewcvs?rev=274853=gcc=rev
Log:
Backport r274504

2019-08-23  Martin Liska  

PR ipa/91508
Backport from mainline
2019-08-15  Martin Liska  

PR ipa/91438
* cgraph.c (cgraph_node::remove): When setting
n->origin = NULL for all nested functions, reset
also next_nested.

Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/cgraph.c

[Bug ipa/91438] [10 Regression] Profiledbootstrap broken on i586 in Ada

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91438

--- Comment #10 from Martin Liška  ---
Author: marxin
Date: Fri Aug 23 11:42:19 2019
New Revision: 274853

URL: https://gcc.gnu.org/viewcvs?rev=274853=gcc=rev
Log:
Backport r274504

2019-08-23  Martin Liska  

PR ipa/91508
Backport from mainline
2019-08-15  Martin Liska  

PR ipa/91438
* cgraph.c (cgraph_node::remove): When setting
n->origin = NULL for all nested functions, reset
also next_nested.

Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/cgraph.c

[Bug ipa/91404] [10 Regression] ICE in gt_ggc_mx_symtab_node at gcc/gtype-desc.c:1302

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91404

--- Comment #8 from Martin Liška  ---
Author: marxin
Date: Fri Aug 23 11:41:16 2019
New Revision: 274851

URL: https://gcc.gnu.org/viewcvs?rev=274851=gcc=rev
Log:
Backport r274502

2019-08-23  Martin Liska  

Backport from mainline
2019-08-15  Martin Liska  

PR ipa/91404
* passes.c (order): Remove.
(uid_hash_t): Likewise).
(remove_cgraph_node_from_order): Remove from set
of pointers (cgraph_node *).
(insert_cgraph_node_to_order): New.
(duplicate_cgraph_node_to_order): New.
(do_per_function_toporder): Register all 3 cgraph hooks.
Skip removed_nodes now as we know about all of them.

Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/passes.c

[Bug middle-end/91283] [10 regression] gcc.dg/torture/c99-contract-1.c FAILs

2019-08-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91283

--- Comment #3 from Jakub Jelinek  ---
Author: jakub
Date: Fri Aug 23 11:37:29 2019
New Revision: 274850

URL: https://gcc.gnu.org/viewcvs?rev=274850=gcc=rev
Log:
PR middle-end/91283
* common.opt (fexcess-precision=): Add Optimization flag.  Use
flag_excess_precision variable instead of
flag_excess_precision_cmdline.
* flags.h (class target_flag_state): Remove x_flag_excess_precision
member.
(flag_excess_precision): Don't define.
* langhooks.c (lhd_post_options): Set flag_excess_precision instead of
flag_excess_precision_cmdline.  Remove comment.
* opts.c (set_fast_math_flags): Use frontend_set_flag_excess_precision
and x_flag_excess_precision instead of
frontend_set_flag_excess_precision_cmdline and
x_flag_excess_precision_cmdline.
(fast_math_flags_set_p): Use x_flag_excess_precision instead of
x_flag_excess_precision_cmdline.
* toplev.c (init_excess_precision): Remove.
(lang_dependent_init_target): Don't call it.
ada/
* gcc-interface/misc.c (gnat_post_options): Set flag_excess_precision
instead of flag_excess_precision_cmdline.
brig/
* brig-lang.c (brig_langhook_post_options): Set flag_excess_precision
instead of flag_excess_precision_cmdline.
c-family/
* c-common.c (c_ts18661_flt_eval_method): Use flag_excess_precision
instead of flag_excess_precision_cmdline.
* c-cppbuiltin.c (c_cpp_flt_eval_method_iec_559): Likewise.
* c-opts.c (c_common_post_options): Likewise.
d/
* d-lang.cc (d_post_options): Set flag_excess_precision instead of
flag_excess_precision_cmdline.
fortran/
* options.c (gfc_post_options): Set flag_excess_precision instead of
flag_excess_precision_cmdline.  Remove comment.
go/
* go-lang.c (go_langhook_post_options): Set flag_excess_precision
instead of flag_excess_precision_cmdline.
lto/
* lto-lang.c (lto_post_options): Set flag_excess_precision instead of
flag_excess_precision_cmdline.  Remove comment.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/ada/ChangeLog
trunk/gcc/ada/gcc-interface/misc.c
trunk/gcc/brig/ChangeLog
trunk/gcc/brig/brig-lang.c
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.c
trunk/gcc/c-family/c-cppbuiltin.c
trunk/gcc/c-family/c-opts.c
trunk/gcc/common.opt
trunk/gcc/d/ChangeLog
trunk/gcc/d/d-lang.cc
trunk/gcc/flags.h
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/options.c
trunk/gcc/go/ChangeLog
trunk/gcc/go/go-lang.c
trunk/gcc/langhooks.c
trunk/gcc/lto/ChangeLog
trunk/gcc/lto/lto-lang.c
trunk/gcc/opts.c
trunk/gcc/toplev.c

[Bug ipa/91508] [9 Regression] Segfault due to referencing removed cgraph_node

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91508

--- Comment #4 from Richard Biener  ---
Yes.

[Bug libstdc++/91480] Nonconforming definitions of standard library feature-test macros

2019-08-23 Thread frankhb1989 at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91480

--- Comment #4 from frankhb1989 at gmail dot com ---
(In reply to Jonathan Wakely from comment #3)
> (In reply to frankhb1989 from comment #0)
> > Also, in , `__cpp_lib_allocator_traits_is_always_equal` is
> > wrongly spelled as `__cpp_lib_allocator_is_always_equal`.
> 
> This is incorrect. We *also* define
> __cpp_lib_allocator_traits_is_always_equal, in the appropriate places. So we
> have an extra, non-standard macro. We don't spell the standard one wrong.
> 

OK, I see N4258 proposes changes both to [allocator.traits] and
[default.allocator]. The macro `__cpp_lib_allocator_is_always_equal` is likely
only for the latter and it's in the right header. Not a bug.

The issue is remained for 'L'.

[Bug c++/91529] [8/9/10 Regression] -fmerge-all-constants leads to corrupt output without inlining

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91529

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-23
 CC||jason at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org
  Known to work||7.4.0
   Target Milestone|--- |8.4
Summary|-fmerge-all-constants leads |[8/9/10 Regression]
   |to corrupt output without   |-fmerge-all-constants leads
   |inlining|to corrupt output without
   ||inlining
 Ever confirmed|0   |1
  Known to fail||10.0, 8.3.0, 9.2.0

--- Comment #1 from Martin Liška  ---
Confirmed, started with r258755.

[Bug c++/91529] New: -fmerge-all-constants leads to corrupt output without inlining

2019-08-23 Thread fiesh at zefix dot tv

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91529

Bug ID: 91529
   Summary: -fmerge-all-constants leads to corrupt output without
inlining
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fiesh at zefix dot tv
  Target Milestone: ---

(Filed this as C++ but don't think that's the right component.)

The following code compiles to a program that behaves as follows:

* Compiled with "g++ -std=c++17 -fmerge-all-constants": segfault
* Compiled with "g++ -std=c++17 -fmerge-all-constants -O1 -fno-inline":
segfault
* Compiled with "g++ -std=c++17 -fmerge-all-constants -O1": success
* Compiled with "g++ -std=c++17": success

It seems that for gcc-7, the program always succeeds, but for gcc-8, gcc-9, and
trunk, this behavior shows up.  It was reduced from what I believe is valid
code using std::variant.

template 
struct e
{
};
template 
e h;
template 
class ad;
long bi;
struct i
{
static constexpr bool bm = 0;
};
template 
union ap {
};
template 
union ap {
constexpr ap(e<0>) : aq() {}
template 
constexpr ap(e) : bt(h)
{
}
ag aq;
ap bt;
};
template 
struct as;
template 
struct as
{
template 
constexpr as(e) : av(h), aw(af)
{
}
void j() { aw = bi; }
~as() { j(); }
ap av;
int aw;
};
template 
using az = as;
template 
struct k : az
{
using bb = az;
bb::bb;
};
template 
using ce = k<0, bv...>;
template 
struct m : ce
{
using bb = ce;
bb::bb;
};
template 
using be = m<0, bv...>;
template 
struct p : be
{
using bb = be;
bb::bb;
};
template 
using bg = p<0, bv...>;
template 
struct q : bg
{
using bb = bg;
bb::bb;
};
template 
using ck = q<0, bv...>;
template 
struct r : ck
{
using bb = ck;
template 
constexpr r(e s) : bb(s)
{
}
};
template 
struct l;
template 
struct l>
{
static constexpr long c = 1;
};
template 
class ad : r
{
using bb = r;
template 
static constexpr long l = l::c;

public:
template 
constexpr ad(g) : ad(h>)
{
}
template 
constexpr ad(e) : bb(h)
{
}
};
template 
struct n
{
double d = 1.;
};
using ch = ad>;
main()
{
ch const o{n<0>()};
}

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

--- Comment #7 from Martin Liška  ---
@John:

Which GCC version are you testing? Do you have following trunk commit:

Fix off-by-one in simple-object-elf.c (PR lto/91228).

2019-07-24  Martin Liska  

PR lto/91228
* simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
Find first '\0' starting from gnu_lto + 1.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@273757
138bc75d-0d04-0410-961f-82ee72b054a4

?

[Bug libstdc++/91067] [9/10 Regression] Clang compiler can't link executable if std::filesystem::directory_iterator is encountered

2019-08-23 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91067

Jonathan Wakely  changed:

   What|Removed |Added

 CC||rafael at espindo dot la

--- Comment #18 from Jonathan Wakely  ---
*** Bug 91516 has been marked as a duplicate of this bug. ***

[Bug libstdc++/91516] Please also export the base object constructor for __shared_ptr;

2019-08-23 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91516

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Jonathan Wakely  ---
.

*** This bug has been marked as a duplicate of bug 91067 ***

[Bug ipa/91508] [9 Regression] Segfault due to referencing removed cgraph_node

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91508

--- Comment #3 from Martin Liška  ---
@Richard:

For the proper fix, I would like to backport following 3 commits from trunk:

commit a413f183a85bc9a08e3dcd9e9d617086fce86460 (HEAD -> backport-9-v6,
origin/backport-9-v6)
Author: marxin 
Date:   Thu Aug 15 06:58:36 2019 +

Backport r274504

gcc/ChangeLog:

2019-08-15  Martin Liska  

PR ipa/91438
* cgraph.c (cgraph_node::remove): When setting
n->origin = NULL for all nested functions, reset
also next_nested.

commit 7fad5cd74a282bc49b14c4d9a5a95b3d1a212394
Author: marxin 
Date:   Thu Aug 15 06:58:26 2019 +

Backport r274503

gcc/ChangeLog:

2019-08-15  Martin Liska  

* cgraph.c (cgraph_node::verify_node): Verify origin, nested
and next_nested.

commit ebcb363be811c20d678dc7b985e68ca86afe4707
Author: marxin 
Date:   Thu Aug 15 06:58:09 2019 +

Backport r274502

gcc/ChangeLog:

2019-08-15  Martin Liska  

PR ipa/91404
* passes.c (order): Remove.
(uid_hash_t): Likewise).
(remove_cgraph_node_from_order): Remove from set
of pointers (cgraph_node *).
(insert_cgraph_node_to_order): New.
(duplicate_cgraph_node_to_order): New.
(do_per_function_toporder): Register all 3 cgraph hooks.
Skip removed_nodes now as we know about all of them.

I've just tested that on x86_64-linux-gnu. Are you fine with that approach?

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

--- Comment #6 from Richard Biener  ---
Oh, and then, since we vectorized things, we do not NRV because

 || DECL_ALIGN (found) > DECL_ALIGN (result)

thus we adjusted the VAR_DECLs alignment but the ABI says the return slot
isn't appropriately aligned (well, we do not end up returning in memory,
but...).

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

--- Comment #5 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #4)
> > so the C++ FE already elides the return copy by placing 'result' in the
> > return slot while the C FE doesn't do this.
> 
> That's because in C++ the language requires NRV to be performed in certain
> cases, while for C there is nothing like that and we do the tree NRV in that
> case only much later (nrv pass).
> 
> Joseph, any thoughts whether it would be a valid C FE optimization that
> valid C programs can't observe?

I think we're careful on the caller side not using the destination as
return slot in

  aggr = foo ();

already so no need to try to be clever on the callee-side?  Fixing this
might also fix some missed tail-calling.

Note in this particular case the return value is returned via xmm0/xmm2
so the extra copy we create during gimplification is even more pointless.

And I guess NRV doesn't do anything because of the CLOBBER?

   = result;
  result ={v} {CLOBBER};
  return ;

or simply because

  /* If this function does not return an aggregate type in memory, then
 there is nothing to do.  */
  if (!aggregate_value_p (result, current_function_decl))
return 0;

I guess.  Or because 'result' ends up as TREE_ADDRESSABLE for some
reason!?  create_iv does this, as part of vectorization but after
that we never again do update_address_taken ... :/  I guess
after late FRE would be a good time.

[Bug target/91528] [10 Regression] ICE in ix86_expand_prologue at i386.c:7844 since r274481

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91528

Richard Biener  changed:

   What|Removed |Added

 Target||i?86-*-*
 CC||hjl.tools at gmail dot com

--- Comment #1 from Richard Biener  ---
(gdb) p x_rtl.drap_reg 
$1 = (rtx) 0x0

so

7843  /* Only need to push parameter pointer reg if it is caller saved.
 */
7844  if (!call_used_regs[REGNO (crtl->drap_reg)])
7845{

segfaults.  This must be really a latent issue.  I guess

  /* Conversion means we may have 128bit register spills/fills
 which require aligned stack.  */
  if (converted_insns)
{
  if (crtl->stack_alignment_needed < 128)
...

needs to do some magic for -mforce-drap (which might be handled too early,
ignoring the late generated xmm uses?)

[Bug target/91527] [10 Regression] ICE in update_equiv_regs, at ira.c:3473 since r274694

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91527

Richard Biener  changed:

   What|Removed |Added

   Keywords||ra
 Status|NEW |ASSIGNED
 CC||vmakarov at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #1 from Richard Biener  ---
This looks like a latent issue to me.  IRA is confused about the reg-equiv
note in

(insn 4 24 5 2 (set (subreg:V4SI (reg/v:SI 90 [ c ]) 0)
(subreg:V4SI (reg:SI 100) 0))
"/space/rguenther/src/svn/trunk2/gcc/testsuite/g++.dg/tree-ssa/pr21463.C":11:4
1248 {movv4si_internal}
 (expr_list:REG_DEAD (reg:SI 100)
(expr_list:REG_EQUIV (mem/c:SI (plus:DI (reg/f:DI 16 argp)
(const_int 16 [0x10])) [1 c+0 S4 A64])
(nil

expecting the SET_DEST to be a REG_P (it's a paradoxical subreg).  Not sure
if that's a requirement for RTL in general(?) but at least the docs say
the dest may be a strict_low_part or zero_extract as well.

STV doesn't seem to do anything with notes and DF doesn't track uses in
notes (eh).  So it's probably safest to kill all of them on converted
insns?!  For the timode chain we update equal/equiv notes for REG dests.

I have a patch.

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
> so the C++ FE already elides the return copy by placing 'result' in the
> return slot while the C FE doesn't do this.

That's because in C++ the language requires NRV to be performed in certain
cases, while for C there is nothing like that and we do the tree NRV in that
case only much later (nrv pass).

Joseph, any thoughts whether it would be a valid C FE optimization that valid C
programs can't observe?

[Bug target/91306] [MSP430] libgcc/crtstuff.c: Alignment of frame_dummy .init_array entry is too big

2019-08-23 Thread jozefl at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91306

--- Comment #5 from jozefl at gcc dot gnu.org ---
Author: jozefl
Date: Fri Aug 23 09:21:26 2019
New Revision: 274846

URL: https://gcc.gnu.org/viewcvs?rev=274846=gcc=rev
Log:
2019-08-23  Jozef Lawrynowicz  

PR target/91306
* crtstuff.c (__CTOR_LIST__): Align to the "__alignof__" the array
element type, instead of "sizeof" the element type.
(__DTOR_LIST__): Likewise.
(__TMC_LIST__): Likewise.
(__do_global_dtors_aux_fini_array_entry): Likewise.
(__frame_dummy_init_array_entry): Likewise.
(__CTOR_END__): Likewise.
(__DTOR_END__): Likweise.
(__FRAME_END__): Likewise.
(__TMC_END__): Likewise.

Modified:
trunk/libgcc/ChangeLog
trunk/libgcc/crtstuff.c

[Bug lto/64636] Bootstrapping gcc-4.9.2 fails if lto is enabled

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64636

Martin Liška  changed:

   What|Removed |Added

 Status|WAITING |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

--- Comment #8 from Martin Liška  ---
(In reply to John Paul Adrian Glaubitz from comment #7)
> (In reply to Martin Liška from comment #6)
> > Can you please debug the internal compiler error?
> > I'm interested in how 'hist' struct looks like?
> 
> The gcc compile farm has a fast sparc64 porterbox running Debian unstable,
> so if you want, you can try it yourself.

Good, I have access to the compile farm machine. So let me take a look..

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

--- Comment #3 from Andrew Pinski  ---
>Interestingly, even if the __restrict__ attribute is removed, it still gets 
>vectorized. Is this correct behavior?

Yes as v1->v[0] cannot be the same as v2->v[1] or result->v[1], etc. due to the
full object v1 can either be a fully different object or the same object as
result but not overlapping objects.

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread bisqwit at iki dot fi

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

Joel Yliluoma  changed:

   What|Removed |Added

 CC||bisqwit at iki dot fi

--- Comment #2 from Joel Yliluoma  ---
The theory that it is related to RVO seems to be confirmed by the fact that if
the code is changed like this:

   struct Vec { float v[8]; };
   void multiply(struct Vec* result,
 const struct Vec* __restrict__ v1,
 const struct Vec* __restrict__ v2)
   {
   for(unsigned i = 0; i < 8; ++i)
   result->v[i] = v1->v[i] * v2->v[i];
   }

Then it gets compiled in the shorter and proper form. Interestingly, even if
the __restrict__ attribute is removed, it still gets vectorized. Is this
correct behavior?

[Bug fortran/91519] [10 Regression] ICE error in 521.wrf_r

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91519

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |10.0
Summary|[regression]ICE error in|[10 Regression] ICE error
   |521.wrf_r   |in 521.wrf_r

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Target||x86_64-*-*, i?86-*-*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-23
 CC||jakub at gcc dot gnu.org,
   ||mpolacek at gcc dot gnu.org
  Component|target  |c
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
I think I've seen duplicates about this issue where C/C++ differ in the IL
presented to the middle-end for the aggregate return stmt which in the
end causes us to not elide an aggregate copy.  Usually SRA deals with
this but it has a hard job with heuristics and arrays...

The C++ FE does

;; Function Vec multiply(const Vec*, const Vec*) (null)
;; enabled by -tree-original


{
  struct Vec result [value-expr: ];
^^^

while the C FE does

;; Function multiply (null)
;; enabled by -tree-original


{
  struct Vec result;

so the C++ FE already elides the return copy by placing 'result' in the
return slot while the C FE doesn't do this.

Let's make this a C enhancement request rather than a missed optimization
during GIMPLE optimizations (which there are dups for already).

Marek - any chance the C FE could do sth like this?  Maybe we can also
do this during gimplification, we'd have to see what constraints the C++
FE has for performing this.

[Bug target/91528] [10 Regression] ICE in ix86_expand_prologue at i386.c:7844 since r274481

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91528

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-23
  Known to work||9.2.0
   Target Milestone|--- |10.0
 Ever confirmed|0   |1
  Known to fail||10.0

[Bug target/91528] New: [10 Regression] ICE in ix86_expand_prologue at i386.c:7844 since r274481

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91528

Bug ID: 91528
   Summary: [10 Regression] ICE in ix86_expand_prologue at
i386.c:7844 since r274481
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org, ubizjak at gmail dot com
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu

One more ICE caused by the revision:

$ gcc /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr67271.c -Os
-mavx512vbmi2 -mforce-drap -m32
during RTL pass: pro_and_epilogue
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr67271.c: In function
‘main’:
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr67271.c:13:1: internal
compiler error: Segmentation fault
   13 | }
  | ^
0xd9b2bf crash_signal
/home/marxin/Programming/gcc/gcc/toplev.c:326
0x7f16f048ee4f ???
   
/usr/src/debug/glibc-2.29-7.3.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x10e1fa0 ix86_expand_prologue()
/home/marxin/Programming/gcc/gcc/config/i386/i386.c:7844
0x13eee4b gen_prologue()
/home/marxin/Programming/gcc/gcc/config/i386/i386.md:12893
0x10d3848 target_gen_prologue
/home/marxin/Programming/gcc/gcc/config/i386/i386.md:19420
0xac22be make_prologue_seq
/home/marxin/Programming/gcc/gcc/function.c:5735
0xac2483 thread_prologue_and_epilogue_insns()
/home/marxin/Programming/gcc/gcc/function.c:5852
0xac2b82 rest_of_handle_thread_prologue_and_epilogue
/home/marxin/Programming/gcc/gcc/function.c:6343
0xac2b82 execute
/home/marxin/Programming/gcc/gcc/function.c:6385

[Bug target/91527] [10 Regression] ICE in update_equiv_regs, at ira.c:3473 since r274694

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91527

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-23
  Known to work||9.1.0
   Target Milestone|--- |10.0
 Ever confirmed|0   |1
  Known to fail||10.0

[Bug target/91527] New: [10 Regression] ICE in update_equiv_regs, at ira.c:3473 since r274694

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91527

Bug ID: 91527
   Summary: [10 Regression] ICE in update_equiv_regs, at
ira.c:3473 since r274694
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu

Since the revision, I see:

$ g++ /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/tree-ssa/pr21463.C -O3
-mabi=ms -msse4
during RTL pass: ira
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/tree-ssa/pr21463.C: In member
function ‘T foo_t::bar_ref(T, T) [with T = int]’:
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/tree-ssa/pr21463.C:13:2:
internal compiler error: in update_equiv_regs, at ira.c:3473
   13 |  }
  |  ^
0x73fb29 update_equiv_regs
/home/marxin/Programming/gcc/gcc/ira.c:3473
0xe03c62 ira
/home/marxin/Programming/gcc/gcc/ira.c:5308
0xe03c62 execute
/home/marxin/Programming/gcc/gcc/ira.c:5663

[Bug lto/91273] [7/8/9/10 Regression] ICE in warn_types_mismatch at ipa-devirt.c:995

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91273

--- Comment #9 from Martin Liška  ---
@Honza: Is there any progress?

[Bug ipa/91508] [9 Regression] Segfault due to referencing removed cgraph_node

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91508

Martin Liška  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

--- Comment #2 from Martin Liška  ---
Then let me make some backport patch that will also utilize the cgraph hooks.

[Bug lto/64636] Bootstrapping gcc-4.9.2 fails if lto is enabled

2019-08-23 Thread glaubitz at physik dot fu-berlin.de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64636

--- Comment #7 from John Paul Adrian Glaubitz  ---
(In reply to Martin Liška from comment #6)
> Can you please debug the internal compiler error?
> I'm interested in how 'hist' struct looks like?

The gcc compile farm has a fast sparc64 porterbox running Debian unstable, so
if you want, you can try it yourself.

[Bug lto/91478] FAIL: gcc.dg/debug/pr41893-1.c -gdwarf-2 -g1 (test for excess errors)

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91478

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-08-23
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #6 from Martin Liška  ---
Then mine.

[Bug target/91518] [9/10 Regression] segfault when run CPU2006 465.tonto since r263875

2019-08-23 Thread rguenther at suse dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91518

--- Comment #4 from rguenther at suse dot de  ---
On Fri, 23 Aug 2019, luoxhu at cn dot ibm.com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91518
> 
> --- Comment #3 from Xiong Hu XS Luo  ---
> (In reply to Richard Biener from comment #2)
> > Not seen on x86_64.  Given you bisected to r263875 it should appear with GCC
> > 9 as well - are the actual GCC 9 releases also affected?
> > 
> > I assume this is ppc64le.
> > 
> > Unless we know more I assume this is a target issue.  Please build with 
> > debug
> > info and see where exactly and why it segfaults.
> 
> Yes.  It still fails on both power8 and power9 even on GCC 10 (gcc version
> 10.0.0 20190823 (experimental) (GCC)).  
> Reset to r263875, the register content shown as below, Wrong address filled 
> for
> lwzx instruction ($r8 is expected to be a valid address value):
> 
> 140│0x101a5718 <+552>:   ld  r12,888(r31)
> 141│0x101a571c <+556>:   ld  r0,856(r31)
> 142│0x101a5720 <+560>:   ld  r17,880(r31)
> 143│0x101a5724 <+564>:   ld  r8,848(r31)
> 144│0x101a5728 <+568>:   addir21,r21,1
> 145│0x101a572c <+572>:   cmpwcr7,r21,r30
> 146│0x101a5730 <+576>:   mulld   r4,r3,r12
> 147│0x101a5734 <+580>:   add r18,r4,r0
> 148│0x101a5738 <+584>:   mulld   r11,r18,r17
> 149├>   0x101a573c <+588>:   lwzxr3,r8,r11 
> 
> 44: /x $r3 = 0x1
> 45: /x $r8 = 0x77
> 46: /x $r11 = 0x1770
> 47: /x $r18 = 0x7d
> 48: /x $r17 = 0x30
> 49: /x $r4 = 0x1
> 50: /x $r0 = 0x7c
> 51: /x $r3 = 0x1
> 52: /x $r12 = 0x1
> 53: /x $r21 = 0x2
> 54: /x $r8 = 0x77
> 55: /x $r17 = 0x30
> 56: /x $r0 = 0x7c
> 57: /x $r12 = 0x1
> 
> I am not sure whether this is the debug info you needed? 
> function callstack is already pasted in #c0, as source code is not allowed to
> be pasted, the segment fault place is in line 9375 of file mol.fppized.f90 of
> function make_image_of_shell.  Thanks.

That's

   call get_shell_(self,sh,b); nb = sh%n_comp; lb = sh%l; call 
destroy_ptr_part_(sh)

for me.  Maybe you can edit the source to split this line at stmt
boundaries and include assembly up to the previous/next call.
I'm not familiar with power too much so you have to say which
of r3, r8 or r11 is supposed to be the base address and trace
it to where that goes wrong.

As said I'm quite confident this is a target issue.

[Bug lto/64636] Bootstrapping gcc-4.9.2 fails if lto is enabled

2019-08-23 Thread marxin at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64636

--- Comment #6 from Martin Liška  ---
Can you please debug the internal compiler error?
I'm interested in how 'hist' struct looks like?

[Bug tree-optimization/91504] Inlining misses some logical operation folding

2019-08-23 Thread rguenth at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91504

--- Comment #4 from Richard Biener  ---
(In reply to Kamlesh Kumar from comment #3)
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 93dcef9..b62ef36 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -137,6 +137,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (pointer_plus integer_zerop @1)
>   (non_lvalue (convert @1)))
>  
> +/* (~value & C) ^ value -> value | C */
> +(simplify
> + (bit_xor:c (bit_and (bit_not @0) INTEGER_CST@1) @0)
> + (bit_ior @0 @1))
> + 

Looks good.  I think there's a related transform already you could put it
next to, also it shouldn't be restricted to INTEGER_CST @1?  Also the
inner bit_and should have :cs if @1 isn't INTEGER_CST.

/* (a & ~b) ^ ~a  -->  ~(a & b)  */
(simplify
 (bit_xor:c (bit_and:cs @0 (bit_not @1)) (bit_not @0))
 (bit_not (bit_and @0 @1)))

[Bug middle-end/91512] [10 Regression] Fortran compile time regression.

2019-08-23 Thread rguenther at suse dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91512

--- Comment #15 from rguenther at suse dot de  ---
On Thu, 22 Aug 2019, skpgkp2 at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91512
> 
> --- Comment #14 from Sunil Pandey  ---
> (In reply to Richard Biener from comment #7)
> > (In reply to Sunil Pandey from comment #4)
> > > Actually it is spec cpu 2017 521.wrf benchmark getting this problem while
> > > compiling. Compilation taking forever, you can see while compiling file
> > > module_first_rk_step_part1.fppized.f90 as a representative.
> > 
> > Note this file contains a single function which (besides USEing quite a
> > number
> > of modules...) has only function calls involving a lot of parameters
> > effectively forwarding parameters from the function.  Thus
> > 
> > SUBROUTINE foo (psim, ..., ims, ime, jms, jme)
> > REAL,DIMENSION(ims:ime,jms:jme), INTENT(INOUT) :: psim
> > call sub1 (PSIM=psim, ...)
> > call sub2 (PSIM=psim, ...)
> > END SUBROUTINE
> > 
> > with a _lot_ of arrays being passed through.  A simple testcase like
> > 
> > SUBROUTINE sub1 (psim, ims, ime, jms, jme)
> > REAL,DIMENSION(ims:ime,jms:jme), INTENT(INOUT) :: psim
> > END SUBROUTINE
> > SUBROUTINE foo (psim, ims, ime, jms, jme)
> > REAL,DIMENSION(ims:ime,jms:jme), INTENT(INOUT) :: psim
> > call sub1 (psim, ims, ime, jms, jme)
> > END SUBROUTINE
> > 
> > doesn't show any extra loops generated though, so I'm not sure what to
> > look after.
> 
> It seems very hard to create a small test case which reproduce the long 
> compile
> time problem. Unfortunately, I'm not allowed to upload spec source file. Also
> it's very big with lots of module dependency. Assuming you have spec 2017
> sources,
> 
> Here is unmodified command line, which show compile time problem.
> 
> Spec build dir: 
> ===
> 
> /local/skpandey/gccwork/specx5/cpu2017/benchspec/CPU/521.wrf_r/build/build_base_gcc-10.0.0-x86-64.
> 
> Before the commit in question:
> ==
> 
> Take 41 second to compile unmodified file with -O2 -march=skylake
> 
> $ time
> /local/skpandey/gccwork/gcc_trunk/tools-build/gcc-debug/release.a4ba5c3ec624008e899a8bcb687359db25140c23/usr/gcc-10.0.0-x86-64/bin/gfortran
>  -m64 -c -o module_first_rk_step_part1.fppized.o -I. -I./netcdf/include 
> -I./inc
> -fno-unsafe-math-optimizations -mfpmath=sse -O2 -march=skylake -funroll-loops
> -fconvert=big-endian module_first_rk_step_part1.fppized.f90
> 
> real0m41.295s
> user0m41.031s
> sys 0m0.204s
> 
> After the commit in question:
> =
> 
> It take about 12 minute with -O2 -march=skylake
> 
> $ time
> /local/skpandey/gccwork/gcc_trunk/tools-build/gcc-debug/release/usr/gcc-10.0.0-x86-64/bin/gfortran
>  -m64 -c -o module_first_rk_step_part1.fppized.o -I. -I./netcdf/include 
> -I./inc
> -fno-unsafe-math-optimizations -mfpmath=sse -O2 -march=skylake -funroll-loops
> -fconvert=big-endian module_first_rk_step_part1.fppized.f90
> 
> real11m59.498s
> user11m53.304s
> sys 0m4.835s
> 
> 
> With higher optimization like -O3 or -Ofast, it take even longer and I have to
> kill it.

Does it help to omit -funroll-loops?

[Bug c/91526] New: Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread warp at iki dot fi

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

Bug ID: 91526
   Summary: Unnecessary SSE and other instructions generated when
compiling in C mode (vs. C++ mode)
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: warp at iki dot fi
  Target Milestone: ---

Consider the following piece of code:

//--
struct Vec { float v[8]; };

struct Vec multiply(const struct Vec* v1, const struct Vec* v2)
{
struct Vec result;
for(unsigned i = 0; i < 8; ++i)
result.v[i] = v1->v[i] * v2->v[i];
return result;
}
//--

If this is compiled as C++, using g++ 9.2 with options -Ofast -march=skylake,
the following result is produced:

_Z8multiplyPK3VecS1_:
  vmovups ymm0, YMMWORD PTR [rdx]
  mov rax, rdi
  vmulps ymm0, ymm0, YMMWORD PTR [rsi]
  vmovups YMMWORD PTR [rdi], ymm0
  vzeroupper
  ret

However, if it's compiled as C, using the same options, this is produced:

multiply:
  push rbp
  mov rax, rdi
  mov rbp, rsp
  and rsp, -32
  vmovups ymm0, YMMWORD PTR [rdx]
  vmulps ymm0, ymm0, YMMWORD PTR [rsi]
  vmovaps YMMWORD PTR [rsp-32], ymm0
  vmovdqa xmm2, XMMWORD PTR [rsp-16]
  vmovups XMMWORD PTR [rdi], xmm0
  vmovups XMMWORD PTR [rdi+16], xmm2
  vzeroupper
  leave
  ret

Not only are extra instructions surrounding the code, but moreover the
assignment of the result into [rdi] has for some reason been split into two
parts.

Both clang and icc produce the same result (very similar to the first result
above) regardless of whether compiling as C or C++.

[Bug target/90552] attribute((optimize(3))) not overriding -Os

2019-08-23 Thread ubizjak at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90552

Uroš Bizjak  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |10.0

--- Comment #7 from Uroš Bizjak  ---
(In reply to Eric Gallager from comment #6)

> Did this fix it?

Fixed.

85 matches

Mail list logo