[Bug c++/88261] [9 Regression] ICE: verify_gimple failed (error: non-trivial conversion at assignment)

2018-12-14 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88261

--- Comment #10 from Bernd Edlinger  ---
Hmm, there are a few loose ends, where there is simply no decl.
For instance in this example:

/* PR c/5597 */
/* { dg-do compile } */
/* { dg-options "" } */

/* Verify that GCC forbids non-static initialization of
   flexible array members. */

struct str { int len; char s[]; };

struct str a = { 2, "a" };

void foo()
{
  static struct str d = (struct str) { 2, "d" }; /* { dg-error
"(non-static)|(near initialization)" } */
  static struct str e = (struct str) { d.len, "e" }; /* { dg-error
"(non-static)|(initialization)" } */
}

$ g++ -S array-6.c
array-6.c:14:47: error: non-static initialization of a flexible array member
   14 |   static struct str d = (struct str) { 2, "d" }; /* { dg-error
"(non-static)|(near initialization)" } */
  |   ^
array-6.c:15:51: error: non-static initialization of a flexible array member
   15 |   static struct str e = (struct str) { d.len, "e" }; /* { dg-error
"(non-static)|(initialization)" } */
  |   ^


the patch prints an error, but digest_init_r is called from
finish_compound_literal:
2830  compound_literal = digest_init_flags (type, compound_literal,
LOOKUP_NORMAL,
2831complain, NULL_TREE);

So here it is impossible to tell which decl is going to be initialized,
because this is called drectly from the parser (cp_parser_postfix_expression).

However the C-FE does also not have a decl here, and I think I cannot do
much about that:

$ gcc -S array-6.c 
array-6.c: In function 'foo':
array-6.c:14:43: error: non-static initialization of a flexible array member
   14 |   static struct str d = (struct str) { 2, "d" }; /* { dg-error
"(non-static)|(near initialization)" } */
  |   ^~~
array-6.c:14:43: note: (near initialization for '(anonymous)')
array-6.c:15:47: error: non-static initialization of a flexible array member
   15 |   static struct str e = (struct str) { d.len, "e" }; /* { dg-error
"(non-static)|(initialization)" } */
  |   ^~~
array-6.c:15:47: note: (near initialization for '(anonymous)')
array-6.c:15:25: error: initializer element is not constant
   15 |   static struct str e = (struct str) { d.len, "e" }; /* { dg-error
"(non-static)|(initialization)" } */
  | ^

[Bug target/88498] [9 Regression] FAIL: gcc.target/i386/avx512vl-pr79299-1.c (internal compiler error)

2018-12-14 Thread xuepeng.guo at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88498

--- Comment #3 from Terry Guo  ---
I just tried and both of failures are gone with Jakub's patch.

[Bug rtl-optimization/88001] ASMCONS cannot handle properly UNSPEC(CONST)

2018-12-14 Thread vgupta at synopsys dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88001

--- Comment #11 from Vineet Gupta  ---
Sure, but how can I ? if i click the "known to work" field it takes me to help.

The issue certainly with gcc-8-branch for ARC and presumably also with
tip/trunk.

[Bug c++/88509] Missing optimization of tls initialization

2018-12-14 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88509

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization

--- Comment #1 from Andrew Pinski  ---
>As far as I can tell this can be applied to any static tls variable

Only in functions which have non-vague linkage.  static varaibles in inline
functions have to have the same ABI across TUs and there might have some in
already compiled code ...

[Bug target/88510] New: GCC generates inefficient U64x2 scalar multiply for NEON32

2018-12-14 Thread husseydevin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88510

Bug ID: 88510
   Summary: GCC generates inefficient U64x2 scalar multiply for
NEON32
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: husseydevin at gmail dot com
  Target Milestone: ---

Note: I use these typedefs here for brevity.

typedef uint64x2_t U64x2;
typedef uint32x2_t U32x2;
typedef uint32x2x2_t U32x2x2;
typedef uint32x4_t U32x4;

GCC and Clang both have issues with this code on ARMv7a NEON, and will switch
to scalar:

U64x2 multiply(U64x2 top, U64x2 bot)
{
return top * bot;
}

gcc-8 -mfloat-abi=hard -mfpu=neon -O3 -S -march=armv7-a 

multiply:
push{r4, r5, r6, r7, lr}
sub sp, sp, #20
vmovr0, r1, d0  @ v2di
vmovr6, r7, d2  @ v2di
vmovr2, r3, d1  @ v2di
vmovr4, r5, d3  @ v2di
mul lr, r0, r7
mla lr, r6, r1, lr
mul ip, r2, r5
umull   r0, r1, r0, r6
mla ip, r4, r3, ip
add r1, lr, r1
umull   r2, r3, r2, r4
strdr0, [sp]
add r3, ip, r3
strdr2, [sp, #8]
vld1.64 {d0-d1}, [sp:64]
add sp, sp, #20
pop {r4, r5, r6, r7, pc}

Clang's is worse, and you can compare the output, as well as the i386 SSE4.1
code here: https://godbolt.org/z/35owtL

Related LLVM bug 39967: https://bugs.llvm.org/show_bug.cgi?id=39967

I started the discussion in LLVM, as it had the worse problem, and we have come
up with a few options for faster code that does not require scalar. You can
also find the benchmark file (with outdated tests) and results results. They
are from Clang, but since they use intrinsics, results are similar.

While we don't have vmulq_u64, we do have faster ways to multiply without going
scalar.

I have benchmarked the code, and have found this option, based on the code
emitted for SSE4.1:

U64x2 goodmul_sse(U64x2 top, U64x2 bot)
{
U32x2 topHi = vshrn_n_u64(top, 32); // U32x2 topHi  = top >> 32;
U32x2 topLo = vmovn_u64(top);   // U32x2 topLo  = top & 0x;
U32x2 botHi = vshrn_n_u64(bot, 32); // U32x2 botHi  = bot >> 32;
U32x2 botLo = vmovn_u64(bot);   // U32x2 botLo  = bot & 0x;

U64x2 ret64 = vmull_u32(topHi, botLo);  // U64x2 ret64   = (U64x2)topHi *
(U64x2)botLo;
ret64 = vmlal_u32(ret64, topLo, botHi); //   ret64  += (U64x2)topLo *
(U64x2)botHi;
ret64 = vshlq_n_u64(ret64, 32); //   ret64 <<= 32;
ret64 = vmlal_u32(ret64, topLo, botLo); //   ret64  += (U64x2)topLo *
(U64x2)botLo;
return ret64;
}

If GCC can figure out how to interleave one or two of the operands, for
example, changing this:

U64x2 inp1 = vld1q_u64(p);
U64x2 inp2 = vld1q_u64(q);
vec = goodmul_sse(inp1, inp2);

to this (if it knows inp1 and/or inp2 are only used for multiplication):

U32x2x2 inp1 = vld2_u32(p);
U32x2x2 inp2 = vld2_u32(q);
vec = goodmul_sse_interleaved(inp1, inp2)

then we can do this and save 4 cycles:

U64x2 goodmul_sse_interleaved(const U32x2x2 top, const U32x2x2 bot)
{
U64x2 ret64 = vmull_u32(top.val[1], bot.val[0]);  // U64x2 ret64   =
(U64x2)topHi * (U64x2)botLo;
ret64 = vmlal_u32(ret64, top.val[0], bot.val[1]); //   ret64  +=
(U64x2)topLo * (U64x2)botHi;
ret64 = vshlq_n_u64(ret64, 32);   //   ret64 <<= 32;
ret64 = vmlal_u32(ret64, top.val[0], bot.val[0]); //   ret64  +=
(U64x2)topLo * (U64x2)botLo;
return ret64;
}

Another user posted this (typos fixed).

It seems to use two fewer cycles when not interleaved (not 100% sure about it),
but two cycles slower when it is fully interleaved.

U64x2 twomul(U64x2 top, U64x2 bot)
{
U32x2 top_low = vmovn_u64(top);
U32x2 bot_low = vmovn_u64(bot);
U32x4 top_re = vreinterpretq_u32_u64(top);
U32x4 bot_re = vrev64q_u32(vreinterpretq_u32_u64(bot));
U32x4 prod = vmulq_u32(top_re, bot_re);
U64x2 paired = vpaddlq_u32(prod);
U64x2 shifted = vshlq_n_u64(paired, 32);
return vmlal_u32(shifted, top_low, bot_low);
}

Either one of these is faster than scalar.

[Bug c/87310] -Wc90-c99-compat does not warn about bool usage

2018-12-14 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87310

Eric Gallager  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=45780

--- Comment #6 from Eric Gallager  ---
I got this confused with bug 45780 for a second but that's actually something
different. Still putting it under "See Also" though, as a "just-in-case" for
the future...

[Bug debug/49167] dwarf marker for function return instruction

2018-12-14 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49167

--- Comment #4 from Eric Gallager  ---
Alexandre, did you just get assigned this because that's what happens with all
bugs with the "debug" component, or are you actually working on it?

[Bug c++/88509] New: Missing optimization of tls initialization

2018-12-14 Thread rafael at espindo dot la
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88509

Bug ID: 88509
   Summary: Missing optimization of tls initialization
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rafael at espindo dot la
  Target Milestone: ---

Given

struct foo {
  foo();
};
static thread_local foo bar;
foo *f() { return  }
foo *g() {
  static thread_local foo *bar_ptr;
  if (bar_ptr == nullptr) {
[&]() { bar_ptr =  }();
  }
  return bar_ptr;
}

GCC has to make sure bar is only initialized once. For the function f it
produces

pushq   %rbx
cmpb$0, %fs:__tls_guard@tpoff
movq%fs:0, %rbx
je  .L6
leaq_ZL3bar@tpoff(%rbx), %rax
popq%rbx
ret
.L6:
   

for g, the common code path is somewhat simpler:

movq%fs:_ZZ1gvE7bar_ptr@tpoff, %rax
testq   %rax, %rax
je  .L15
ret
.L15:
   


The optimization is to use the a pointer to the object as a guard instead of
using a boolean. As far as I can tell this can be applied to any static tls
variable (where the ABI is not a problem).

[Bug c++/88395] ICE: Segmentation fault signal terminated program cc1plus, with -std=c++2a -fconcepts

2018-12-14 Thread xerofoify at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88395

--- Comment #7 from Nicholas Krause  ---
(In reply to Nicholas Krause from comment #6)
> Created attachment 45242 [details]
> Proposed Bug Fix

This is my proposed fix after tracing it and reading it carefully seems that
passing the NULL_TREE in rather than the correct decl tree of concepts is
causing the bug. If this is correct I will just run the gcc test suite and post
to the list.

[Bug c/78352] GCC lacks support for the Apple "blocks" extension to the C family of languages

2018-12-14 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78352

--- Comment #6 from Eric Gallager  ---
(In reply to Eric Gallager from comment #5)
> (In reply to René J.V. Bertin from comment #4)
> > Any news on this front?
> 
> Last I heard from Iain he was still having to deal with water damage to his
> office...

He seems to be back now, but dealing with lower-hanging fruit in his backlog at
the moment. (I really wish I knew as much about the Darwin port as he does so I
could help reduce its "bus factor", but unfortunately I haven't been able to
focus hard enough to sit down and really learn it)

[Bug c++/88395] ICE: Segmentation fault signal terminated program cc1plus, with -std=c++2a -fconcepts

2018-12-14 Thread xerofoify at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88395

--- Comment #6 from Nicholas Krause  ---
Created attachment 45242
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45242=edit
Proposed Bug Fix

[Bug sanitizer/84863] false-positive -Warray-bounds warning with -fsanitize-coverage=object-size

2018-12-14 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84863

--- Comment #2 from Eric Gallager  ---
(In reply to Jakub Jelinek from comment #1)
> Never use -Werror with -fsanitize=*, those really do cause new warnings
> because the code intentionally is less optimized and the runtime check
> themselves prevent further optimizations, so warnings that depend on
> optimizations can't work properly.

Maybe gcc should explicitly forbid the 2 of them being combined and error out
in the option-handling part of the driver instead.

[Bug libstdc++/88508] New: std::bad_cast in std::basic_ostringstream.oss.fill()

2018-12-14 Thread dudin.roman at list dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88508

Bug ID: 88508
   Summary: std::bad_cast in
std::basic_ostringstream.oss.fill()
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dudin.roman at list dot ru
  Target Milestone: ---

Created attachment 45241
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45241=edit
Code

I am trying to add support of UTF-16 and UTF-32 in C++17 environment.

Short fragment of my code:

namespace std {
typedef basic_ostringstream u16ostringstream;
}

std::u16string to_hexdump_u16(const std::u16string_view _v) {

std::u16ostringstream oss;
oss.setf(std::ios::hex, std::ios::basefield);
oss.fill(u'0');

...

return oss.str();
}

int main() {

auto str_u16 = u"Hello, my favorite World with native UTF-16 & UTF-32
support!♥❤♡";
auto str_dump_u16 = to_hexdump_u16(str_u16);

return 0;
}

I compile it via CMake with set(CMAKE_CXX_STANDARD 17)
It compiles successfully, but the bad_cast exception occurs when
"oss.fill(u'0')" is called. The internal function which throws the exception is
"__check_facet(const _Facet* __f)" in basic_ios.h.

Exception is also thrown when calling "oss << std::endl", but I think it's the
same problem.

Problem touches latest library compiled with clang (linux), gcc 8.1 (mingw-w64,
linux). The code compiles and works fine on latest MSVC compiler. 

The one-file full example in attach.

[Bug rtl-optimization/88001] ASMCONS cannot handle properly UNSPEC(CONST)

2018-12-14 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88001

--- Comment #10 from Segher Boessenkool  ---
Sure this can be backported...  But can you fill in known-to-{work,fail}
then please?  Thanks.

[Bug rtl-optimization/87759] [8/9 Regression] ICE in lra_assign, at lra-assigns.c:1624, or ICE: Maximum number of LRA assignment passes is achieved (30), or compile-time hog

2018-12-14 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87759

--- Comment #2 from Vladimir Makarov  ---
  I've started to work on it.  The patch will be probably ready on Monday or
Tuesday.

[Bug c++/88507] New: utf8 not displayed

2018-12-14 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88507

Bug ID: 88507
   Summary: utf8 not displayed
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jg at jguk dot org
  Target Milestone: ---

Hello

This is an typo in the word "string", just reporting as perhaps it could show £
correctly, as it does on line 10 error. Perhaps could also show the stray bytes
in hex as well? ie "0xF3C2"

$ g++ -Wall -o string string.cpp
string.cpp:8:7: error: stray ‘\302’ in program
 st��ing buf;
   ^
string.cpp:8:8: error: stray ‘\243’ in program
 st��ing buf;
^
string.cpp: In function ‘int main()’:
string.cpp:8:5: error: ‘st’ was not declared in this scope
 st£ing buf;
 ^~
string.cpp:10:5: error: ‘buf2’ was not declared in this scope
 buf2 = "£"
 ^~~~



// g++ -Wall -o string string.cpp
#include 

using namespace std;

int main()
{
st£ing buf;

buf = "£"
}

[Bug c++/88501] Improve suggested alternative to be closer to typo

2018-12-14 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88501

--- Comment #3 from Jonny Grant  ---
(In reply to David Malcolm from comment #2)
> Confirmed.  "string" vs "stting" has edit distance of 1, closer than
> "stdin".  I think the issue here is that it's not considering names that
> would be found via using-directives.  I have a work-in-progress patch to do
> so, which makes it successfully suggest "string" for this case.

Great!

I wonder, for typos if a simple byte compare would be enough? Vary each char by
1 letter, or length. This starts to get complicated I know..

A few that still work
"st4ing"
"st$ing"
"st%ing"

A few that do not work (I did think gcc could output UTF-8) but it would seem
not - I'll file a separate ticket

"st!ing"

"st£ing"

string.cpp:8:7: error: stray ‘\302’ in program
 st��ing buf;
   ^
string.cpp:8:8: error: stray ‘\243’ in program
 st��ing buf;

[Bug tree-optimization/88372] alloc_size attribute is ignored on function pointers

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88372

Martin Sebor  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=88506

--- Comment #7 from Martin Sebor  ---
Bug 88506 tracks the missing warning.

[Bug c/88506] New: missing warning assigning to a pointer with incompatible attributes

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88506

Bug ID: 88506
   Summary: missing warning assigning to a pointer with
incompatible attributes
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

Similar to bug 88505, GCC silently accepts assignments of addresses of
functions to pointers declared with incompatible attributes.  See the test case
below.

GCC should check the attributes on pointer initialization and assignment for
compatibility and issue a warning if the destination is more restrictive than
the source.

$ cat u.c && gcc -O2 -S -Wall -Wextra -fdump-tree-optimized=/dev/stdout u.c
#include 

__attribute__ ((alloc_align (2), alloc_size (1))) void*
(*paligned_alloc)(size_t, size_t) = aligned_alloc;

size_t f (void)
{
  size_t align = 16;
  void *p = paligned_alloc (48, align);
  return __builtin_object_size (p, 0);   // bad alloc_size used
}

void g (void)
{
  size_t align = 32;
  void *p = paligned_alloc (4, align); 
  if ((long)p & (align - 1)) // bad alloc_align used
__builtin_abort ();
}

;; Function f (f, funcdef_no=10, decl_uid=2495, cgraph_uid=11, symbol_order=11)

f ()
{
  void * (*) (size_t, size_t) paligned_alloc.0_1;

   [local count: 1073741824]:
  paligned_alloc.0_1 = paligned_alloc;
  paligned_alloc.0_1 (48, 16);
  return 48;

}



;; Function g (g, funcdef_no=11, decl_uid=2500, cgraph_uid=12, symbol_order=12)

g ()
{
  void * (*) (size_t, size_t) paligned_alloc.1_1;

   [local count: 1073741824]:
  paligned_alloc.1_1 = paligned_alloc;
  paligned_alloc.1_1 (4, 32); [tail call]
  return;

}


tmp$ cat u.c && /ssd/build/gcc-svn/gcc/xgcc -B /ssd/build/gcc-svn/gcc -O2 -S
-Wall -Wextra -fdump-tree-optimized=/dev/stdout u.c

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Jakub Jelinek  ---
Fixed.

[Bug rtl-optimization/88478] [9 Regression] valgrind error in cselib_record_sets

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88478

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||jakub at gcc dot gnu.org
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Fixed.

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

--- Comment #5 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 14 23:21:10 2018
New Revision: 267160

URL: https://gcc.gnu.org/viewcvs?rev=267160=gcc=rev
Log:
PR target/88489
* config/i386/sse.md (UNSPEC_SFIXUPIMM): New unspec enumerator.
(avx512f_sfixupimm): Use it
instead of UNSPEC_FIXUPIMM.

* gcc.target/i386/avx512vl-vfixupimmsd-2.c: New test.
* gcc.target/i386/avx512vl-vfixupimmss-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/avx512vl-vfixupimmsd-2.c
trunk/gcc/testsuite/gcc.target/i386/avx512vl-vfixupimmss-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog

[Bug fortran/87992] ICE in resolve_fl_variable, at fortran/resolve.c:12314

2018-12-14 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87992

kargl at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P4
 CC||kargl at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |kargl at gcc dot gnu.org
   Target Milestone|--- |9.0

--- Comment #1 from kargl at gcc dot gnu.org ---
Don't know if code is valid, but I have a patch that fixes
the ICE.  The code compiles.

[Bug rtl-optimization/88478] [9 Regression] valgrind error in cselib_record_sets

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88478

--- Comment #3 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 14 23:17:03 2018
New Revision: 267159

URL: https://gcc.gnu.org/viewcvs?rev=267159=gcc=rev
Log:
PR rtl-optimization/88478
* cselib.c (cselib_record_sets): Move sets[i].src_elt tests
after REG_P (dest) test.

* g++.dg/opt/pr88478.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/opt/pr88478.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cselib.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/88505] New: missing -Wbuiltin-declaration-mismatch on a declaration with incompatible attributes

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88505

Bug ID: 88505
   Summary: missing -Wbuiltin-declaration-mismatch on a
declaration with incompatible attributes
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

-Wbuiltin-declaration-mismatch diagnoses declarations of built-in functions
whose type conflicts with the type of the built-in.  This includes the return
type and argument types, but does not include incompatible attributes.  For
example, the C11 aligned_alloc() function is declared like so:

  void *aligned_alloc (size_t alignment, size_t size);

but GCC silently accepts the declarations that specify attributes in the
opposite order, and in some cases (in g() below) even relies on the conflicting
attributes to make optimization decisions.

The -Wbuiltin-declaration-mismatch warning should be enhanced to also detect
mismatches in function attributes.

$ cat u.c && gcc -O2 -S -Wall -Wextra -fdump-tree-optimized=/dev/stdout u.c
typedef __SIZE_TYPE__ size_t;

extern __attribute__ ((alloc_align (2), alloc_size (1))) void*
aligned_alloc (size_t, size_t);

size_t f (void)
{
  size_t align = 16;
  void *p = aligned_alloc (48, align);
  return __builtin_object_size (p, 0);   // bad alloc_size ignored here
}

void g (void)
{
  size_t align = 32;
  void *p = aligned_alloc (4, align); 
  if ((long)p & (align - 1)) // bad alloc_align used here
__builtin_abort ();
}

;; Function f (f, funcdef_no=0, decl_uid=1910, cgraph_uid=1, symbol_order=0)

f ()
{
   [local count: 1073741824]:
  return 16;

}



;; Function g (g, funcdef_no=1, decl_uid=1915, cgraph_uid=2, symbol_order=1)

g ()
{
   [local count: 1073741824]:
  return;

}

[Bug c++/87436] [7/8 Regression] G++ produces >300MB .rodata section to initialize struct with big array

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87436

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[7/8/9 Regression] G++  |[7/8 Regression] G++
   |produces >300MB .rodata |produces >300MB .rodata
   |section to initialize   |section to initialize
   |struct with big array   |struct with big array

--- Comment #9 from Jakub Jelinek  ---
Fixed on the trunk.

[Bug tree-optimization/88372] alloc_size attribute is ignored on function pointers

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88372

Martin Sebor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Martin Sebor  ---
Fixed for GCC 9 in r267158.  A warning when a function without the attribute is
assigned to a pointer with it should also be implemented to help detect bugs
that this might lead to.

[Bug tree-optimization/88372] alloc_size attribute is ignored on function pointers

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88372

--- Comment #5 from Martin Sebor  ---
Author: msebor
Date: Fri Dec 14 22:45:55 2018
New Revision: 267158

URL: https://gcc.gnu.org/viewcvs?rev=267158=gcc=rev
Log:
PR tree-optimization/88372 - alloc_size attribute is ignored on function
pointers

gcc/ChangeLog:

PR tree-optimization/88372
* calls.c (maybe_warn_alloc_args_overflow): Handle function pointers.
* tree-object-size.c (alloc_object_size): Same.  Simplify.
* doc/extend.texi (Object Size Checking): Update.
(Other Builtins): Add __builtin_object_size.
(Common Type Attributes): Add alloc_size.
(Common Variable Attributes): Ditto.

gcc/testsuite/ChangeLog:

PR tree-optimization/88372
* gcc.dg/Walloc-size-larger-than-18.c: New test.
* gcc.dg/builtin-object-size-19.c: Same.


Added:
trunk/gcc/testsuite/gcc.dg/Walloc-size-larger-than-18.c
trunk/gcc/testsuite/gcc.dg/builtin-object-size-19.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/calls.c
trunk/gcc/doc/extend.texi
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-object-size.c

[Bug tree-optimization/87096] "Optimised" snprintf is not POSIX conformant

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87096

Martin Sebor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Martin Sebor  ---
Fixed for GCC 9 in r267157.

[Bug tree-optimization/87096] "Optimised" snprintf is not POSIX conformant

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87096

--- Comment #12 from Martin Sebor  ---
Author: msebor
Date: Fri Dec 14 22:38:08 2018
New Revision: 267157

URL: https://gcc.gnu.org/viewcvs?rev=267157=gcc=rev
Log:
PR tree-optimization/87096 - Optimised snprintf is not POSIX conformant

gcc/ChangeLog:

PR rtl-optimization/87096
* gimple-ssa-sprintf.c (sprintf_dom_walker::handle_gimple_call): Avoid
folding calls whose bound may exceed INT_MAX.  Diagnose bound ranges
that exceed the limit.

gcc/testsuite/ChangeLog:

PR tree-optimization/87096
* gcc.dg/tree-ssa/builtin-snprintf-4.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/builtin-snprintf-4.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimple-ssa-sprintf.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/88504] New: Inconsistent error message notes when using forward-declared type as value

2018-12-14 Thread petschy at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88504

Bug ID: 88504
   Summary: Inconsistent error message notes when using
forward-declared type as value
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: petschy at gmail dot com
  Target Milestone: ---

struct Foo;

struct Bar
{
Bar(Foo f_) :
m_foo(f_)
{
}

Foo m_foo;
};

Foo baz1()
{
}

void baz2(Foo f_)
{
}

void baz3()
{
Foo foo;
}

Foo g_foo;


$ g++-9.0.0 -Wall -Wextra -c 20181214-fwddecl_value.cpp 
20181214-fwddecl_value.cpp:10:6: error: field ‘m_foo’ has incomplete type ‘Foo’
   10 |  Foo m_foo;
  |  ^
20181214-fwddecl_value.cpp:1:8: note: forward declaration of ‘struct Foo’
1 | struct Foo;
  |^~~
20181214-fwddecl_value.cpp:5:10: error: ‘f_’ has incomplete type
5 |  Bar(Foo f_) :
  |  ^~
20181214-fwddecl_value.cpp:1:8: note: forward declaration of ‘struct Foo’
1 | struct Foo;
  |^~~
20181214-fwddecl_value.cpp:13:10: error: return type ‘struct Foo’ is incomplete
   13 | Foo baz1()
  |  ^
20181214-fwddecl_value.cpp:17:15: error: ‘f_’ has incomplete type
   17 | void baz2(Foo f_)
  |   ^~
20181214-fwddecl_value.cpp:1:8: note: forward declaration of ‘struct Foo’
1 | struct Foo;
  |^~~
20181214-fwddecl_value.cpp: In function ‘void baz2(Foo)’:
20181214-fwddecl_value.cpp:17:15: warning: unused parameter ‘f_’
[-Wunused-parameter]
   17 | void baz2(Foo f_)
  |   ^~
20181214-fwddecl_value.cpp: In function ‘void baz3()’:
20181214-fwddecl_value.cpp:23:6: error: aggregate ‘Foo foo’ has incomplete type
and cannot be defined
   23 |  Foo foo;
  |  ^~~
20181214-fwddecl_value.cpp: At global scope:
20181214-fwddecl_value.cpp:26:5: error: aggregate ‘Foo g_foo’ has incomplete
type and cannot be defined
   26 | Foo g_foo;
  | ^

Most messages contain the note where the forward decl occured, but some don't:
- returning from baz1()
- local variable 'foo' in baz3()
- global variable 'g_foo'
Mentioning it everywhere would be helpful.

Quite a minor issue, but the wording of the error messages also varies
somewhat:
- inside the class: field ‘m_foo’ has incomplete type ‘Foo’
- as fn param, the name of the type is omitted : ‘f_’ has incomplete type
- when returning, 'struct Foo' is mentioned: return type ‘struct Foo’ is
incomplete
- but when defining a variable, Foo is not a struct, but an aggregate:
aggregate ‘Foo foo’ has incomplete type and cannot be defined

[Bug web/79738] Documentation for __attribute__((const)) slightly misleading

2018-12-14 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79738

--- Comment #6 from Martin Sebor  ---
Author: msebor
Date: Fri Dec 14 22:16:43 2018
New Revision: 267156

URL: https://gcc.gnu.org/viewcvs?rev=267156=gcc=rev
Log:
PR 79738 - Documentation for __attribute__((const)) slightly misleading

gcc/ChangeLog:
* doc/extend.texi (attribute const, pure): Clarify.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/doc/extend.texi

[Bug c++/87814] [9 Regression] ICE in in tsubst_copy, at cp/pt.c:15962 with range-v3

2018-12-14 Thread aoliva at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87814

--- Comment #8 from Alexandre Oliva  ---
Author: aoliva
Date: Fri Dec 14 21:57:07 2018
New Revision: 267155

URL: https://gcc.gnu.org/viewcvs?rev=267155=gcc=rev
Log:
[PR c++/87814] undefer deferred noexcept on tsubst if request

tsubst_expr and tsubst_copy_and_build are not expected to handle
DEFERRED_NOEXCEPT exprs, but if tsubst_exception_specification takes a
DEFERRED_NOEXCEPT expr with !defer_ok, it just passes the expr on for
tsubst_copy_and_build to barf.

This patch arranges for tsubst_exception_specification to combine the
incoming args with those already stored in a DEFERRED_NOEXCEPT, and
then substitute them into the pattern, when retaining a deferred
noexcept is unacceptable.


for  gcc/cp/ChangeLog

PR c++/87814
* pt.c (tsubst_exception_specification): Handle
DEFERRED_NOEXCEPT with !defer_ok.

for  gcc/testsuite/ChangeLog

PR c++/87814
* g++.dg/cpp1z/pr87814.C: New.

Added:
trunk/gcc/testsuite/g++.dg/cpp1z/pr87814.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/pt.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/88501] Improve suggested alternative to be closer to typo

2018-12-14 Thread dmalcolm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88501

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-12-14
   Assignee|unassigned at gcc dot gnu.org  |dmalcolm at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #2 from David Malcolm  ---
Confirmed.  "string" vs "stting" has edit distance of 1, closer than "stdin". 
I think the issue here is that it's not considering names that would be found
via using-directives.  I have a work-in-progress patch to do so, which makes it
successfully suggest "string" for this case.

[Bug c++/88503] New: 'invalid static_cast' error message could be more helpful

2018-12-14 Thread petschy at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88503

Bug ID: 88503
   Summary: 'invalid static_cast' error message could be more
helpful
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: petschy at gmail dot com
  Target Milestone: ---

class Parent;
class Derived;

Derived* foo(Parent* p)
{
return static_cast(p);
}

$ g++-9.0.0 -Wall -Wextra -c 20181214-fwddecl_vs_static_cast.cpp 
20181214-fwddecl_vs_static_cast.cpp: In function ‘Derived* foo(Parent*)’:
20181214-fwddecl_vs_static_cast.cpp:6:32: error: invalid static_cast from type
‘Parent*’ to type ‘Derived*’
6 |  return static_cast(p);
  |^


Now imagine that your project is somewhat more complex than the above example,
and Derived derives from Parent. You know it, you are used to it. However, some
code needs only the forward declarations, at other places the full definition
is needed.

When I bumped into this error, it caused me quite some time to figure out what
is going on. I double/triple/quad checked the names of the classes, no typo,
yet the error. It turned out of course that only the forward declarations were
available in that TU.

It would be really useful if the error message was a bit more verbose,
mentioning the fact if either types were forward declared, and if no
parent-child relationship info is available.

[Bug target/88502] New: Inline built-in asinh, acosh, atanh for -ffast-math

2018-12-14 Thread jsm28 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88502

Bug ID: 88502
   Summary: Inline built-in asinh, acosh, atanh for -ffast-math
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jsm28 at gcc dot gnu.org
  Target Milestone: ---
Target: i?86-*-*

GCC should support inline code generation for asinh, acosh, atanh functions,
under appropriate fast-math conditions.

glibc's bits/mathinline.h, for 32-bit non-SSE fast-math x86 only, has:

/* The argument range of the inline version of asinhl is slightly reduced.  */
__inline_mathcodeNP (asinh, __x, \
  register long double  __y = __fabsl (__x);  \
  return (log1pl (__y * __y / (__libc_sqrtl (__y * __y + 1.0) + 1.0) + __y)   \
  * __sgn1l (__x)))

__inline_mathcodeNP (acosh, __x, \
  return logl (__x + __libc_sqrtl (__x - 1.0) * __libc_sqrtl (__x + 1.0)))

__inline_mathcodeNP (atanh, __x, \
  register long double __y = __fabsl (__x);   \
  return -0.5 * log1pl (-(__y + __y) / (1.0 + __y)) * __sgn1l (__x))

We're moving away from such inlines in glibc, preferring to leave it to the
compiler to inline standard functions under appropriate conditions.  This
inlining probably only makes sense when logl / log1pl are themselves expanded
inline (but in principle it's otherwise generic; note this x86 code uses long
double, so avoiding reducing the argument range for built-in functions for
narrower types).  (__sgn1l is another inline function, copying the sign of x to
the value 1.0L.)

[Bug rtl-optimization/88001] ASMCONS cannot handle properly UNSPEC(CONST)

2018-12-14 Thread vgupta at synopsys dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88001

--- Comment #9 from Vineet Gupta  ---
Can this be stable backported to gcc-8-branch as well.
glibc folks use that branch for their regular smoke testing and without that
ARC tools don't even build.

[Bug c++/78986] [7/8/9 Regression] template inner classes are not affected by access specifiers

2018-12-14 Thread balakrishnan.erode at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78986

Balakrishnan B  changed:

   What|Removed |Added

 CC||balakrishnan.erode at gmail 
dot co
   ||m

--- Comment #2 from Balakrishnan B  ---
This has nothing to do with inheritance. If the inner class is template, access
specifiers are ignored. Simpler example:

class A {
struct B1 {};

template
struct B2 {};
};

void foo() {
//A::B1 b1; //This doesn't compile (GOOD)
A::B2 b2; // This compiles (BAD)
}

Explorer: https://gcc.godbolt.org/z/S0YBPu

Bug is present in all versions between GCC 6.1 to trunk. GCC5 and earlier are
good.

[Bug middle-end/88497] Improve Accumulation in Auto-Vectorized Code

2018-12-14 Thread wschmidt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497

--- Comment #4 from Bill Schmidt  ---
Yes, reassociation sounds like the right place to look at this.

[Bug libgomp/88495] An OpenACC async queue is always synchronized with itself

2018-12-14 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88495

--- Comment #1 from Thomas Schwinge  ---
Author: tschwinge
Date: Fri Dec 14 20:43:02 2018
New Revision: 267152

URL: https://gcc.gnu.org/viewcvs?rev=267152=gcc=rev
Log:
[PR88495] An OpenACC async queue is always synchronized with itself

An OpenACC async queue is always synchronized with itself, so invocations like
"#pragma acc wait(0) async(0)", or "acc_wait_async (0, 0)" don't make a lot of
sense, but are still valid.

libgomp/
PR libgomp/88495
* plugin/plugin-nvptx.c (nvptx_wait_async): Don't refuse
"identical parameters".
* testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c: Update.
* testsuite/libgomp.oacc-c-c++-common/lib-80.c: Remove.

Removed:
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-80.c
Modified:
trunk/libgomp/ChangeLog
trunk/libgomp/plugin/plugin-nvptx.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c

[Bug libgomp/88484] OpenACC wait directive without wait argument but with async clause

2018-12-14 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88484

--- Comment #1 from Thomas Schwinge  ---
Author: tschwinge
Date: Fri Dec 14 20:42:50 2018
New Revision: 267151

URL: https://gcc.gnu.org/viewcvs?rev=267151=gcc=rev
Log:
[PR88484] OpenACC wait directive without wait argument but with async clause

We don't correctly handle "#pragma acc wait async (a)" for "a >= 0", handling
as a no-op whereas it should enqueue the appropriate wait operations on
"async (a)".

libgomp/
PR libgomp/88484
* oacc-parallel.c (GOACC_wait): Correct handling for "async >= 0".
* testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c: New file.

Added:
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c
Modified:
trunk/libgomp/ChangeLog
trunk/libgomp/oacc-parallel.c

[Bug libgomp/88407] [OpenACC] Correctly handle unseen async-arguments

2018-12-14 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88407

--- Comment #1 from Thomas Schwinge  ---
Author: tschwinge
Date: Fri Dec 14 20:42:40 2018
New Revision: 267150

URL: https://gcc.gnu.org/viewcvs?rev=267150=gcc=rev
Log:
[PR88407] [OpenACC] Correctly handle unseen async-arguments

... which turn the operation into a no-op.

libgomp/
PR libgomp/88407
* plugin/plugin-nvptx.c (nvptx_async_test, nvptx_wait)
(nvptx_wait_async): Unseen async-argument is a no-op.
* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Update.
* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
* testsuite/libgomp.oacc-fortran/lib-12.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-71.c: Merge into...
* testsuite/libgomp.oacc-c-c++-common/lib-69.c: ... this.  Update.
* testsuite/libgomp.oacc-c-c++-common/lib-77.c: Merge into...
* testsuite/libgomp.oacc-c-c++-common/lib-74.c: ... this.  Update

Removed:
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-71.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-77.c
Modified:
trunk/libgomp/ChangeLog
trunk/libgomp/plugin/plugin-nvptx.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/async_queue-1.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-69.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-74.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-79.c
trunk/libgomp/testsuite/libgomp.oacc-fortran/lib-12.f90

[Bug libgomp/88370] acc_get_cuda_stream/acc_set_cuda_stream: acc_async_sync, acc_async_noval

2018-12-14 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88370

--- Comment #1 from Thomas Schwinge  ---
Author: tschwinge
Date: Fri Dec 14 20:42:08 2018
New Revision: 267147

URL: https://gcc.gnu.org/viewcvs?rev=267147=gcc=rev
Log:
[PR88370] acc_get_cuda_stream/acc_set_cuda_stream: acc_async_sync,
acc_async_noval

Per my reading of the OpenACC specification (and as supported by secondary
documentation, such as code examples, or presentations), it's valid to call
"acc_get_cuda_stream"/"acc_set_cuda_stream" also with "acc_async_sync",
"acc_async_noval" arguments, not just with the nonnegative values as currently
implemented.

libgomp/
PR libgomp/88370
* libgomp.texi (acc_get_current_cuda_context, acc_get_cuda_stream)
(acc_set_cuda_stream): Clarify.
* oacc-cuda.c (acc_get_cuda_stream, acc_set_cuda_stream): Use
"async_valid_p".
* plugin/plugin-nvptx.c (nvptx_set_cuda_stream): Refuse "async ==
acc_async_sync".
* testsuite/libgomp.oacc-c-c++-common/acc_set_cuda_stream-1.c: New
file.
* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Update.
* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.

Added:
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_set_cuda_stream-1.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/async_queue-1.c
Modified:
trunk/libgomp/ChangeLog
trunk/libgomp/libgomp.texi
trunk/libgomp/oacc-cuda.c
trunk/libgomp/plugin/plugin-nvptx.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-84.c
trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-85.c

[Bug c++/88261] [9 Regression] ICE: verify_gimple failed (error: non-trivial conversion at assignment)

2018-12-14 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88261

--- Comment #9 from Jeffrey A. Law  ---
Thanks for running with this Bernd.  My brain was too mushy last night to get
anywhere.  I agree that digest_init_r seems like the right place to try and
address this problem.

ISTM we could either add the DECL as a parameter or pass in some kind of state
to indicate if it's automatic or static.  I've got no strong opinion on which
of those two approaches is best -- but Jason will have the final call here.

[Bug c++/86823] [7/8/9 Regression] private member template struct/class is publicly accessible

2018-12-14 Thread aoliva at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86823

--- Comment #7 from Alexandre Oliva  ---
Author: aoliva
Date: Fri Dec 14 20:06:15 2018
New Revision: 267144

URL: https://gcc.gnu.org/viewcvs?rev=267144=gcc=rev
Log:
[PR86823] retain deferred access checks from outside firewall

We used to preserve deferred access check along with resolved template
ids, but a tentative parsing firewall introduced additional layers of
deferred access checks, so that we don't preserve the checks we
want to any more.

This patch moves the deferred access checks from outside the firewall
into it.


From: Jason Merrill 
for  gcc/cp/ChangeLog

PR c++/86823
* parser.c (cp_parser_template_id): Rearrange deferred access
checks into the firewall.

From: Alexandre Oliva 
for  gcc/testsuite/ChangeLog

PR c++/86823
* g++.dg/pr86823.C: New.

Added:
trunk/gcc/testsuite/g++.dg/pr86823.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/parser.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/88497] Improve Accumulation in Auto-Vectorized Code

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497

--- Comment #3 from Jakub Jelinek  ---
Obviously, in either case for -ffast-math only.

[Bug middle-end/88497] Improve Accumulation in Auto-Vectorized Code

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Is that though something the vectorizer should do, or better something some
post-vectorization pass like reassoc which already has such infrastructure
should do?

[Bug target/88498] [9 Regression] FAIL: gcc.target/i386/avx512vl-pr79299-1.c (internal compiler error)

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88498

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Please try https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01085.html

[Bug c++/82294] Array of objects with constexpr constructors initialized from space-inefficient memory image

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82294

--- Comment #4 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 14 19:37:38 2018
New Revision: 267143

URL: https://gcc.gnu.org/viewcvs?rev=267143=gcc=rev
Log:
PR c++/82294
PR c++/87436
* expr.h (categorize_ctor_elements): Add p_unique_nz_elts argument.
* expr.c (categorize_ctor_elements_1): Likewise.  Compute it like
p_nz_elts, except don't multiply it by mult.  Adjust recursive call.
Fix up COMPLEX_CST handling.
(categorize_ctor_elements): Add p_unique_nz_elts argument, initialize
it and pass it through to categorize_ctor_elements_1.
(mostly_zeros_p, all_zeros_p): Adjust categorize_ctor_elements callers.
* gimplify.c (gimplify_init_constructor): Likewise.  Don't force
ctor into readonly data section if num_unique_nonzero_elements is
smaller or equal to 1/8 of num_nonzero_elements and size is >= 64
bytes.

* g++.dg/tree-ssa/pr82294.C: New test.
* g++.dg/tree-ssa/pr87436.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/tree-ssa/pr82294.C
trunk/gcc/testsuite/g++.dg/tree-ssa/pr87436.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/expr.c
trunk/gcc/expr.h
trunk/gcc/gimplify.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/87436] [7/8/9 Regression] G++ produces >300MB .rodata section to initialize struct with big array

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87436

--- Comment #8 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 14 19:37:38 2018
New Revision: 267143

URL: https://gcc.gnu.org/viewcvs?rev=267143=gcc=rev
Log:
PR c++/82294
PR c++/87436
* expr.h (categorize_ctor_elements): Add p_unique_nz_elts argument.
* expr.c (categorize_ctor_elements_1): Likewise.  Compute it like
p_nz_elts, except don't multiply it by mult.  Adjust recursive call.
Fix up COMPLEX_CST handling.
(categorize_ctor_elements): Add p_unique_nz_elts argument, initialize
it and pass it through to categorize_ctor_elements_1.
(mostly_zeros_p, all_zeros_p): Adjust categorize_ctor_elements callers.
* gimplify.c (gimplify_init_constructor): Likewise.  Don't force
ctor into readonly data section if num_unique_nonzero_elements is
smaller or equal to 1/8 of num_nonzero_elements and size is >= 64
bytes.

* g++.dg/tree-ssa/pr82294.C: New test.
* g++.dg/tree-ssa/pr87436.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/tree-ssa/pr82294.C
trunk/gcc/testsuite/g++.dg/tree-ssa/pr87436.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/expr.c
trunk/gcc/expr.h
trunk/gcc/gimplify.c
trunk/gcc/testsuite/ChangeLog

[Bug c++/82294] Array of objects with constexpr constructors initialized from space-inefficient memory image

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82294

--- Comment #3 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 14 19:36:33 2018
New Revision: 267142

URL: https://gcc.gnu.org/viewcvs?rev=267142=gcc=rev
Log:
PR c++/82294
PR c++/87436
* init.c (build_vec_init): Change num_initialized_elts type from int
to HOST_WIDE_INT.  Build a RANGE_EXPR if e needs to be repeated more
than once.

Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/init.c

[Bug c++/87436] [7/8/9 Regression] G++ produces >300MB .rodata section to initialize struct with big array

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87436

--- Comment #7 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 14 19:36:33 2018
New Revision: 267142

URL: https://gcc.gnu.org/viewcvs?rev=267142=gcc=rev
Log:
PR c++/82294
PR c++/87436
* init.c (build_vec_init): Change num_initialized_elts type from int
to HOST_WIDE_INT.  Build a RANGE_EXPR if e needs to be repeated more
than once.

Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/init.c

[Bug c++/88501] Improve suggested alternative to be closer to typo

2018-12-14 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88501

Eric Gallager  changed:

   What|Removed |Added

   Keywords||diagnostic
 CC||dmalcolm at gcc dot gnu.org,
   ||egallager at gcc dot gnu.org

--- Comment #1 from Eric Gallager  ---
Another heuristic besides length chars would be that "string" is a typename and
"stdin" isn't. David?

[Bug other/88499] Check for less than zero removed before floating point division causes division by zero (fast-math mode)

2018-12-14 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88499

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #4 from Andrew Pinski  ---
You need to turn back on trapping math, -ftrapping-math for this to work
correctly.  -ftrapping-math is turned off with -ffast-math.

[Bug middle-end/88490] Missed autovectorization when indices are different

2018-12-14 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88490

--- Comment #5 from joseph at codesourcery dot com  ---
On Fri, 14 Dec 2018, rguenther at suse dot de wrote:

> Note I do not think the C standard is sufficiently clear with regarding 
> to restrict qualified pointers loaded from memory.

I think this is where "Every access that modifies X shall be considered 
also to modify P, for the purposes of this subclause." comes in.  (See 
what I said at .)  
Modifying s->d[n][0] is considered to modify s->d[n], and so considered to 
modify s->d, and so considered to modify s.  (It's still perfectly valid 
to have n == k; what's not valid is aliasing between objects accessed via 
s->d[0] and s->d[1], for example.)

[Bug c++/88261] [9 Regression] ICE: verify_gimple failed (error: non-trivial conversion at assignment)

2018-12-14 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88261

Bernd Edlinger  changed:

   What|Removed |Added

 CC||bernd.edlinger at hotmail dot 
de

--- Comment #8 from Bernd Edlinger  ---
Interesting: above patch adds an error in 
gcc/testsuite/g++.dg/warn/Wplacement-new-size-1.C

where this is no ICE but only wrong code (I modified the test case
a bit to demonstrate the Problem):

$ cat Wplacement-new-size-1.C
// PR c++/69662 - -Wplacement-new on allocated one element array members
// Exercising the more permissive -Wplacement-new=1.  The difference
// between -Wplacement-new=1 is denoted by "no warning at level 1" in
// the comments below.
// { dg-do compile }
// { dg-options "-Wno-pedantic -Wplacement-new=1" }

typedef __typeof__ (sizeof 0) size_t;

void* operator new (size_t, void *p) { return p; }
void* operator new[] (size_t, void *p) { return p; }

struct Ax { char n, a []; };

typedef __INT16_TYPE__ Int16;

char xx[3];
void fAx2 ()
{
  Ax ax2 = { 1, { 2, 3 } };

  new (ax2.a) Int16(123);
  __builtin_memcpy(xx, , 3);
}

int main()
{
  fAx2 ();
}

$ g++ -O2 Wplacement-new-size-1.C
$ ./a.out
Segmentation fault (core dumped)
$ g++ -S -O2 Wplacement-new-size-1.C
$ cat  Wplacement-new-size-1.s
.file   "Wplacement-new-size-1.C"
.text
.p2align 4
.globl  _ZnwmPv
.type   _ZnwmPv, @function
_ZnwmPv:
.LFB0:
.cfi_startproc
movq%rsi, %rax
ret
.cfi_endproc
.LFE0:
.size   _ZnwmPv, .-_ZnwmPv
.p2align 4
.globl  _ZnamPv
.type   _ZnamPv, @function
_ZnamPv:
.LFB5:
.cfi_startproc
movq%rsi, %rax
ret
.cfi_endproc
.LFE5:
.size   _ZnamPv, .-_ZnamPv
.section.rodata
.LC0:
.byte   1
.byte   2
.byte   3
.text
.p2align 4
.globl  _Z4fAx2v
.type   _Z4fAx2v, @function
_Z4fAx2v:
.LFB2:
.cfi_startproc
movzbl  .LC0(%rip), %eax
movb%al, -1(%rsp)
movl$123, %eax
movw%ax, (%rsp)
movzwl  -1(%rsp), %eax
movw%ax, xx(%rip)
movzbl  1(%rsp), %eax
movb%al, xx+2(%rip)
ret
.cfi_endproc
.LFE2:
.size   _Z4fAx2v, .-_Z4fAx2v
.section.text.startup,"ax",@progbits
.p2align 4
.globl  main
.type   main, @function
main:
.LFB3:
.cfi_startproc
movzbl  .LC0(%rip), %eax
movb%al, -1(%rsp)
movl$123, %eax
movw%ax, (%rsp)
movzwl  -1(%rsp), %eax
movw%ax, xx(%rip)
movzbl  1(%rsp), %eax
movb%al, xx+2(%rip)
xorl%eax, %eax
ret
.cfi_endproc
.LFE3:
.size   main, .-main
.globl  xx
.bss
.type   xx, @object
.size   xx, 3
xx:
.zero   3
.ident  "GCC: (GNU) 9.0.0 20181209 (experimental)"
.section.note.GNU-stack,"",@progbits



So Ax2 has actually only 1 Byte space on the stack,
and "new (ax2.a) Int16(123);"
overwrites the return stack

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Created attachment 45240
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45240=edit
gcc9-pr88489.patch

Untested fix.  The problem was that the sd pattern had the same RTL as the pd
for 128-bit vectors, but behaved differently.

[Bug other/88499] Check for less than zero removed before floating point division causes division by zero (fast-math mode)

2018-12-14 Thread fuscated at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88499

--- Comment #3 from Teodor Petrov  ---
@Marc Glisse: Would it be possible to give an explanation why this is not a
good idea? Link to some kind of a documentation which explains that this
behaviour is expected?

[Bug c++/88501] New: Improve suggested alternative to be closer to typo

2018-12-14 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88501

Bug ID: 88501
   Summary: Improve suggested alternative to be closer to typo
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jg at jguk dot org
  Target Milestone: ---

"string" spelt as "stting" gives suggestion "stdin" which is 2 characters
different than the expected suggestion to "string".

I imagine it is difficult to get suggestions right, but ideally same length
chars would be considered a closer match.


$ g++ -Wall -o string string.cpp
string.cpp: In function ‘int main()’:
string.cpp:8:5: error: ‘stting’ was not declared in this scope
 stting buf;
 ^~
string.cpp:8:5: note: suggested alternative: ‘stdin’
 stting buf;
 ^~
 stdin
string.cpp:10:5: error: ‘buf’ was not declared in this scope
 buf = "hello";
 ^~~




// g++ -Wall -o string string.cpp
#include 

using namespace std;

int main()
{
stting buf;

buf = "hello";
}

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

--- Comment #3 from Jakub Jelinek  ---
The test FAILs even at -O0 when built with -mavx512vl, when built e.g. with
-mavx512{bw,dq} it works fine.

[Bug tree-optimization/88464] AVX-512 vectorization of masked scatter failing with "not suitable for scatter store"

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88464

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #12 from Jakub Jelinek  ---
Created attachment 45239
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45239=edit
gcc9-pr88464-2.patch

Incremental untested patch.
First of all, I've missed an important case for gathers in the testsuite
coverage, float with long index type on -m64, that actually ICEd due to
multiple issues.
The rest of the patch implements masked scatters, though at least for now only
for 512-bit vectors.  The problem for 128-bit and 256-bit is that the
vectorizer computes the comparisons in an integral vector rather than bool
vector (i.e. int); perhaps we could just emit an instruction that sets masks
from the MSB bits of such a vector.

[Bug middle-end/88487] union prevents autovectorization

2018-12-14 Thread bugzi...@poradnik-webmastera.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88487

--- Comment #4 from Daniel Fruzynski  ---
OK, I see. Is there any workaround for this? I tried to assign pointer to local
variable directly and with intermediate casting via void*, but it did not help.
Casting S1* to S2* also does not work.

[Bug target/88474] Inline built-in hypot for -ffast-math

2018-12-14 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88474

--- Comment #2 from uros at gcc dot gnu.org ---
Author: uros
Date: Fri Dec 14 17:04:48 2018
New Revision: 267137

URL: https://gcc.gnu.org/viewcvs?rev=267137=gcc=rev
Log:
PR target/88474
* internal-fn.def (HYPOT): New.
* optabs.def (hypot_optab): New.
* config/i386/i386.md (hypot3): New expander.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.md
trunk/gcc/internal-fn.def
trunk/gcc/optabs.def

[Bug c++/88261] [9 Regression] ICE: verify_gimple failed (error: non-trivial conversion at assignment)

2018-12-14 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88261

--- Comment #7 from Bernd Edlinger  ---
Created attachment 45238
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45238=edit
untested patch

[Bug other/88499] Check for less than zero removed before floating point division causes division by zero (fast-math mode)

2018-12-14 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88499

--- Comment #2 from Marc Glisse  ---
I don't think using fenv.h with -ffast-math makes much sense.

[Bug go/88500] New: [SH]: SETCONTEXT_CLOBBERS_TLS needs to be handled in libgo

2018-12-14 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88500

Bug ID: 88500
   Summary: [SH]: SETCONTEXT_CLOBBERS_TLS needs to be handled in
libgo
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: glaubitz at physik dot fu-berlin.de
CC: cmang at google dot com, ian at airs dot com, jrtc27 at 
jrtc27 dot com,
kkojima at gcc dot gnu.org, olegendo at gcc dot gnu.org
  Target Milestone: ---
Target: sh*-*-*

Trying to build gcc-8 with gccgo enabled fails on sh4 now with:

libtool: compile:  /<>/build/./gcc/xgcc
-B/<>/build/./gcc/ -B/usr/sh4-linux-gnu/bin/
-B/usr/sh4-linux-gnu/lib/ -isystem /usr/sh4-linux-gnu/include -isystem
/usr/sh4-linux-gnu/sys-include -isystem /<>/build/sys-include
-DHAVE_CONFIG_H -I. -I../../../src/libgo -I ../../../src/libgo/runtime
-I../../../src/libgo/../libffi/include -I../libffi/include -pthread
-L../libatomic/.libs -fexceptions -fnon-call-exceptions -fno-stack-protector
-Wall -Wextra -Wwrite-strings -Wcast-qual -D_GNU_SOURCE -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I ../../../src/libgo/../libgcc -I
../../../src/libgo/../libbacktrace -I ../../gcc/include -g -O2 -MT
go-unsafe-pointer.lo -MD -MP -MF .deps/go-unsafe-pointer.Tpo -c
../../../src/libgo/runtime/go-unsafe-pointer.c -o go-unsafe-pointer.o
>/dev/null 2>&1
../../../src/libgo/runtime/proc.c:171:4: error: #error unknown case for
SETCONTEXT_CLOBBERS_TLS
 #  error unknown case for SETCONTEXT_CLOBBERS_TLS
^
../../../src/libgo/runtime/proc.c: In function 'runtime_gogo':
../../../src/libgo/runtime/proc.c:289:2: warning: implicit declaration of
function 'fixcontext'; did you mean 'setcontext'?
[-Wimplicit-function-declaration]
  fixcontext(ucontext_arg(>context[0]));
  ^~
  setcontext
../../../src/libgo/runtime/proc.c: In function 'runtime_mstart':
../../../src/libgo/runtime/proc.c:492:2: warning: implicit declaration of
function 'initcontext'; did you mean 'setcontext'?
[-Wimplicit-function-declaration]
  initcontext();
  ^~~
  setcontext

Looking at libgo/runtime/proc.c, it looks like this is because sh*-*-* defines
SETCONTEXT_CLOBBERS_TLS and it needs to be handled in libgo/runtime/proc.c.

[Bug d/88462] All D execution tests FAIL on Solaris/SPARC

2018-12-14 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88462

--- Comment #11 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #10 from Johannes Pfau  ---
> I guess the proper fix to the alignment problem is using
> 'https://dlang.org/phobos/std_traits.html#classInstanceAlignment' (or rather
> the druntime equivalent) instead of Mutex.alignof + the rounding / slice
> assignment fixes?

Seems plausible: the current situation is nothing more than a hack to
get me further along, and I've only just started reading up on D.

> Regarding the ModuleInfo problem: Although ModuleInfo does have a variable
> size, _flags ist the first field in the struct. So the whole struct instance
> has to be misaligned for some reason? Is the minfo section aligned properly?

It is: both minfo sections on libgdruntime.so and libgphobos.so are:

libdruntime/.libs/libgdruntime.so:


Section Header[28]:  sh_name: minfo
sh_addr:  0x17b834sh_flags:   [ SHF_WRITE SHF_ALLOC ]
sh_size:  0x344   sh_type:[ SHT_PROGBITS ]
sh_offset:0x16b834sh_entsize: 0
sh_link:  0   sh_info:0
sh_addralign: 0x4   

src/.libs/libgphobos.so:


Section Header[28]:  sh_name: minfo
sh_addr:  0x6ff014sh_flags:   [ SHF_WRITE SHF_ALLOC ]
sh_size:  0x224   sh_type:[ SHT_PROGBITS ]
sh_offset:0x6ef014sh_entsize: 0
sh_link:  0   sh_info:0
sh_addralign: 0x4   

And looking at a statically linked test program (gdc283.exe), I see

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x0007c970 in object.ModuleInfo.flags() const (this=...)
at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1541
1541@property uint flags() nothrow pure @nogc { return _flags; }
(gdb) p this
$1 = (const object.ModuleInfo &) @0x12ab33: {_flags = 4100, _index = 0}
(gdb) up
#1  0x0007d118 in object.ModuleInfo.importedModules() const (this=...)
at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1580
1580if (flags & MIimportedModules)
(gdb) up
#2  0x0008ed74 in rt.minfo.ModuleGroup.sortCtors(immutable(char)[]) (this=..., 
cycleHandling=...)
at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:259
259 foreach (imp; m.importedModules)
(gdb) p this
$2 = (rt.minfo.ModuleGroup &) @0x12f228: {_modules = {
  0x12932c , 
  0x1297ac , 
  0x129acc , 
  0x129ae4 , 
  0x12a99c , 0x12aafc , 
  0x12ab33 , 0x12ab42 , 
  0x12ab56 , 0x12ab67 , 
  0x12ab77 , 
  0x12ab89 , 

i.e. everything starts off alright, but goes astray from 0x12ab33
 onwards.

[Bug other/88499] Check for less than zero removed before floating point division causes division by zero (fast-math mode)

2018-12-14 Thread fuscated at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88499

--- Comment #1 from Teodor Petrov  ---
Here are the commands used to reproduce the bug:
$ g++ -g -fPIC -Ofast  -msse4.2 -std=c++11 -ffunction-sections -fdata-sections
-ffast-math -fvisibility=hidden -fexceptions -Wno-c++11-extensions
gcc_division.cpp 
$ ./a.out 
p=0.000; i=0
Floating point exception (core dumped)

If I move the if-else which sets the y0 outside of the loop just after the
printf call it works as expected and there is no SIGFPE.

[Bug other/88499] New: Check for less than zero removed before floating point division causes division by zero (fast-math mode)

2018-12-14 Thread fuscated at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88499

Bug ID: 88499
   Summary: Check for less than zero removed before floating point
division causes division by zero (fast-math mode)
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fuscated at gmail dot com
  Target Milestone: ---

Created attachment 45237
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45237=edit
minimal source to reproduce the problem

See the attached file.
I've tried 4.8.2 and 8.2.0. on x86-64 Linux. Both failed with SIGFPE.

[Bug d/88462] All D execution tests FAIL on Solaris/SPARC

2018-12-14 Thread johannespfau at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88462

Johannes Pfau  changed:

   What|Removed |Added

 CC||johannespfau at gmail dot com

--- Comment #10 from Johannes Pfau  ---
I guess the proper fix to the alignment problem is using
'https://dlang.org/phobos/std_traits.html#classInstanceAlignment' (or rather
the druntime equivalent) instead of Mutex.alignof + the rounding / slice
assignment fixes?

Regarding the ModuleInfo problem: Although ModuleInfo does have a variable
size, _flags ist the first field in the struct. So the whole struct instance
has to be misaligned for some reason? Is the minfo section aligned properly?

[Bug target/88498] [9 Regression] FAIL: gcc.target/i386/avx512vl-pr79299-1.c (internal compiler error)

2018-12-14 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88498

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-12-14
 CC||wei3.xiao at intel dot com,
   ||xuepeng.guo at intel dot com
 Ever confirmed|0   |1

--- Comment #1 from H.J. Lu  ---
Also:

FAIL: gcc.target/i386/pr81225.c (internal compiler error)

/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr81225.c
-march=corei7 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -mavx512ifma -O3 -ffloat-store -S -o pr81225.s
during GIMPLE pass: vect
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr81225.c: In
function \u2018foo\u2019:
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr81225.c:10:1:
internal compiler error: in vect_gen_perm_mask_checked, at
tree-vect-stmts.c:7269
0x14c4956 vect_gen_perm_mask_checked(tree_node*, vec_perm_indices const&)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:7269
0x14b3759 vect_build_gather_load_calls
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:2698
0x14c5c2f vectorizable_load
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:7629
0x14ccb75 vect_transform_stmt(_stmt_vec_info*, gimple_stmt_iterator*,
_slp_tree*, _slp_instance*)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:9655
0x14f157a vect_transform_loop_stmt
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:8139
0x14f224f vect_transform_loop(_loop_vec_info*)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:8358
0x151d21e try_vectorize_loop_1
/export/gnu/import/git/sources/gcc/gcc/tree-vectorizer.c:969
0x151d4b5 try_vectorize_loop
/export/gnu/import/git/sources/gcc/gcc/tree-vectorizer.c:1019
0x151d685 vectorize_loops()
/export/gnu/import/git/sources/gcc/gcc/tree-vectorizer.c:1101
0x1389785 execute
/export/gnu/import/git/sources/gcc/gcc/tree-ssa-loop.c:414
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
[hjl@gnu-cfl-1 gcc]$

[Bug target/88498] New: [9 Regression] FAIL: gcc.target/i386/avx512vl-pr79299-1.c (internal compiler error)

2018-12-14 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88498

Bug ID: 88498
   Summary: [9 Regression] FAIL:
gcc.target/i386/avx512vl-pr79299-1.c (internal
compiler error)
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
  Target Milestone: ---
Target: i386,x86-64

On x86-64, r267123 has

[hjl@gnu-cfl-1 gcc]$
/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/ -march=corei7
-fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -Ofast -mavx512vl -masm=intel -c -o
avx512vl-pr79299-1.o
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-pr79299-1.c
during GIMPLE pass: vect
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-pr79299-1.c:
In function \u2018f1\u2019:
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-pr79299-1.c:15:1:
internal compiler error: in vect_gen_perm_mask_checked, at
tree-vect-stmts.c:7269
0x14c4956 vect_gen_perm_mask_checked(tree_node*, vec_perm_indices const&)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:7269
0x14b3759 vect_build_gather_load_calls
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:2698
0x14c5c2f vectorizable_load
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:7629
0x14ccb75 vect_transform_stmt(_stmt_vec_info*, gimple_stmt_iterator*,
_slp_tree*, _slp_instance*)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-stmts.c:9655
0x14f157a vect_transform_loop_stmt
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:8139
0x14f224f vect_transform_loop(_loop_vec_info*)
/export/gnu/import/git/sources/gcc/gcc/tree-vect-loop.c:8358
0x151d21e try_vectorize_loop_1
/export/gnu/import/git/sources/gcc/gcc/tree-vectorizer.c:969
0x151d4b5 try_vectorize_loop
/export/gnu/import/git/sources/gcc/gcc/tree-vectorizer.c:1019
0x151d685 vectorize_loops()
/export/gnu/import/git/sources/gcc/gcc/tree-vectorizer.c:1101
0x1389785 execute
/export/gnu/import/git/sources/gcc/gcc/tree-ssa-loop.c:414
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
[hjl@gnu-cfl-1 gcc]$

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2018-12-14 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-12-14
 Ever confirmed|0   |1

--- Comment #2 from H.J. Lu  ---
[hjl@gnu-skx-1 gcc]$ /export/ssd/git/gcc-test-native/bld/gcc/xgcc
-B/export/ssd/git/gcc-test-native/bld/gcc/
/export/ssd/git/gcc-test-native/src-trunk/gcc/testsuite/gcc.target/i386/avx512f-vfixupimmss-2.c
-fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -mavx512f -march=skylake-avx512
[hjl@gnu-skx-1 gcc]$ ./a.out 
Aborted
[hjl@gnu-skx-1 gcc]$ /export/ssd/git/gcc-test-native/bld/gcc/xgcc -v
Using built-in specs.
COLLECT_GCC=/export/ssd/git/gcc-test-native/bld/gcc/xgcc
Target: x86_64-pc-linux-gnu
Configured with: ../src-trunk/configure --with-arch=native --with-cpu=native
--prefix=/usr/9.0.0 --enable-clocale=gnu --with-system-zlib --enable-shared
--enable-cet --with-demangler-in-ld --enable-libmpx
--with-multilib-list=m32,m64,mx32 --with-fpmath=sse
Thread model: posix
gcc version 9.0.0 20181214 (experimental) [trunk revision 267123] (GCC) 
[hjl@gnu-skx-1 gcc]$ 

--with-arch=native --with-cpu=native is equivalent to -march=skylake-avx512.
It used to pass before r265827.

[Bug libgomp/87835] nvptx offloading: libgomp.oacc-c-c++-common/asyncwait-1.c execution test intermittently fails at -O2

2018-12-14 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87835

--- Comment #1 from Tom de Vries  ---
(In reply to Thomas Schwinge from comment #0)
> After r264397 "[nvptx] Remove use of CUDA unified memory in libgomp", I'm
> seeing (intermittently only, and only on some systems):
> 

I see the failure reproduced consistently with a Quadro M1200.

> I have not yet analyzed what's causing this, but I have some ideas about
> pending patches that might cure it.

OK, let's see if those make it. If not, we may want to investigate and decide
if we want to revert the patch.

[Bug middle-end/88497] Improve Accumulation in Auto-Vectorized Code

2018-12-14 Thread kelvin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497

kelvin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kelvin at gcc dot gnu.org,
   ||rguenther at suse dot de,
   ||segher at gcc dot gnu.org,
   ||wschmidt at gcc dot gnu.org

--- Comment #1 from kelvin at gcc dot gnu.org ---
Consider the following loop:

double sacc = 0.00;
extern double x[], y[];
for (unsigned long long int i = 0; i < N; i++)
  sacc += x[i] * y[i];

Auto-vectorization turns the body of the loop into something close to the
following function foo:

double foo (double accumulator, vector double arg2[], vector double arg3[])
{
  vector double temp;

  temp = arg2[0] * arg3[0];
  accumulator += temp[0] + temp[1];
  temp = arg2[1] * arg3[1];
  accumulator += temp[0] + temp[1];
  temp = arg2[2] * arg3[2];
  accumulator += temp[0] + temp[1];
  temp = arg2[3] * arg3[3];
  accumulator += temp[0] + temp[1];
  return accumulator;
}

Compiled with -O3 -mcpu=power9 -ffast-math, this translates into 25
instructions:

foo:
.LFB11:
.cfi_startproc
lxv 6,0(5)
lxv 10,0(4)
lxv 7,16(5)
lxv 11,16(4)
lxv 8,32(5)
lxv 12,32(4)
lxv 9,48(5)
lxv 0,48(4)
xvmuldp 10,10,6
xvmuldp 11,11,7
xvmuldp 12,12,8
xvmuldp 0,0,9
xxpermdi 7,10,10,3
xxpermdi 8,11,11,3
fadd 10,7,10
xxpermdi 9,12,12,3
fadd 11,8,11
xxpermdi 6,0,0,3
fadd 12,9,12
fadd 0,6,0
fadd 10,10,1
fadd 11,11,10
fadd 1,12,11
fadd 1,0,1
blr


If auto-vectorization were to transform this loop into the following equivalent
code, the resulting translation is only 18 instructions:

double foo (double accumulator, vector double arg2[], vector double arg3[])
{
  vector double temp;

  temp[0] = accumulator;
  temp[1] = 0.0;
  temp += arg2[0] * arg3[0];
  temp += arg2[1] * arg3[1];
  temp += arg2[2] * arg3[2];
  temp += arg2[3] * arg3[3];
  return temp[0] + temp[1];
}

foo:
.LFB11: 
.cfi_startproc  
li 9,0  
lxv 10,0(4) 
lxv 6,0(5)  
lxv 11,16(4)
lxv 7,16(5) 
mtvsrd 0,9  
lxv 12,32(4)
lxv 8,32(5) 
lxv 9,48(5) 
xxpermdi 1,0,1,0
lxv 0,48(4) 
xvmaddadp 1,10,6
xvmaddadp 1,11,7
xvmaddadp 1,12,8
xvmaddadp 1,0,9 
xxpermdi 0,1,1,3
fadd 1,0,1  
blr  

I have also experimented with trunk's treatment of x86 targets, and the same
optimization is relevant there:


x86 -O3 -ffast-math optimized translation of the "original" source is:

_foo:
LFB1:
;; 17 insns in original code
;;  movadp: 4   
;;  mulpd:  4   
;;  addsd:  4   
;;  haddpd: 4   
;;  ret:1   
;;  total: 17   

movapd  32(%rdi), %xmm2 ; load arg2[2]  
mulpd   32(%rsi), %xmm2 ; multiply arg2[2] * arg3[2]
movapd  (%rdi), %xmm1   ; load arg2[0]  
movapd  16(%rdi), %xmm3 ; load arg2[1]

[Bug middle-end/88497] New: Improve Accumulation in Auto-Vectorized Code

2018-12-14 Thread kelvin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497

Bug ID: 88497
   Summary: Improve Accumulation in Auto-Vectorized Code
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kelvin at gcc dot gnu.org
  Target Milestone: ---

[Bug target/88496] New: Unnecessary stack adjustment with -mavx512f

2018-12-14 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88496

Bug ID: 88496
   Summary: Unnecessary stack adjustment with -mavx512f
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: ubizjak at gmail dot com, wei3.xiao at intel dot com,
xuepeng.guo at intel dot com
  Target Milestone: ---
Target: i386,x86-64

[hjl@gnu-cfl-1 gcc]$ cat /tmp/x.i
struct B
{
  char a[12];
  int b;
};

struct B
f2 (void)
{
  struct B x = {};
  return x;
}
[hjl@gnu-cfl-1 gcc]$ ./xgcc -B./ -O2 -S /tmp/x.i -mavx2
[hjl@gnu-cfl-1 gcc]$ cat x.s
.file   "x.i"
.text
.p2align 4
.globl  f2
.type   f2, @function
f2:
.LFB0:
.cfi_startproc
xorl%eax, %eax
xorl%edx, %edx
ret
.cfi_endproc
.LFE0:
.size   f2, .-f2
.ident  "GCC: (GNU) 9.0.0 20181214 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-1 gcc]$ ./xgcc -B./ -O2 -S /tmp/x.i -mavx512f
[hjl@gnu-cfl-1 gcc]$ cat x.s
.file   "x.i"
.text
.p2align 4
.globl  f2
.type   f2, @function
f2:
.LFB0:
.cfi_startproc
subq$16, %rsp
.cfi_def_cfa_offset 24
xorl%eax, %eax
xorl%edx, %edx
addq$16, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE0:
.size   f2, .-f2
    .ident  "GCC: (GNU) 9.0.0 20181214 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-1 gcc]$ 

subq and addq aren't necessary.

[Bug tree-optimization/88044] [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171

2018-12-14 Thread samtebbs at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88044

--- Comment #7 from samtebbs at gcc dot gnu.org ---
I can confirm this test fails on arm-none-linux-gnueabihf when invoking with
"-mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16", as Christophe wrote. Please see the
attached log.

[Bug libgomp/88495] New: An OpenACC async queue is always synchronized with itself

2018-12-14 Thread tschwinge at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88495

Bug ID: 88495
   Summary: An OpenACC async queue is always synchronized with
itself
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: openacc, patch
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: tschwinge at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: cltang at gcc dot gnu.org, jakub at gcc dot gnu.org
  Target Milestone: ---

An OpenACC async queue is always synchronized with itself, so invocations like
"#pragma acc wait(0) async(0)", or "acc_wait_async(0, 0)" don't make a lot of
sense, but are still valid.

This will need to be fixed on all release branches.

I have a patch.

[Bug tree-optimization/88044] [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171

2018-12-14 Thread samtebbs at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88044

samtebbs at gcc dot gnu.org changed:

   What|Removed |Added

 CC||samtebbs at gcc dot gnu.org

--- Comment #6 from samtebbs at gcc dot gnu.org ---
Created attachment 45236
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45236=edit
arm-none-linux-gnueabihf with cpu and fpu options

[Bug c++/88261] [9 Regression] ICE: verify_gimple failed (error: non-trivial conversion at assignment)

2018-12-14 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88261

--- Comment #6 from Bernd Edlinger  ---
(In reply to Jeffrey A. Law from comment #5)
> Right, but we're not supposed to ICE, even on invalid code.

Yes, ideed.

I think what would be needed is adding this C-error to the C++FE:

array-6.c: In function 'foo':
array-6.c:12:23: error: non-static initialization of a flexible array member
   12 |   struct str b = { 2, "b" };
  |   ^~~
array-6.c:12:23: note: (near initialization for 'b')


the error depends on TREE_STATIC (decl);
the data flow in the C-FE is just weird,
see "require_constant_value", a global value holds that bit...

I believe the right place to add in the C++ FE is probably
at where this C++ -Wpedantic warning is emitted,

array-6.c:12:27: warning: initialization of a flexible array member
[-Wpedantic]
   12 |   struct str b = { 2, "b" };
  |   ^


unfortunately that is in digest_init_r and that does not have
access to TREE_STATIC (decl);
In case the object is static, that should be a warning, but in case
of an automatic object it should be a error, that would prevent the ICE.

Either one need to a static variable like the C-FE or pass the decl
from store_init_value and probably other places too into the digest_init
and digest_init_flags.

Well, it will probably be better to add an additional decl parameter.

Thoughts?

[Bug target/88494] [9 Regression] polyhedron 10% mdbx runtime regression

2018-12-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88494

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Target||x86_64-*-*
   Target Milestone|--- |9.0

[Bug target/88494] New: [9 Regression] polyhedron 10% mdbx runtime regression

2018-12-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88494

Bug ID: 88494
   Summary: [9 Regression] polyhedron 10% mdbx runtime regression
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

Between r266526 (good) and r266587 (bad) polyhedron mdbx runtime regressed from
6s to 6.7s on a Haswell machine with -Ofast -march=native -funroll-loops.

https://gcc.opensuse.org/gcc-old/c++bench-czerny/pb11/pb11-summary.txt-2-0.html

There are not many interesting changes in the revision range but I sofar didn't
reproduce elsewhere nor bisected the above revs.

[Bug target/88483] Unnecessary stack alignment

2018-12-14 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88483

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |9.0

--- Comment #2 from H.J. Lu  ---
Fixed for GCC 9.

[Bug target/87370] [7/8/9 Regression] Inefficient return code of struct values

2018-12-14 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87370

--- Comment #14 from H.J. Lu  ---
(In reply to trashyankes from comment #12)
> (In reply to H.J. Lu from comment #11)
> > (In reply to trashyankes from comment #10)
> > 
> > Which GCC are you using? GCC 8.2 generates:
> 
> GCC Explorer :D
> 
> g++ (GCC-Explorer-Build) 9.0.0 20181211 (experimental)
> 
> https://gcc.godbolt.org/z/7AQXiq
> 
> Code is copy paste of test case.
> Compiler is x86-64
> Important are flags: `-O3 -march=skylake-avx512`
> Without it compile this code fine.

It has been fixed in GCC 9 by r267133.

[Bug target/88483] Unnecessary stack alignment

2018-12-14 Thread hjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88483

--- Comment #1 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Fri Dec 14 12:38:04 2018
New Revision: 267133

URL: https://gcc.gnu.org/viewcvs?rev=267133=gcc=rev
Log:
x86: Don't use get_frame_size when finalizing stack frame

get_frame_size () returns used stack slots during compilation, which
may be optimized out later.  Since ix86_find_max_used_stack_alignment
is called by ix86_finalize_stack_frame_flags to check if stack frame
is required, there is no need to call get_frame_size () which may give
inaccurate final stack frame size.

Tested on AVX512 machine configured with

--with-arch=native --with-cpu=native

gcc/

PR target/88483
* config/i386/i386.c (ix86_finalize_stack_frame_flags): Don't
use get_frame_size ().

gcc/testsuite/

PR target/88483
* gcc.target/i386/stackalign/pr88483.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/stackalign/pr88483.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/88490] Missed autovectorization when indices are different

2018-12-14 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88490

--- Comment #4 from rguenther at suse dot de  ---
On Fri, 14 Dec 2018, bugzi...@poradnik-webmastera.com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88490
> 
> --- Comment #3 from Daniel Fruzynski  ---
> In this case s->d is pointer to pointer to double, and both pointer levels 
> have
> restrict qualifier. I wonder if you could add some tag that s->d[n] and 
> s->d[k]
> points to separate memory areas. This tag could be later used to determine 
> that
> s->d[n][0] and s->d[k][0] also do not overlap.

Not easily.  Consider the loads being hoisted before the if (k > n) check
for example.  Note I do not think the C standard is sufficiently
clear with regarding to restrict qualified pointers loaded from
memory.  Clearly s->d[n][0] and s->d[n][0] alias but they are not
based on each other.

[Bug middle-end/88490] Missed autovectorization when indices are different

2018-12-14 Thread bugzi...@poradnik-webmastera.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88490

--- Comment #3 from Daniel Fruzynski  ---
In this case s->d is pointer to pointer to double, and both pointer levels have
restrict qualifier. I wonder if you could add some tag that s->d[n] and s->d[k]
points to separate memory areas. This tag could be later used to determine that
s->d[n][0] and s->d[k][0] also do not overlap.

[Bug tree-optimization/88464] AVX-512 vectorization of masked scatter failing with "not suitable for scatter store"

2018-12-14 Thread moritz.kreutzer at siemens dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88464

--- Comment #11 from Moritz Kreutzer  ---
Jakub, I can confirm it's working for masked gathers (we have a similar pattern
elsewhere in our code) with the latest trunk. Thanks for looking at the
scatters as well!

[Bug target/88473] AVX512: constant folding on mask does not remove unnecessary instructions

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88473

--- Comment #5 from Jakub Jelinek  ---
The rationale for doing it the way it currently is done:
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02612.html

[Bug rtl-optimization/88253] Inlining of function incorrectly deletes volatile register access when using XOR in avr-gcc

2018-12-14 Thread saaadhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88253

Senthil Kumar Selvaraj  changed:

   What|Removed |Added

 CC||saaadhu at gcc dot gnu.org

--- Comment #4 from Senthil Kumar Selvaraj  ---
Oh shoot, just sent https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01028.html :)

[Bug tree-optimization/88492] SLP optimization generates ugly code

2018-12-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 CC||rguenth at gcc dot gnu.org

--- Comment #2 from Richard Biener  ---
IIRC we have a duplicate for this.  The issue is the SLP vectorizer doesn't
handle reductions (not implemented) and thus the vector results need
to be decomposed for the scalar reduction tail.  On x86 we get with -mavx2

vmovdqu (%rdi), %xmm0
vpshufb .LC0(%rip), %xmm0, %xmm0
vpmovzxbw   %xmm0, %xmm1
vpsrldq $8, %xmm0, %xmm0
vpmovzxwd   %xmm1, %xmm2
vpsrldq $8, %xmm1, %xmm1
vpmovzxbw   %xmm0, %xmm0
vpmovzxwd   %xmm1, %xmm1
vmovaps %xmm2, -72(%rsp)
movl-68(%rsp), %eax
vmovaps %xmm1, -56(%rsp)
vpmovzxwd   %xmm0, %xmm1
vpsrldq $8, %xmm0, %xmm0
addl-52(%rsp), %eax
vpmovzxwd   %xmm0, %xmm0
vmovaps %xmm1, -40(%rsp)
movl-56(%rsp), %edx
addl-36(%rsp), %eax
vmovaps %xmm0, -24(%rsp)
addl-72(%rsp), %edx
addl-20(%rsp), %eax
addl-40(%rsp), %edx
addl-24(%rsp), %edx
addl%edx, %eax
movl-48(%rsp), %edx
addl-64(%rsp), %edx
addl-32(%rsp), %edx
addl-16(%rsp), %edx
addl%edx, %eax
movl-44(%rsp), %edx
addl-60(%rsp), %edx
addl-28(%rsp), %edx
addl-12(%rsp), %edx
addl%edx, %eax
ret

the main issue of course that we fail to elide the stack temporary.
Re-running FRE after loop opts might help here but of course
SLP vectorization handling the reduction would be best (though the
tail loop is structured badly, not matching up with the head one).

Whether vectorizing this specific testcases head loop is profitable
or not is questionable on its own of course (but you can easily make
it so and still get similar ugly code in the tail).

[Bug target/88224] Wrong Cortex-R7 and Cortex-R8 FPU configuration

2018-12-14 Thread avieira at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88224

avieira at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from avieira at gcc dot gnu.org ---
Fixed on trunk and gcc-8.

[Bug middle-end/88490] Missed autovectorization when indices are different

2018-12-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88490

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-12-14
 CC||rguenth at gcc dot gnu.org
 Blocks||49774
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
Confirmed.

  _33 = MEM[(double *)_28 clique 1 base 2];
  MEM[(double *)_32 clique 1 base 2] = _33;
  _35 = MEM[(double *)_28 + 8B clique 1 base 2];
  MEM[(double *)_32 + 8B clique 1 base 2] = _35;
  _49 = MEM[(double *)_28 clique 1 base 2];
  MEM[(double *)_32 clique 1 base 2] = _49;
  _51 = MEM[(double *)_28 + 8B clique 1 base 2];
  MEM[(double *)_32 + 8B clique 1 base 2] = _51;

t.c:13:24: note: can't determine dependence between MEM[(double *)_32 clique 1
base 2] and MEM[(double *)_28 + 8B clique 1 base 2]

As you can see we do not exploit that n != k when assigning restrict tags
to the indirect loaded pointers.  Nor would we do that when you load
those in an unrolled loop.  Our restrict machinery is simply not set up
for this.

I don't see how a compiler could reliably and reasonably implement this.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49774
[Bug 49774] [meta-bug] restrict qualification aliasing issues

[Bug target/88473] AVX512: constant folding on mask does not remove unnecessary instructions

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88473

Jakub Jelinek  changed:

   What|Removed |Added

 CC||kyukhin at gcc dot gnu.org,
   ||uros at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
The *s on the =k, km *mov{di,si}_internal patterns (which I've copied to the
*zero_extend?i?i2 patterns) were introduced in
https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01113.html
but there wasn't any discussion on why that has been introduced.  Was that a
fear that the register allocator will start using the mask registers for cases
like
  memory1 = memory2 | memory3;
instead of the GPRs?  I'd say it would be useful to slightly disparage
transfers from GPRs to mask registers and back and perhaps also slightly
disparate mask stores into memory if needed to prevent using mask registers for
the logical ops or shifts with only memory arguments and keep the rest of the
alternative constaints (like =k, km) without any modifiers.  And if that works,
change the mask intrinsics to be normal arithmetics instead of special builtins
with UNSPECs at RTL.  Thoughts on that?

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2018-12-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

Richard Biener  changed:

   What|Removed |Added

 Target|x86 |x86_64-*-*, i?86-*-*
   Target Milestone|--- |9.0

[Bug middle-end/88487] union prevents autovectorization

2018-12-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88487

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-12-14
 Blocks||49774
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
t.c:22:1: note: can't determine dependence between MEM[(double *)_28 clique 1
base 0] and MEM[(double *)_25 + 8B clique 1 base 0]
t.c:22:1: note: removing SLP instance operations starting from: MEM[(double
*)_28 clique 1 base 0] = _29;
t.c:22:1: note: can't determine dependence between MEM[(double *)_43 clique 1
base 0] and MEM[(double *)_40 + 8B clique 1 base 0]
t.c:22:1: note: removing SLP instance operations starting from: MEM[(double
*)_43 clique 1 base 0] = _44;

  # PT = nonlocal escaped null
  _25 = MEM[(double * restrict *)_21 clique 1 base 0];
  # PT = nonlocal escaped null
  _28 = MEM[(double * restrict *)_26 clique 1 base 0];
..
  MEM[(double *)_28 clique 1 base 0] = _29;
  _31 = MEM[(double *)_25 + 8B clique 1 base 0];

while w/o unions we have

  # PT = null { D.2686 } (nonlocal, restrict)
  _25 = MEM[(double * restrict *)_21 clique 1 base 1];
..
  _29 = MEM[(double *)_25 clique 1 base 2];

that is, the indirect loads from non-union members produce restricted
pointers while those from union members not.

The reason for this is that points-to analysis doesn't handle unions
in field-sensitive analysis and thus the restrict code doesn't apply.
This can probably be fixed in a reasonable manner in
push_fields_onto_fieldstack
by initializing only_restrict_pointers appropriately for the UNION case.

Not really my top-priority though.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49774
[Bug 49774] [meta-bug] restrict qualification aliasing issues

[Bug rtl-optimization/88478] [9 Regression] valgrind error in cselib_record_sets

2018-12-14 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88478

David Binderman  changed:

   What|Removed |Added

 CC||aoliva at gcc dot gnu.org

--- Comment #2 from David Binderman  ---
svn blame says

266873 aoliva && sets[i].src_elt

Revision number is in the correct range, so adding to
distribution list.

[Bug target/88489] [9 Regression] FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution test

2018-12-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88489

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Can't reproduce, the test passes for me just fine on i9-7960X.

[Bug rtl-optimization/88478] [9 Regression] valgrind error in cselib_record_sets

2018-12-14 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88478

Richard Biener  changed:

   What|Removed |Added

  Component|c++ |rtl-optimization
Version|8.0 |9.0
   Target Milestone|--- |9.0
Summary|valgrind error in   |[9 Regression] valgrind
   |cselib_record_sets  |error in
   ||cselib_record_sets

  1   2   >