Go patch committed: Build a single backend type for a type alias

2018-07-11 Thread Ian Lance Taylor
A type alias and its underlying type are identical.  This patch to the
Go frontend by Cherry Zhang builds a single backend type for them.
Previously we build two backend types, which sometimes confuse the
backend's type system.

Also don't include type aliases into the list of named type
declarations, since they are not named types.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 262554)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-ea7ac7784791dca517b6681a02c39c11bf136755
+267686fd1dffbc03e610e9f17dadb4e72c75f18d
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 262540)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -7604,7 +7604,7 @@ Named_object::get_backend(Gogo* gogo, st
 case NAMED_OBJECT_TYPE:
   {
 Named_type* named_type = this->u_.type_value;
-   if (!Gogo::is_erroneous_name(this->name_))
+   if (!Gogo::is_erroneous_name(this->name_) && !named_type->is_alias())
  type_decls.push_back(named_type->get_backend(gogo));
 
 // We need to produce a type descriptor for every named
Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc  (revision 262540)
+++ gcc/go/gofrontend/types.cc  (working copy)
@@ -991,6 +991,11 @@ Type::get_backend(Gogo* gogo)
   if (this->btype_ != NULL)
 return this->btype_;
 
+  if (this->named_type() != NULL && this->named_type()->is_alias()) {
+this->btype_ = this->unalias()->get_backend(gogo);
+return this->btype_;
+  }
+
   if (this->forward_declaration_type() != NULL
   || this->named_type() != NULL)
 return this->get_btype_without_hash(gogo);


[Bug c++/86500] accepts-invalid with :: before decltype

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86500

Martin Sebor  changed:

   What|Removed |Added

   Keywords||accepts-invalid
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-12
 CC||msebor at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Sebor  ---
Confirmed.  Clang, ICC, and MSVC all reject it.

[Bug c++/86499] lambda-expressions with capture-default are allowed at namespace scope

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86499

Martin Sebor  changed:

   What|Removed |Added

   Keywords||accepts-invalid
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-12
 CC||msebor at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Sebor  ---
Confirmed.  Clang rejects it with:

error: non-local lambda expression cannot have a capture-default

  auto l = [=]{};

^
ICC and MSVC also accept it.

Please remember include compiler options and output in bug reports:
https://gcc.gnu.org/bugs

[Bug c++/86498] g++ allows conversion from string literal to non-const char* in C++11 mode

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86498

Martin Sebor  changed:

   What|Removed |Added

   Keywords||rejects-valid
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-12
 CC||msebor at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Sebor  ---
Confirmed.  Clang compiles it fine.  ICC and MSVC also reject it.

Please include compiler output in bug reports: https://gcc.gnu.org/bugs

An issue on loop optimization/vectorization

2018-07-11 Thread jiangning liu
For the case below, the code generated by “gcc -O3” is very ugly.



char g_d[1024], g_s1[1024], g_s2[1024];

void test_loop(void)

{

char *d = g_d, *s1 = g_s1, *s2 = g_s2;



for( int y = 0; y < 128; y++ )

{

for( int x = 0; x < 16; x++ )

d[x] = s1[x] + s2[x];

d += 16;

}

}



If we change “for( int x = 0; x < 16; x++ )” to be like “for( int x = 0; x
< 32; x++ )”, very beautiful vectorization code would be generated,



test_loop:

.LFB0:

.cfi_startproc

adrpx2, g_s1

adrpx3, g_s2

add x2, x2, :lo12:g_s1

add x3, x3, :lo12:g_s2

adrpx0, g_d

adrpx1, g_d+2048

add x0, x0, :lo12:g_d

add x1, x1, :lo12:g_d+2048

ldp q1, q2, [x2]

ldp q3, q0, [x3]

add v1.16b, v1.16b, v3.16b

add v0.16b, v0.16b, v2.16b

.p2align 3,,7

.L2:

str q1, [x0]

str q0, [x0, 16]!

cmp x0, x1

bne .L2

ret


The code generated for " for( int x = 0; x < 8; x++ )" is also very ugly.


It looks gcc has potential bugs on loop vectorization. Any idea?



Thanks,

-Jiangning


[Bug c++/84993] Combination of fieldnames and accessor suggestions for misspelled private fields

2018-07-11 Thread dmalcolm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84993

David Malcolm  changed:

   What|Removed |Added

   Keywords||patch
 Status|NEW |ASSIGNED

--- Comment #2 from David Malcolm  ---
Candidate patch:
  https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00594.html

[PATCH] C++: suggestions for misspelled private members (PR c++/84993)

2018-07-11 Thread David Malcolm
PR c++/84993 identifies a problem with our suggestions for
misspelled member names in the C++ FE for the case where the
member is private.

For example, given:

class foo
{
public:
  double get_ratio() const { return m_ratio; }

private:
  double m_ratio;
};

void test(foo *ptr)
{
  if (ptr->ratio >= 0.5)
;// etc
}

...we currently emit this suggestion:

: In function 'void test(foo*)':
:12:12: error: 'class foo' has no member named 'ratio'; did you mean 
'm_ratio'?
   if (ptr->ratio >= 0.5)
^
m_ratio

...but if the user follows this suggestion, they get:

: In function 'void test(foo*)':
:12:12: error: 'double foo::m_ratio' is private within this context
   if (ptr->m_ratio >= 0.5)
^~~
:7:10: note: declared private here
   double m_ratio;
  ^~~
:12:12: note: field 'double foo::m_ratio' can be accessed via 'double 
foo::get_ratio() const'
   if (ptr->m_ratio >= 0.5)
^~~
get_ratio()

It feels wrong to be emitting a fix-it hint that doesn't compile, so this
patch adds the accessor fix-it hint logic to this case, so that we directly
offer a valid suggestion:

: In function 'void test(foo*)':
:12:12: error: 'class foo' has no member named 'ratio'; did you mean
'double foo::m_ratio'? (accessible via 'double foo::get_ratio() const')
   if (ptr->ratio >= 0.5)
^
get_ratio()

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
adds 105 PASS results to g++.sum.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/84993
* call.c (enforce_access): Move diagnostics to...
(complain_about_access): ...this new function.
* cp-tree.h (class access_failure_info): Rename split out field
"m_field_decl" into "m_decl" and "m_diag_decl".
(access_failure_info::record_access_failure): Add tree param.
(access_failure_info::was_inaccessible_p): New accessor.
(access_failure_info::get_decl): New accessor.
(access_failure_info::get_diag_decl): New accessor.
(access_failure_info::get_any_accessor): New member function.
(access_failure_info::add_fixit_hint): New static member function.
(complain_about_access): New decl.
* typeck.c (access_failure_info::record_access_failure): Update
for change to fields.
(access_failure_info::maybe_suggest_accessor): Split out into...
(access_failure_info::get_any_accessor): ...this new function...
(access_failure_info::add_fixit_hint): ...and this new function.
(finish_class_member_access_expr): Split out "has no member named"
error-handling into...
(complain_about_unrecognized_member): ...this new function, and
check that the guessed name is accessible along the access path.
Only provide a spell-correction fix-it hint if it is; otherwise,
attempt to issue an accessor fix-it hint.

gcc/testsuite/ChangeLog:
PR c++/84993
* g++.dg/torture/accessor-fixits-9.C: New test.
---
 gcc/cp/call.c|  64 ++
 gcc/cp/cp-tree.h |  17 ++-
 gcc/cp/typeck.c  | 150 +--
 gcc/testsuite/g++.dg/torture/accessor-fixits-9.C | 119 ++
 4 files changed, 282 insertions(+), 68 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/accessor-fixits-9.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 209c1fd..121f0ca 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6456,6 +6456,38 @@ build_op_delete_call (enum tree_code code, tree addr, 
tree size,
   return error_mark_node;
 }
 
+/* Issue diagnostics about a disallowed access of DECL, using DIAG_DECL
+   in the diagnostics.
+
+   If ISSUE_ERROR is true, then issue an error about the
+   access, followed by a note showing the declaration.
+   Otherwise, just show the note.  */
+
+void
+complain_about_access (tree decl, tree diag_decl, bool issue_error)
+{
+  if (TREE_PRIVATE (decl))
+{
+  if (issue_error)
+   error ("%q#D is private within this context", diag_decl);
+  inform (DECL_SOURCE_LOCATION (diag_decl),
+ "declared private here");
+}
+  else if (TREE_PROTECTED (decl))
+{
+  if (issue_error)
+   error ("%q#D is protected within this context", diag_decl);
+  inform (DECL_SOURCE_LOCATION (diag_decl),
+ "declared protected here");
+}
+  else
+{
+  if (issue_error)
+   error ("%q#D is inaccessible within this context", diag_decl);
+  inform (DECL_SOURCE_LOCATION (diag_decl), "declared here");
+}
+}
+
 /* If the current scope isn't allowed to access DECL along
BASETYPE_PATH, give an error.  The most derived class in
BASETYPE_PATH is the one used to qualify DECL. DIAG_DECL is
@@ -6480,34 +6512,12 @@ enforce_access (tree basetype_path, tree decl, tree 
diag_decl,
 
   if (!accessible_p (basetype_path, decl, true))
 {
+  if 

[Bug c++/86503] New: Segmentation fault signal terminated

2018-07-11 Thread zhonghao at pku dot org.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86503

Bug ID: 86503
   Summary: Segmentation fault signal terminated
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhonghao at pku dot org.cn
  Target Milestone: ---

The code is as follow:

template< bool c, class T >
struct enable_if {
typedef T type;
};

template< class T >
struct enable_if< true, T > {};

template< class F, int n >
void
ft( F f, typename enable_if< n == 0, int >::type ) {}

template< class F, int n >
typeof( ft< F, n-1 >( F(), 0 ) )
ft( F f, typename enable_if< n != 0, int >::type ) {}

int main() {
ft< struct a*, 2 >( 0, 0 );
}

g++ produces the following error message:

3regressiontestusingG++typeof.cpp: In substitution of 'template
__typeof__ (ft(F(), 0)) ft(F, typename enable_if<(n != 0),
int>::type) [with F = main()::a*; int n = -897]':
3regressiontestusingG++typeof.cpp:14:21:   recursively required by substitution
of 'template __typeof__ (ft(F(), 0)) ft(F, typename
enable_if<(n != 0), int>::type) [with F = main()::a*; int n = 1]'
3regressiontestusingG++typeof.cpp:14:21:   required by substitution of
'template __typeof__ (ft(F(), 0)) ft(F, typename
enable_if<(n != 0), int>::type) [with F = main()::a*; int n = 2]'
3regressiontestusingG++typeof.cpp:18:30:   required from here
3regressiontestusingG++typeof.cpp:14:21: fatal error: template instantiation
depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum)
 typeof( ft< F, n-1 >( F(), 0 ) )
   ~~^~~~
compilation terminated.

So, I change the command line to:
g++ -ftemplate-depth=100 3regressiontestusingG++typeof.cpp 

This time, the error message is as follow:

g++: internal compiler error: Segmentation fault signal terminated program
cc1plus
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

Anyway, clang++ accepts the code.

[Bug c++/86502] friend declaration specifying a default argument must be a definition

2018-07-11 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86502

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup of bug 59480.

*** This bug has been marked as a duplicate of bug 59480 ***

[Bug c++/59480] Missing error diagnostic: friend declaration specifying a default argument must be a definition

2018-07-11 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59480

Andrew Pinski  changed:

   What|Removed |Added

 CC||zhonghao at pku dot org.cn

--- Comment #8 from Andrew Pinski  ---
*** Bug 86502 has been marked as a duplicate of this bug. ***

[Bug c++/86502] New: friend declaration specifying a default argument must be a definition

2018-07-11 Thread zhonghao at pku dot org.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86502

Bug ID: 86502
   Summary: friend declaration specifying a default argument must
be a definition
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhonghao at pku dot org.cn
  Target Milestone: ---

The code is as follow:

class Test
{
 friend const int getInt(int inInt = 0);
};


g++ accepts it, but clang++ rejects it:

error: friend declaration specifying a default argument must be a definition

[Bug c++/86501] New: shadow template parameter

2018-07-11 Thread zhonghao at pku dot org.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86501

Bug ID: 86501
   Summary: shadow template parameter
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhonghao at pku dot org.cn
  Target Milestone: ---

The code is as follows:

template < int V >
struct A
{ 
 struct B
 { 
 template < int > friend struct V; 
 };
};

A < 0 >::B a;

g++ accepts the code, but clang++ rejects it:

error: declaration of 'V' shadows template parameter
 template < int > friend struct V; 
^
note: template parameter is declared here
template < int V >
   ^
1 error generated.

[Bug c++/86500] New: accepts-invalid with :: before decltype

2018-07-11 Thread zhonghao at pku dot org.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86500

Bug ID: 86500
   Summary: accepts-invalid with :: before decltype
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhonghao at pku dot org.cn
  Target Milestone: ---

g++ incorrectly accepts this:

struct S { struct T {}; };
::decltype(S())::T st;

clang++ rejects it:

error: expected unqualified-id

[Bug c++/86499] New: lambda-expressions with capture-default are allowed at namespace scope

2018-07-11 Thread zhonghao at pku dot org.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86499

Bug ID: 86499
   Summary: lambda-expressions with capture-default are allowed at
namespace scope
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhonghao at pku dot org.cn
  Target Milestone: ---

g++ incorrectly accepts this:

  auto l = [=]{};

Per C++1y [expr.lambda]p9 this is ill-formed: only local lambda expressions may
have capture-defaults (even if they don't actually capture anything).

[Bug c++/86498] New: g++ allows conversion from string literal to non-const char* in C++11 mode

2018-07-11 Thread zhonghao at pku dot org.cn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86498

Bug ID: 86498
   Summary: g++ allows conversion from string literal to non-const
char* in C++11 mode
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhonghao at pku dot org.cn
  Target Milestone: ---

g++ rejects this valid C++11 code:

  void f(char*);
  int (...);
  int  = f("foo");

There is no conversion from string literal to non-const char* in C++11.

[PATCH] doc: update looping constructs

2018-07-11 Thread Paul Koning
This patch removes the obsolete documentation for 
decrement_and_branch_until_zero.  It also adds detail to the description for 
doloop_end.  In particular, it describes the required form of the conditional 
branch part of the pattern.

Ok for trunk?

paul

ChangeLog:

2018-07-11  Paul Koning  

* doc/rtl.texi (REG_NONNEG): Remove decrement and branch until
zero reference, add doloop_end instead.
* doc/md.texi (decrement_and_branch_until_zero): Remove.
(Looping patterns): Remove decrement_and_branch_until_zero.  Add
detail for doloop_end.

Index: doc/md.texi
===
--- doc/md.texi (revision 262562)
+++ doc/md.texi (working copy)
@@ -6681,17 +6681,6 @@ second operand, but you should incorporate it in t
 that the jump optimizer will not delete the table as unreachable code.
 
 
-@cindex @code{decrement_and_branch_until_zero} instruction pattern
-@item @samp{decrement_and_branch_until_zero}
-Conditional branch instruction that decrements a register and
-jumps if the register is nonzero.  Operand 0 is the register to
-decrement and test; operand 1 is the label to jump to if the
-register is nonzero.  @xref{Looping Patterns}.
-
-This optional instruction pattern is only used by the combiner,
-typically for loops reversed by the loop optimizer when strength
-reduction is enabled.
-
 @cindex @code{doloop_end} instruction pattern
 @item @samp{doloop_end}
 Conditional branch instruction that decrements a register and
@@ -7515,67 +7504,12 @@ iterations.  This avoids the need for fetching and
 @samp{dbra}-like instruction and avoids pipeline stalls associated with
 the jump.
 
-GCC has three special named patterns to support low overhead looping.
-They are @samp{decrement_and_branch_until_zero}, @samp{doloop_begin},
-and @samp{doloop_end}.  The first pattern,
-@samp{decrement_and_branch_until_zero}, is not emitted during RTL
-generation but may be emitted during the instruction combination phase.
-This requires the assistance of the loop optimizer, using information
-collected during strength reduction, to reverse a loop to count down to
-zero.  Some targets also require the loop optimizer to add a
-@code{REG_NONNEG} note to indicate that the iteration count is always
-positive.  This is needed if the target performs a signed loop
-termination test.  For example, the 68000 uses a pattern similar to the
-following for its @code{dbra} instruction:
+GCC has two special named patterns to support low overhead looping.
+They are @samp{doloop_begin} and @samp{doloop_end}.  These are emitted
+by the loop optimizer for certain well-behaved loops with a finite
+number of loop iterations using information collected during strength
+reduction.
 
-@smallexample
-@group
-(define_insn "decrement_and_branch_until_zero"
-  [(set (pc)
-(if_then_else
-  (ge (plus:SI (match_operand:SI 0 "general_operand" "+d*am")
-   (const_int -1))
-  (const_int 0))
-  (label_ref (match_operand 1 "" ""))
-  (pc)))
-   (set (match_dup 0)
-(plus:SI (match_dup 0)
- (const_int -1)))]
-  "find_reg_note (insn, REG_NONNEG, 0)"
-  "@dots{}")
-@end group
-@end smallexample
-
-Note that since the insn is both a jump insn and has an output, it must
-deal with its own reloads, hence the `m' constraints.  Also note that
-since this insn is generated by the instruction combination phase
-combining two sequential insns together into an implicit parallel insn,
-the iteration counter needs to be biased by the same amount as the
-decrement operation, in this case @minus{}1.  Note that the following similar
-pattern will not be matched by the combiner.
-
-@smallexample
-@group
-(define_insn "decrement_and_branch_until_zero"
-  [(set (pc)
-(if_then_else
-  (ge (match_operand:SI 0 "general_operand" "+d*am")
-  (const_int 1))
-  (label_ref (match_operand 1 "" ""))
-  (pc)))
-   (set (match_dup 0)
-(plus:SI (match_dup 0)
- (const_int -1)))]
-  "find_reg_note (insn, REG_NONNEG, 0)"
-  "@dots{}")
-@end group
-@end smallexample
-
-The other two special looping patterns, @samp{doloop_begin} and
-@samp{doloop_end}, are emitted by the loop optimizer for certain
-well-behaved loops with a finite number of loop iterations using
-information collected during strength reduction.
-
 The @samp{doloop_end} pattern describes the actual looping instruction
 (or the implicit looping operation) and the @samp{doloop_begin} pattern
 is an optional companion pattern that can be used for initialization
@@ -7594,15 +7528,65 @@ desired special iteration counter register was not
 machine dependent reorg pass could emit a traditional compare and jump
 instruction pair.
 
-The essential difference between the
-@samp{decrement_and_branch_until_zero} and the @samp{doloop_end}
-patterns is that the loop optimizer allocates an additional 

[PING][PATCH][Aarch64] v2: Arithmetic overflow addv patterns [Patch 2/4]

2018-07-11 Thread Michael Collison
Ping. Last patch here:

https://gcc.gnu.org/ml/gcc-patches/2018-06/msg00735.html



[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

--- Comment #8 from Alexander Monakov  ---
(In reply to H.J. Lu from comment #7)
> It is to be consistent for common symbol linked against .a or .so.

That seems like a really strange reason because without --whole-archive there
are other ways to arrive at an apparent "inconsistency", while with
--whole-archive there's no need for special treatment as the "consistent"
result is achieved automatically.

Re: [PATCH, rs6000] Add support for gimple folding vec_perm()

2018-07-11 Thread Segher Boessenkool
On Mon, Jul 09, 2018 at 02:08:55PM -0500, Will Schmidt wrote:
>Add support for early gimple folding of vec_perm.   Testcases are already 
> in-tree as
> gcc.target/powerpc/fold-vec-perm-*.c
> 
> OK for trunk?

Looks fine to me.  Okay if no one else complains :-)


Segher


>   * gcc/config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support
>   for folding vec_perm.


[Bug tree-optimization/86489] ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-11 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

--- Comment #1 from kugan at gcc dot gnu.org ---
Sorry about the breakage, I am trying to reproduce it on x86-64. Please let me
know if you have testcase.

Re: [PATCH, rs6000] Testcase adds for vec_unpack

2018-07-11 Thread Segher Boessenkool
Hi Will,


On Mon, Jul 09, 2018 at 02:08:49PM -0500, Will Schmidt wrote:
>   * gcc.target/powerpc/fold-vec-unpack-char.c: New.
>   * gcc.target/powerpc/fold-vec-unpack-float.c: New.
>   * gcc.target/powerpc/fold-vec-unpack-int.c: New.
>   * gcc.target/powerpc/fold-vec-unpack-pixel.c: New.
>   * gcc.target/powerpc/fold-vec-unpack-short.c: New.

This looks fine.  Okay for trunk.  Thanks!


Segher


Re: [PATCH, rs6000] gimple folding support for vec_pack and vec_unpack

2018-07-11 Thread Segher Boessenkool
Hi!

On Mon, Jul 09, 2018 at 02:08:37PM -0500, Will Schmidt wrote:
>   * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin):
>   Add support for gimple-folding of vec_pack() and vec_unpack()
>   intrinsics.

> +case ALTIVEC_BUILTIN_VUPKHPX:
> +case ALTIVEC_BUILTIN_VUPKLPX:
> +  {
> +   return false;
> +  }

A block around a signle statement looks a bit silly (and in the other
cases in your patch it isn't necessary either; it is nice if you use it
to give some scope to a local var, but you don't have that here).

But, patch is fine as far as I can see :-)


Segher


gcc-6-20180711 is now available

2018-07-11 Thread gccadmin
Snapshot gcc-6-20180711 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/6-20180711/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-6-branch 
revision 262567

You'll find:

 gcc-6-20180711.tar.xzComplete GCC

  SHA256=f634ad6c731071247ed18e1d043072272121ea876604a73b17060e9be858a570
  SHA1=c09f5418af872a5eb0c9cbaa4855991d2e739db6

Diffs from 6-20180704 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-11 Thread Richard Earnshaw (lists)
On 11/07/18 21:46, Jeff Law wrote:
> On 07/10/2018 10:43 AM, Richard Earnshaw (lists) wrote:
>> On 10/07/18 16:42, Jeff Law wrote:
>>> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
 On 10/07/18 00:13, Jeff Law wrote:
> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>
>> To address all of the above, these patches adopt a new approach, based
>> in part on a posting by Chandler Carruth to the LLVM developers list
>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
>> but which we have extended to deal with inter-function speculation.
>> The patches divide the problem into two halves.
> We're essentially turning the control dependency into a value that we
> can then use to munge the pointer or the resultant data.
>
>>
>> The first half is some target-specific code to track the speculation
>> condition through the generated code to provide an internal variable
>> which can tell us whether or not the CPU's control flow speculation
>> matches the data flow calculations.  The idea is that the internal
>> variable starts with the value TRUE and if the CPU's control flow
>> speculation ever causes a jump to the wrong block of code the variable
>> becomes false until such time as the incorrect control flow
>> speculation gets unwound.
> Right.
>
> So one of the things that comes immediately to mind is you have to run
> this early enough that you can still get to all the control flow and
> build your predicates.  Otherwise you have do undo stuff like
> conditional move generation.

 No, the opposite, in fact.  We want to run this very late, at least on
 Arm systems (AArch64 or AArch32).  Conditional move instructions are
 fine - they're data-flow operations, not control flow (in fact, that's
 exactly what the control flow tracker instructions are).  By running it
 late we avoid disrupting any of the earlier optimization passes as well.
>>> Ack.  I looked at the aarch64 implementation after sending my message
>>> and it clearly runs very late.
>>>
>>> I haven't convinced myself that all the work generic parts of the
>>> compiler to rewrite and eliminate conditionals is safe.  But even if it
>>> isn't, you're probably getting enough coverage to drastically reduce the
>>> attack surface.  I'm going to have to think about the early
>>> transformations we make and how they interact here harder.  But I think
>>> the general approach can dramatically reduce the attack surface.
>>
>> My argument here would be that we are concerned about speculation that
>> the CPU does with the generated program.  We're not particularly
>> bothered about the abstract machine description it's based upon.  As
>> long as the earlier transforms lead to a valid translation (it hasn't
>> removed a necessary bounds check) then running late is fine.
> I'm thinking about obfuscation of the bounds check or the pointer or
> turning branchy into straightline code, possibly doing some speculation
> in the process, if-conversion and the like.
> 
> For example hoist_adjacent_loads which results in speculative loads and
> likely a conditional move to select between the two loaded values.
> 
> Or what if we've done something like
> 
> if (x < maxval)
>res = *p;
> 
> And we've turned that into
> 
> 
> t = *p;
> res = (x < maxval) ? t : res;

Hmm, interesting.  But for that to be safe, the compiler would have to
be able to prove that dereferencing p was safe even if x >= maxval,
otherwise the run-time code could fault (so if there's any chance that
it could point to something vulnerable, then there must also be a chance
that it points to unmapped memory).  Given that requirement, I don't
think this case can be a specific concern, since the requirement implies
that p must already be within some known bounds for the type of object
it points to.

R.

> 
> 
> That may be implemented as a conditional move at the RTL level, so
> protecting that may be nontrivial.
> 
> In those examples the compiler itself has introduced the speculation.
> 
> I can't find the conditional obfuscation I was looking for, so it's hard
> to rule it in our out as potentially problematical.
> 
> WRT pointer obfuscation, we no longer propagate conditional equivalences
> very agressively, so it may be a non-issue in the end.
> 
> But again, even with these concerns I think what you're doing cuts down
> the attack surface in meaningful ways.
> 
> 
> 
>>
>> I can't currently conceive a situation where the compiler would be able
>> to remove a /necessary/ bounds check that could lead to unsafe
>> speculation later on.  A redundant bounds check removal shouldn't be a
>> problem as the non-redundant check should remain and that will still get
>> tracking code added.
> It's less about removal and more about either compiler-generated
> speculation or obfuscation of the patterns you're looking for.
> 
> 
> jeff
> 
> 
> 
> 



[Bug middle-end/86467] inlining strcmp with small known length array

2018-07-11 Thread qing.zhao at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86467

--- Comment #2 from Qing Zhao  ---
> --- Comment #1 from Richard Biener  ---
> I think there's a duplicate report.

you mean another similar PR existing?

[PATCH] Fix store-merging wrong-code issue (PR tree-optimization/86492)

2018-07-11 Thread Jakub Jelinek
Hi!

The following testcase is a similar issue to PR84503 and the fix is similar,
because coalesce_immediate_stores temporarily sorts the stores on ascending
bitpos and if stores are merged, the merged store is emitted in the location
of the latest of the stores, we need to verify that there is no overlap with
other stores that are originally before that latest store from those we are
considering and overlaps the set of stores we are considering to merge.
In that case we need to punt and not merge (unless we do smarts like prove
overlap between just some of the stores and force reordering).

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk and 8.2?

2018-07-11  Jakub Jelinek  

PR tree-optimization/86492
* gimple-ssa-store-merging.c
(imm_store_chain_info::coalesce_immediate_stores): Call
check_no_overlap even for the merge_overlapping case.  Formatting fix.

* gcc.c-torture/execute/pr86492.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2018-06-13 10:05:53.0 +0200
+++ gcc/gimple-ssa-store-merging.c  2018-07-11 19:24:12.084120206 +0200
@@ -2702,7 +2702,12 @@ imm_store_chain_info::coalesce_immediate
{
  /* Only allow overlapping stores of constants.  */
  if (info->rhs_code == INTEGER_CST
- && merged_store->stores[0]->rhs_code == INTEGER_CST)
+ && merged_store->stores[0]->rhs_code == INTEGER_CST
+ && check_no_overlap (m_store_info, i, INTEGER_CST,
+  MAX (merged_store->last_order, info->order),
+  MAX (merged_store->start
+   + merged_store->width,
+   info->bitpos + info->bitsize)))
{
  merged_store->merge_overlapping (info);
  goto done;
@@ -2732,10 +2737,8 @@ imm_store_chain_info::coalesce_immediate
  info->ops_swapped_p = true;
}
  if (check_no_overlap (m_store_info, i, info->rhs_code,
-   MAX (merged_store->last_order,
-info->order),
-   MAX (merged_store->start
-+ merged_store->width,
+   MAX (merged_store->last_order, info->order),
+   MAX (merged_store->start + merged_store->width,
 info->bitpos + info->bitsize)))
{
  /* Turn MEM_REF into BIT_INSERT_EXPR for bit-field stores.  */
--- gcc/testsuite/gcc.c-torture/execute/pr86492.c.jj2018-07-11 
19:40:27.760122514 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr86492.c   2018-07-11 
19:40:13.460107841 +0200
@@ -0,0 +1,34 @@
+/* PR tree-optimization/86492 */
+
+union U
+{
+  unsigned int r;
+  struct S
+  {
+unsigned int a:12;
+unsigned int b:4;
+unsigned int c:16;
+  } f;
+};
+
+__attribute__((noipa)) unsigned int
+foo (unsigned int x)
+{
+  union U u;
+  u.r = 0;
+  u.f.c = x;
+  u.f.b = 0xe;
+  return u.r;
+}
+
+int
+main ()
+{
+  union U u;
+  if (__CHAR_BIT__ * __SIZEOF_INT__ != 32 || sizeof (u.r) != sizeof (u.f))
+return 0;
+  u.r = foo (0x72);
+  if (u.f.a != 0 || u.f.b != 0xe || u.f.c != 0x72)
+__builtin_abort ();
+  return 0;
+}

Jakub


[Bug c++/86491] bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace

2018-07-11 Thread jason.vas.dias at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491

--- Comment #6 from Jason Vas Dias  ---
Thanks Andrew!

But, please explain, why does using a static reference cause
anonymous namespace issues ?
Where is this mandated in the C++ standards ?

I understand that any reference to a static object can violate the ODR
rule IFF multiple translation units that contain references to the "same"
static object are combined. But how does that involve anonymous namespace
issues?

Maybe GCC should provide some mechanism to detect static references,
report them separately as a new  "-Wstatic-reference"? warning ,
and NOT bring in 'anonymous namespace' usage issues at all ?
It is purely a semantic difference - I agree SOME warning should
have been issued - but complaining about 'anonymous namespace'
when it means 'static reference' is really confusing.
There are NO objects named '{anon::}' in the program.
Couldn't GCC have some trigger on the creation of an '{anon}::'
reference, and use it to report the anonymous namespace usage,
and if none existed, complain about the static reference usage ?

Thanks & Best Regards,
Jason


On 11/07/2018, pinskia at gcc dot gnu.org  wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491
>
> Andrew Pinski  changed:
>
>What|Removed |Added
> 
>Keywords||diagnostic
>
> --- Comment #5 from Andrew Pinski  ---
> It is not complaining about d in main but rather N::D.  The warning is done
> before it reaches main.
>
> Really the warning should be clearer as you are not using an anonymous
> namespace rather static linkage which is also causes an anonymous namespace
> issues.

[Bug middle-end/86453] [8/9 Regression] error: type variant differs by TYPE_PACKED in free_lang_data since r255469

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453

Martin Sebor  changed:

   What|Removed |Added

   Keywords||patch
  Component|c   |middle-end

--- Comment #10 from Martin Sebor  ---
Patch: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00586.html

[PATCH] reject conflicting attributes before calling handlers (PR 86453)

2018-07-11 Thread Martin Sebor

The attached change set adjusts the attribute exclusion code
to detect and reject incompatible attributes before attribute
handlers are called to have a chance to make changes despite
the exclusions.  The handlers are not run when a conflict is
found.

Tested on x86_64-linux.  I expected the fallout to be bigger
but only a handful of tests needed adjusting and the changes
all look like clear improvements.  I.e., conflicting attributes
that diagnosed as being ignored really are being ignored as one
would expect.

Martin
PR c/86453 - error: type variant differs by TYPE_PACKED in free_lang_data since r255469

gcc/ChangeLog:

	PR c/86453
	* attribs.c (decl_attributes): Reject conflicting attributes before
	calling attribute handlers.

gcc/testsuite/ChangeLog:

	PR c/86453
	* c-c++-common/Wattributes.c: Adjust.
	* gcc.dg/Wattributes-10.c: New test.
	* g++.dg/Wattributes-3.C: Adjust.
	* g++.dg/lto/pr86453_0.C: New test.
	* gcc.dg/Wattributes-6.c: Adjust.
	* gcc.dg/pr18079.c: Adjust.
	* gcc.dg/torture/pr42363.c: Adjust.

Index: gcc/attribs.c
===
--- gcc/attribs.c	(revision 262542)
+++ gcc/attribs.c	(working copy)
@@ -672,6 +672,35 @@ decl_attributes (tree *node, tree attributes, int
 
   bool no_add_attrs = false;
 
+  /* Check for exclusions with other attributes on the current
+	 declation as well as the last declaration of the same
+	 symbol already processed (if one exists).  Detect and
+	 reject incompatible attributes.  */
+  bool built_in = flags & ATTR_FLAG_BUILT_IN;
+  if (spec->exclude
+	  && (flag_checking || !built_in))
+	{
+	  /* Always check attributes on user-defined functions.
+	 Check them on built-ins only when -fchecking is set.
+	 Ignore __builtin_unreachable -- it's both const and
+	 noreturn.  */
+
+	  if (!built_in
+	  || !DECL_P (*anode)
+	  || (DECL_FUNCTION_CODE (*anode) != BUILT_IN_UNREACHABLE
+		  && (DECL_FUNCTION_CODE (*anode)
+		  != BUILT_IN_UBSAN_HANDLE_BUILTIN_UNREACHABLE)))
+	{
+	  bool no_add = diag_attr_exclusions (last_decl, *anode, name, spec);
+	  if (!no_add && anode != node)
+		no_add = diag_attr_exclusions (last_decl, *node, name, spec);
+	  no_add_attrs |= no_add;
+	}
+	}
+
+  if (no_add_attrs)
+	continue;
+
   if (spec->handler != NULL)
 	{
 	  int cxx11_flag =
@@ -695,33 +724,6 @@ decl_attributes (tree *node, tree attributes, int
 	returned_attrs = chainon (ret, returned_attrs);
 	}
 
-  /* If the attribute was successfully handled on its own and is
-	 about to be added check for exclusions with other attributes
-	 on the current declation as well as the last declaration of
-	 the same symbol already processed (if one exists).  */
-  bool built_in = flags & ATTR_FLAG_BUILT_IN;
-  if (spec->exclude
-	  && !no_add_attrs
-	  && (flag_checking || !built_in))
-	{
-	  /* Always check attributes on user-defined functions.
-	 Check them on built-ins only when -fchecking is set.
-	 Ignore __builtin_unreachable -- it's both const and
-	 noreturn.  */
-
-	  if (!built_in
-	  || !DECL_P (*anode)
-	  || (DECL_FUNCTION_CODE (*anode) != BUILT_IN_UNREACHABLE
-		  && (DECL_FUNCTION_CODE (*anode)
-		  != BUILT_IN_UBSAN_HANDLE_BUILTIN_UNREACHABLE)))
-	{
-	  bool no_add = diag_attr_exclusions (last_decl, *anode, name, spec);
-	  if (!no_add && anode != node)
-		no_add = diag_attr_exclusions (last_decl, *node, name, spec);
-	  no_add_attrs |= no_add;
-	}
-	}
-
   /* Layout the decl in case anything changed.  */
   if (spec->type_required && DECL_P (*node)
 	  && (VAR_P (*node)
Index: gcc/testsuite/c-c++-common/Wattributes.c
===
--- gcc/testsuite/c-c++-common/Wattributes.c	(revision 262542)
+++ gcc/testsuite/c-c++-common/Wattributes.c	(working copy)
@@ -39,13 +39,13 @@ PackedPacked { int i; };
aligned and packed on a function declaration.  */
 
 void ATTR ((aligned (8), packed))
-faligned8_1 (void);   /* { dg-warning ".packed. attribute ignored" } */
+faligned8_1 (void);   /* { dg-warning "ignoring attribute .packed. because it conflicts with attribute .aligned." } */
 
 void ATTR ((aligned (8)))
-faligned8_2 (void);   /* { dg-message "previous declaration here" "" { xfail *-*-* } } */
+faligned8_2 (void);   /* { dg-message "previous declaration here" } */
 
 void ATTR ((packed))
-faligned8_2 (void);   /* { dg-warning ".packed. attribute ignored" } */
+faligned8_2 (void);   /* { dg-warning "ignoring attribute .packed. because it conflicts with attribute .aligned." } */
 
 /* Exercise the handling of the mutually exclusive attributes
always_inline and noinline (in that order).  */
Index: gcc/testsuite/g++.dg/Wattributes-3.C
===
--- gcc/testsuite/g++.dg/Wattributes-3.C	(revision 262542)
+++ 

[PATCH][Middle-end][version 3]3rd patch of PR78809

2018-07-11 Thread Qing Zhao
Hi,   This is the 3rd version of the patch for the last part of PR78809.

the major change in this version is to address the following concerns raised by 
Martin:

> One of the basic design principles that I myself have
> accidentally violated in the past is that warning options
> should not impact the emitted object code.  I don't think
> your patch actually does introduce this dependency by having
> the codegen depend on the result of check_access() -- I'm
> pretty sure the function is designed to do the validation
> irrespective of warning options and return based on
> the result of the validation and not based on whether
> a warning was issued.  But the choice of the variable name,
> no_overflow_warn, suggests that it does, in fact, have this
> effect.  So I would suggest to rename the variable and add
> a test that verifies that this dependency does not exist.

I have addressed this concern as following per our discussion:

1. in routine expand_builtin_memcmp, 
* delete the condition if (warn_stringop_overflow) before check_access;
* change the name of the variable that holds the return value of check_access 
to no_overflow

2. in the testsuite, change the new testcase strcmpopt_6.c to inhibit inlining 
when check_access
detects error (Not depend on whether the warning option is ON or not).

the following is the new patch, tested on both X86 and aarch64, no regression.

Okay for thunk?

thanks.

Qing

gcc/ChangeLog:

+2018-07-11  Qing Zhao  
+
+   PR middle-end/78809
+   * builtins.c (expand_builtin_memcmp): Inline the calls first
+   when result_eq is false.
+   (expand_builtin_strcmp): Inline the calls first.
+   (expand_builtin_strncmp): Likewise.
+   (inline_string_cmp): New routine. Expand a string compare 
+   call by using a sequence of char comparison.
+   (inline_expand_builtin_string_cmp): New routine. Inline expansion
+   a call to str(n)cmp/memcmp.
+   * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
option.
+   * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
+

gcc/testsuite/ChangeLog:

+2018-07-11  Qing Zhao  
+
+   PR middle-end/78809
+   * gcc.dg/strcmpopt_5.c: New test.
+   * gcc.dg/strcmpopt_6.c: New test.
+



0001-3nd-Patch-for-PR78009.patch
Description: Binary data


> On Jul 5, 2018, at 10:46 AM, Qing Zhao  wrote:
> 
> Hi,
> 
> I have sent two emails with the updated patches on 7/3:
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00065.html
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00070.html
> 
> however, these 2 emails  were not successfully forwarded to the 
> gcc-patches@gcc.gnu.org mailing list.
> 
> So, I am sending the same email again in this one, hopefully this time it can 
> go through.
> Qing
> 
> Hi, Jeff,
> 
> thanks a lot for your review and comments.
> 
> I have addressed your comments,updated the patch, retested on both
> aarch64 and x86.
> 
> The major changes in this version compared to the previous version are:
> 
>   1. in routine expand_builtin_memcmp:
> * move the inlining transformation AFTER the warning is issues for
> -Wstringop-overflow;
> * only apply inlining when there is No warning is issued.
>   2. in the testsuite, add a new testcase strcmpopt_6.c for this case.
>   3. update comments to:
> * capitalize the first word.
> * capitalize all the arguments.
> 
> NOTE, the routine expand_builtin_strcmp and expand_builtin_strncmp are not 
> changed.
> the reason is:  there is NO overflow checking for these two routines 
> currently.
> if we need overflow checking for these two routines, I think that a separate 
> patch is needed.
> if this is needed, let me know, I can work on this separate patch for issuing 
> warning for strcmp/strncmp when
> -Wstringop-overflow is specified.



Re: RFC: lra-constraints.c and TARGET_HARD_REGNO_CALL_PART_CLOBBERED question/patch

2018-07-11 Thread Jeff Law
On 07/11/2018 02:07 PM, Steve Ellcey wrote:
> I have a reload/register allocation question and possible patch.  While
> working on the Aarch64 SIMD ABI[1] I ran into a problem where GCC was
> saving and restoring registers that it did not need to.  I tracked it
> down to lra-constraints.c and its use of
> targetm.hard_regno_call_part_clobbered on instructions that are not
> calls.  Specifically need_for_call_save_p would check this macro even
> when the instruction in question (unknown to need_for_call_save_p)
> was not a call instruction.
> 
> This seems wrong to me and I was wondering if anyone more familiar
> with the register allocator and reload could look at this patch and
> tell me if it seems reasonable or not.  It passed bootstrap and I
> am running tests now.  I am just wondering if there is any reason why
> this target function would need to be called on non-call instructions
> or if doing so is just an oversight/bug.
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> [1] https://gcc.gnu.org/ml/gcc/2018-07/msg00012.html
> 
> 
> 2018-07-11  Steve Ellcey  
> 
>   * lra-constraints.c (need_for_call_save_p): Add insn argument
>   and only check targetm.hard_regno_call_part_clobbered on calls.
>   (need_for_split_p): Add insn argument, pass to need_for_call_save_p.
>   (split_reg): Pass insn to need_for_call_save_p.
>   (split_if_necessary): Pass curr_insn to need_for_split_p.
>   (inherit_in_ebb): Ditto.
Various target have calls which are exposed as INSNs rather than as
CALL_INSNs.   So we need to check that hook on all insns.

You can probably see this in action with the TLS insns on aarch64.

jeff


[Bug c++/86497] New: Regression for x!=x

2018-07-11 Thread no...@turm-lahnstein.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86497

Bug ID: 86497
   Summary: Regression for x!=x
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: no...@turm-lahnstein.de
  Target Milestone: ---

When compiling

bool is_nan1(double x){
return x!=x;
}

with g++-8.1  -O3 the resulting assembler (https://godbolt.org/g/BBFM3Q) is 

_Z7is_nan1d:
  ucomisd %xmm0, %xmm0
  movl $1, %edx
  setne %al
  cmovp %edx, %eax
  ret

However, for version 7.3 the result was (https://godbolt.org/g/tR69jf) better:

_Z7is_nan1d:
  ucomisd %xmm0, %xmm0
  setp %al
  ret

Also for 8.1 -Os is the assembler somewhat strange:

_Z7is_nan1d:
  ucomisd %xmm0, %xmm0
  movb $1, %al
  jp .L2
  setne %al

[Bug lto/86496] [9 regression] plugin required to handle lto object

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86496

Martin Sebor  changed:

   What|Removed |Added

 CC||msebor at gcc dot gnu.org

--- Comment #1 from Martin Sebor  ---
I think these are the same failures those in bug 86004.

[Bug c++/86491] bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace

2018-07-11 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #5 from Andrew Pinski  ---
It is not complaining about d in main but rather N::D.  The warning is done
before it reaches main.

Really the warning should be clearer as you are not using an anonymous
namespace rather static linkage which is also causes an anonymous namespace
issues.

Re: [PATCH][RFC] Make iterating over hash-map elide copying/destructing

2018-07-11 Thread Pedro Alves
On 07/11/2018 12:24 PM, Trevor Saunders wrote:
> However if we went that route we should prevent use of the
> assignment operator by declaring one explicitly and making it private but
> then not implementing it, so it at least fails to link and with some
> macros you can actually tell the compiler in c++11 its deleted and may
> not be used.

The macro already exists --- DISABLE_COPY_AND_ASSIGN in include/ansidecl.h.

Thanks,
Pedro Alves


[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

--- Comment #7 from H.J. Lu  ---
(In reply to Alexander Monakov from comment #6)
> (In reply to H.J. Lu from comment #5)
> > When ld sees a common symbol, it will use a non-common definiton
> > in a library, .a or .so, to override it.
> 
> This is surprising, is it documented somewhere? I don't think the ELF spec
> suggests something like that needs to happen.

It is to be consistent for common symbol linked against .a or .so.

[Patch, Fortran] PR 85599: warn about short-circuiting of logical expressions for non-pure functions

2018-07-11 Thread Janus Weil
Hi all,

after the dust of the heated discussion around this PR has settled a
bit, here is another attempt to implement at least some basic warnings
about compiler-dependent behavior concerning the short-circuiting of
logical expressions.

As a reminder (and recap of the previous discussion), the Fortran
standard unfortunately is a bit sloppy in this area: It allows
compilers to short-circuit the second operand of .AND. / .OR.
operators, but does not require this. As a result, compilers can do
what they want without conflicting with the standard, and they do:
gfortran does short-circuiting (via TRUTH_ANDIF_EXPR/TRUTH_ORIF_EXPR),
ifort does not.

I'm continuing here the least-invasive approach of keeping gfortran's
current behavior, but warning about cases where compilers may produce
different results.

The attached patch is very close to the version I posted previously
(which was already approved by Janne), with the difference that the
warnings are now triggered by -Wextra and not -Wsurprising (which is
included in -Wall), as suggested by Nick Maclaren. I think this is
more reasonable, since not everyone may want to see these warnings.

Note that I don't want to warn about all possible optimizations that
might be allowed by the standard, but only about those that are
actually problematic in practice and result in compiler-dependent
behavior.

The patch regtests cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus


2018-07-11  Thomas Koenig  
Janus Weil  

PR fortran/85599
* dump-parse-tree (show_attr): Add handling of implicit_pure.
* resolve.c (impure_function_callback): New function.
(resolve_operator): Call it vial gfc_expr_walker.


2018-07-11  Janus Weil  

PR fortran/85599
* gfortran.dg/short_circuiting.f90: New test.
Index: gcc/fortran/dump-parse-tree.c
===
--- gcc/fortran/dump-parse-tree.c	(revision 262563)
+++ gcc/fortran/dump-parse-tree.c	(working copy)
@@ -716,6 +716,8 @@ show_attr (symbol_attribute *attr, const char * mo
 fputs (" ELEMENTAL", dumpfile);
   if (attr->pure)
 fputs (" PURE", dumpfile);
+  if (attr->implicit_pure)
+fputs (" IMPLICIT_PURE", dumpfile);
   if (attr->recursive)
 fputs (" RECURSIVE", dumpfile);
 
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 262563)
+++ gcc/fortran/resolve.c	(working copy)
@@ -3822,6 +3822,46 @@ lookup_uop_fuzzy (const char *op, gfc_symtree *uop
 }
 
 
+/* Callback finding an impure function as an operand to an .and. or
+   .or.  expression.  Remember the last function warned about to
+   avoid double warnings when recursing.  */
+
+static int
+impure_function_callback (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+			  void *data)
+{
+  gfc_expr *f = *e;
+  const char *name;
+  static gfc_expr *last = NULL;
+  bool *found = (bool *) data;
+
+  if (f->expr_type == EXPR_FUNCTION)
+{
+  *found = 1;
+  if (f != last && !pure_function (f, ))
+	{
+	  /* This could still be a function without side effects, i.e.
+	 implicit pure.  Do not warn for that case.  */
+	  if (f->symtree == NULL || f->symtree->n.sym == NULL
+	  || !gfc_implicit_pure (f->symtree->n.sym))
+	{
+	  if (name)
+		gfc_warning (OPT_Wextra,
+			 "Function %qs at %L might not be evaluated",
+			 name, >where);
+	  else
+		gfc_warning (OPT_Wextra,
+			 "Function at %L might not be evaluated",
+			 >where);
+	}
+	}
+  last = f;
+}
+
+  return 0;
+}
+
+
 /* Resolve an operator expression node.  This can involve replacing the
operation with a user defined function call.  */
 
@@ -3930,6 +3970,14 @@ resolve_operator (gfc_expr *e)
 	gfc_convert_type (op1, >ts, 2);
 	  else if (op2->ts.kind < e->ts.kind)
 	gfc_convert_type (op2, >ts, 2);
+
+	  if (e->value.op.op == INTRINSIC_AND || e->value.op.op == INTRINSIC_OR)
+	{
+	  /* Warn about short-circuiting
+	 with impure function as second operand.  */
+	  bool op2_f = false;
+	  gfc_expr_walker (, impure_function_callback, _f);
+	}
 	  break;
 	}
 
! { dg-do compile }
! { dg-additional-options "-Wextra" }
!
! PR 85599: warn about short-circuiting of logical expressions for non-pure functions
!
! Contributed by Janus Weil 

program short_circuit

   logical :: flag
   flag = .false.
   flag = check() .and. flag
   flag = flag .and. check()  ! { dg-warning "might not be evaluated" }
   flag = flag .and. pure_check()

contains

   logical function check()
  integer, save :: i = 1
  print *, "check", i
  i = i + 1
  check = .true.
   end function

   logical pure function pure_check()
  pure_check = .true.
   end function

end


Re: [PATCH 6/7] AArch64 - new pass to add conditional-branch speculation tracking

2018-07-11 Thread Jeff Law
On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
> This patch is the main part of the speculation tracking code.  It adds
> a new target-specific pass that is run just before the final branch
> reorg pass (so that it can clean up any new edge insertions we make).
> The pass is only run with -mtrack-speculation is passed on the command
> line.
> 
> One thing that did come to light as part of this was that the stack pointer
> register was not being permitted in comparision instructions.  We rely on
> that for moving the tracking state between SP and the scratch register at
> function call boundaries.
Note that the sp in comparison instructions issue came up with the
improvements to stack-clash that Tamar, Richard S. and you worked on.


> 
>   * config/aarch64/aarch64-speculation.cc: New file.
>   * config/aarch64/aarch64-passes.def (pass_track_speculation): Add before
>   pass_reorder_blocks.
>   * config/aarch64/aarch64-protos.h (make_pass_track_speculation): Add
>   prototype.
>   * config/aarch64/aarch64.c (aarch64_conditional_register_usage): Fix
>   X14 and X15 when tracking speculation.
>   * config/aarch64/aarch64.md (register name constants): Add
>   SPECULATION_TRACKER_REGNUM and SPECULATION_SCRATCH_REGNUM.
>   (unspec): Add UNSPEC_SPECULATION_TRACKER.
>   (speculation_barrier): New insn attribute.
>   (cmp): Allow SP in comparisons.
>   (speculation_tracker): New insn.
>   (speculation_barrier): Add speculation_barrier attribute.
>   * config/aarch64/t-aarch64: Add make rule for aarch64-speculation.o.
>   * config.gcc (aarch64*-*-*): Add aarch64-speculation.o to extra_objs.
>   * doc/invoke.texi (AArch64 Options): Document -mtrack-speculation.
> ---
>  gcc/config.gcc|   2 +-
>  gcc/config/aarch64/aarch64-passes.def |   1 +
>  gcc/config/aarch64/aarch64-protos.h   |   3 +-
>  gcc/config/aarch64/aarch64-speculation.cc | 494 
> ++
>  gcc/config/aarch64/aarch64.c  |  13 +
>  gcc/config/aarch64/aarch64.md |  30 +-
>  gcc/config/aarch64/t-aarch64  |  10 +
>  gcc/doc/invoke.texi   |  10 +-
>  8 files changed, 558 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/config/aarch64/aarch64-speculation.cc
Given the consensus forming about using these kind of masking
instructions being the preferred way to mitigate (as opposed to lfence
barriers and the like) I have to ask your opinions about making the bulk
of this a general pass rather than one specific to the aarch backend.
I'd hate to end up duplicating all this stuff across multiple architectures.

I think it all looks pretty reasonable though.

jeff



Re: [PATCH 0/7] Mitigation against unsafe data speculation (CVE-2017-5753)

2018-07-11 Thread Jeff Law
On 07/10/2018 10:43 AM, Richard Earnshaw (lists) wrote:
> On 10/07/18 16:42, Jeff Law wrote:
>> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote:
>>> On 10/07/18 00:13, Jeff Law wrote:
 On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>
> To address all of the above, these patches adopt a new approach, based
> in part on a posting by Chandler Carruth to the LLVM developers list
> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html),
> but which we have extended to deal with inter-function speculation.
> The patches divide the problem into two halves.
 We're essentially turning the control dependency into a value that we
 can then use to munge the pointer or the resultant data.

>
> The first half is some target-specific code to track the speculation
> condition through the generated code to provide an internal variable
> which can tell us whether or not the CPU's control flow speculation
> matches the data flow calculations.  The idea is that the internal
> variable starts with the value TRUE and if the CPU's control flow
> speculation ever causes a jump to the wrong block of code the variable
> becomes false until such time as the incorrect control flow
> speculation gets unwound.
 Right.

 So one of the things that comes immediately to mind is you have to run
 this early enough that you can still get to all the control flow and
 build your predicates.  Otherwise you have do undo stuff like
 conditional move generation.
>>>
>>> No, the opposite, in fact.  We want to run this very late, at least on
>>> Arm systems (AArch64 or AArch32).  Conditional move instructions are
>>> fine - they're data-flow operations, not control flow (in fact, that's
>>> exactly what the control flow tracker instructions are).  By running it
>>> late we avoid disrupting any of the earlier optimization passes as well.
>> Ack.  I looked at the aarch64 implementation after sending my message
>> and it clearly runs very late.
>>
>> I haven't convinced myself that all the work generic parts of the
>> compiler to rewrite and eliminate conditionals is safe.  But even if it
>> isn't, you're probably getting enough coverage to drastically reduce the
>> attack surface.  I'm going to have to think about the early
>> transformations we make and how they interact here harder.  But I think
>> the general approach can dramatically reduce the attack surface.
> 
> My argument here would be that we are concerned about speculation that
> the CPU does with the generated program.  We're not particularly
> bothered about the abstract machine description it's based upon.  As
> long as the earlier transforms lead to a valid translation (it hasn't
> removed a necessary bounds check) then running late is fine.
I'm thinking about obfuscation of the bounds check or the pointer or
turning branchy into straightline code, possibly doing some speculation
in the process, if-conversion and the like.

For example hoist_adjacent_loads which results in speculative loads and
likely a conditional move to select between the two loaded values.

Or what if we've done something like

if (x < maxval)
   res = *p;

And we've turned that into


t = *p;
res = (x < maxval) ? t : res;


That may be implemented as a conditional move at the RTL level, so
protecting that may be nontrivial.

In those examples the compiler itself has introduced the speculation.

I can't find the conditional obfuscation I was looking for, so it's hard
to rule it in our out as potentially problematical.

WRT pointer obfuscation, we no longer propagate conditional equivalences
very agressively, so it may be a non-issue in the end.

But again, even with these concerns I think what you're doing cuts down
the attack surface in meaningful ways.



> 
> I can't currently conceive a situation where the compiler would be able
> to remove a /necessary/ bounds check that could lead to unsafe
> speculation later on.  A redundant bounds check removal shouldn't be a
> problem as the non-redundant check should remain and that will still get
> tracking code added.
It's less about removal and more about either compiler-generated
speculation or obfuscation of the patterns you're looking for.


jeff






RFC: lra-constraints.c and TARGET_HARD_REGNO_CALL_PART_CLOBBERED question/patch

2018-07-11 Thread Steve Ellcey
I have a reload/register allocation question and possible patch.  While
working on the Aarch64 SIMD ABI[1] I ran into a problem where GCC was
saving and restoring registers that it did not need to.  I tracked it
down to lra-constraints.c and its use of
targetm.hard_regno_call_part_clobbered on instructions that are not
calls.  Specifically need_for_call_save_p would check this macro even
when the instruction in question (unknown to need_for_call_save_p)
was not a call instruction.

This seems wrong to me and I was wondering if anyone more familiar
with the register allocator and reload could look at this patch and
tell me if it seems reasonable or not.  It passed bootstrap and I
am running tests now.  I am just wondering if there is any reason why
this target function would need to be called on non-call instructions
or if doing so is just an oversight/bug.

Steve Ellcey
sell...@cavium.com


[1] https://gcc.gnu.org/ml/gcc/2018-07/msg00012.html


2018-07-11  Steve Ellcey  

* lra-constraints.c (need_for_call_save_p): Add insn argument
and only check targetm.hard_regno_call_part_clobbered on calls.
(need_for_split_p): Add insn argument, pass to need_for_call_save_p.
(split_reg): Pass insn to need_for_call_save_p.
(split_if_necessary): Pass curr_insn to need_for_split_p.
(inherit_in_ebb): Ditto.


diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 7eeec76..7fc8e7f 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -5344,7 +5344,7 @@ inherit_reload_reg (bool def_p, int original_regno,
 /* Return true if we need a caller save/restore for pseudo REGNO which
was assigned to a hard register.  */
 static inline bool
-need_for_call_save_p (int regno)
+need_for_call_save_p (int regno, rtx_insn *insn)
 {
   lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0);
   return (usage_insns[regno].calls_num < calls_num
@@ -5354,7 +5354,7 @@ need_for_call_save_p (int regno)
       ? lra_reg_info[regno].actual_call_used_reg_set
       : call_used_reg_set,
       PSEUDO_REGNO_MODE (regno), reg_renumber[regno])
-     || (targetm.hard_regno_call_part_clobbered
+     || (CALL_P (insn) && targetm.hard_regno_call_part_clobbered
      (reg_renumber[regno], PSEUDO_REGNO_MODE (regno);
 }
 
@@ -5374,7 +5374,8 @@ static bitmap_head ebb_global_regs;
assignment pass because of too many generated moves which will be
probably removed in the undo pass.  */
 static inline bool
-need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno)
+need_for_split_p (HARD_REG_SET potential_reload_hard_regs,
+     int regno, rtx_insn *insn)
 {
   int hard_regno = regno < FIRST_PSEUDO_REGISTER ? regno : reg_renumber[regno];
 
@@ -5416,7 +5417,8 @@ need_for_split_p (HARD_REG_SET 
potential_reload_hard_regs, int regno)
       || (regno >= FIRST_PSEUDO_REGISTER
       && lra_reg_info[regno].nrefs > 3
       && bitmap_bit_p (_global_regs, regno
-     || (regno >= FIRST_PSEUDO_REGISTER && need_for_call_save_p (regno)));
+     || (regno >= FIRST_PSEUDO_REGISTER
+     && need_for_call_save_p (regno, insn)));
 }
 
 /* Return class for the split pseudo created from original pseudo with
@@ -5536,7 +5538,7 @@ split_reg (bool before_p, int original_regno, rtx_insn 
*insn,
   nregs = hard_regno_nregs (hard_regno, mode);
   rclass = lra_get_allocno_class (original_regno);
   original_reg = regno_reg_rtx[original_regno];
-  call_save_p = need_for_call_save_p (original_regno);
+  call_save_p = need_for_call_save_p (original_regno, insn);
 }
   lra_assert (hard_regno >= 0);
   if (lra_dump_file != NULL)
@@ -5759,7 +5761,7 @@ split_if_necessary (int regno, machine_mode mode,
     && INSN_UID (next_usage_insns) < max_uid)
    || (GET_CODE (next_usage_insns) == INSN_LIST
    && (INSN_UID (XEXP (next_usage_insns, 0)) < max_uid)))
-   && need_for_split_p (potential_reload_hard_regs, regno + i)
+   && need_for_split_p (potential_reload_hard_regs, regno + i, insn)
    && split_reg (before_p, regno + i, insn, next_usage_insns, NULL))
 res = true;
   return res;
@@ -6529,7 +6531,8 @@ inherit_in_ebb (rtx_insn *head, rtx_insn *tail)
      && usage_insns[j].check == curr_usage_insns_check
      && (next_usage_insns = usage_insns[j].insns) != NULL_RTX)
    {
-     if (need_for_split_p (potential_reload_hard_regs, j))
+     if (need_for_split_p (potential_reload_hard_regs, j,
+   curr_insn))
    {
      if (lra_dump_file != NULL && head_p)
    {


[Bug c++/86491] bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace

2018-07-11 Thread jason.vas.dias at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491

--- Comment #4 from Jason Vas Dias  ---
Aha! It is simply that the object pointer template parameter cannot
have static (translation unit) linkage here:

  namespace NA 
  {  class C { ... };
 static C c_;
   /*^^*/
  }

If I remove the 'static', no warning is generated .

This -Wsubobject-linkage warning is therefore doubly confusing !

Since I am instantiating an object whose symbol has automatic storage class,
('d' in main()), 
why should gcc complain that I have an object containing a static reference ?

And why does it have anything to do with -Wsubobject-linkage or use of 
anonymous namespaces ?

There really is no anonymous namespace usage going on in this code.

I really do want all users of the 'NT' template to get a pointer to the same
global 'class C' 'c_' object , which is ONLY defined in the one place, 
but multiple files must be able to include the same 't2.h' header .

I see now I should have used 'extern C c_', and defined it in tM.C.

But actually, in the context in which it was used, since there was no
other defining translation unit, and it was only instantiated in a main()
program, I do not think the warning should have been issued. If I was
actually trying to instantiate multiple 'class D' objects from multiple
translation units, there would be a problem, but I was not.

And really, that '-Wsubobject-linkage' should be split into:
  A) Detect genuine anonymous namespace use
  B) Detect usage of static object references in headers that can be
 included by multiple files, and issue a separate warning message like 
  'static object reference may not be to same object\
   if used in multiple translation units'
 or something like that.

It is highly confusing to claim that code uses anonymous namespaces 
when it does not.

Re: [PATCH] Fix __mmask* types on many AVX512 intrinsics

2018-07-11 Thread Jeff Law
On 07/07/2018 02:15 AM, Jakub Jelinek wrote:
> Hi!
> 
> On Fri, Jul 06, 2018 at 12:47:07PM +0200, Jakub Jelinek wrote:
>> On Thu, Jul 05, 2018 at 11:57:26PM +0300, Grazvydas Ignotas wrote:
>>> I think it would be more efficient if you took care of it. I won't
>>> have time for at least a few days anyway.
> 
> Here is the complete patch, I found two further issues where
> the __mmask mismatch was in between the return type and what was used
> in the rest of the intrinsic, so not caught by my earlier greps.
> 
> I've added (except for the avx512bitalg which seems to have no runtime
> test coverage whatsoever) tests that cover the real bugs and further
> fixed the avx512*-vpcmp{,u}b-2.c test because (rel) << i triggered UB
> if i could go up to 63.
> 
> I don't have AVX512* hw, so I've just bootstrapped/regtested the patch
> normally on i686-linux and x86_64-linux AVX2 hw and tried the affected
> tests without the config/i386/ changes and with them under SDE.
> The patch should fix these FAILs:
> 
> FAIL: gcc.target/i386/avx512bw-vpcmpb-2.c execution test
> FAIL: gcc.target/i386/avx512bw-vpcmpub-2.c execution test
> FAIL: gcc.target/i386/avx512f-vinsertf32x4-3.c execution test
> FAIL: gcc.target/i386/avx512f-vinserti32x4-3.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgeb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgeub-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgeuw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpgew-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpleb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpleub-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpleuw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmplew-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltub-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltuw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpltw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpneqb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpnequb-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpnequw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpneqw-2.c execution test
> FAIL: gcc.target/i386/avx512vl-vpcmpub-2.c execution test
> 
> Ok for trunk?
> 
> I guess we want to backport it soon, but would appreciate somebody testing
> it on real AVX512-{BW,VL} hw before doing the backports.
> 
> Another thing to consider is whether we shouldn't add those grep/sed checks
> I've been doing (at least the easy ones that don't cross-check the
> i386-builtins.def against the uses in the intrin headers) to config/i386/t-*
> some way.
> 
> 2018-07-07  Jakub Jelinek  
> 
>   * config/i386/avx512bitalgintrin.h (_mm512_mask_bitshuffle_epi64_mask):
>   Use __mmask64 type instead of __mmask8 for __M argument.
>   * config/i386/avx512fintrin.h (_mm512_mask_xor_epi64,
>   _mm512_maskz_xor_epi64): Use __mmask8 type instead of __mmask16 for
>   __U argument.
>   (_mm512_mask_cmpneq_epi64_mask): Use __mmask8 type instead of
>   __mmask16 for __M argument.
>   (_mm512_maskz_insertf32x4, _mm512_maskz_inserti32x4,
>   _mm512_mask_insertf32x4, _mm512_mask_inserti32x4): Cast last argument
>   to __mmask16 instead of __mmask8.
>   * config/i386/avx512vlintrin.h (_mm_mask_add_ps, _mm_maskz_add_ps,
>   _mm256_mask_add_ps, _mm256_maskz_add_ps, _mm_mask_sub_ps,
>   _mm_maskz_sub_ps, _mm256_mask_sub_ps, _mm256_maskz_sub_ps,
>   _mm256_maskz_cvtepi32_ps, _mm_maskz_cvtepi32_ps): Use __mmask8 type
>   instead of __mmask16 for __U argument.
>   * config/i386/avx512vlbwintrin.h (_mm_mask_cmp_epi8_mask): Use
>   __mmask16 instead of __mmask8 for __U argument.
>   (_mm256_mask_cmp_epi8_mask): Use __mmask32 instead of __mmask16 for
>   __U argument.
>   (_mm256_cmp_epi8_mask): Use __mmask32 return type instead of
>   __mmask16.
>   (_mm_mask_cmp_epu8_mask): Use __mmask16 instead of __mmask8 for __U
>   argument.
>   (_mm256_mask_cmp_epu8_mask): Use __mmask32 instead of __mmask16 for
>   __U argument.
>   (_mm256_cmp_epu8_mask): Use __mmask32 return type instead of
>   __mmask16.
>   (_mm_mask_cmp_epi16_mask): Cast last argument to __mmask8 instead
>   of __mmask16.
>   (_mm256_mask_cvtepi8_epi16): Use __mmask16 instead of __mmask32 for
>   __U argument.
>   (_mm_mask_cvtepi8_epi16): Use __mmask8 instead of __mmask32 for
>   __U argument.
>   (_mm256_mask_cvtepu8_epi16): Use __mmask16 instead of __mmask32 for
>   __U argument.
>   (_mm_mask_cvtepu8_epi16): Use __mmask8 instead of __mmask32 for
>   __U argument.
>   (_mm256_mask_cmpneq_epu8_mask, _mm256_mask_cmplt_epu8_mask,
>   _mm256_mask_cmpge_epu8_mask, _mm256_mask_cmple_epu8_mask): Change
>   return type as well as __M argument type and all 

[Bug c/86453] [8/9 Regression] error: type variant differs by TYPE_PACKED in free_lang_data since r255469

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453

--- Comment #9 from Martin Sebor  ---
Okay, let me look into making the change.

[Bug c++/86491] bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace

2018-07-11 Thread jason.vas.dias at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491

--- Comment #3 from Jason Vas Dias  ---
Of course, these lines of t2.h from Comment #1 :

template < class _C_, _C_ *_C_OBJ_, void (_C_::*_M_)() >
  class NT
  { static constexpr _C_ *c_ = _C_OBJ_;
 public:
NT()
{ (c_->*_M_)();


could be replaced by:

template < class _C_, _C_ *_C_OBJ_, void (_C_::*_M_)() >
  class NT
  { static constexpr _C_ *c_ = _C_OBJ_;
 public:
NT()
{ (c_->*_M_)();


and the same problem would occur (-Wsubobject-linkage warning) .

(the existence of the 'c_' static const copy of '_C_' is irrelevant.

It is simply that any occurrence of, in file A:
namespace N {
  template   
class T { ... /* something that uses C */}
}
and in file B :
namespace N {
namespace X { 
  class Y  { ... };
  Y y;
}
class C : T< Y, , retval (::a_Y_method) (,..) >
{...};
} // end namespace N

will trigger this bug.

I am trying to figure out why, and how it can be avoided.

I do not want to trigger C++'s special 'anonymous namespace object'
processing in any way here, and I do not see why it is being triggered.

[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

--- Comment #6 from Alexander Monakov  ---
(In reply to H.J. Lu from comment #5)
> When ld sees a common symbol, it will use a non-common definiton
> in a library, .a or .so, to override it.

This is surprising, is it documented somewhere? I don't think the ELF spec
suggests something like that needs to happen.

> Do you have a testcase?

No, it would take some time to prepare.

Re: [PATCH][GCC][front-end][opt-framework] Update options framework for parameters to properly handle and validate configure time params. [Patch (2/3)]

2018-07-11 Thread Jeff Law
On 07/11/2018 05:24 AM, Tamar Christina wrote:
> Hi All,
> 
> This patch builds on a previous patch to pass param options down from 
> configure
> by adding more expansive validation and correctness checks.
> 
> These are set very early on and allow the target to validate or reject the
> values as they see fit.
> 
> To do this compiler_param has been extended to hold a value set at configure
> time, this value is used to be able to distinguish between
> 
> 1) default value
> 2) configure value
> 3) back-end default
> 4) user specific value.
> 
> The priority of the values should be 4 > 2 > 3 > 1.  The compiler will now 
> also
> validate the values in params.def after setting them.  This means invalid 
> values
> will no longer be accepted.
> 
> This also changes it so that default parameters are validated during
> initialization. This change is needed to ensure parameters set via configure
> or by the target specific common initialization routines still keep the
> parameters within the valid range.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no 
> issues.
> Both targets were tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Tamar Christina  
> 
>   * params.h (struct param_info): Add configure_value.
>   * params.c (DEFPARAMCONF): New.
>   (DEFPARAM, DEFPARAMENUM5): Set configure_value.
>   (validate_param): New.
>   (add_params): Use it.
>   (set_param_value): Refactor param validation into validate_param.
>   (maybe_set_param_value): Don't override value from configure.
>   (diagnostic.h): Include.
>   * params-enum.h (DEFPARAMCONF): New.
>   * params-list.h: Likewise.
>   * params-options.h: Likewise.
>   * params.def (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE): Use it.
>   * diagnostic.h (diagnostic_ready_p): New.
Generally OK, though probably should depend on what we decide WRT
configurability.  ie, I'm not convinced we need to be able to set the
default via a configure time option.  And if we don't support that this
patch gets somewhat simpler.

jeff
> 



Re: [PATCH][GCC][front-end][opt-framework] Allow back-ends to be able to do custom validations on params. [Patch (1/3)]

2018-07-11 Thread Jeff Law
On 07/11/2018 05:24 AM, Tamar Christina wrote:
> Hi All,
> 
> This patch adds the ability for backends to add custom constrains to the param
> values by defining a new hook option_validate_param.
> 
> This hook is invoked on every set_param_value which allows the back-end to
> ensure that the parameters are always within it's desired state.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no 
> issues.
> Both targets were tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Tamar Christina  
> 
>   * params.c (set_param_value):
>   Add index of parameter being validated.
>   * common/common-target.def (option_validate_param): New.
>   * common/common-targhooks.h (default_option_validate_param): New.
>   * common/common-targhooks.c (default_option_validate_param): New.
>   * doc/tm.texi.in (TARGET_OPTION_VALIDATE_PARAM): New.
>   * doc/tm.texi: Regenerate.
> 
OK
jeff


Re: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-11 Thread Jeff Law
On 07/11/2018 05:22 AM, Tamar Christina wrote:
> Hi All,
> 
> This patch defines a configure option to allow the setting of the default
> guard size via configure flags when building the target.
> 
> The new flag is:
> 
>  * --with-stack-clash-protection-guard-size=
> 
> The value of configured based params are set very early on and allow the
> target to validate or reject the values as it sees fit.
> 
> To do this the values for the parameter get set by configure through CPP 
> defines.
> In case the back-end wants to know if a value was set or not the original 
> default
> value is also passed down as a define.
> 
> This allows a target to check if a param was changed by the user at configure 
> time.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no 
> issues.
> Both targets were tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Tamar Christina  
> 
>   PR target/86486
>   * configure.ac: Add stack-clash-protection-guard-size.
>   * config.in (DEFAULT_STK_CLASH_GUARD_SIZE, STK_CLASH_GUARD_SIZE_DEFAULT,
>   STK_CLASH_GUARD_SIZE_MAX, STK_CLASH_GUARD_SIZE_MIN): New.
>   * params.def (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE): Use it.
>   * configure: Regenerate.
>   * Makefile.in (params.list, params.options): Add include dir for CPP.
>   * params-list.h: Include auto-host.h
>   * params-options.h: Likewise.
> 
Something seems wrong here.

What's the purpose of including auto-host in params-list and
params-options?  It seems like you're putting a property of the target
(guard size) into the wrong place (auto-host.h).

It's also a bit unclear to me why this is necessary at all.  Are we
planning to support both the 4k and 64k guards?  My goal (once the guard
was configurable) was never for supporting multiple sizes on a target
but instead to allow experimentation to find the right default.

Jeff


[Bug target/86386] [8/9 Regression] unaligned load from stack with -Os -fno-tree-dce -mstringop-strategy=vector_loop -mavx512bw

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86386

--- Comment #4 from Jakub Jelinek  ---
The A128 MEM is introduced during RA, *.ira still has:
(insn 26 24 28 2 (set (mem/c:QI (reg/f:DI 16 argp) [0 MEM[(char *
{ref-all})]+0 S1 A32])
(vec_select:QI (subreg:V16QI (reg:TI 88 [ _2 ]) 0)
(parallel [
(const_int 2 [0x2])
]))) "pr86386.c":9 3728 {*vec_extractv16qi}
 (nil))
and *.reload replaces it with:
(insn 151 23 24 2 (set (mem/c:TI (plus:DI (reg/f:DI 7 sp)
(const_int -16 [0xfff0])) [3 %sfp+-16 S16 A128])
(reg:TI 0 ax [orig:88 _2 ] [88])) "pr86386.c":8 84 {*movti_internal}
 (nil))
...
(insn 152 24 26 2 (set (reg:V16QI 21 xmm0 [168])
(mem/c:V16QI (plus:DI (reg/f:DI 7 sp)
(const_int -16 [0xfff0])) [3 %sfp+-16 S16 A128]))
"pr86386.c":9 1283 {movv16qi_internal}
 (nil))
(insn 26 152 28 2 (set (mem/c:QI (plus:DI (reg/f:DI 6 bp)
(const_int 16 [0x10])) [0 MEM[(char * {ref-all})]+0 S1 A32])
(vec_select:QI (reg:V16QI 21 xmm0 [168])
(parallel [
(const_int 2 [0x2])
]))) "pr86386.c":9 3728 {*vec_extractv16qi}
 (nil))

Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

2018-07-11 Thread Cesar Philippidis
On 07/02/2018 07:14 AM, Tom de Vries wrote:
> On 06/21/2018 03:58 PM, Cesar Philippidis wrote:
>> On 06/20/2018 03:15 PM, Tom de Vries wrote:
>>> On 06/20/2018 11:59 PM, Cesar Philippidis wrote:
 Now it follows the formula contained in
 the "CUDA Occupancy Calculator" spreadsheet that's distributed with CUDA.
>>>
>>> Any reason we're not using the cuda runtime functions to get the
>>> occupancy (see PR85590 - [nvptx, libgomp, openacc] Use cuda runtime fns
>>> to determine launch configuration in nvptx ) ?
>>
>> There are two reasons:
>>
>>   1) cuda_occupancy.h depends on the CUDA runtime to extract the device
>>  properties instead of the CUDA driver API. However, we can always
>>  teach libgomp how to populate the cudaDeviceProp struct using the
>>  driver API.
>>
>>   2) CUDA is not always present on the build host, and that's why
>>  libgomp maintains its own cuda.h. So at the very least, this
>>  functionality would be good to have in libgomp as a fallback
>>  implementation;
> 
> Libgomp maintains its own cuda.h to "allow building GCC with PTX
> offloading even without CUDA being installed" (
> https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00980.html ).
> 
> The libgomp nvptx plugin however uses the cuda driver API to launch
> kernels etc, so we can assume that's always available at launch time.
> And according to the "CUDA Pro Tip: Occupancy API Simplifies Launch
> Configuration", the occupancy API is also available in the driver API.
> 
> What we cannot assume to be available is the occupancy API pre cuda-6.5.
> So it's fine to have a fallback for that (properly isolated in utility
> functions), but for cuda 6.5 and up we want to use the occupancy API.

Here's revision 2 to the patch. I replaced all of my thread occupancy
heuristics with calls to the CUDA driver as you suggested. The
performance is worse than my heuristics, but that's to be expected
because the CUDA driver only guarantees the minimal launch geometry to
to fully utilize the hardware, and not the optimal value. I'll
reintroduce my heuristics later as a follow up patch. The major
advantage of the CUDA thread occupancy calculator is that it allows the
runtime to select sensible default num_workers to avoid those annoying
runtime failures due to insufficient GPU hardware resources.

One thing that may stick out in this patch is how it probes for the
driver version instead of the API version. It turns out that the API
version corresponds to the SM version declared in the PTX sources,
whereas the driver version corresponds to the latest version of CUDA
supported by the driver. At least that's the case with driver version
396.24.

>>  its not good to have program fail due to
>>  insufficient hardware resources errors when it is avoidable.
>>
> 
> Right, in fact there are two separate things you're trying to address
> here: launch failure and occupancy heuristic, so split the patch.

That hunk was small, so I included it with this patch. Although if you
insist, I can remove it.

Is this patch OK for trunk? I tested it x86_64 with nvptx offloading.

Cesar
2018-07-XX  Cesar Philippidis  
	Tom de Vries  

	gcc/
	* config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Rename to ...
	(PTX_DEFAULT_RUNTIME_DIM): ... this.
	(nvptx_goacc_validate_dims): Set default worker and gang dims to
	PTX_DEFAULT_RUNTIME_DIM.
	(nvptx_dim_limit): Ignore GOMP_DIM_WORKER;

	libgomp/
	* plugin/cuda/cuda.h (CUoccupancyB2DSize): Declare.
	(cuOccupancyMaxPotentialBlockSizeWithFlags): Likewise.
	* plugin/plugin-nvptx.c (struct ptx_device): Add driver_version member.
	(nvptx_open_device): Set it.
	(nvptx_exec): Use the CUDA driver to both determine default num_gangs
	and num_workers, and error if the hardware doesn't have sufficient
	resources to launch a kernel.


diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 5608bee8a8d..c1946e75f42 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -5165,7 +5165,7 @@ nvptx_expand_builtin (tree exp, rtx target, rtx ARG_UNUSED (subtarget),
 /* Define dimension sizes for known hardware.  */
 #define PTX_VECTOR_LENGTH 32
 #define PTX_WORKER_LENGTH 32
-#define PTX_GANG_DEFAULT  0 /* Defer to runtime.  */
+#define PTX_DEFAULT_RUNTIME_DIM 0 /* Defer to runtime.  */
 
 /* Implement TARGET_SIMT_VF target hook: number of threads in a warp.  */
 
@@ -5214,9 +5214,9 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level)
 {
   dims[GOMP_DIM_VECTOR] = PTX_VECTOR_LENGTH;
   if (dims[GOMP_DIM_WORKER] < 0)
-	dims[GOMP_DIM_WORKER] = PTX_WORKER_LENGTH;
+	dims[GOMP_DIM_WORKER] = PTX_DEFAULT_RUNTIME_DIM;
   if (dims[GOMP_DIM_GANG] < 0)
-	dims[GOMP_DIM_GANG] = PTX_GANG_DEFAULT;
+	dims[GOMP_DIM_GANG] = PTX_DEFAULT_RUNTIME_DIM;
   changed = true;
 }
 
@@ -5230,9 +5230,6 @@ nvptx_dim_limit (int axis)
 {
   switch (axis)
 {
-case GOMP_DIM_WORKER:
-  return PTX_WORKER_LENGTH;
-
 case GOMP_DIM_VECTOR:
   return PTX_VECTOR_LENGTH;

[Bug target/86386] [8/9 Regression] unaligned load from stack with -Os -fno-tree-dce -mstringop-strategy=vector_loop -mavx512bw

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86386

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Started with r250084.

/* PR target/86386 */
/* { dg-do run { target { avx_runtime && int128 } } } */
/* { dg-options "-Os -fno-tree-dce -mstringop-strategy=vector_loop -mavx" } */

unsigned c, d, e, f;

unsigned __attribute__((noipa))
foo (unsigned char g, unsigned short h, unsigned i, unsigned long long j,
 unsigned char k, unsigned short l, unsigned m, unsigned __int128 n)
{
  __builtin_memset (, 0, 3);
  n <<= m;
  __builtin_memcpy (, 2 + (char *) , 1);
  m >>= 0;
  d ^= __builtin_mul_overflow (l, n, );
  return m;
}

int
main ()
{
  unsigned __int128 x = foo (0, 0, 0, 0, 0, 4, 1, 3);
  if (x != 24)
__builtin_abort ();
  return 0;
}

Re: [PATCH][GCC][mid-end] Add a hook to support telling the mid-end when to probe the stack [patch (2/6)]

2018-07-11 Thread Jeff Law
On 07/11/2018 05:21 AM, Tamar Christina wrote:
> Hi All,
> 
> This patch adds a hook to tell the mid-end about the probing requirements of 
> the
> target.  On AArch64 we allow a specific range for which no probing needs to
> be done.  This same range is also the amount that will have to be probed up 
> when
> a probe is needed after dropping the stack.
> 
> Defining this probe comes with the extra requirement that the outgoing 
> arguments
> size of any function that uses alloca and stack clash be at the very least 8
> bytes.  With this invariant we can skip doing the zero checks for alloca and
> save some code.
> 
> A simplified version of the AArch64 stack frame is:
> 
>+---+  
>|   | 
>|   |  
>|   |  
>+---+  
>|LR |  
>+---+  
>|FP |  
>+---+  
>|dynamic allocations| -\  probe range hook effects these   
>+---+   --\   and ensures that outgoing stack  
>|padding|  -- args is always > 8 when alloca.  
>+---+  ---/   Which means it's always safe to probe
>|outgoing stack args|-/   at SP
>+---+  
>   
>  
> 
> This allows us to generate better code than without the hook without affecting
> other targets.
> 
> With this patch I am also removing the 
> stack_clash_protection_final_dynamic_probe
> hook which was added specifically for AArch64 but that is no longer needed.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no 
> issues.
> Both targets were tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Tamar Christina  
> 
>   PR target/86486
>   * explow.c (anti_adjust_stack_and_probe_stack_clash): Support custom
>   probe ranges.
>   * target.def (stack_clash_protection_alloca_probe_range): New.
>   (stack_clash_protection_final_dynamic_probe): Remove.
>   * targhooks.h (default_stack_clash_protection_alloca_probe_range) New.
>   (default_stack_clash_protection_final_dynamic_probe): Remove.
>   * targhooks.c: Likewise.
>   * doc/tm.texi.in (TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): 
> New.
>   (TARGET_STACK_CLASH_PROTECTION_FINAL_DYNAMIC_PROBE): Remove.
>   * doc/tm.texi: Regenerate.
>
The control flow is a bit convoluted here, but after a few false starts
where I thought this was wrong, I think it's OK.

Jeff












[Bug c/86453] [8/9 Regression] error: type variant differs by TYPE_PACKED in free_lang_data since r255469

2018-07-11 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453

--- Comment #8 from rguenther at suse dot de  ---
On July 11, 2018 8:30:43 PM GMT+02:00, "msebor at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453
>
>--- Comment #7 from Martin Sebor  ---
>Right.  The exclusion logic doesn't depend on no_add_attr; it uses it
>for the
>same purpose as attribute handlers do: to prevent the rest of the
>framework
>from applying them.  Maybe the exclusion should be done first, before
>calling
>the handler, and the call to the handler avoided if the exclusion finds
>a
>conflict.

Yes, that would be a possibility. Again, since some attributes are only applied
as tree flags we may miss to diagnose conflicts via this mechanism.

[Bug c++/86485] [7/8/9 Regression] "anonymous" maybe-uninitialized false positive with ternary operator

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86485

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
For the is_really_empty_class assignments, usually the logic in
cp_gimplify_expr to handle simple_empty_class_p cases (or it is already
optimized away earlier).  But for the COND_EXPR, the gimplifier creates iftmp.N
var and adds 2 MODIFY_EXPRs to that and somehow in this case it doesn't trigger
because the second argument to simple_empty_class_p is a TARGET_EXPR, not a
gimple lvalue etc.

Wonder if cp_gimplify_expr couldn't specially gimplify some COND_EXPRs with
is_really_empty_class type by effectively turning it into a VOID_TYPE COND_EXPR
and then just return as result of the whole an uninitialized temporary.

[Bug lto/86496] New: [9 regression] plugin required to handle lto object

2018-07-11 Thread seurer at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86496

Bug ID: 86496
   Summary: [9 regression] plugin required to handle lto object
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Somewhere in the range r260955-r260970 a change was made that is causing a
bunch of the LTO tests to fail.  There were some build errors in that range
which makes it hard to bisect to the exact revision.

FAIL: g++.dg/lto/20091002-1 cp_lto_20091002-1_0.o-cp_lto_20091002-1_0.o link,
-fPIC -flto -Wno-return-type
FAIL: g++.dg/lto/pr64043 cp_lto_pr64043_0.o-cp_lto_pr64043_0.o link,  -flto
-std=c++11 
FAIL: g++.dg/lto/pr65193 cp_lto_pr65193_0.o-cp_lto_pr65193_0.o link, -fPIC -r
-nostdlib -flto -O2 -g -Wno-return-type
FAIL: g++.dg/lto/pr65302 cp_lto_pr65302_0.o-cp_lto_pr65302_1.o link,  -flto -O2
-Wno-return-type 
FAIL: g++.dg/lto/pr65316 cp_lto_pr65316_0.o-cp_lto_pr65316_1.o link,  -flto
-std=c++11 -g2 -fno-lto-odr-type-merging -O2 -Wno-return-type 
FAIL: g++.dg/lto/pr65549 cp_lto_pr65549_0.o-cp_lto_pr65549_0.o link, 
-std=gnu++14 -flto -g -O2 -fno-inline -flto-partition=max -Wno-return-type 
FAIL: g++.dg/lto/pr65549 cp_lto_pr65549_0.o-cp_lto_pr65549_0.o link, 
-std=gnu++14 -flto -g -Wno-return-type 
FAIL: g++.dg/lto/pr66180 cp_lto_pr66180_0.o-cp_lto_pr66180_1.o link,  -flto
-std=c++14 -r -nostdlib 
FAIL: g++.dg/lto/pr66705 cp_lto_pr66705_0.o-cp_lto_pr66705_0.o link,  -O2 -flto
-flto-partition=max -fipa-pta 
FAIL: g++.dg/lto/pr68057 cp_lto_pr68057_0.o-cp_lto_pr68057_1.o link, -O0 -flto
-flto-partition=none -fuse-linker-plugin
FAIL: g++.dg/lto/pr68057 cp_lto_pr68057_0.o-cp_lto_pr68057_1.o link, -O0 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: g++.dg/lto/pr68057 cp_lto_pr68057_0.o-cp_lto_pr68057_1.o link, -O2 -flto
-flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects 
FAIL: g++.dg/lto/pr68057 cp_lto_pr68057_0.o-cp_lto_pr68057_1.o link, -O2 -flto
-fuse-linker-plugin
FAIL: g++.dg/lto/pr69077 cp_lto_pr69077_0.o-cp_lto_pr69077_1.o link,  -O3 -g
-flto 
FAIL: g++.dg/lto/pr69133 cp_lto_pr69133_0.o-cp_lto_pr69133_1.o link,  -flto -O2 
FAIL: g++.dg/lto/pr69137 cp_lto_pr69137_0.o-cp_lto_pr69137_0.o link, 
-std=c++11 -g -flto 
FAIL: g++.dg/lto/pr79000 cp_lto_pr79000_0.o-cp_lto_pr79000_1.o link, -flto -g
FAIL: g++.dg/lto/pr81940 cp_lto_pr81940_0.o-cp_lto_pr81940_0.o link,  -O -flto 
FAIL: g++.dg/lto/pr85176 cp_lto_pr85176_0.o-cp_lto_pr85176_0.o link,  -flto -g1 
FAIL: gfortran.dg/lto/pr79108 f_lto_pr79108_0.o-f_lto_pr79108_0.o link,  -Ofast
-flto --param ggc-min-expand=0 --param ggc-min-heapsize=0 


spawn -ignore SIGHUP
/home/seurer/gcc/build/gcc-test2/gcc/testsuite/g++7/../../xg++
-B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/g++7/../../ cp_lto_pr64043_0.o
-fno-diagnostics-show-caret -fdiagnostics-color=never -nostdinc++
-I/home/seurer/gcc/build/gcc-test2/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu
-I/home/seurer/gcc/build/gcc-test2/powerpc64le-unknown-linux-gnu/libstdc++-v3/include
-I/home/seurer/gcc/gcc-test2/libstdc++-v3/libsupc++
-I/home/seurer/gcc/gcc-test2/libstdc++-v3/include/backward
-I/home/seurer/gcc/gcc-test2/libstdc++-v3/testsuite/util -fmessage-length=0
-flto -std=c++11 -r -nostdlib -O2
-L/home/seurer/gcc/build/gcc-test2/powerpc64le-unknown-linux-gnu/./libstdc++-v3/src/.libs
-B/home/seurer/gcc/build/gcc-test2/powerpc64le-unknown-linux-gnu/./libstdc++-v3/src/.libs
-L/home/seurer/gcc/build/gcc-test2/powerpc64le-unknown-linux-gnu/./libstdc++-v3/src/.libs
-B/home/seurer/gcc/build/gcc-test2/powerpc64le-unknown-linux-gnu/./libitm/
-L/home/seurer/gcc/build/gcc-test2/powerpc64le-unknown-linux-gnu/./libitm/.libs
-o g++-dg-lto-pr64043-01.exe
/usr/bin/ld: /tmp/ccXKXOCe.lto.o: plugin needed to handle lto object
FAIL: g++.dg/lto/pr64043 cp_lto_pr64043_0.o-cp_lto_pr64043_0.o link,  -flto
-std=c++11 

ld --version
GNU ld (GNU Binutils for Ubuntu) 2.26.1

Note that with binutils 2.27 it works fine.

IIRC there is some issue with binutils for a few revisions that can trigger
this but is there something that can be done to prevent all these tests from
failing?  Should gcc be dependent on certain builds of binutils?

[Bug c/86453] [8/9 Regression] error: type variant differs by TYPE_PACKED in free_lang_data since r255469

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453

--- Comment #7 from Martin Sebor  ---
Right.  The exclusion logic doesn't depend on no_add_attr; it uses it for the
same purpose as attribute handlers do: to prevent the rest of the framework
from applying them.  Maybe the exclusion should be done first, before calling
the handler, and the call to the handler avoided if the exclusion finds a
conflict.

[Bug c++/86485] [7/8/9 Regression] "anonymous" maybe-uninitialized false positive with ternary operator

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86485

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Started with r246314.

Smaller testcase:
struct E {};
struct S { S () {} E e; };
void foo (S);

void
bar (bool b)
{
  foo (b ? S {} : S {});
}

Re: [PATCH 0/5] [RFC v2] Higher-level reporting of vectorization problems

2018-07-11 Thread David Malcolm
On Wed, 2018-07-11 at 16:56 +0100, Richard Sandiford wrote:
> David Malcolm  writes:
> > On Mon, 2018-06-25 at 11:10 +0200, Richard Biener wrote:
> > > On Fri, 22 Jun 2018, David Malcolm wrote:
> > > 
> > > > NightStrike and I were chatting on IRC last week about
> > > > issues with trying to vectorize the following code:
> > > > 
> > > > #include 
> > > > std::size_t f(std::vector> const & v) {
> > > > std::size_t ret = 0;
> > > > for (auto const & w: v)
> > > > ret += w.size();
> > > > return ret;
> > > > }
> > > > 
> > > > icc could vectorize it, but gcc couldn't, but neither of us
> > > > could
> > > > immediately figure out what the problem was.
> > > > 
> > > > Using -fopt-info leads to a wall of text.
> > > > 
> > > > I tried using my patch here:
> > > > 
> > > >  "[PATCH] v3 of optinfo, remarks and optimization records"
> > > >   https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01267.html
> > > > 
> > > > It improved things somewhat, by showing:
> > > > (a) the nesting structure via indentation, and
> > > > (b) the GCC line at which each message is emitted (by using the
> > > > "remark" output)
> > > > 
> > > > but it's still a wall of text:
> > > > 
> > > >   https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.rema
> > > > rks.
> > > > html
> > > >   https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.d/..
> > > > %7C.
> > > > .%7Csrc%7Ctest.cc.html#line-4
> > > > 
> > > > It doesn't yet provide a simple high-level message to a
> > > > tech-savvy user on what they need to do to get GCC to
> > > > vectorize their loop.
> > > 
> > > Yeah, in particular the vectorizer is way too noisy in its low-
> > > level
> > > functions.  IIRC -fopt-info-vec-missed is "somewhat" better:
> > > 
> > > t.C:4:26: note: step unknown.
> > > t.C:4:26: note: vector alignment may not be reachable
> > > t.C:4:26: note: not ssa-name.
> > > t.C:4:26: note: use not simple.
> > > t.C:4:26: note: not ssa-name.
> > > t.C:4:26: note: use not simple.
> > > t.C:4:26: note: no array mode for V2DI[3]
> > > t.C:4:26: note: Data access with gaps requires scalar epilogue
> > > loop
> > > t.C:4:26: note: can't use a fully-masked loop because the target
> > > doesn't 
> > > have the appropriate masked load or store.
> > > t.C:4:26: note: not ssa-name.
> > > t.C:4:26: note: use not simple.
> > > t.C:4:26: note: not ssa-name.
> > > t.C:4:26: note: use not simple.
> > > t.C:4:26: note: no array mode for V2DI[3]
> > > t.C:4:26: note: Data access with gaps requires scalar epilogue
> > > loop
> > > t.C:4:26: note: op not supported by target.
> > > t.C:4:26: note: not vectorized: relevant stmt not supported: _15
> > > =
> > > _14 
> > > /[ex] 4;
> > > t.C:4:26: note: bad operation or unsupported loop bound.
> > > t.C:4:26: note: not vectorized: no grouped stores in basic block.
> > > t.C:4:26: note: not vectorized: no grouped stores in basic block.
> > > t.C:6:12: note: not vectorized: not enough data-refs in basic
> > > block.
> > > 
> > > 
> > > > The pertinent dump messages are:
> > > > 
> > > > test.cc:4:23: remark: === try_vectorize_loop_1 ===
> > > > [../../src/gcc/tree-vectorizer.c:674:try_vectorize_loop_1]
> > > > cc1plus: remark:
> > > > Analyzing loop at test.cc:4
> > > > [../../src/gcc/dumpfile.c:735:ensure_pending_optinfo]
> > > > test.cc:4:23: remark:  === analyze_loop_nest ===
> > > > [../../src/gcc/tree-vect-loop.c:2299:vect_analyze_loop]
> > > > [...snip...]
> > > > test.cc:4:23: remark:   === vect_analyze_loop_operations ===
> > > > [../../src/gcc/tree-vect-
> > > > loop.c:1520:vect_analyze_loop_operations]
> > > > [...snip...]
> > > > test.cc:4:23: remark:==> examining statement: ‘_15 = _14
> > > > /[ex]
> > > > 4;’ [../../src/gcc/tree-vect-stmts.c:9382:vect_analyze_stmt]
> > > > test.cc:4:23: remark:vect_is_simple_use: operand ‘_14’
> > > > [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
> > > > test.cc:4:23: remark:def_stmt: ‘_14 = _8 - _7;’
> > > > [../../src/gcc/tree-vect-stmts.c:10098:vect_is_simple_use]
> > > > test.cc:4:23: remark:type of def: internal
> > > > [../../src/gcc/tree-
> > > > vect-stmts.c:10112:vect_is_simple_use]
> > > > test.cc:4:23: remark:vect_is_simple_use: operand ‘4’
> > > > [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
> > > > test.cc:4:23: remark:op not supported by target.
> > > > [../../src/gcc/tree-vect-stmts.c:5932:vectorizable_operation]
> > > > test.cc:4:23: remark:not vectorized: relevant stmt not
> > > > supported: ‘_15 = _14 /[ex] 4;’ [../../src/gcc/tree-vect-
> > > > stmts.c:9565:vect_analyze_stmt]
> > > > test.cc:4:23: remark:   bad operation or unsupported loop
> > > > bound.
> > > > [../../src/gcc/tree-vect-loop.c:2043:vect_analyze_loop_2]
> > > > cc1plus: remark: vectorized 0 loops in function.
> > > > [../../src/gcc/tree-vectorizer.c:904:vectorize_loops]
> > > > 
> > > > In particular, that complaint from
> > > >   [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt]
> > > > is 

[Bug c/86453] [8/9 Regression] error: type variant differs by TYPE_PACKED in free_lang_data since r255469

2018-07-11 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453

--- Comment #6 from rguenther at suse dot de  ---
On July 11, 2018 8:12:17 PM GMT+02:00, "msebor at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453
>
>--- Comment #5 from Martin Sebor  ---
>The attribute exclusion framework only excludes conflicting attributes
>if they
>aren't applied by their handler.  It doesn't know how to undo changes
>that the
>handler makes, like modifying tree nodes in place.  It would need to be
>extended to also let each handler detect the conflicts before making
>these
>kinds of changes.

Note that no_add_attr doesn't mean the attribute isn't applied. Instead some
attributes are optimized and translated to tree flags. There isn't really a way
the handler can signal the attribute wasn't applied.

[Bug c/86453] [8/9 Regression] error: type variant differs by TYPE_PACKED in free_lang_data since r255469

2018-07-11 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86453

--- Comment #5 from Martin Sebor  ---
The attribute exclusion framework only excludes conflicting attributes if they
aren't applied by their handler.  It doesn't know how to undo changes that the
handler makes, like modifying tree nodes in place.  It would need to be
extended to also let each handler detect the conflicts before making these
kinds of changes.

[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

--- Comment #5 from H.J. Lu  ---
(In reply to Alexander Monakov from comment #4)
> (In reply to H.J. Lu from comment #3)
> > It is because gold doesn't check archive for a common definition.
> 
> Please elaborate - does ld.bfd try to extract static archive members when it
> already has a common definition? Why?

When ld sees a common symbol, it will use a non-common definiton
in a library, .a or .so, to override it.

> > Is there a common symbol involved?
> 
> I don't think so, but I'm not sure. We've also seen other pain points like
> the same member extracted and given to the plugin multiple times, even
> though the second extraction cannot possibly satisfy any unresolved
> references.

Do you have a testcase?

[Bug tree-optimization/86492] [8/9 Regression] store-merging wrong-code

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86492

--- Comment #3 from Jakub Jelinek  ---
Created attachment 44385
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44385=edit
gcc9-pr86492.patch

Untested fix.

[Bug c++/86495] New: false no return statement warning in "if constexpr" branch

2018-07-11 Thread tower120 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86495

Bug ID: 86495
   Summary: false no return statement warning in "if constexpr"
branch
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tower120 at gmail dot com
  Target Milestone: ---

The following code produce false "no return statement" warning in gcc 8.1.
Everything fine on 7.x

Live: https://godbolt.org/g/dCuFci

#include 
#include 

template
class variant_w_base{
Base* m_base;
Variant m_variant;

void update_base(){
m_base = std::visit([](auto&& arg) -> Base* {
using Arg = std::decay_t;
// ERRONEOUS WARNING HERE.
if constexpr (std::is_same_v){
return nullptr;
} else {
return static_cast();
}
}, m_variant);
}

public:
variant_w_base(){
update_base();
}
};


int main() {
struct Base{};
struct Data : Base{};
variant_w_base> v;

return 0;
}

Re: abstract wide int binop code from VRP

2018-07-11 Thread Richard Sandiford
Aldy Hernandez  writes:
> On 07/11/2018 08:52 AM, Richard Biener wrote:
>> On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez  wrote:
>>>
>>> Hmmm, I think we can do better, and since this hasn't been reviewed yet,
>>> I don't think anyone will mind the adjustment to the patch ;-).
>>>
>>> I really hate int_const_binop_SOME_RANDOM_NUMBER.  We should abstract
>>> them into properly named poly_int_binop, wide_int_binop, and tree_binop,
>>> and then use a default argument for int_const_binop() to get things going.
>>>
>>> Sorry for more changes in flight, but I thought we could benefit from
>>> more cleanups :).
>>>
>>> OK for trunk pending tests?
>> 
>> Much of GCC pre-dates function overloading / default args ;)
>
> Heh...and ANSI C.
>
>> 
>> Looks OK but can you please rename your tree_binop to int_cst_binop?
>> Or maybe inline it into int_const_binop, also sharing the force_fit_type ()
>> tail with poly_int_binop?
>
> I tried both, but inlining looked cleaner :).  Done.
>
>> 
>> What about mixed INTEGER_CST / poly_int constants?  Shouldn't it
>> be
>> 
>>if (neither-poly-nor-integer-cst (arg1 || arg2))
>>  return NULL_TREE;
>>if (poly_int_tree (arg1) || poly_int_tree (arg2))
>>  poly-int-stuff
>>else if (INTEGER_CST && INTEGER_CST)
>>  wide-int-stuff
>> 
>> ?  I see that is a pre-existing issue but if you are at refactoring...
>> wi::to_poly_wide should handle INTEGER_CST operands just fine
>> I hope.
>
> This aborted:
> gcc_assert (NUM_POLY_INT_COEFFS != 1);
>
> but even taking it out made the bootstrap die somewhere else.
>
> If it's ok, I'd rather not tackle this now, as I have some more cleanups 
> that are pending on this.  If you feel strongly, I could do it at a 
> later time.
>
> OK pending tests?

LGTM FWIW, just some nits:

> -/* Subroutine of int_const_binop_1 that handles two INTEGER_CSTs.  */
> +/* Combine two wide ints ARG1 and ARG2 under operation CODE to produce
> +   a new constant in RES.  Return FALSE if we don't know how to
> +   evaluate CODE at compile-time.  */
> 
> -static tree
> -int_const_binop_2 (enum tree_code code, const_tree parg1, const_tree parg2,
> -int overflowable)
> +bool
> +wide_int_binop (enum tree_code code,
> + wide_int , const wide_int , const wide_int ,
> + signop sign, wi::overflow_type )
>  {

IMO we should avoid pass-back by reference like the plague. :-)
It's especially confusing when the code does things like:

case FLOOR_DIV_EXPR:
  if (arg2 == 0)
return false;
  res = wi::div_floor (arg1, arg2, sign, );
  break;

It looked at first like it was taking the address of a local variable
and failing to propagate the information back up.

I think we should stick to using pointers for this kind of thing.

> -/* Combine two integer constants PARG1 and PARG2 under operation CODE
> -   to produce a new constant.  Return NULL_TREE if we don't know how
> +/* Combine two poly int's ARG1 and ARG2 under operation CODE to
> +   produce a new constant in RES.  Return FALSE if we don't know how
> to evaluate CODE at compile-time.  */
> 
> -static tree
> -int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree arg2,
> -int overflowable)
> +static bool
> +poly_int_binop (poly_wide_int , enum tree_code code,
> + const_tree arg1, const_tree arg2,
> + signop sign, wi::overflow_type )
>  {

Would be good to be consistent about the order of the result and code
arguments.  Here it's "result, code" (which seems better IMO),
but in wide_int_binop it's "code, result".

> +/* Combine two integer constants PARG1 and PARG2 under operation CODE
> +   to produce a new constant.  Return NULL_TREE if we don't know how
> +   to evaluate CODE at compile-time.  */
> +
>  tree
> -int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2)
> +int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2,
> +  int overflowable)

s/PARG/ARG/g in comment.

>  {
> -  return int_const_binop_1 (code, arg1, arg2, 1);
> +  bool success = false;
> +  poly_wide_int poly_res;
> +  tree type = TREE_TYPE (arg1);
> +  signop sign = TYPE_SIGN (type);
> +  wi::overflow_type overflow = wi::OVF_NONE;
> +
> +  if (TREE_CODE (arg1) == INTEGER_CST && TREE_CODE (arg2) == INTEGER_CST)
> +{
> +  wide_int warg1 = wi::to_wide (arg1), res;
> +  wide_int warg2 = wi::to_wide (arg2, TYPE_PRECISION (type));
> +  success = wide_int_binop (code, res, warg1, warg2, sign, overflow);
> +  poly_res = res;
> +}
> +  else if (poly_int_tree_p (arg1) && poly_int_tree_p (arg2))
> +success = poly_int_binop (poly_res, code, arg1, arg2, sign, overflow);
> +  if (success)
> +return force_fit_type (type, poly_res, overflowable,
> +(((sign == SIGNED || overflowable == -1)
> +  && overflow)
> + | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2)));
> +  return NULL_TREE;
>  }
>  

Re: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]

2018-07-11 Thread Jeff Law
On 07/11/2018 05:20 AM, Tamar Christina wrote:
> Hi All,
> 
> This patch implements the use of the stack clash mitigation for aarch64.
> In Aarch64 we expect both the probing interval and the guard size to be 64KB
> and we enforce them to always be equal.
> 
> We also probe up by 1024 bytes in the general case when a probe is required.
> 
> AArch64 has the following probing conditions:
> 
>  1) Any allocation less than 63KB requires no probing.  An ABI defined safe
> buffer of 1Kbytes is used and a page size of 64k is assumed.
> 
>  2) Any allocations larger than 1 page size, is done in increments of page 
> size
> and probed up by 1KB leaving the residuals.
> 
>  3a) Any residual for local arguments that is less than 63KB requires no 
> probing.
>  Essentially this is a sliding window.  The probing range determines the 
> ABI
>  safe buffer, and the amount to be probed up.
> 
>   b) Any residual for outgoing arguments that is less than 1KB requires no 
> probing,
>  However to maintain our invariant, anything above or equal to 1KB 
> requires a probe.
> 
> Incrementally allocating less than the probing thresholds, e.g. recursive 
> functions will
> not be an issue as the storing of LR counts as a probe.
> 
> 
> +---+ 
>
> |  ABI SAFE REGION  | 
>
>   +-- 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>  maximum amount   | |   | 
>
>  not needing a| |   | 
>
>  probe| |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   |Probe offset when
>
>   |  probe is required
>
>   | |   | 
>
>   + +---+   Point of first 
> probe 
> |  ABI SAFE REGION  | 
>
> - 
>
> |   | 
>
> |   | 
>
> |   | 
> 
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> Target was tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Jeff Law  
>   Richard Sandiford 
>   Tamar Christina  
> 
>   PR target/86486
>   * config/aarch64/aarch64.md (cmp,
>   probe_stack_range): Add k (SP) constraint.
>   * config/aarch64/aarch64.h (STACK_CLASH_CALLER_GUARD,
>   STACK_CLASH_MAX_UNROLL_PAGES): New.
>   * config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Emit
>   stack probes for stack clash.
>   (aarch64_allocate_and_probe_stack_space): New.
>   (aarch64_expand_prologue): Use it.
>   (aarch64_expand_epilogue): Likewise and update IP regs re-use criteria.
>   (aarch64_sub_sp): Add emit_move_imm optional param.
> 
> gcc/testsuite/
> 2018-07-11  Jeff Law  
>   Richard Sandiford 
>   Tamar Christina  
> 
>   PR target/86486
>   * gcc.target/aarch64/stack-check-12.c: New.
>   * gcc.target/aarch64/stack-check-13.c: New.
>   * gcc.target/aarch64/stack-check-cfa-1.c: New.
>   * gcc.target/aarch64/stack-check-cfa-2.c: New.
>   * gcc.target/aarch64/stack-check-prologue-1.c: New.
>   * gcc.target/aarch64/stack-check-prologue-10.c: New.
>   * gcc.target/aarch64/stack-check-prologue-11.c: New.
>   * gcc.target/aarch64/stack-check-prologue-2.c: New.
>   * gcc.target/aarch64/stack-check-prologue-3.c: New.
>   * gcc.target/aarch64/stack-check-prologue-4.c: New.
>   

[Bug c++/54080] [C++11] g++ crashes when compiling the following file

2018-07-11 Thread nightstrike at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54080

--- Comment #4 from nightstrike  ---
This still crashes with gcc 8.1.1 20180531

[PATCH, rs6000] Add missing logical-op interfaces to emmintrin.h

2018-07-11 Thread Bill Schmidt
Hi,

It was recently brought to our attention that the existing emmintrin.h
header, which was believed to be feature-complete for SSE2 support, is
actually missing four logical-op interfaces:

 _mm_and_si128
 _mm_andnot_si128
 _mm_or_si128
 _mm_xor_si128

This patch provides those with the obvious implementations, along with
test cases.  I've bootstrapped it on powerpc64le-linux-gnu (P8, P9)
and powerpc64-linux-gnu (P7, P8) and tested it with no regressions.
Is this okay for trunk?

Although this isn't a regression, it is an oversight that leaves the
SSE2 support incomplete.  Thus I'd like to ask permission to also
backport this to gcc-8-branch after a short waiting period.  It's
passed regstrap on P8 and P9 LE, and P7/P8 BE testing is underway.
Is that backport okay if testing succeeds?

[BTW, I'm shepherding this patch on behalf of Steve Munroe.]

Thanks!
Bill


[gcc]

2018-07-10  Bill Schmidt  
Steve Munroe  

* config/rs6000/emmintrin.h (_mm_and_si128): New function.
(_mm_andnot_si128): Likewise.
(_mm_or_si128): Likewise.
(_mm_xor_si128): Likewise.

[gcc/testsuite]

2018-07-10  Bill Schmidt  
Steve Munroe  

* gcc.target/powerpc/sse2-pand-1.c: New file.
* gcc.target/powerpc/sse2-pandn-1.c: Likewise.
* gcc.target/powerpc/sse2-por-1.c: Likewise.
* gcc.target/powerpc/sse2-pxor-1.c: Likewise.


Index: gcc/config/rs6000/emmintrin.h
===
--- gcc/config/rs6000/emmintrin.h   (revision 262235)
+++ gcc/config/rs6000/emmintrin.h   (working copy)
@@ -1884,6 +1884,30 @@
 }
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_and_si128 (__m128i __A, __m128i __B)
+{
+  return (__m128i)vec_and ((__v2di) __A, (__v2di) __B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_andnot_si128 (__m128i __A, __m128i __B)
+{
+  return (__m128i)vec_andc ((__v2di) __B, (__v2di) __A);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_or_si128 (__m128i __A, __m128i __B)
+{
+  return (__m128i)vec_or ((__v2di) __A, (__v2di) __B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_xor_si128 (__m128i __A, __m128i __B)
+{
+  return (__m128i)vec_xor ((__v2di) __A, (__v2di) __B);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_cmpeq_epi8 (__m128i __A, __m128i __B)
 {
   return (__m128i) vec_cmpeq ((__v16qi) __A, (__v16qi)__B);
@@ -2333,3 +2357,4 @@
 }
 
 #endif /* EMMINTRIN_H_ */
+
Index: gcc/testsuite/gcc.target/powerpc/sse2-pand-1.c
===
--- gcc/testsuite/gcc.target/powerpc/sse2-pand-1.c  (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/sse2-pand-1.c  (working copy)
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pand_1
+#endif
+
+#include 
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_and_si128 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 
4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+ e[i] = s1.a[i] & s2.a[i];
+
+  if (check_union128i_b (u, e))
+abort ();
+}
Index: gcc/testsuite/gcc.target/powerpc/sse2-pandn-1.c
===
--- gcc/testsuite/gcc.target/powerpc/sse2-pandn-1.c (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/sse2-pandn-1.c (working copy)
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mpower8-vector -Wno-psabi" } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target p8vector_hw } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse2-check.h"
+#endif
+
+#include CHECK_H
+
+#ifndef TEST
+#define TEST sse2_test_pandn_1
+#endif
+
+#include 
+
+static __m128i
+__attribute__((noinline, unused))
+test (__m128i s1, __m128i s2)
+{
+  return _mm_andnot_si128 (s1, s2); 
+}
+
+static void
+TEST (void)
+{
+  union128i_b u, s1, s2;
+  char e[16];
+  int i;
+   
+  s1.x = _mm_set_epi8 (1,2,3,4,10,20,30,90,-80,-40,-100,-15,98, 25, 98,7);
+  s2.x = _mm_set_epi8 (88, 44, 33, 22, 11, 98, 76, -100, -34, -78, -39, 6, 3, 
4, 5, 119);
+  u.x = test (s1.x, s2.x); 
+   
+  for (i = 0; i < 16; i++)
+ e[i] = (~s1.a[i]) & s2.a[i];
+
+  if (check_union128i_b (u, e))
+abort ();
+}
Index: 

[Bug tree-optimization/86492] [8/9 Regression] store-merging wrong-code

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86492

--- Comment #2 from Jakub Jelinek  ---
Related to PR84503.

[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

--- Comment #4 from Alexander Monakov  ---
(In reply to H.J. Lu from comment #3)
> It is because gold doesn't check archive for a common definition.

Please elaborate - does ld.bfd try to extract static archive members when it
already has a common definition? Why?

> Is there a common symbol involved?

I don't think so, but I'm not sure. We've also seen other pain points like the
same member extracted and given to the plugin multiple times, even though the
second extraction cannot possibly satisfy any unresolved references.

[Bug c++/86494] New: Usage in unevaluated context causes compile time errors because of implicit deletion

2018-07-11 Thread antoshkka at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86494

Bug ID: 86494
   Summary: Usage in unevaluated context causes compile time
errors because of implicit deletion
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoshkka at gmail dot com
  Target Milestone: ---

Following code:


template
struct pair {
T1 first;

pair() = default;
pair(const pair&) = default;
pair(pair&&) = default;

pair& operator=(pair&& __p);
pair& operator=(const pair& __p);
};

struct Idx : public pair {
using pair::pair;
using pair::operator=;
};

template _Tp&& declval() noexcept;

int main() {
// Comment out the next line and everything will compile
using t = decltype(declval() = declval());

Idx p{};
Idx p2(p);
}


Fails to compile on GCC with the following error:

: In function 'int main()':
:25:13: error: use of deleted function 'constexpr Idx::Idx(const Idx&)'
 Idx p2(p);
 ^
:13:8: note: 'constexpr Idx::Idx(const Idx&)' is implicitly declared as
deleted because 'Idx' declares a move constructor or move assignment operator
 struct Idx : public pair {
^~~

However clang compiles that code well.

With line 'using t =...' commented out everything compiles well on GCC.

[Bug c++/86491] bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace

2018-07-11 Thread jason.vas.dias at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491

--- Comment #2 from Jason Vas Dias  ---
Created attachment 44384
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44384=edit
More readable (diff -ur) patch against 6.4.1's cp/decl2.c

Here is a more readable version of the patch 
to print out information on the exact component
of the base class structure which GCC thinks
uses the anonymous namespace. 
In this case, it is of not much use, but does
inform the user that GCC thinks the whole structure
declaration somehow uses the anonymous namespace,
which obviously it does not, so does signal the
existence of this GCC bug to the user.

Re: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]

2018-07-11 Thread Jeff Law
On 07/11/2018 05:20 AM, Tamar Christina wrote:
> Hi All,
> 
> This patch implements the use of the stack clash mitigation for aarch64.
> In Aarch64 we expect both the probing interval and the guard size to be 64KB
> and we enforce them to always be equal.
> 
> We also probe up by 1024 bytes in the general case when a probe is required.
> 
> AArch64 has the following probing conditions:
> 
>  1) Any allocation less than 63KB requires no probing.  An ABI defined safe
> buffer of 1Kbytes is used and a page size of 64k is assumed.
> 
>  2) Any allocations larger than 1 page size, is done in increments of page 
> size
> and probed up by 1KB leaving the residuals.
> 
>  3a) Any residual for local arguments that is less than 63KB requires no 
> probing.
>  Essentially this is a sliding window.  The probing range determines the 
> ABI
>  safe buffer, and the amount to be probed up.
> 
>   b) Any residual for outgoing arguments that is less than 1KB requires no 
> probing,
>  However to maintain our invariant, anything above or equal to 1KB 
> requires a probe.
> 
> Incrementally allocating less than the probing thresholds, e.g. recursive 
> functions will
> not be an issue as the storing of LR counts as a probe.
> 
> 
> +---+ 
>
> |  ABI SAFE REGION  | 
>
>   +-- 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>  maximum amount   | |   | 
>
>  not needing a| |   | 
>
>  probe| |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   | 
>
>   | |   |Probe offset when
>
>   |  probe is required
>
>   | |   | 
>
>   + +---+   Point of first 
> probe 
> |  ABI SAFE REGION  | 
>
> - 
>
> |   | 
>
> |   | 
>
> |   | 
> 
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> Target was tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Jeff Law  
>   Richard Sandiford 
>   Tamar Christina  
> 
>   PR target/86486
>   * config/aarch64/aarch64.md (cmp,
>   probe_stack_range): Add k (SP) constraint.
>   * config/aarch64/aarch64.h (STACK_CLASH_CALLER_GUARD,
>   STACK_CLASH_MAX_UNROLL_PAGES): New.
>   * config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Emit
>   stack probes for stack clash.
>   (aarch64_allocate_and_probe_stack_space): New.
>   (aarch64_expand_prologue): Use it.
>   (aarch64_expand_epilogue): Likewise and update IP regs re-use criteria.
>   (aarch64_sub_sp): Add emit_move_imm optional param.
[ ... ]
I'm going to let the aarch64 maintainers ack/nack the aarch64 specific
bits.  I'll review them to see if there's anything obvious (since I am
familiar with the core issues and the original implementation).

I'm happy to own review work on the target independent chunks.

jeff


Re: abstract wide int binop code from VRP

2018-07-11 Thread Richard Sandiford
Richard Biener  writes:
> On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez  wrote:
>>
>> Hmmm, I think we can do better, and since this hasn't been reviewed yet,
>> I don't think anyone will mind the adjustment to the patch ;-).
>>
>> I really hate int_const_binop_SOME_RANDOM_NUMBER.  We should abstract
>> them into properly named poly_int_binop, wide_int_binop, and tree_binop,
>> and then use a default argument for int_const_binop() to get things going.
>>
>> Sorry for more changes in flight, but I thought we could benefit from
>> more cleanups :).
>>
>> OK for trunk pending tests?
>
> Much of GCC pre-dates function overloading / default args ;)
>
> Looks OK but can you please rename your tree_binop to int_cst_binop?
> Or maybe inline it into int_const_binop, also sharing the force_fit_type ()
> tail with poly_int_binop?
>
> What about mixed INTEGER_CST / poly_int constants?  Shouldn't it
> be
>
>   if (neither-poly-nor-integer-cst (arg1 || arg2))
> return NULL_TREE;
>   if (poly_int_tree (arg1) || poly_int_tree (arg2))
> poly-int-stuff
>   else if (INTEGER_CST && INTEGER_CST)
> wide-int-stuff
>
> ?  I see that is a pre-existing issue but if you are at refactoring...
> wi::to_poly_wide should handle INTEGER_CST operands just fine
> I hope.

Don't think it's a preexisting issue.  poly_int_tree_p returns true
for anything that can be represented as a poly_int, i.e. both
INTEGER_CST and POLY_INT_CST.  (It wouldn't really make sense to
ask whether something could *only* be represented as a POLY_INT_CST.)

So:

  if (poly_int_tree_p (arg1) && poly_int_tree_p (arg2))
{
  poly_wide_int res;
  bool overflow;
  tree type = TREE_TYPE (arg1);
  signop sign = TYPE_SIGN (type);
  switch (code)
{
case PLUS_EXPR:
  res = wi::add (wi::to_poly_wide (arg1),
 wi::to_poly_wide (arg2), sign, );
  break;

handles POLY_INT_CST + POLY_INT_CST, POLY_INT_CST + INTEGER_CST and
INTEGER_CST + POLY_INT_CST.

Thanks,
Richard


Re: abstract wide int binop code from VRP

2018-07-11 Thread Aldy Hernandez



On 07/11/2018 08:52 AM, Richard Biener wrote:

On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez  wrote:


Hmmm, I think we can do better, and since this hasn't been reviewed yet,
I don't think anyone will mind the adjustment to the patch ;-).

I really hate int_const_binop_SOME_RANDOM_NUMBER.  We should abstract
them into properly named poly_int_binop, wide_int_binop, and tree_binop,
and then use a default argument for int_const_binop() to get things going.

Sorry for more changes in flight, but I thought we could benefit from
more cleanups :).

OK for trunk pending tests?


Much of GCC pre-dates function overloading / default args ;)


Heh...and ANSI C.



Looks OK but can you please rename your tree_binop to int_cst_binop?
Or maybe inline it into int_const_binop, also sharing the force_fit_type ()
tail with poly_int_binop?


I tried both, but inlining looked cleaner :).  Done.



What about mixed INTEGER_CST / poly_int constants?  Shouldn't it
be

   if (neither-poly-nor-integer-cst (arg1 || arg2))
 return NULL_TREE;
   if (poly_int_tree (arg1) || poly_int_tree (arg2))
 poly-int-stuff
   else if (INTEGER_CST && INTEGER_CST)
 wide-int-stuff

?  I see that is a pre-existing issue but if you are at refactoring...
wi::to_poly_wide should handle INTEGER_CST operands just fine
I hope.


This aborted:
gcc_assert (NUM_POLY_INT_COEFFS != 1);

but even taking it out made the bootstrap die somewhere else.

If it's ok, I'd rather not tackle this now, as I have some more cleanups 
that are pending on this.  If you feel strongly, I could do it at a 
later time.


OK pending tests?
Aldy
gcc/

* fold-const.c (int_const_binop_1): Abstract...
(wide_int_binop): ...wide int code here.
	(poly_int_binop): ...poly int code here.
	Abstract the rest of int_const_binop_1 into int_const_binop.
* fold-const.h (wide_int_binop): New.
* tree-vrp.c (vrp_int_const_binop): Call wide_int_binop.
	Remove useless PLUS/MINUS_EXPR case.
(zero_nonzero_bits_from_vr): Move wide int code...
(zero_nonzero_bits_from_bounds): ...here.
(extract_range_from_binary_expr_1): Move mask optimization code...
(range_easy_mask_min_max): ...here.
* tree-vrp.h (zero_nonzero_bits_from_bounds): New.
(range_easy_mask_min_max): New.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 97c435fa5e0..ad8c0a69f63 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -966,21 +966,17 @@ int_binop_types_match_p (enum tree_code code, const_tree type1, const_tree type2
 	 && TYPE_MODE (type1) == TYPE_MODE (type2);
 }
 
-/* Subroutine of int_const_binop_1 that handles two INTEGER_CSTs.  */
+/* Combine two wide ints ARG1 and ARG2 under operation CODE to produce
+   a new constant in RES.  Return FALSE if we don't know how to
+   evaluate CODE at compile-time.  */
 
-static tree
-int_const_binop_2 (enum tree_code code, const_tree parg1, const_tree parg2,
-		   int overflowable)
+bool
+wide_int_binop (enum tree_code code,
+		wide_int , const wide_int , const wide_int ,
+		signop sign, wi::overflow_type )
 {
-  wide_int res;
-  tree t;
-  tree type = TREE_TYPE (parg1);
-  signop sign = TYPE_SIGN (type);
-  wi::overflow_type overflow = wi::OVF_NONE;
-
-  wi::tree_to_wide_ref arg1 = wi::to_wide (parg1);
-  wide_int arg2 = wi::to_wide (parg2, TYPE_PRECISION (type));
-
+  wide_int tmp;
+  overflow = wi::OVF_NONE;
   switch (code)
 {
 case BIT_IOR_EXPR:
@@ -999,37 +995,41 @@ int_const_binop_2 (enum tree_code code, const_tree parg1, const_tree parg2,
 case LSHIFT_EXPR:
   if (wi::neg_p (arg2))
 	{
-	  arg2 = -arg2;
+	  tmp = -arg2;
 	  if (code == RSHIFT_EXPR)
 	code = LSHIFT_EXPR;
 	  else
 	code = RSHIFT_EXPR;
 	}
+  else
+tmp = arg2;
 
   if (code == RSHIFT_EXPR)
 	/* It's unclear from the C standard whether shifts can overflow.
 	   The following code ignores overflow; perhaps a C standard
 	   interpretation ruling is needed.  */
-	res = wi::rshift (arg1, arg2, sign);
+	res = wi::rshift (arg1, tmp, sign);
   else
-	res = wi::lshift (arg1, arg2);
+	res = wi::lshift (arg1, tmp);
   break;
 
 case RROTATE_EXPR:
 case LROTATE_EXPR:
   if (wi::neg_p (arg2))
 	{
-	  arg2 = -arg2;
+	  tmp = -arg2;
 	  if (code == RROTATE_EXPR)
 	code = LROTATE_EXPR;
 	  else
 	code = RROTATE_EXPR;
 	}
+  else
+tmp = arg2;
 
   if (code == RROTATE_EXPR)
-	res = wi::rrotate (arg1, arg2);
+	res = wi::rrotate (arg1, tmp);
   else
-	res = wi::lrotate (arg1, arg2);
+	res = wi::lrotate (arg1, tmp);
   break;
 
 case PLUS_EXPR:
@@ -1051,49 +1051,49 @@ int_const_binop_2 (enum tree_code code, const_tree parg1, const_tree parg2,
 case TRUNC_DIV_EXPR:
 case EXACT_DIV_EXPR:
   if (arg2 == 0)
-	return NULL_TREE;
+	return false;
   res = wi::div_trunc (arg1, arg2, sign, );
   break;
 
 case FLOOR_DIV_EXPR:
   if (arg2 == 0)
-	return NULL_TREE;
+	return false;
   res = wi::div_floor (arg1, arg2, 

Re: Fwd: GCC 8.1 :Store Merge pass issue (-fstore-merging).

2018-07-11 Thread Umesh Kalappa
Thank you Jakub and my bad sure next time.

Umesh

On Wed, Jul 11, 2018, 10:17 PM Jakub Jelinek  wrote:

> On Wed, Jul 11, 2018 at 09:48:07PM +0530, Umesh Kalappa wrote:
> > Cc'ed Kyrill.
>
> Mailing list is not the right medium to report bugs.
> I've filed http://gcc.gnu.org/PR86492 for you, but please next time use
> bugzilla.
>
> Jakub
>


[Bug c++/86493] New: [concepts] Hard error for "call to non-'constexpr' function" in a requires expression

2018-07-11 Thread Casey at Carter dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86493

Bug ID: 86493
   Summary: [concepts] Hard error for "call to non-'constexpr'
function" in a requires expression
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Casey at Carter dot net
  Target Milestone: ---

Compiling this well-formed program:

  template 
  concept bool Concept = T::f() == 0;

  struct bad {
  static int f() { return 0; }
  };

  int main() {
  static_assert(!Concept);
  }

with -fconcepts and GCC 6.3/7.3/8.1/trunk diagnoses:

  : In function 'int main()':
  :2:30: error: call to non-'constexpr' function 'static int bad::f()'
 concept bool Concept = T::f() == 0;
^~

instead of correctly failing the concept check.

[Bug tree-optimization/86492] [8/9 Regression] store-merging wrong-code

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86492

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-07-11
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
   Target Milestone|--- |8.2
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Started with my r254948.

[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

--- Comment #3 from H.J. Lu  ---
(In reply to Alexander Monakov from comment #2)
> Note that Gold does not exhibit this issue. I think ld.bfd is at fault here.

It is because gold doesn't check archive for a common definition.

> We've hit similar issues with some internal plugin development. The main
> issue is, ld.bfd feeds the plugin with objects extracted from static
> archives, but those objects do not satisfy any unresolved references and
> would not be extracted in the first place in non-LTO link. So ld.bfd is
> causing useless extra work both for itself and the compiler plugin.
> 

Is there a common symbol involved?

Re: [AArch64] Generate load-pairs when the last load clobbers the address register [2/2]

2018-07-11 Thread Jackson Woodruff

Hi Sudi,

On 07/10/2018 02:29 PM, Sudakshina Das wrote:

Hi Jackson


On Tuesday 10 July 2018 09:37 AM, Jackson Woodruff wrote:

Hi all,

This patch resolves PR86014.  It does so by noticing that the last 
load may clobber the address register without issue (regardless of 
where it exists in the final ldp/stp sequence). That check has been 
changed so that the last register may be clobbered and the testcase 
(gcc.target/aarch64/ldp_stp_10.c) now passes.


Bootstrap and regtest OK.

OK for trunk?

Jackson

Changelog:

gcc/

2018-06-25  Jackson Woodruff  

    PR target/86014
    * config/aarch64/aarch64.c 
(aarch64_operands_adjust_ok_for_ldpstp):

    Remove address clobber check on last register.

This looks good to me but you will need a maintainer to approve it. 
The only
thing I would add is that if you could move the comment on top of the 
for loop

to this patch. That is, keep the original
/* Check if the addresses are clobbered by load.  */
in your [1/2] and make the comment change in [2/2].

Thanks, change made.  OK for trunk?

Thanks,

Jackson
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index da44b33b2bc12f9aa2122cf5194e244437fb31a5..8a027974e9772cacf5f5cb8ec61e8ef62187e879 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -17071,9 +17071,10 @@ aarch64_operands_adjust_ok_for_ldpstp (rtx *operands, bool load,
 	return false;
 }
 
-  /* Check if addresses are clobbered by load.  */
+  /* Only the last register in the order in which they occur
+ may be clobbered by the load.  */
   if (load)
-for (int i = 0; i < num_instructions; i++)
+for (int i = 0; i < num_instructions - 1; i++)
   if (reg_mentioned_p (reg[i], mem[i]))
 	return false;
 


Re: [AArch64] Use arrays and loops rather than numbered variables in aarch64_operands_adjust_ok_for_ldpstp [1/2]

2018-07-11 Thread Jackson Woodruff

Hi Sudi,

Thanks for the review.


On 07/10/2018 10:56 AM, Sudakshina wrote:

Hi Jackson


-  if (!MEM_P (mem_1) || aarch64_mem_pair_operand (mem_1, mode))
+  if (!MEM_P (mem[1]) || aarch64_mem_pair_operand (mem[1], mode))

mem_1 == mem[1]?

Oops, yes... That should be mem[0].


 return false;

-  /* The mems cannot be volatile.  */
...

/* If we have SImode and slow unaligned ldp,
  check the alignment to be at least 8 byte. */
   if (mode == SImode
   && (aarch64_tune_params.extra_tuning_flags
-  & AARCH64_EXTRA_TUNE_SLOW_UNALIGNED_LDPW)
+      & AARCH64_EXTRA_TUNE_SLOW_UNALIGNED_LDPW)
   && !optimize_size
-  && MEM_ALIGN (mem_1) < 8 * BITS_PER_UNIT)
+  && MEM_ALIGN (mem[1]) < 8 * BITS_PER_UNIT)

Likewise

Done

...
   /* Check if the registers are of same class.  */
-  if (rclass_1 != rclass_2 || rclass_2 != rclass_3 || rclass_3 != 
rclass_4)

-    return false;
+  for (int i = 0; i < 3; i++)

num_instructions -1 instead of 3 would be more consistent.

Done


+    if (rclass[i] != rclass[i + 1])
+  return false;

It looks good otherwise.

Thanks
Sudi


Re-regtested and boostrapped.

OK for trunk?

Thanks,

Jackson
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 01f35f8e8525adb455780269757452c8c3eb20be..da44b33b2bc12f9aa2122cf5194e244437fb31a5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -17026,23 +17026,21 @@ bool
 aarch64_operands_adjust_ok_for_ldpstp (rtx *operands, bool load,
    scalar_mode mode)
 {
-  enum reg_class rclass_1, rclass_2, rclass_3, rclass_4;
-  HOST_WIDE_INT offvals[4], msize;
-  rtx mem_1, mem_2, mem_3, mem_4, reg_1, reg_2, reg_3, reg_4;
-  rtx base_1, base_2, base_3, base_4, offset_1, offset_2, offset_3, offset_4;
+  const int num_instructions = 4;
+  enum reg_class rclass[num_instructions];
+  HOST_WIDE_INT offvals[num_instructions], msize;
+  rtx mem[num_instructions], reg[num_instructions],
+  base[num_instructions], offset[num_instructions];
 
   if (load)
 {
-  reg_1 = operands[0];
-  mem_1 = operands[1];
-  reg_2 = operands[2];
-  mem_2 = operands[3];
-  reg_3 = operands[4];
-  mem_3 = operands[5];
-  reg_4 = operands[6];
-  mem_4 = operands[7];
-  gcc_assert (REG_P (reg_1) && REG_P (reg_2)
-		  && REG_P (reg_3) && REG_P (reg_4));
+  for (int i = 0; i < num_instructions; i++)
+	{
+	  reg[i] = operands[2 * i];
+	  mem[i] = operands[2 * i + 1];
+
+	  gcc_assert (REG_P (reg[i]));
+	}
 
   /* Do not attempt to merge the loads if the loads clobber each other.  */
   for (int i = 0; i < 8; i += 2)
@@ -17051,53 +17049,47 @@ aarch64_operands_adjust_ok_for_ldpstp (rtx *operands, bool load,
 	return false;
 }
   else
-{
-  mem_1 = operands[0];
-  reg_1 = operands[1];
-  mem_2 = operands[2];
-  reg_2 = operands[3];
-  mem_3 = operands[4];
-  reg_3 = operands[5];
-  mem_4 = operands[6];
-  reg_4 = operands[7];
-}
+for (int i = 0; i < num_instructions; i++)
+  {
+	mem[i] = operands[2 * i];
+	reg[i] = operands[2 * i + 1];
+  }
+
   /* Skip if memory operand is by itslef valid for ldp/stp.  */
-  if (!MEM_P (mem_1) || aarch64_mem_pair_operand (mem_1, mode))
+  if (!MEM_P (mem[0]) || aarch64_mem_pair_operand (mem[0], mode))
 return false;
 
-  /* The mems cannot be volatile.  */
-  if (MEM_VOLATILE_P (mem_1) || MEM_VOLATILE_P (mem_2)
-  || MEM_VOLATILE_P (mem_3) ||MEM_VOLATILE_P (mem_4))
-return false;
+  for (int i = 0; i < num_instructions; i++)
+{
+  /* The mems cannot be volatile.  */
+  if (MEM_VOLATILE_P (mem[i]))
+	return false;
 
-  /* Check if the addresses are in the form of [base+offset].  */
-  extract_base_offset_in_addr (mem_1, _1, _1);
-  if (base_1 == NULL_RTX || offset_1 == NULL_RTX)
-return false;
-  extract_base_offset_in_addr (mem_2, _2, _2);
-  if (base_2 == NULL_RTX || offset_2 == NULL_RTX)
-return false;
-  extract_base_offset_in_addr (mem_3, _3, _3);
-  if (base_3 == NULL_RTX || offset_3 == NULL_RTX)
-return false;
-  extract_base_offset_in_addr (mem_4, _4, _4);
-  if (base_4 == NULL_RTX || offset_4 == NULL_RTX)
-return false;
+  /* Check if the addresses are in the form of [base+offset].  */
+  extract_base_offset_in_addr (mem[i], base + i, offset + i);
+  if (base[i] == NULL_RTX || offset[i] == NULL_RTX)
+	return false;
+}
+
+  /* Check if addresses are clobbered by load.  */
+  if (load)
+for (int i = 0; i < num_instructions; i++)
+  if (reg_mentioned_p (reg[i], mem[i]))
+	return false;
 
   /* Check if the bases are same.  */
-  if (!rtx_equal_p (base_1, base_2)
-  || !rtx_equal_p (base_2, base_3)
-  || !rtx_equal_p (base_3, base_4))
-return false;
+  for (int i = 0; i < num_instructions - 1; i++)
+if (!rtx_equal_p (base[i], base[i + 1]))
+  return false;
+
+  for (int i = 0; i < num_instructions; i++)
+offvals[i] = INTVAL (offset[i]);
 
-  offvals[0] = 

Re: Fwd: GCC 8.1 :Store Merge pass issue (-fstore-merging).

2018-07-11 Thread Jakub Jelinek
On Wed, Jul 11, 2018 at 09:48:07PM +0530, Umesh Kalappa wrote:
> Cc'ed Kyrill.

Mailing list is not the right medium to report bugs.
I've filed http://gcc.gnu.org/PR86492 for you, but please next time use
bugzilla.

Jakub


[Bug c++/86491] bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace

2018-07-11 Thread jason.vas.dias at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491

--- Comment #1 from Jason Vas Dias  ---
In investigating this problem, I actually modified 6.4.1's gcc/cp/decl2.c
with the following patch to print out which component of the
base struct it thinks uses the anonymous namespace:

BEGIN PATCH:
--- decl2.c,r260630 2018-05-18 14:47:27.0 +0100
+++ decl2.c 2018-07-11 16:16:13.816541340 +0100
@@ -2154,0 +2155,5 @@
+struct tree_vis
+{ tree *tp;
+  int  vis;
+};
+
@@ -2158 +2163 @@
-  int *vis_p = (int *)data;
+  struct tree_vis *tree_vis_p = (struct tree_vis *)data;
@@ -2166 +2171,2 @@
-  *vis_p = VISIBILITY_ANON;
+  tree_vis_p->tp  = tp;
+  tree_vis_p->vis = VISIBILITY_ANON;
@@ -2170,2 +2176,4 @@
-  && CLASSTYPE_VISIBILITY (*tp) > *vis_p)
-*vis_p = CLASSTYPE_VISIBILITY (*tp);
+  && CLASSTYPE_VISIBILITY (*tp) > tree_vis_p->vis)
+  { tree_vis_p->vis = CLASSTYPE_VISIBILITY (*tp);
+tree_vis_p->tp = tp;
+  }
@@ -2181,3 +2189,8 @@
-  int vis = VISIBILITY_DEFAULT;
-  cp_walk_tree_without_duplicates (, min_vis_r, );
-  return vis;
+  struct tree_vis tv = { NULL, VISIBILITY_DEFAULT };
+  cp_walk_tree_without_duplicates (, min_vis_r, );
+  return tv.vis;
+}
+
+static void
+tree_type_visibility (tree_vis *tvis)
+{ cp_walk_tree_without_duplicates (tvis->tp, min_vis_r, tvis);
@@ -2607 +2620,4 @@
-  int subvis = type_visibility (TREE_TYPE (t));
+  tree_vis tvis = { &(TREE_TYPE (t)), VISIBILITY_DEFAULT };
+  tree_type_visibility ();
+
+  int subvis = tvis.vis;
@@ -2627,2 +2643,2 @@
-%qT has a base %qT whose type uses the anonymous namespace",
-type, TREE_TYPE (t));
+%qT has a base %qT whose type uses the anonymous namespace because of its
component %qT",
+type, TREE_TYPE (t), *tvis.tp);
:END PATCH


But the offending component is actually the whole base class:

 N::NT<_C_, _C_OBJ_, _M_>::NT() void N::NA::C::m()
In file included from /tmp/tM.C:1:0:
/tmp/t2.H: At global scope:
/tmp/t2.H:14:9: warning: ‘N::D’ has a base ‘N::NT’ whose type uses the anonymous namespace because of its component
‘N::NT’ [-Wsubobject-linkage]
   class D : public NT
 ^
 N::D::D() N::D::D() N::D::D() int main() N::NT<_C_, _C_OBJ_, _M_>::NT() [with
_C_ = N::NA::C; _C_* _C_OBJ_ = (& N::NA::c_); void (_C_::* _M_)() =
::NA::C::m] N::NT<_C_, _C_OBJ_, _M_>::NT() [with _C_ = N::NA::C; _C_* _C_OBJ_
= (& N::NA::c_); void (_C_::* _M_)() = ::NA::C::m] N::NT<_C_, _C_OBJ_,
_M_>::NT() [with _C_ = N::NA::C; _C_* _C_OBJ_ = (& N::NA::c_); void (_C_::*
_M_)() = ::NA::C::m]
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data>   
   Assembling
functions:
  void N::NA::C::m() N::D::D() int main() N::NT<_C_, _C_OBJ_,
_M_>::NT() [with _C_ = N::NA::C; _C_* _C_OBJ_ = (& N::NA::c_); void (_C_::*
_M_)() = ::NA::C::m]
Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
  1386 kB (62%) ggc
 phase opt and generate  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 (100%) wall
169 kB ( 8%) ggc
 integrated RA   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 (100%) wall
 96 kB ( 4%) ggc
 TOTAL :   0.00 0.00 0.01  
2223 kB


It would be nice if the fix for this could also print out more information
about precisely which component of the structure the compiler thinks uses
the anonymous namespace - it has taken me 2 days of analysis to get this 
far, which could have been avoided if GCC printed out information like
the above in the first place.

[Bug tree-optimization/86492] New: [8/9 Regression] store-merging wrong-code

2018-07-11 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86492

Bug ID: 86492
   Summary: [8/9 Regression] store-merging wrong-code
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

union U
{
  unsigned int r;
  struct
  {
unsigned int a:12;
unsigned int b:4;
unsigned int c:16;
  } f;
};

__attribute__((noinline, noclone)) unsigned int
foo (unsigned int x)
{
  union U r;
  r.r = 0;
  r.f.c = x;
  r.f.b = 0xe;
  return r.r;
}

int
main ()
{
  volatile unsigned int x;
  x = 0x72;
  x = foo (x);
  union U r;
  r.r = x;
  if (r.f.a != 0 || r.f.b != 0xe || r.f.c != 0x72)
__builtin_abort ();
  return 0;
}

is miscompiled by store-merging at -O2.

[Bug c++/86491] New: bogus and unsuppressible warning: 'YYY' has a base 'ZZZ' whose type uses the anonymous namespace

2018-07-11 Thread jason.vas.dias at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86491

Bug ID: 86491
   Summary: bogus and unsuppressible warning: 'YYY' has a base
'ZZZ' whose type uses the anonymous namespace
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jason.vas.dias at gmail dot com
  Target Milestone: ---

Created attachment 44383
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44383=edit
test code demonstrating problem as described above.

This could be a duplicate of (now closed) bug #57317 , 
or a demonstration that the fix for that bug is not complete.

It occurs with all versions of GCC on Linux x86_64 - tested:
 o RHEL7 system compiler: 4.8.5 (rpm: 4.8.5-28.el7_5.1.x86_64)
My own builds of:
 o gcc 5.4.0
 o gcc 6.4.1 (from gcc-6-branch r260630)
 o gcc 7.3.1 (r260631)
 o gcc 8.1.0 (r261026) 
.

The following test code triggers a -Wsubobject-linkage warning,
with all the above compilers :

./t2.H:14:9: warning: 'N::D' has a base 'N::NT whose type uses the anonymous namespace \
 [enabled by default]
  class D : public NT
^

I have constructed the following test code,
which triggers the same -Wsubobject-linkage warning
I am seeing in a large more complex real-world example.

I am not intentionally using the anonymous namespace
in any of this code, and I cannot for the life of me
see how any part of it uses the anonymous namespace:

Test Code :

File t1.H:

// t1.H :
namespace N
{
  template < class _C_, _C_ *_C_OBJ_, void (_C_::*_M_)() >
  class NT
  { static constexpr _C_ *c_ = _C_OBJ_;
  public:
NT()
{ (c_->*_M_)();
}
  };
}


File t2.H:

// t2.H :
#include 
namespace N
{
  namespace NA
  {
class C
{ public:
  void m()
  {}
};
static C c_;
  }

  class D : public NT
  {public:
typedef NT NT_t;
D(): NT_t()
{}
  };
}
file tM.C (main program):


#include 

extern int main()
{ N::D d;
  (void)d;
  return 0;
}



The above files are included in the attached 
'Wsubobject-linkage-anon-base-bug.tar' file, which includes a make file:

$ tar -xpf Wsubobject-linkage-anon-base-bug.tar
$ cd Wsubobject-linkage-anon-base-bug
$ make
g++ -std=gnu++14 -I. -Wall -Wextra -Werror -o tM tM.C
In file included from tM.C:1:0:
./t2.H:14:9: error: 'N::D' has a base 'N::NT' whose type uses the anonymous namespace [-Werror]
   class D : public NT
 ^
cc1plus: all warnings being treated as errors
make: *** [Makefile:4: tM] Error 1

[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #2 from Alexander Monakov  ---
Note that Gold does not exhibit this issue. I think ld.bfd is at fault here.

We've hit similar issues with some internal plugin development. The main issue
is, ld.bfd feeds the plugin with objects extracted from static archives, but
those objects do not satisfy any unresolved references and would not be
extracted in the first place in non-LTO link. So ld.bfd is causing useless
extra work both for itself and the compiler plugin.

It would be nice to fix this on ld.bfd side so future plugin writers don't need
to wrestle with this issue.

[Bug tree-optimization/80122] __builtin_va_arg_pack() and __builtin_va_arg_pack_len() does not work correctly

2018-07-11 Thread rpirrera at aitek dot it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80122

rpirrera at aitek dot it changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #12 from rpirrera at aitek dot it ---
I have found another issue with __builtin_va_arg_pack_len() that happens when
the functions that uses __builtin_va_arg_pack() calls another function that
uses __builtin_va_arg_pack_len(), the argument len is miscalculated as you can
see in the attached example.
This was working in GCC version 4.4.7.


/*** TESTBUILTIN BEGIN /
static inline __attribute__(( __always_inline__)) int 
funA(unsigned int param, ...) 
{ 
return __builtin_va_arg_pack_len(); 
}

static inline __attribute__(( __always_inline__)) int
funB(unsigned int param, ...)
{ 
return funA(param,  2, 4, __builtin_va_arg_pack()); 
}

int 
testBuiltin(void) 
{ 
printf(ANSI_BOLD "%s" ANSI_RESET " ... ", __FUNCTION__);

int rc = funB(0,1,2); 
if (rc != 4) {
return 1;
}

return 0;
}

Thank you!

Fwd: GCC 8.1 :Store Merge pass issue (-fstore-merging).

2018-07-11 Thread Umesh Kalappa
Cc'ed Kyrill.


-- Forwarded message -
From: Umesh Kalappa 
Date: Wed, Jul 11, 2018, 7:37 PM
Subject: GCC 8.1 :Store Merge pass issue (-fstore-merging).
To: 


Hi Everyone ,

We have the below case ,where store marge pass doing the invalid
optimization (thats our observations on powerpc ) ,i.e

C case :

typedef unsigned int UINT32;

typedef union
{
UINT32 regVal;
struct
{
UINT32 mask:1;
UINT32 a:1;
UINT32 :6;
UINT32 p:1;
UINT32 s:1;
UINT32 :2;
UINT32 priority:4;
UINT32 vector:16;
} field;
} MPIC_IVPR;

UINT32 test(UINT32 vector)
{
MPIC_IVPR   mpicIvpr;

mpicIvpr.regVal = 0;
mpicIvpr.field.vector = vector;
mpicIvpr.field.priority = 0xe;

return mpicIvpr.regVal;
}


  gcc -O2 -S   test.c

  ...
lis 3,0xe   ;; mpicIvpr.field.priority = 15
blr
  ...

the store dump as

Processing basic block <2>:
Starting new chain with statement:
mpicIvpr.regVal = 0;
The base object is:

Recording immediate store from stmt:
mpicIvpr.field.vector = _1;
Recording immediate store from stmt:
mpicIvpr.field.priority = 14;
stmt causes chain termination:
_7 = mpicIvpr.regVal;
Attempting to coalesce 3 stores in chain.
Store 0:
bitsize:32 bitpos:0 val:
0

Store 1:
bitsize:4 bitpos:12 val:
14

Store 2:
bitsize:16 bitpos:16 val:
_1


After writing 0 of size 32 at position 0 the merged region contains:
0 0 0 0 0 0 0 0
After writing 14 of size 4 at position 12 the merged region contains:
0 e 0 0 0 0 0 0
Coalescing successful!
Merged into 1 stores
New sequence of 1 stmts to replace old one of 2 stmts
# .MEM_6 = VDEF <.MEM_5>
MEM[(union  *)] = 917504;
Merging successful!
Volatile access terminates all chains
test (UINT32 vector)
{
  union MPIC_IVPR mpicIvpr;
  short unsigned int _1;
  UINT32 _7;

   [local count: 1073741825]:
  _1 = (short unsigned int) vector_4(D);
  mpicIvpr.field.vector = _1;
  MEM[(union  *)] = 917504;
  _7 = mpicIvpr.regVal;
  mpicIvpr ={v} {CLOBBER};
  return _7;

}


As noticed  from dump ,the store of  .regVal and  priority is folded
to single store ,since the rhs operand is constant in both stmts by
leaving the  above cfg and making  mpicIvpr.field.vector = _1 stmt as
dead code,hence latter DCE deletes the same ,which results with  the
incorrect asm as show above and we are in process of debugging the
store merge pass and we see that "  mpicIvpr.field.vector = _1; "
should clobber the first store "mpicIvpr.regVal = 0;" in the merge
store vector ,but its not clobbering ,by disabling the handling the
overlapping store ,the above case works , i.e
if(0)
{
 /* |---store 1---|
   |---store 2---|
 Overlapping stores.  */
  if (IN_RANGE (info->bitpos, merged_store->start,
merged_store->start + merged_store->width - 1))
{
  if (info->rhs_code == INTEGER_CST
  && merged_store->stores[0]->rhs_code == INTEGER_CST)
{
  merged_store->merge_overlapping (info);
  continue;
}
}

}

before we conclude on the same ,we would like to hear any comments
from community ,which helps us to resolve the issue  and by disabling
the store merge pass as expected the above case works .

Thank you. and looking for any suggestions on the same.
~Umesh


Re: [PATCH] doc: add missing "mode" type attribute

2018-07-11 Thread Jeff Law
On 07/10/2018 10:50 AM, Paul Koning wrote:
> "mode" is documented as a variable attribute but not as a type attribute.  
> This fixes that omission.  I simply copied the other text, it seemed suitable 
> as it stands.
> 
> The attributes are normally listed in alphabetical order but "mode" was out 
> of order in the variable attributes.
> 
> Ok for trunk?
> 
>   paul
> 
> ChangeLog:
> 
> 2018-07-10  Paul Koning  
> 
>   * doc/extend.texi (Common Variable Attributes): Move "mode" into
>   alphabetical order.
>   (Common Type Attributes): Add "mode" attribute.
OK.
jeff


Re: allow thread_through_all_blocks() to start from the same initial BB

2018-07-11 Thread Jeff Law
On 07/10/2018 05:14 AM, Aldy Hernandez wrote:
> I believe I missed this companion patch when I submitted...
> 
>    Subject: jump threading multiple paths that start from the same BB
> 
> The attached patch changes thread_through_all_blocks to allow threads
> that start from the same basic block as another thread.
> 
> OK for trunk?
> 
> curr.patch
> 
> 
> gcc/
> 
> * tree-ssa-threadupdate.c (thread_through_all_blocks): Do not jump
>   thread twice from the same starting edge.
OK
jeff


Re: [PATCH, doc] Small clarification on define_subst

2018-07-11 Thread Jeff Law
On 07/08/2018 06:03 PM, Paul Koning wrote:
> In doing CCmode work I was confused how define_subst handles cases where the 
> same argument appears more than once.  The attached clarifies this.
> 
> Ok for trunk?
> 
>   paul
> 
> ChangeLog:
> 
> 2018-07-08  Paul Koning  
> 
>   * doc/md.texi (define_subst): Document how multiple occurrences of
>   the same argument in the replacement pattern are handled.
OK.
jeff


[Bug lto/86490] lto1: fatal error: multiple prevailing defs

2018-07-11 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-07-11
 CC||hubicka at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Liška  ---
Thanks H.J. for investigation.
Honza can you please take a look?

Re: [PATCH 0/5] [RFC v2] Higher-level reporting of vectorization problems

2018-07-11 Thread Richard Sandiford
David Malcolm  writes:
> On Mon, 2018-06-25 at 11:10 +0200, Richard Biener wrote:
>> On Fri, 22 Jun 2018, David Malcolm wrote:
>> 
>> > NightStrike and I were chatting on IRC last week about
>> > issues with trying to vectorize the following code:
>> > 
>> > #include 
>> > std::size_t f(std::vector> const & v) {
>> >std::size_t ret = 0;
>> >for (auto const & w: v)
>> >ret += w.size();
>> >return ret;
>> > }
>> > 
>> > icc could vectorize it, but gcc couldn't, but neither of us could
>> > immediately figure out what the problem was.
>> > 
>> > Using -fopt-info leads to a wall of text.
>> > 
>> > I tried using my patch here:
>> > 
>> >  "[PATCH] v3 of optinfo, remarks and optimization records"
>> >   https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01267.html
>> > 
>> > It improved things somewhat, by showing:
>> > (a) the nesting structure via indentation, and
>> > (b) the GCC line at which each message is emitted (by using the
>> > "remark" output)
>> > 
>> > but it's still a wall of text:
>> > 
>> >   https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.remarks.
>> > html
>> >   https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.d/..%7C.
>> > .%7Csrc%7Ctest.cc.html#line-4
>> > 
>> > It doesn't yet provide a simple high-level message to a
>> > tech-savvy user on what they need to do to get GCC to
>> > vectorize their loop.
>> 
>> Yeah, in particular the vectorizer is way too noisy in its low-level
>> functions.  IIRC -fopt-info-vec-missed is "somewhat" better:
>> 
>> t.C:4:26: note: step unknown.
>> t.C:4:26: note: vector alignment may not be reachable
>> t.C:4:26: note: not ssa-name.
>> t.C:4:26: note: use not simple.
>> t.C:4:26: note: not ssa-name.
>> t.C:4:26: note: use not simple.
>> t.C:4:26: note: no array mode for V2DI[3]
>> t.C:4:26: note: Data access with gaps requires scalar epilogue loop
>> t.C:4:26: note: can't use a fully-masked loop because the target
>> doesn't 
>> have the appropriate masked load or store.
>> t.C:4:26: note: not ssa-name.
>> t.C:4:26: note: use not simple.
>> t.C:4:26: note: not ssa-name.
>> t.C:4:26: note: use not simple.
>> t.C:4:26: note: no array mode for V2DI[3]
>> t.C:4:26: note: Data access with gaps requires scalar epilogue loop
>> t.C:4:26: note: op not supported by target.
>> t.C:4:26: note: not vectorized: relevant stmt not supported: _15 =
>> _14 
>> /[ex] 4;
>> t.C:4:26: note: bad operation or unsupported loop bound.
>> t.C:4:26: note: not vectorized: no grouped stores in basic block.
>> t.C:4:26: note: not vectorized: no grouped stores in basic block.
>> t.C:6:12: note: not vectorized: not enough data-refs in basic block.
>> 
>> 
>> > The pertinent dump messages are:
>> > 
>> > test.cc:4:23: remark: === try_vectorize_loop_1 ===
>> > [../../src/gcc/tree-vectorizer.c:674:try_vectorize_loop_1]
>> > cc1plus: remark:
>> > Analyzing loop at test.cc:4
>> > [../../src/gcc/dumpfile.c:735:ensure_pending_optinfo]
>> > test.cc:4:23: remark:  === analyze_loop_nest ===
>> > [../../src/gcc/tree-vect-loop.c:2299:vect_analyze_loop]
>> > [...snip...]
>> > test.cc:4:23: remark:   === vect_analyze_loop_operations ===
>> > [../../src/gcc/tree-vect-loop.c:1520:vect_analyze_loop_operations]
>> > [...snip...]
>> > test.cc:4:23: remark:==> examining statement: ‘_15 = _14 /[ex]
>> > 4;’ [../../src/gcc/tree-vect-stmts.c:9382:vect_analyze_stmt]
>> > test.cc:4:23: remark:vect_is_simple_use: operand ‘_14’
>> > [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
>> > test.cc:4:23: remark:def_stmt: ‘_14 = _8 - _7;’
>> > [../../src/gcc/tree-vect-stmts.c:10098:vect_is_simple_use]
>> > test.cc:4:23: remark:type of def: internal [../../src/gcc/tree-
>> > vect-stmts.c:10112:vect_is_simple_use]
>> > test.cc:4:23: remark:vect_is_simple_use: operand ‘4’
>> > [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
>> > test.cc:4:23: remark:op not supported by target.
>> > [../../src/gcc/tree-vect-stmts.c:5932:vectorizable_operation]
>> > test.cc:4:23: remark:not vectorized: relevant stmt not
>> > supported: ‘_15 = _14 /[ex] 4;’ [../../src/gcc/tree-vect-
>> > stmts.c:9565:vect_analyze_stmt]
>> > test.cc:4:23: remark:   bad operation or unsupported loop bound.
>> > [../../src/gcc/tree-vect-loop.c:2043:vect_analyze_loop_2]
>> > cc1plus: remark: vectorized 0 loops in function.
>> > [../../src/gcc/tree-vectorizer.c:904:vectorize_loops]
>> > 
>> > In particular, that complaint from
>> >   [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt]
>> > is coming from:
>> > 
>> >   if (!ok)
>> > {
>> >   if (dump_enabled_p ())
>> > {
>> >   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> >"not vectorized: relevant stmt not ");
>> >   dump_printf (MSG_MISSED_OPTIMIZATION, "supported: ");
>> >   dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
>> > stmt, 0);
>> > }
>> > 
>> >   return false;
>> > }
>> > 
>> > This got me thinking: the user 

[Bug lto/86490] New: lto1: fatal error: multiple prevailing defs

2018-07-11 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86490

Bug ID: 86490
   Summary: lto1: fatal error: multiple prevailing defs
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

[hjl@gnu-cfl-1 pr23350]$ cat foo.c
int foo;
void bar() {}
[hjl@gnu-cfl-1 pr23350]$ cat bar.c
int foo;
void bar() {}
[hjl@gnu-cfl-1 pr23350]$ cat main.c
int foo;
int
main ()
{
  return foo;
}
[hjl@gnu-cfl-1 pr23350]$ make CC=gcc
gcc -O2 -flto   -c -o main.o main.c
gcc -O2 -flto   -c -o foo.o foo.c
ar --plugin `gcc -print-prog-name=liblto_plugin.so` -rusc libfoo.a foo.o
gcc -O2 -flto   -c -o bar.o bar.c
ar --plugin `gcc -print-prog-name=liblto_plugin.so` -rusc libbar.a bar.o
gcc -o x  main.o libfoo.a libbar.a
lto1: fatal error: multiple prevailing defs for ‘bar’
compilation terminated.
lto-wrapper: fatal error: gcc returned 1 exit status
compilation terminated.
/usr/local/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [Makefile:14: x] Error 1
[hjl@gnu-cfl-1 pr23350]$ 

Linker has:

 if (!blhe)
{
  /* The plugin is called to claim symbols in an archive element
 from plugin_object_p.  But those symbols aren't needed to
 create output.  They are defined and referenced only within
 IR.  */
  switch (syms[n].def)
{
default:
  abort ();
case LDPK_UNDEF:
case LDPK_WEAKUNDEF:
  res = LDPR_UNDEF;
  break;
case LDPK_DEF:
case LDPK_WEAKDEF:
case LDPK_COMMON:
  res = LDPR_PREVAILING_DEF_IRONLY;
  break;
}
  goto report_symbol;
}

lto1 shouldn't issue

lto1: fatal error: multiple prevailing defs for ‘bar’

unless bar will be included in output.

[PATCH, S390] Increase function alignment to 16 bytes

2018-07-11 Thread Robin Dapp
Hi,

the following patch increases the default function alignment to 16
bytes.  This helps get rid of some unwanted performance effects.

I'm unsure whether or when it's necessary to implement
OVERRIDE_OPTIONS_AFTER_CHANGE.
Apparently ia64 did it to set flags that are reset when using
__attribute__((optimize)). i386 calls i386_default_align () and sets
various alignments only when the alignment value is unset but when is
e.g. global_options.x_str_align_functions actually unset except for the
very first call?

Trying simple examples like


 void foo () {};

 __attribute__((optimize("Os")))
 void bar () {};


I did not observe that the default alignment, once set, was reset anywhere.

Regards
 Robin

--

gcc/ChangeLog:

2018-07-11  Robin Dapp  

* config/s390/s390.c (s390_default_align): Set default
function alignment.
(s390_override_options_after_change): New.
(s390_option_override_internal): Call s390_default_align.
(TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): New.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 8df195ddd78..eaeba89b321 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -15322,6 +15322,23 @@ s390_function_specific_restore (struct gcc_options *opts,
   opts->x_s390_cost_pointer = (long)processor_table[opts->x_s390_tune].cost;
 }
 
+static void
+s390_default_align (struct gcc_options *opts)
+{
+  /* Set the default function alignment to 16 in order to get rid of
+ some unwanted performance effects. */
+  if (opts->x_flag_align_functions && !opts->x_str_align_functions
+  && opts->x_s390_tune >= PROCESSOR_2964_Z13
+  && !opts->x_optimize_size)
+opts->x_str_align_functions = "16";
+}
+
+static void
+s390_override_options_after_change (void)
+{
+  s390_default_align (_options);
+}
+
 static void
 s390_option_override_internal (bool main_args_p,
 			   struct gcc_options *opts,
@@ -15559,6 +15576,9 @@ s390_option_override_internal (bool main_args_p,
 			 opts->x_param_values,
 			 opts_set->x_param_values);
 
+  /* Set the default alignment.  */
+  s390_default_align (opts);
+
   /* Call target specific restore function to do post-init work.  At the moment,
  this just sets opts->x_s390_cost_pointer.  */
   s390_function_specific_restore (opts, NULL);
@@ -16751,6 +16771,9 @@ s390_case_values_threshold (void)
 #undef TARGET_PASS_BY_REFERENCE
 #define TARGET_PASS_BY_REFERENCE s390_pass_by_reference
 
+#undef  TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE
+#define TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE s390_override_options_after_change
+
 #undef TARGET_FUNCTION_OK_FOR_SIBCALL
 #define TARGET_FUNCTION_OK_FOR_SIBCALL s390_function_ok_for_sibcall
 #undef TARGET_FUNCTION_ARG


[arm] Put CPU's FPU capabilities directly in the ISA specification

2018-07-11 Thread Richard Earnshaw (lists)
As part of the transition from the original support for named FPUs to
general FPU properties I defined an entry in the CPU definitions in
arm-cpus.in to use a named FPU.  However, that has now outlived its
usefulness and increasingly we are likely to find that newer cores do
not fit the legacy FPU names very well.  Furthermore it is now possible
to encode all the FPU capatilities directly in the ISA definitions,
often as simply as using +fp or +simd.

So this patch removes the fpu field from the "define cpu" entries and
instead encodes the same information in the isa field.  This also alows
us to remove a bit of now-dead code from parsecpu.awk.

* config/arm/arm-cpus.in: Move information from fpu field of each
cpu definition to the isa field.
* config/arm/parsecpu.awk (fpu): Delete match rule.
(gen_comm_data): Don't add bits from the CPU's FPU entry.

Committed to trunk.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index c2dacda..d6eed2f 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -620,7 +620,6 @@ end arch iwmmxt2
 #   [tune for ]
 #   [tune flags ]
 #   architecture 
-#   [fpu ]
 #   [isa ]
 #   [option  add|remove ]*
 #   [optalias  ]*
@@ -633,7 +632,7 @@ end arch iwmmxt2
 # isa flags are appended to those defined by the architecture.
 # Each add option must have a distinct feature set and each remove
 # option must similarly have a distinct feature set.  Option aliases can be
-# added with the optalias statement
+# added with the optalias statement.
 
 # V4 Architecture Processors
 begin cpu arm8
@@ -778,8 +777,7 @@ end cpu arm1020t
 # V5TE Architecture Processors
 begin cpu arm9e
  tune flags LDSCHED
- architecture armv5te
- fpu vfpv2
+ architecture armv5te+fp
  option nofp remove ALL_FP
  costs 9e
 end cpu arm9e
@@ -787,8 +785,7 @@ end cpu arm9e
 begin cpu arm946e-s
  cname arm946es
  tune flags LDSCHED
- architecture armv5te
- fpu vfpv2
+ architecture armv5te+fp
  option nofp remove ALL_FP
  costs 9e
 end cpu arm946e-s
@@ -796,8 +793,7 @@ end cpu arm946e-s
 begin cpu arm966e-s
  cname arm966es
  tune flags LDSCHED
- architecture armv5te
- fpu vfpv2
+ architecture armv5te+fp
  option nofp remove ALL_FP
  costs 9e
 end cpu arm966e-s
@@ -805,32 +801,28 @@ end cpu arm966e-s
 begin cpu arm968e-s
  cname arm968es
  tune flags LDSCHED
- architecture armv5te
- fpu vfpv2
+ architecture armv5te+fp
  option nofp remove ALL_FP
  costs 9e
 end cpu arm968e-s
 
 begin cpu arm10e
  tune flags LDSCHED
- architecture armv5te
- fpu vfpv2
+ architecture armv5te+fp
  option nofp remove ALL_FP
  costs fastmul
 end cpu arm10e
 
 begin cpu arm1020e
  tune flags LDSCHED
- architecture armv5te
- fpu vfpv2
+ architecture armv5te+fp
  option nofp remove ALL_FP
  costs fastmul
 end cpu arm1020e
 
 begin cpu arm1022e
  tune flags LDSCHED
- architecture armv5te
- fpu vfpv2
+ architecture armv5te+fp
  option nofp remove ALL_FP
  costs fastmul
 end cpu arm1022e
@@ -883,8 +875,7 @@ end cpu fa726te
 begin cpu arm926ej-s
  cname arm926ejs
  tune flags LDSCHED
- architecture armv5tej
- fpu vfpv2
+ architecture armv5tej+fp
  option nofp remove ALL_FP
  costs 9e
 end cpu arm926ej-s
@@ -892,8 +883,7 @@ end cpu arm926ej-s
 begin cpu arm1026ej-s
  cname arm1026ejs
  tune flags LDSCHED
- architecture armv5tej
- fpu vfpv2
+ architecture armv5tej+fp
  option nofp remove ALL_FP
  costs 9e
 end cpu arm1026ej-s
@@ -910,8 +900,7 @@ end cpu arm1136j-s
 begin cpu arm1136jf-s
  cname arm1136jfs
  tune flags LDSCHED
- architecture armv6j
- fpu vfpv2
+ architecture armv6j+fp
  costs 9e
 end cpu arm1136jf-s
 
@@ -925,8 +914,7 @@ end cpu arm1176jz-s
 begin cpu arm1176jzf-s
  cname arm1176jzfs
  tune flags LDSCHED
- architecture armv6kz
- fpu vfpv2
+ architecture armv6kz+fp
  costs 9e
 end cpu arm1176jzf-s
 
@@ -938,8 +926,7 @@ end cpu mpcorenovfp
 
 begin cpu mpcore
  tune flags LDSCHED
- architecture armv6k
- fpu vfpv2
+ architecture armv6k+fp
  costs 9e
 end cpu mpcore
 
@@ -953,8 +940,7 @@ end cpu arm1156t2-s
 begin cpu arm1156t2f-s
  cname arm1156t2fs
  tune flags LDSCHED
- architecture armv6t2
- fpu vfpv2
+ architecture armv6t2+fp
  costs v6t2
 end cpu arm1156t2f-s
 
@@ -1012,8 +998,7 @@ end cpu cortex-m0plus.small-multiply
 begin cpu generic-armv7-a
  cname genericv7a
  tune flags LDSCHED
- architecture armv7-a
- fpu vfpv3-d16
+ architecture armv7-a+fp
  option vfpv3-d16 add VFPv3 FP_DBL
  option vfpv3 add VFPv3 FP_D32
  option vfpv3-d16-fp16 add VFPv3 FP_DBL fp16conv
@@ -1033,8 +1018,7 @@ end cpu generic-armv7-a
 begin cpu cortex-a5
  cname cortexa5
  tune flags LDSCHED
- architecture armv7-a
- fpu neon-fp16
+ architecture armv7-a+neon-fp16
  option nosimd remove ALL_SIMD
  option nofp remove ALL_FP
  costs cortex_a5
@@ -1043,8 +1027,7 @@ end cpu cortex-a5
 begin cpu cortex-a7
  cname cortexa7
  tune flags LDSCHED
- architecture armv7ve
- fpu neon-vfpv4
+ architecture armv7ve+simd
  option nosimd remove ALL_SIMD
  option nofp remove ALL_FP
  costs cortex_a7
@@ 

[Bug tree-optimization/86489] New: ICE in gimple_phi_arg starting with r261682 when building 531.deepsjeng_r with FDO + LTO

2018-07-11 Thread pthaugen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86489

Bug ID: 86489
   Summary: ICE in gimple_phi_arg starting with r261682 when
building 531.deepsjeng_r with FDO + LTO
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pthaugen at gcc dot gnu.org
CC: dje at gcc dot gnu.org, kugan at gcc dot gnu.org,
segher at gcc dot gnu.org, wschmidt at gcc dot gnu.org
  Target Milestone: ---
  Host: powerpc64le-unknown-linux-gnu
Target: powerpc64le-unknown-linux-gnu
 Build: powerpc64le-unknown-linux-gnu

The patch for pr82479 causes an ICE while building CPU2017 531.deepsjeng_r with
FDO and LTO. The ICE occurs during the link step of the -fprofile-use build.

/home/pthaugen/install/gcc/gcc_hunt/bin/g++  -m64 -O3 -mcpu=power9
-fpeel-loops -funroll-loops -ffast-math -mpopcntd -mrecip -flto -DSPEC_LP64
 -m64 -Wl,-q  -Wl,-rpath=/home/pthaugen/install/gcc/gcc_hunt/lib64   attacks.o
bitboard.o bits.o board.o draw.o endgame.o epd.o generate.o initp.o make.o
moves.o neval.o pawn.o preproc.o search.o see.o sjeng.o state.o ttable.o
utils.o  -o deepsjeng_r  
during GIMPLE pass: cunroll
generate.cpp: In function 'gen.constprop':
generate.cpp:159:5: internal compiler error: in gimple_phi_arg, at
gimple.h:4345
 int gen(state_t *s, move_s *moves) {
 ^
0x1013c597 gimple_phi_arg
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/gimple.h:4345
0x1013c5f3 gimple_phi_arg
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/gimple.h:4345
0x1013c5f3 gimple_phi_arg
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/gimple.h:4353
0x10a37607 gimple_phi_arg_def
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/gimple.h:4396
0x10a37607 number_of_iterations_popcount
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-niter.c:2559
0x10a37607 number_of_iterations_exit_assumptions(loop*, edge_def*,
tree_niter_desc*, gcond**, bool)
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-niter.c:2364
0x10a392eb number_of_iterations_exit_assumptions(loop*, edge_def*,
tree_niter_desc*, gcond**, bool)
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-niter.c:2611
0x10a392eb number_of_iterations_exit(loop*, edge_def*, tree_niter_desc*, bool,
bool)
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-niter.c:2616
0x10a3985f number_of_iterations_exit(loop*, edge_def*, tree_niter_desc*, bool,
bool)
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/vec.h:884
0x10a3985f estimate_numbers_of_iterations(loop*)
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-niter.c:4100
0x10a3ce73 estimate_numbers_of_iterations(function*)
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-niter.c:4329
0x10a07ec7 tree_unroll_loops_completely
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-ivcanon.c:1452
0x10a08603 execute
/home/pthaugen/src/gcc/gcc_hunt/gcc/gcc/tree-ssa-loop-ivcanon.c:1612
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.

[Bug c++/86480] [8 Regression] error: parameter packs not expanded with '...' in a recursive variadic lambda

2018-07-11 Thread gufideg at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86480

--- Comment #2 from Guillaume Racicot  ---
Yes of course! I only added the `-std=c++17` flag.

Here's a live example: https://godbolt.org/g/p8KLfE

Go patch committed: Fix evaluation order of LHS index expressions

2018-07-11 Thread Ian Lance Taylor
The Go spec says that when an index expression appears on the left
hand side of an assignment, the operands should be evaluated. The
gofrontend code was assuming that that only referred to the index
operand. But discussion of https://golang.org/issue/23188 has
clarified that this means both the slice/map/string operand and the
index operand. This patch adjusts the gofrontend code accordingly,
fixing the issue.  The test case for this is in
https://golang.org/cl/123115.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 262540)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-8ad67a72a4fa59efffc891e73ecf10020e3c565d
+ea7ac7784791dca517b6681a02c39c11bf136755
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 262540)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -10898,6 +10898,20 @@ Array_index_expression::do_check_types(G
 }
 }
 
+// The subexpressions of an array index must be evaluated in order.
+// If this is indexing into an array, rather than a slice, then only
+// the index should be evaluated.  Since this is called for values on
+// the left hand side of an assigment, evaluating the array, meaning
+// copying the array, will cause a different array to be modified.
+
+bool
+Array_index_expression::do_must_eval_subexpressions_in_order(
+int* skip) const
+{
+  *skip = this->array_->type()->is_slice_type() ? 0 : 1;
+  return true;
+}
+
 // Flatten array indexing by using temporary variables for slices and indexes.
 
 Expression*
Index: gcc/go/gofrontend/expressions.h
===
--- gcc/go/gofrontend/expressions.h (revision 262540)
+++ gcc/go/gofrontend/expressions.h (working copy)
@@ -2771,12 +2771,10 @@ class Index_expression : public Parser_e
this->location());
   }
 
+  // This shouldn't be called--we don't know yet.
   bool
-  do_must_eval_subexpressions_in_order(int* skip) const
-  {
-*skip = 1;
-return true;
-  }
+  do_must_eval_subexpressions_in_order(int*) const
+  { go_unreachable(); }
 
   void
   do_dump_expression(Ast_dump_context*) const;
@@ -2882,11 +2880,7 @@ class Array_index_expression : public Ex
   }
 
   bool
-  do_must_eval_subexpressions_in_order(int* skip) const
-  {
-*skip = 1;
-return true;
-  }
+  do_must_eval_subexpressions_in_order(int* skip) const;
 
   bool
   do_is_addressable() const;
@@ -2965,11 +2959,8 @@ class String_index_expression : public E
   }
 
   bool
-  do_must_eval_subexpressions_in_order(int* skip) const
-  {
-*skip = 1;
-return true;
-  }
+  do_must_eval_subexpressions_in_order(int*) const
+  { return true; }
 
   Bexpression*
   do_get_backend(Translate_context*);
@@ -3052,11 +3043,8 @@ class Map_index_expression : public Expr
   }
 
   bool
-  do_must_eval_subexpressions_in_order(int* skip) const
-  {
-*skip = 1;
-return true;
-  }
+  do_must_eval_subexpressions_in_order(int*) const
+  { return true; }
 
   // A map index expression is an lvalue but it is not addressable.
 


GCC 8.1 :Store Merge pass issue (-fstore-merging).

2018-07-11 Thread Umesh Kalappa
Hi Everyone ,

We have the below case ,where store marge pass doing the invalid
optimization (thats our observations on powerpc ) ,i.e

C case :

typedef unsigned int UINT32;

typedef union
{
UINT32 regVal;
struct
{
UINT32 mask:1;
UINT32 a:1;
UINT32 :6;
UINT32 p:1;
UINT32 s:1;
UINT32 :2;
UINT32 priority:4;
UINT32 vector:16;
} field;
} MPIC_IVPR;

UINT32 test(UINT32 vector)
{
MPIC_IVPR   mpicIvpr;

mpicIvpr.regVal = 0;
mpicIvpr.field.vector = vector;
mpicIvpr.field.priority = 0xe;

return mpicIvpr.regVal;
}


  gcc -O2 -S   test.c

  ...
lis 3,0xe   ;; mpicIvpr.field.priority = 15
blr
  ...

the store dump as

Processing basic block <2>:
Starting new chain with statement:
mpicIvpr.regVal = 0;
The base object is:

Recording immediate store from stmt:
mpicIvpr.field.vector = _1;
Recording immediate store from stmt:
mpicIvpr.field.priority = 14;
stmt causes chain termination:
_7 = mpicIvpr.regVal;
Attempting to coalesce 3 stores in chain.
Store 0:
bitsize:32 bitpos:0 val:
0

Store 1:
bitsize:4 bitpos:12 val:
14

Store 2:
bitsize:16 bitpos:16 val:
_1


After writing 0 of size 32 at position 0 the merged region contains:
0 0 0 0 0 0 0 0
After writing 14 of size 4 at position 12 the merged region contains:
0 e 0 0 0 0 0 0
Coalescing successful!
Merged into 1 stores
New sequence of 1 stmts to replace old one of 2 stmts
# .MEM_6 = VDEF <.MEM_5>
MEM[(union  *)] = 917504;
Merging successful!
Volatile access terminates all chains
test (UINT32 vector)
{
  union MPIC_IVPR mpicIvpr;
  short unsigned int _1;
  UINT32 _7;

   [local count: 1073741825]:
  _1 = (short unsigned int) vector_4(D);
  mpicIvpr.field.vector = _1;
  MEM[(union  *)] = 917504;
  _7 = mpicIvpr.regVal;
  mpicIvpr ={v} {CLOBBER};
  return _7;

}


As noticed  from dump ,the store of  .regVal and  priority is folded
to single store ,since the rhs operand is constant in both stmts by
leaving the  above cfg and making  mpicIvpr.field.vector = _1 stmt as
dead code,hence latter DCE deletes the same ,which results with  the
incorrect asm as show above and we are in process of debugging the
store merge pass and we see that "  mpicIvpr.field.vector = _1; "
should clobber the first store "mpicIvpr.regVal = 0;" in the merge
store vector ,but its not clobbering ,by disabling the handling the
overlapping store ,the above case works , i.e
if(0)
{
 /* |---store 1---|
   |---store 2---|
 Overlapping stores.  */
  if (IN_RANGE (info->bitpos, merged_store->start,
merged_store->start + merged_store->width - 1))
{
  if (info->rhs_code == INTEGER_CST
  && merged_store->stores[0]->rhs_code == INTEGER_CST)
{
  merged_store->merge_overlapping (info);
  continue;
}
}

}

before we conclude on the same ,we would like to hear any comments
from community ,which helps us to resolve the issue  and by disabling
the store merge pass as expected the above case works .

Thank you. and looking for any suggestions on the same.
~Umesh


Re: [PATCH] fold strlen() of aggregate members (PR 77357)

2018-07-11 Thread Martin Sebor

On 07/11/2018 07:50 AM, Andre Vieira (lists) wrote:

On 11/07/18 11:00, Andre Vieira (lists) wrote:

On 09/07/18 22:44, Martin Sebor wrote:

On 07/09/2018 06:40 AM, Richard Biener wrote:

On Sun, Jul 8, 2018 at 4:56 AM Martin Sebor  wrote:


On 07/06/2018 09:52 AM, Richard Biener wrote:

On Fri, Jul 6, 2018 at 1:54 AM Martin Sebor  wrote:


GCC folds accesses to members of constant aggregates except
for character arrays/strings.  For example, the strlen() call
below is not folded:

   const char a[][4] = { "1", "12" };

   int f (void) { retturn strlen (a[1]); }

The attached change set enhances the string_constant() function
to make it possible to extract string constants from aggregate
initializers (CONSTRUCTORS).

The initial solution was much simpler but as is often the case,
MEM_REF made it fail to fold things like:

   int f (void) { retturn strlen (a[1] + 1); }

Handling those made the project a bit more interesting and
the final solution somewhat more involved.

To handle offsets into aggregate string members the patch also
extends the fold_ctor_reference() function to extract entire
string array initializers even if the offset points past
the beginning of the string and even though the size and
exact type of the reference are not known (there isn't enough
information in a MEM_REF to determine that).

Tested along with the patch for PR 86415 on x86_64-linux.


+  if (TREE_CODE (init) == CONSTRUCTOR)
+   {
+ tree type;
+ if (TREE_CODE (arg) == ARRAY_REF
+ || TREE_CODE (arg) == MEM_REF)
+   type = TREE_TYPE (arg);
+ else if (TREE_CODE (arg) == COMPONENT_REF)
+   {
+ tree field = TREE_OPERAND (arg, 1);
+ type = TREE_TYPE (field);
+   }
+ else
+   return NULL_TREE;

what's wrong with just

type = TREE_TYPE (field);


In response to your comment below abut size I simplified things
further so determining the type a priori is no longer necessary.


?

+ base_off *= BITS_PER_UNIT;

poly_uint64 isn't enough for "bits", with wide-int you'd use
offset_int,
for poly you'd then use poly_offset?


Okay, I tried to avoid the overflow.  (Converting between all
these flavors of wide int types is a monumental PITA.)



You extend fold_ctor_reference to treat size == 0 specially but then
bother to compute a size here - that looks unneeded?


Yes, well spotted, thanks!  I simplified the code so this isn't
necessary, and neither is the type.



While the offset of the reference determines the first field in the
CONSTRUCTOR, how do you know the access doesn't touch
adjacent ones?  STRING_CSTs do not have to be '\0' terminated,
so consider

  char x[2][4] = { "abcd", "abcd" };

and MEM[] with a char[8] type?  memcpy "inlining" will create
such MEMs for example.


The code is only used to find string constants in initializer
expressions where I don't think the size of the access comes
into play.  If a memcpy() call results in a MEM_REF[char[8],
, 8] that's fine.  It's a valid reference and we can still
get the underlying character sequence (which is represented
as two STRING_CSTs with the two string literals).  I might
be missing the point of your question.


Maybe irrelevant for strlen folding depending on what you do
for missing '\0' termination.



@@ -6554,8 +6577,16 @@ fold_nonarray_ctor_reference (tree type, tree
ctor,
   tree byte_offset = DECL_FIELD_OFFSET (cfield);
   tree field_offset = DECL_FIELD_BIT_OFFSET (cfield);
   tree field_size = DECL_SIZE (cfield);
-  offset_int bitoffset;
-  offset_int bitoffset_end, access_end;
+
+  if (!field_size && TREE_CODE (cval) == STRING_CST)
+   {
+ /* Determine the size of the flexible array member from
+the size of the string initializer provided for it.  */
+ unsigned HOST_WIDE_INT len = TREE_STRING_LENGTH (cval);
+ tree eltype = TREE_TYPE (TREE_TYPE (cval));
+ len *= tree_to_uhwi (TYPE_SIZE (eltype));
+ field_size = build_int_cst (size_type_node, len);
+   }

Why does this only apply to STRING_CST initializers and not
CONSTRUCTORS,
say, for

struct S { int i; int a[]; } s = { 1, { 2, 3, 4, 5, 6 } };


I can't think of a use for it.  Do you have something in mind?


Well, you basically implemented a get-CONSTRUCTOR-elt-at-offset
which is useful in other parts of the compiler.  So I don't see why
it shouldn't work for general flex-arrays.



?  And why not use simply

  field_size = TYPE_SIZE (TREE_TYPE (cval));

like you do in c_strlen?


Yes, that's simpler, thanks.



Otherwise looks reasonable.


Attached is an updated patch.  I also enhanced the handling
of non-constant indices.  They were already handled before
to a smaller extent.  (There may be other opportunities
here.)


Please don't do functional changes to a patch in review, without
exactly pointing out the change.  It makes review inefficent for me.

Looks like it might be the NULL type argument handling?



  1   2   >