[committed] v850, default to bigger switch tables

2018-06-25 Thread Jeff Law
So many many years ago when I first wrote the v850 port I included the
ability to have entries in switch jump tables be either 16 or 32 bits
wide.  With the default being 16 bits.

In retrospect that was a mistake.  The silent failures when the entries
overflow  is just too painful.  This patch changes the default to 32
bits.  So the tables themselves are considerably larger.  The dispatch
code itself is slightly more efficient as we don't have to deal with
sign extending the 16bit offset to 32bits.

Code with different switch table sizes can co-exist.  Someone who really
wanted those smaller tables can use -mno-big-switch to get the prior
behavior.

This fixes a few more testsuite failures.  Committed to the trunk.

Jeff
commit 9038d0f6e0a23e2dfeb27e9cad471e98ebc155da
Author: Jeff Law 
Date:   Mon Jun 25 23:44:30 2018 -0600

* common/config/v850/v850-common.c (TARGET_DEFAULT_TARGET_FLAGS): 
Turn
on -mbig-switch by default.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b5f073de663..3b1bd94f818 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2018-06-25  Jeff Law  
 
+   * common/config/v850/v850-common.c (TARGET_DEFAULT_TARGET_FLAGS): Turn
+   on -mbig-switch by default.
+
* config/v850/predicates.md (const_float_1_operand): Fix match_code
test.
(const_float_0_operand): Remove unused predicate.
diff --git a/gcc/common/config/v850/v850-common.c 
b/gcc/common/config/v850/v850-common.c
index 6a93ff08d84..803bc169018 100644
--- a/gcc/common/config/v850/v850-common.c
+++ b/gcc/common/config/v850/v850-common.c
@@ -122,7 +122,8 @@ static const struct default_options 
v850_option_optimization_table[] =
   };
 
 #undef  TARGET_DEFAULT_TARGET_FLAGS
-#define TARGET_DEFAULT_TARGET_FLAGS (MASK_DEFAULT | MASK_APP_REGS)
+#define TARGET_DEFAULT_TARGET_FLAGS \
+  (MASK_DEFAULT | MASK_APP_REGS | MASK_BIG_SWITCH)
 #undef  TARGET_HANDLE_OPTION
 #define TARGET_HANDLE_OPTION v850_handle_option
 #undef  TARGET_OPTION_OPTIMIZATION_TABLE


Re: fix libcc1 dependencies in toplevel Makefile

2018-06-25 Thread Alexandre Oliva
On Jun 11, 2018, Alexandre Oliva  wrote:

> On Jun  3, 2018, Alexandre Oliva  wrote:
>> On Jun 27, 2017, Alexandre Oliva  wrote:

> 1. address the previously-mentioned fragility in the patch I posted, to
> catch all cases of postbootstrap targets and their deps on
> non-postbootstrap targets.

This turned out to just require some thinking to convince myself it
wouldn't come up.

There was a major problem in the earlier patch, however: @if/@endif
gcc-no-bootstrap wasn't quite what we needed to enclose the preexisting
deps, because that works for cases in which gcc is built but not
bootstrapped, but not cases in which gcc is not built.  I had to
introduce @unless/@endunless to express the desired semantics.


Here's the patch I'll install if nobody objects in the next few days.
Tested on x86_64-linux-gnu with a gcc bootstrap tree, a gcc
non-bootstrap tree, and a binutils+gdb tree.

In the patch below, I've omitted hunks with only whitespace changes to
Makefile.in, so that people can more easily identify how rules are
changing.



Introduce @unless/@endunless and postbootstrap Makefile targets

From: Alexandre Oliva 

This patch turns dependencies of non-bootstrap targets on bootstrap
targets for bootstrap builds into dependencies on stage_last.  This
arrangement gets stage1-bubble to run from stage_last if we haven't
started a bootstrap yet, and to use the current stage otherwise.  This
was already the case of target libs, just not of non-bootstrapped host
modules.

In order to retain preexisting dependencies in non-bootstrap builds,
or in gcc-less builds, this introduces support for @unless/@endunless
pairs in Makefile.in.

There is a remaining possibility of problem if activating, in a tree
configured for bootstrap, a parallel build of two or more modules, at
least one bootstrapped and one not.  In this case, make might decide
to build stage_current and stage_last in parallel, the latter will
start a submake to build stage1 while the initial make, having
satisfied stage_current, proceeds to build the bootstrapped module in
non-bootstrapped configurations.  The two builds will overlap and will
likely conflict.  This situation does NOT arise in normal settings,
however: a post-bootstrap build of all-host all-target will indeed
activate such targets concurrently, but only after building all
bootstrapped modules successfully, and it will have both stage_last
and stage_current targets already satisfied, so the potential race
between builds will not arise.

Another remaining problem, that is slightly expanded with this patch,
is that of an interrupted build in a tree configured for bootstrap,
continued with a non-bootstrapped target.  Target modules that were
not bootstrapped would already fail to complete the current stage when
activated explicitly in the command line for a retry; host modules,
however, would attempt to build their bootstrapped dependencies, which
is what led to the problem of concurrent builds addressed with this
patch.  An interrupted or failed build might still recover correctly,
if the non-bootstrapped target is activated in both builds, because
then make will remove stage_last when its build command is
interrupted, so that it will attempt to recreate it with stage1-bubble
in the second try.  A bootstrap build, however, will not be attempting
to build stage_last, so the file will remain and the retry won't go
through stage1-bubble.  We have lived with that for target modules, so
we can probably live with that for host modules too.

Another undesirable consequence of this change is that non-boostrapped
host modules, in a tree configured for bootstrap, when activated as
make all-, will build all of stage1 instead of only the
module's usual dependencies.  This is intentional and necessary to fix
the parallel-build problem.  If it's not desirable, disabling the
unnecessary bootstrap configuration will suffice to restore the
original set of dependencies.


for  ChangeLog

* configure.ac: Introduce support for @unless/@endunless.
* Makefile.tpl (dep-kind): Rewrite with cond; return
postbootstrap in some cases.
(make-postboot-dep, postboot-targets): New.
(dependencies): Do not output postbootstrap dependencies at
first.  Output non-target ones changed for configure to depend
on stage_last @if gcc-bootstrap, and the original deps @unless
gcc-bootstrap.
* configure.in, Makefile.in: Rebuilt.
---
 Makefile.in  |  181 +-
 Makefile.tpl |   78 +
 configure|   20 +-
 configure.ac |   20 +-
 4 files changed, 134 insertions(+), 165 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 32a92a6bcd17..e0dfad337a6c 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -56060,9 +56018,7 @@ all-stagefeedback-fixincludes: 
maybe-all-stagefeedback-libiberty
 all-stageautoprofile-fixincludes: maybe-all-stageautoprofile-libiberty
 

[committed] v850 logical_op_short_circuit

2018-06-25 Thread Jeff Law
v850 should be marked as logical_op_short_circuit for testing purposes
but wasn't.  Fix fixes about a dozen testsuite failures per multilib.

Installed on the trunk.

Jeff
commit 2224de3ea8b9b4381b07fcb96a2cde9266ead611
Author: law 
Date:   Tue Jun 26 05:19:15 2018 +

* lib/target-supports.exp
(check_effective_target_logical_op_short_circuit): Add v850.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@262129 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 4a90a33986d..5c0de04c4cb 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2018-06-25  Jeff Law  
+
+   * lib/target-supports.exp
+   (check_effective_target_logical_op_short_circuit): Add v850.
+
 2018-06-25  Martin Sebor  
 
PR tree-optimization/86204
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 3c2c62a5800..ffbc882b07d 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8801,6 +8801,7 @@ proc check_effective_target_logical_op_short_circuit {} {
 || [istarget powerpc*-*-*]
 || [istarget nios2*-*-*]
 || [istarget riscv*-*-*]
+|| [istarget v850*-*-*]
 || [istarget visium-*-*]
 || [check_effective_target_arm_cortex_m] } {
return 1


Re: Invert sense of NO_IMPLICIT_EXTERN_C

2018-06-25 Thread David Edelsohn
On Mon, Jun 25, 2018 at 12:48 PM Nathan Sidwell  wrote:
>
> NO_IMPLICIT_EXTERN_C was introduced to tell the compiler that it didn't
> need to fake up 'extern "C" { ... }' around system header files.  Over
> the years more and more system headers have become C++-aware, leading to
> more targets defining this macro.
>
> Unfortunately because of the sense of this macro, and that the
> requirement is based on the target-OS, whereas we partition the config
> directory by target-ARCH, it's become hard to know which targets still
> require the older functionality.
>
> There have been a few questions over the past 2 decades to figure this
> out, but they didn;t progress.
>
> This patch replaces the negative NO_IMPLICIT_EXTERN_C with the positive
> SYSTEM_IMPLICIT_EXTERN_C.  Targets that previously did not define
> NO_IMPLICIT_EXTERN_C now need to define SYSTEM_IMPLICIT_EXTERN_C.  I
> know of one such target -- AIX, and I'd be grateful this patch could be
> tried there.
>
> Going through the config files was tricky, and I may well have missed
> something.  One suspicious file is config/sparc/openbsd64.h which did
> explicitly undef the macro, with the comment:
>
>/* Inherited from sp64-elf.  */
>
> sp64-elf.h does define the macro, but the other bsd's also define it,
> which leaves me wondering if openbsd.h has bit rotted here.  Which leads
> me to another observation:
>
> It's quite possible the extern "C" functionality is enabled on targets
> that no longer need it, because their observed behaviour would not be
> broken.  On the other hand, the failure mode of not defining its
> replacement (or alternatively mistakenly defining NO_IMPLICIT_EXTERN_C),
> would be immediate and obvious.  And the fix is also simple.
>
> So, if you have a target that you think has C++-unaware system headers,
> please give this patch a spin and report.  Blessing from a GM after a
> few days out there would be nice :)
>
> The lesson here is that when one has a transition, chose an enablement
> mechanism that makes it easy to tell when the transition is complete.

I tried the subset of the patch that directly affects AIX and saw no
ill effects.

Thanks, David


Re: [PR Fortran/83183] Fix infinite recursion (ICE) with -finit-derived when initializing allocatable BT_DERIVED components

2018-06-25 Thread Steve Kargl
On Mon, Jun 25, 2018 at 06:06:13PM -0400, Fritz Reese wrote:
> 
> The attached patch fixes PR 83183. Previously, attempting to generate
> initializers for allocatable components of the same derived type
> triggered infinite recursion. Passes regression tests. OK for trunk
> and gcc-8-branch?
> 
>if (c->initializer || !generate
>|| (ts->type == BT_CLASS && !c->attr.allocatable)
> +  || (ts->type == BT_DERIVED && c->attr.allocatable)

Fritz,

The patch looks simple enough.  It does seem odd to me
that BT_CLASS has !c->attr.allocatable and BT_DERIVED
is c->attr.allocatable, i.e., bang vs no bang.  Is this
because class is not affected by -finit-derived?

-- 
Steve


[PATCH] Fix bootstrap failure in vect_analyze_loop

2018-06-25 Thread David Malcolm
I ran into this bootstrap failure (with r262092):

../../../src/gcc/tree-vect-loop.c: In function ‘_loop_vec_info* 
vect_analyze_loop(loop*, loop_vec_info, vec_info_shared*)’:
../../../src/gcc/tree-vect-loop.c:1946:25: error: ‘n_stmts’ may be used 
uninitialized in this function [-Werror=maybe-uninitialize ]
   ok = vect_analyze_slp (loop_vinfo, *n_stmts);
~^~
../../../src/gcc/tree-vect-loop.c:2318:12: note: ‘n_stmts’ was declared here
   unsigned n_stmts;
^~~
cc1plus: all warnings being treated as errors

It looks like it's inlining vect_analyze_loop_2 into vect_analyze_loop,
passing _stmts in by pointer.

Normally, vect_get_datarefs_in_loop writes:
  *n_stmts = 0;
when
  if (!LOOP_VINFO_DATAREFS (loop_vinfo).exists ())
but not in the "else" path, and then, after lots of complex logic:

  ok = vect_analyze_slp (loop_vinfo, *n_stmts);

it uses the value in vect_analyze_loop_2, passed in via _stmts.

So it's not at all clear to me (or the compiler) that the value is
initialized in all paths, and an initialization to zero seems a
reasonable fix.

OK for trunk, assuming the bootstrap succeeds this time?

gcc/ChangeLog:
* tree-vect-loop.c (vect_analyze_loop): Initialize n_stmts.
---
 gcc/tree-vect-loop.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index dacc881..2b3ced2 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2315,7 +2315,7 @@ vect_analyze_loop (struct loop *loop, loop_vec_info 
orig_loop_vinfo,
   return NULL;
 }
 
-  unsigned n_stmts;
+  unsigned n_stmts = 0;
   poly_uint64 autodetected_vector_size = 0;
   while (1)
 {
-- 
1.8.5.3



[committed] Fix v850e3v5 recipf and rsqrt issues

2018-06-25 Thread Jeff Law

Code in optabs.c for directly expanding some operations checks the
number of operands in the expander/named pattern against the expected
number of operands.  If they do not match an ICE is triggered.


/* Try to generate instruction ICODE, using operands [OPS, OPS + NOPS)
   as its operands.  Return the instruction pattern on success,
   and emit any necessary set-up code.  Return null and emit no
   code on failure.  */

rtx_insn *
maybe_gen_insn (enum insn_code icode, unsigned int nops,
struct expand_operand *ops)
{
  gcc_assert (nops == (unsigned int) insn_data[(int)
icode].n_generator_args);


[ ... ]


On a target with recip or rsqrt patterns written in the expected way
we'll trigger the ICE because those patterns expose the recip as a
division with 1.0 as an operand, thus creating a second, unexpected
operand  This showed up testing v850e3v5.

It looks like some targets hide things a bit behind an unspec, possibly
to avoid this problem.  This patch does something very similar for the v850:


(define_expand "rsqrtsf2"
  [(set (match_operand:SF 0 "register_operand" "=")
(unspec:SF [(match_operand:SF 1 "register_operand" "")]
   UNSPEC_RSQRT))]

We then generate a more normal looking pattern from the expander which
matches:


(define_insn "rsqrtsf2_insn"
  [(set (match_operand:SF 0 "register_operand" "=r")
(div:SF (match_operand:SF 1 "const_float_1_operand" "")
(sqrt:SF (match_operand:SF 2 "register_operand" "r"]


While looking at this I noticed that const_float_1_operand would
essentially reject everything.  It had a toplevel check that the operand
was a CONST_INT.  But in the body of the predicate it would reject
anything that was not a CONST_DOUBLE.

The right code is CONST_DOUBLE.  So this patch also fixes
const_float_1_operand.

This fixes a handful of ICEs when testing for v850e3v5 (pr41963,
pr46728-9.  It also results in many recipf instructions being generated
for newlib v850e3v5 multilib whereas none were generated before.
There's no uses of rsqrt in newlib, but they do show up in the gcc
testsuite.


Installing on the trunk,

Jeff


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f74b2ecaa4d..b5f073de663 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2018-06-25  Jeff Law  
+
+   * config/v850/predicates.md (const_float_1_operand): Fix match_code
+   test.
+   (const_float_0_operand): Remove unused predicate.
+   * config/v850/v850.md (define_constants): Remove UNSPEC_LOOP.
+   (define_c_enum unspec): Add LOOP, RCP and RSQRT constants.
+   (recipsf2): New expander.  Original pattern now called
+   (recipsf2_insn).
+   (recipdf2, recipdf2_insn): Similarly.
+   (rsqrtsf2, rsqrtsf2_insn): Similarly
+   (rsqrtdf2, rsqrtdf2_insn): Similarly
+
 2018-06-26  Gerald Pfeifer  
 
* ginclude/stddef.h: Remove an obsolete comment on FreeBSD 5.
diff --git a/gcc/config/v850/predicates.md b/gcc/config/v850/predicates.md
index 1b50e50b8c3..68390a23eb7 100644
--- a/gcc/config/v850/predicates.md
+++ b/gcc/config/v850/predicates.md
@@ -475,7 +475,7 @@
 ;; Return true if OP is a float value operand with value as 1.
 
 (define_predicate "const_float_1_operand"
-  (match_code "const_int")
+  (match_code "const_double")
 {
   if (GET_CODE (op) != CONST_DOUBLE
   || mode != GET_MODE (op)
@@ -485,19 +485,6 @@
   return op == CONST1_RTX(mode);
 })
 
-;; Return true if OP is a float value operand with value as 0.
-
-(define_predicate "const_float_0_operand"
-  (match_code "const_int")
-{
-  if (GET_CODE (op) != CONST_DOUBLE
-  || mode != GET_MODE (op)
-  || (mode != DFmode && mode != SFmode))
-return 0;
-
-  return op == CONST0_RTX(mode);
-})
-
 (define_predicate "label_ref_operand"
   (match_code "label_ref")
 )
diff --git a/gcc/config/v850/v850.md b/gcc/config/v850/v850.md
index e01a3102c31..67d906329e8 100644
--- a/gcc/config/v850/v850.md
+++ b/gcc/config/v850/v850.md
@@ -44,10 +44,15 @@
(LP_REGNUM  31) ; Return address register
(CC_REGNUM  32) ; Condition code pseudo register
(FCC_REGNUM 33) ; Floating Condition code pseudo 
register
-   (UNSPEC_LOOP200) ; loop counter
   ]
 )
 
+(define_c_enum "unspec" [
+  UNSPEC_LOOP
+  UNSPEC_RCP
+  UNSPEC_RSQRT
+])
+
 (define_attr "length" ""
   (const_int 4))
 
@@ -2450,8 +2455,22 @@
 ;;  special insns
 ;;
 
-;;; reciprocal
-(define_insn "recipsf2"
+;; Generic code demands that the recip and rsqrt named patterns
+;; have precisely one operand.  So that's what we expose in the
+;; expander via the strange UNSPEC.  However, those expanders
+;; generate normal looking recip and rsqrt patterns.
+
+(define_expand "recipsf2"
+  [(set (match_operand:SF 0 "register_operand" "")
+   (unspec:SF [(match_operand:SF 1 "register_operand" "")]
+  UNSPEC_RCP))]
+  "TARGET_USE_FPU"
+  {
+emit_insn 

[C++ Patch] More location fixes to grokdeclarator

2018-06-25 Thread Paolo Carlini

Hi,

this includes straightforward tweaks to check_concept_fn and quite a bit 
of additional work on grokdeclarator: most of it is also rather 
straightforward. In a few places there is the subtlety that we want to 
handle together ds_storage_class and ds_thread, whichever location is 
the smallest but != UNKNOWN_LOCATION (UNKNWON_LOCATION meaning that the 
issue is with the other one) or use the biggest location when say 
ds_virtual and ds_storage_class conflict, because - I believe - we want 
to point to the place where we give up. Thus I added the min_location 
and max_location helpers. In one place - storage class specified for 
parameter - I also changed grokdeclarator to immediately return 
error_mark_node upon error, consistently with the other cases nearby, 
which avoids redundant diagnostic, that is two errors for a single 
issue. Tested x86_64-linux.


Thanks, Paolo.

PS: In Rapperswil we approved P1064R0, thus the limitation that a 
constexpr function can't be virtual will not exist in C++20 mode.




/cp
2018-06-25  Paolo Carlini  

* decl.c (min_location, max_location): New.
(check_concept_fn): Use  DECL_SOURCE_LOCATION.
(grokdeclarator): Use accurate locations in a number of error
messages involving ds_thread, ds_storage_class, ds_virtual,
ds_constexpr, ds_typedef and ds_friend; exploit min_location
and max_location.

/testsuite
2018-06-25  Paolo Carlini  

* g++.dg/other/locations1.C: New.
* g++.dg/tls/locations1.C: Likewise.
* g++.dg/concepts/fn-concept2.C: Test the locations too.
* g++.dg/cpp0x/constexpr-virtual5.C: Likewise.
* g++.dg/cpp0x/pr51463.C: Likewise.
* g++.dg/other/typedef1.C: Likewise.
* g++.dg/parse/dtor13.C: Likewise.
* g++.dg/template/error44.C: Likewise.
* g++.dg/template/typedef4.C: Likewise.
* g++.dg/template/typedef5.C: Likewise.
* g++.dg/tls/diag-2.C: Likewise.
* g++.old-deja/g++.brendan/crash11.C: Likewise.
Index: cp/decl.c
===
--- cp/decl.c   (revision 262005)
+++ cp/decl.c   (working copy)
@@ -8545,15 +8545,18 @@ check_concept_fn (tree fn)
 {
   // A constraint is nullary.
   if (DECL_ARGUMENTS (fn))
-error ("concept %q#D declared with function parameters", fn);
+error_at (DECL_SOURCE_LOCATION (fn),
+ "concept %q#D declared with function parameters", fn);
 
   // The declared return type of the concept shall be bool, and
   // it shall not be deduced from it definition.
   tree type = TREE_TYPE (TREE_TYPE (fn));
   if (is_auto (type))
-error ("concept %q#D declared with a deduced return type", fn);
+error_at (DECL_SOURCE_LOCATION (fn),
+ "concept %q#D declared with a deduced return type", fn);
   else if (type != boolean_type_node)
-error ("concept %q#D with non-% return type %qT", fn, type);
+error_at (DECL_SOURCE_LOCATION (fn),
+ "concept %q#D with non-% return type %qT", fn, type);
 }
 
 /* Helper function.  Replace the temporary this parameter injected
@@ -9818,6 +9821,27 @@ smallest_type_quals_location (int type_quals, cons
   return loc;
 }
 
+/* Returns the smallest location.  */
+
+static location_t
+min_location (location_t loca, location_t locb)
+{
+  if (loca == UNKNOWN_LOCATION
+  || (locb != UNKNOWN_LOCATION && locb < loca))
+return locb;
+  return loca;
+}
+
+/* Returns the biggest location.  */
+
+static location_t
+max_location (location_t loca, location_t locb)
+{
+  if (loca < locb)
+return locb;
+  return loca;
+}
+
 /* Check that it's OK to declare a function with the indicated TYPE
and TYPE_QUALS.  SFK indicates the kind of special function (if any)
that this function is.  OPTYPE is the type given in a conversion
@@ -10710,14 +10734,18 @@ grokdeclarator (const cp_declarator *declarator,
 {
   if (staticp == 2)
{
- error ("member %qD cannot be declared both % "
-"and %", dname);
+ error_at (max_location (declspecs->locations[ds_virtual],
+ declspecs->locations[ds_storage_class]),
+   "member %qD cannot be declared both % "
+   "and %", dname);
  storage_class = sc_none;
  staticp = 0;
}
   if (constexpr_p)
-   error ("member %qD cannot be declared both % "
-  "and %", dname);
+   error_at (max_location (declspecs->locations[ds_virtual],
+   declspecs->locations[ds_constexpr]),
+ "member %qD cannot be declared both % "
+ "and %", dname);
 }
   friendp = decl_spec_seq_has_spec_p (declspecs, ds_friend);
 
@@ -10726,18 +10754,27 @@ grokdeclarator (const cp_declarator *declarator,
 {
   if (typedef_p)
{
- error ("typedef declaration invalid in parameter declaration");
+ error_at 

Go patch committed: Escape analysis improvements

2018-06-25 Thread Ian Lance Taylor
This patch to the Go frontend by Cherry Zhang adds some escape
analysis improvements to the Go frontend.  These are based on earlier
patches to the gc compiler.

- https://golang.org/cl/99335: unnamed receiver should not escape.

- https://golang.org/cl/105257: propagate loop depth to field. This prevents it
  from escaping when a field's address is taken inside a loop
  (but not otherwise escape).

- https://golang.org/cl/107597: use element type for "indirection" of
slice/string.
  This prevents the slice/string from escaping when only the
  element, in case that it is pointerless, flows to outer scope.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 261980)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-7008302f1f0eaa9508b2857185505d4dc7baac1e
+baaaf1e0f1e9a54ea2dfe475154c85c83ec03740
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 261819)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -43,6 +43,12 @@ Node::type() const
 // which may also be pointer. We model it as another void*, so
 // we don't lose pointer-ness.
 return this->child()->type();
+  else if (this->child()->type()->is_slice_type())
+// We model "indirect" of a slice as dereferencing its pointer
+// field (to get element). Use element type here.
+return this->child()->type()->array_type()->element_type();
+  else if (this->child()->type()->is_string_type())
+return Type::lookup_integer_type("uint8");
   else
 return this->child()->type()->deref();
 }
@@ -1811,60 +1817,77 @@ Escape_analysis_assign::expression(Expre
 
 case Expression::EXPRESSION_UNARY:
   {
-   if ((*pexpr)->unary_expression()->op() != OPERATOR_AND)
- break;
+   Expression* operand = (*pexpr)->unary_expression()->operand();
 
-   Node* addr_node = Node::make_node(*pexpr);
-   this->context_->track(addr_node);
+if ((*pexpr)->unary_expression()->op() == OPERATOR_AND)
+  {
+this->context_->track(n);
 
-   Expression* operand = (*pexpr)->unary_expression()->operand();
-   Named_object* var = NULL;
-   if (operand->var_expression() != NULL)
- var = operand->var_expression()->named_object();
-   else if (operand->enclosed_var_expression() != NULL)
- var = operand->enclosed_var_expression()->variable();
+Named_object* var = NULL;
+if (operand->var_expression() != NULL)
+  var = operand->var_expression()->named_object();
+else if (operand->enclosed_var_expression() != NULL)
+  var = operand->enclosed_var_expression()->variable();
+
+if (var != NULL
+&& ((var->is_variable() && var->var_value()->is_parameter())
+|| var->is_result_variable()))
+  {
+Node::Escape_state* addr_state = n->state(this->context_, 
NULL);
+addr_state->loop_depth = 1;
+break;
+  }
+  }
 
-   if (var == NULL)
- break;
+if ((*pexpr)->unary_expression()->op() != OPERATOR_AND
+&& (*pexpr)->unary_expression()->op() != OPERATOR_MULT)
+  break;
 
-   if (var->is_variable()
-   && !var->var_value()->is_parameter())
- {
-   // For , use the loop depth of x if known.
-   Node::Escape_state* addr_state =
- addr_node->state(this->context_, NULL);
-   Node* operand_node = Node::make_node(operand);
-   Node::Escape_state* operand_state =
- operand_node->state(this->context_, NULL);
-   if (operand_state->loop_depth != 0)
- addr_state->loop_depth = operand_state->loop_depth;
- }
-   else if ((var->is_variable()
- && var->var_value()->is_parameter())
-|| var->is_result_variable())
- {
-   Node::Escape_state* addr_state =
- addr_node->state(this->context_, NULL);
-   addr_state->loop_depth = 1;
- }
+// For  and *x, use the loop depth of x if known.
+Node::Escape_state* expr_state = n->state(this->context_, NULL);
+Node* operand_node = Node::make_node(operand);
+Node::Escape_state* operand_state = 
operand_node->state(this->context_, NULL);
+if (operand_state->loop_depth != 0)
+  expr_state->loop_depth = operand_state->loop_depth;
   }
   break;
 
 case Expression::EXPRESSION_ARRAY_INDEX:
   {
 Array_index_expression* aie = (*pexpr)->array_index_expression();
+
+// 

[PR Fortran/83183] Fix infinite recursion (ICE) with -finit-derived when initializing allocatable BT_DERIVED components

2018-06-25 Thread Fritz Reese
All,

The attached patch fixes PR 83183. Previously, attempting to generate
initializers for allocatable components of the same derived type
triggered infinite recursion. Passes regression tests. OK for trunk
and gcc-8-branch?

>From 01961cc78b8ecd5272521098ae3166516a49dcd1 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 25 Jun 2018 17:51:00 -0400
Subject: [PATCH] PR fortran/83183

Fix infinite recursion occurring with -finit-derived generating initializers
for allocatable derived-type components.

gcc/fortran/

* expr.c (component_initializer): Do not generate initializers with
allocatable BT_DERIVED components.
---
 gcc/fortran/expr.c |  1 +
 gcc/testsuite/gfortran.dg/init_flag_18.f90 | 19 +++
 2 files changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/init_flag_18.f90
From 01961cc78b8ecd5272521098ae3166516a49dcd1 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 25 Jun 2018 17:51:00 -0400
Subject: [PATCH] PR fortran/83183

Fix infinite recursion occurring with -finit-derived generating initializers
for allocatable derived-type components.

gcc/fortran/

	* expr.c (component_initializer): Do not generate initializers with
	allocatable BT_DERIVED components.
---
 gcc/fortran/expr.c |  1 +
 gcc/testsuite/gfortran.dg/init_flag_18.f90 | 19 +++
 2 files changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/init_flag_18.f90

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 5103a5cc990..468f45ec488 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -4422,6 +4422,7 @@ component_initializer (gfc_typespec *ts, gfc_component *c, bool generate)
  Some components should never get initializers.  */
   if (c->initializer || !generate
   || (ts->type == BT_CLASS && !c->attr.allocatable)
+  || (ts->type == BT_DERIVED && c->attr.allocatable)
   || c->attr.pointer
   || c->attr.class_pointer
   || c->attr.proc_pointer)
diff --git a/gcc/testsuite/gfortran.dg/init_flag_18.f90 b/gcc/testsuite/gfortran.dg/init_flag_18.f90
new file mode 100644
index 000..9ab00a9afce
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/init_flag_18.f90
@@ -0,0 +1,19 @@
+! { dg-do compile }
+! { dg-options "-finit-derived" }
+!
+! PR fortran/83183
+!
+! Test a regression where -finit-derived recursed infinitely generating
+! initializers for allocatable components of the same derived type.
+!
+
+program pr83183
+  type :: linked_list
+ type(linked_list), allocatable :: link
+ integer :: value
+  end type
+  type(linked_list) :: test
+  allocate(test % link)
+  print *, test%value
+  print *, test%link%value
+end program
-- 
2.12.2



Re: GCC 6 patch RFA: libgo: Remove syscall.Ustat

2018-06-25 Thread Jakub Jelinek
On Mon, Jun 25, 2018 at 03:02:02PM -0700, Ian Lance Taylor wrote:
> Since it looks like there might be a 6.5 release, here is backport of

Yes, 6.5 is going to be the last release from the gcc-6-branch.

> https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01420.html
> 
> to GCC 6 branch.  This removes the syscall.Ustat function from libgo,
> since the ustat function is being removed from new versions of glibc.
> 
> Bootstrapped and ran Go tests on x86_64-pc-linux-gnu.  OK for GCC 6 branch?

Yes, thanks.

Jakub


GCC 6 patch RFA: libgo: Remove syscall.Ustat

2018-06-25 Thread Ian Lance Taylor
Since it looks like there might be a 6.5 release, here is backport of

https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01420.html

to GCC 6 branch.  This removes the syscall.Ustat function from libgo,
since the ustat function is being removed from new versions of glibc.

Bootstrapped and ran Go tests on x86_64-pc-linux-gnu.  OK for GCC 6 branch?

Ian
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 262106)
+++ libgo/Makefile.am   (working copy)
@@ -1989,17 +1989,6 @@ else
 syscall_lsf_file =
 endif
 
-# GNU/Linux specific ustat support.
-if LIBGO_IS_LINUX
-if LIBGO_IS_ARM64
-syscall_ustat_file =
-else
-syscall_ustat_file = go/syscall/libcall_linux_ustat.go
-endif
-else
-syscall_ustat_file =
-endif
-
 # GNU/Linux specific utimesnano support.
 if LIBGO_IS_LINUX
 syscall_utimesnano_file = go/syscall/libcall_linux_utimesnano.go
Index: libgo/configure.ac
===
--- libgo/configure.ac  (revision 262106)
+++ libgo/configure.ac  (working copy)
@@ -519,24 +519,6 @@ AC_CHECK_HEADERS([linux/filter.h linux/i
 #endif
 ])
 
-AC_CACHE_CHECK([whether  can be used],
-[libgo_cv_c_ustat_h],
-[CFLAGS_hold=$CFLAGS
-CFLAGS="$CFLAGS -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE $OSCFLAGS"
-AC_COMPILE_IFELSE(
-[AC_LANG_SOURCE([
-#include 
-#ifdef HAVE_LINUX_FILTER_H
-#include 
-#endif
-#include 
-])], [libgo_cv_c_ustat_h=yes], [libgo_cv_c_ustat_h=no])
-CFLAGS=$CFLAGS_hold])
-if test $libgo_cv_c_ustat_h = yes; then
-  AC_DEFINE(HAVE_USTAT_H, 1,
-[Define to 1 if you have the  header file and it works.])
-fi
-
 AM_CONDITIONAL(HAVE_SYS_MMAN_H, test "$ac_cv_header_sys_mman_h" = yes)
 
 AC_CHECK_FUNCS(strerror_r strsignal wait4 mincore setenv unsetenv 
dl_iterate_phdr)
Index: libgo/go/syscall/libcall_linux_ustat.go
===
--- libgo/go/syscall/libcall_linux_ustat.go (revision 262106)
+++ libgo/go/syscall/libcall_linux_ustat.go (nonexistent)
@@ -1,11 +0,0 @@
-// Copyright 2015 The Go Authors. All rights reserved.
-// Use of this source code is governed by a BSD-style
-// license that can be found in the LICENSE file.
-
-// GNU/Linux library ustat call.
-// This is not supported on some kernels, such as arm64.
-
-package syscall
-
-//sys  Ustat(dev int, ubuf *Ustat_t) (err error)
-//ustat(dev _dev_t, ubuf *Ustat_t) _C_int
Index: libgo/mksysinfo.sh
===
--- libgo/mksysinfo.sh  (revision 262106)
+++ libgo/mksysinfo.sh  (working copy)
@@ -142,9 +142,6 @@ cat > sysinfo.c <
 #endif
-#if defined(HAVE_USTAT_H)
-#include 
-#endif
 #if defined(HAVE_UTIME_H)
 #include 
 #endif
@@ -1350,20 +1347,6 @@ grep '^type _sysinfo ' gen-sysinfo.go |
   -e 's/mem_unit/Unit/' \
 >> ${OUT}
 
-# The ustat struct.
-grep '^type _ustat ' gen-sysinfo.go | \
-sed -e 's/_ustat/Ustat_t/' \
-  -e 's/f_tfree/Tfree/' \
-  -e 's/f_tinode/Tinoe/' \
-  -e 's/f_fname/Fname/' \
-  -e 's/f_fpack/Fpack/' \
->> ${OUT}
-# Force it to be defined, as on some older GNU/Linux systems the
-# header file fails when using with .
-if ! grep 'type _ustat ' gen-sysinfo.go >/dev/null 2>&1; then
-  echo 'type Ustat_t struct { Tfree int32; Tinoe uint64; Fname [5+1]int8; 
Fpack [5+1]int8; }' >> ${OUT}
-fi
-
 # The utimbuf struct.
 grep '^type _utimbuf ' gen-sysinfo.go | \
 sed -e 's/_utimbuf/Utimbuf/' \


Re: [PATCH 1/2] Untangle stddef.h a little

2018-06-25 Thread Jeff Law
On 06/24/2018 05:19 PM, Gerald Pfeifer wrote:
> On Tue, 19 Jun 2018, Joseph Myers wrote:
>> These two patches are OK, please commit.
> 
> I created ChangeLog entries and committed both patches (one a few
> days ago, the second just now).  
> 
> Thank you, Maya!  If you have any further clean-ups, I'll be happy
> to help by committing them (once approved).
> 
>> (GCC officially removed support for FreeBSD versions before FreeBSD 5 with
>>
>> r260852 | gerald | 2018-05-28 23:20:15 + (Mon, 28 May 2018) | 5 lines
>>
>> * config.gcc: Identify FreeBSD 3.x and 4.x as unsupported.
>>
>> * config/freebsd-spec.h (FBSD_LIB_SPEC): Only consider FreeBSD 5
>> and later.
>>
>> .)
> 
> I took this as a hint to further simplify ginclude/stddef.h. ;-)
> Thanks for pointing this out, Joseph!
> 
> Okay to apply the following?  Tested on x86_64-unknown-freebsd11.2.
> 
> Gerald
> 
> 2018-06-24  Gerald Pfeifer  
> 
>   * ginclude/stddef.h: Remove an obsolete comment on FreeBSD 5.
>   Simplify logic for FreeBSD (twice).
OK.
jeff


Re: [PATCH 1/4] regcprop: Avoid REG_CFA_REGISTER notes (PR85645)

2018-06-25 Thread Jeff Law
On 06/25/2018 05:53 AM, Segher Boessenkool wrote:
> Hi Eric,
> 
> On Wed, May 09, 2018 at 09:22:47AM +0200, Eric Botcazou wrote:
>>> 2018-05-08  Segher Boessenkool  
>>>
>>> PR rtl-optimization/85645
>>> *  regcprop.c (copyprop_hardreg_forward_1): Don't propagate into an
>>> insn that has a REG_CFA_REGISTER note.
>>
>> OK, thanks.
> 
> Are these patches okay for backport to 8?  At least the first two.
Yes.
jeff


Re: [PATCH/RFC] enable -Wstrict-prototypes (PR 82922)

2018-06-25 Thread Martin Sebor

On 06/22/2018 11:19 AM, Joseph Myers wrote:

On Thu, 21 Jun 2018, Eric Gallager wrote:


On 6/21/18, Jeff Law  wrote:

On 06/12/2018 11:21 AM, Joseph Myers wrote:

On Tue, 12 Jun 2018, Martin Sebor wrote:


The proposal to enable -Wstrict-prototypes discussed below
was considered too late for GCC 8.  I'd like to revive it
now for GCC 9.

  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00935.html


My point from that discussion stands that () for no arguments should be
considered separately from warning for all the other cases.

There's a lot of legacy code out there...  What's the proposal for
handling the no argument () case?  Are we thinking multiple levels?  And
if so what's the default?


-Wstrict-prototypes and -Wstricter-prototypes for the prototypes case?
And then split
-Wold-style-definition into -Wold-style-definition and
-Wc++-style-definition for the equivalent use of () in function
definitions?


I think the existing options, when explicitly used, should keep warning
for all the cases they currently warn for, including (), even if you also
have new warning options available that correspond to a subset of the
existing options.

It's the possible enabled-by-default warnings that I think should be a
subset, as I don't think using () for no-argument functions is such an
obsolescent practice as using old-style definitions, or () declarations,
for functions with arguments (especially since () for no-argument
functions is perfectly idiomatic in C++, and if C obsoletes non-prototype
functions I'd expect it to end up with () having the same meaning as in
C++, rather than being disallowed).


I assume you mean if C removes non-prototype functions (they are
obsolete today and have been since C99).  At that point this will
be moot since GCC will presumably give an error for a declaration
without a prototype (or treat it as (void)).  But that time is
a long ways away (2022 at the earliest).  Until then, one of
the coding mistakes to detect is providing no arguments to
a ()-declared function that's defined to take at least one.
Subsetting the warning and not diagnosing such declarations (or
calls to them) will leave a big gap in the detection.  I'm still
thinking about how to approach this under your constraint but
I don't think it's worth either the effort to implement or
the false negatives that seem unavoidable.


The vast bulk of the places where that previous patch changes testcases
are for (), which would not need changing under my proposal.


Sure.  I think we could easily exempt most of the tests from
diagnosing without compromising the efficacy of the warning
by silently accepting definitions of () functions that take no
arguments (and diagnosing calls to them that pass some).  What
I think is important to preserve is diagnosing () declarations.

It might be possible to try to make the warning less noisy and
only have it complain for declarations actually used to call
functions.  Or try to do something even more clever and look
for call patterns to () functions (e.g., diagnose only those
that are called with different numbers or types of arguments
in the same file).  But as I said, I'm not sure that what is
to be gained by this is worth the effort.

Martin


Re: [PATCH] PR libstdc++/86112 fix printers for Python 2.6

2018-06-25 Thread Jonathan Wakely

On 25/06/18 22:03 +0100, Jonathan Wakely wrote:

Dict comprehensions are only supported since Python 2.7, so use an
alternative syntax that is backwards compatible.

PR libstdc++/86112
* python/libstdcxx/v6/printers.py (add_one_template_type_printer):
Replace dict comprehension.

Tested x86_64-linux, committed to trunk.


Oh, and gcc-8-branch.




[PATCH] PR libstdc++/86112 fix printers for Python 2.6

2018-06-25 Thread Jonathan Wakely

Dict comprehensions are only supported since Python 2.7, so use an
alternative syntax that is backwards compatible.

PR libstdc++/86112
* python/libstdcxx/v6/printers.py (add_one_template_type_printer):
Replace dict comprehension.

Tested x86_64-linux, committed to trunk.


commit dfea2551319c1a473f5d9016ff90be1a157d59e9
Author: Jonathan Wakely 
Date:   Mon Jun 25 21:59:28 2018 +0100

PR libstdc++/86112 fix printers for Python 2.6

Dict comprehensions are only supported since Python 2.7, so use an
alternative syntax that is backwards compatible.

PR libstdc++/86112
* python/libstdcxx/v6/printers.py (add_one_template_type_printer):
Replace dict comprehension.

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 45aaa1211ec..34d8b4e6606 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1438,7 +1438,8 @@ def add_one_template_type_printer(obj, name, defargs):
 if _versioned_namespace:
 # Add second type printer for same type in versioned namespace:
 ns = 'std::' + _versioned_namespace
-defargs = { n: d.replace('std::', ns) for n,d in defargs.items() }
+# PR 86112 Cannot use dict comprehension here:
+defargs = dict((n, d.replace('std::', ns)) for (n,d) in 
defargs.items())
 printer = TemplateTypePrinter(ns+name, defargs)
 gdb.types.register_type_printer(obj, printer)
 


Re: [PATCH] avoid using strnlen result for late calls to strlen (PR 82604)

2018-06-25 Thread Martin Sebor

On 06/22/2018 04:00 PM, Jeff Law wrote:

On 06/18/2018 01:15 PM, Martin Sebor wrote:

While looking into opportunities to detect strnlen/strlen coding
mistakes (pr86199) I noticed a bug in the strnlen implementation
I committed earlier today that lets a strnlen() result be saved
and used in subsequent calls to strlen() with the same argument.
The attached patch changes the handle_builtin_strlen() function
to discard the strnlen() result unless its bound is greater than
the length of the string.

Martin

gcc-86204.diff


PR tree-optimization/86204 -  wrong strlen result after prior strnlen

gcc/ChangeLog:

PR tree-optimization/86204
* tree-ssa-strlen.c (handle_builtin_strlen): Avoid storing
a strnlen result if it's less than the length of the string.

gcc/testsuite/ChangeLog:

PR tree-optimization/86204
* gcc.dg/strlenopt-46.c: New test.

OK.  Though I must admit I don't like having variables "bounded" and
"bound" in the same function.  So consider renaming one to avoid future
confusion.


Done in r262114.

Martin


Re: [PATCH] config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.

2018-06-25 Thread Jonathan Wakely

On 21/06/18 21:39 +0100, Jonathan Wakely wrote:

Apparently we never updated the x86_64-linux-gnu baseline for gcc-8
(so now that I'm trying to add a new symbol version on trunk, I'm
seeing errors for the get_entropy symbol added in the previous symbol
version).


The same change is needed for ppc64 on trunk and gcc-8-branch, so I'm
committing this.

The following targets still need their baseline_symbols files updated
on trunk and gcc-8-branch:

find config/abi/post/ -name baseline_symbols.txt | xargs fgrep -L 3.4.25
config/abi/post/mips64-linux-gnu/32/baseline_symbols.txt
config/abi/post/mips64-linux-gnu/64/baseline_symbols.txt
config/abi/post/mips64-linux-gnu/baseline_symbols.txt
config/abi/post/mips-linux-gnu/baseline_symbols.txt
config/abi/post/sparc-linux-gnu/baseline_symbols.txt
config/abi/post/s390-linux-gnu/baseline_symbols.txt
config/abi/post/s390x-linux-gnu/32/baseline_symbols.txt
config/abi/post/s390x-linux-gnu/baseline_symbols.txt


Target maintainers, this can be done by "make new-abi-baseline" in the
$build/$target/libstdc++-v3/testsuite directory.

The "make check-abi" target will already be failing on trunk (as we've
added new symbols there) and we plan to add new libstdc++ symbols to
gcc-8-branch in the near future, so the baselines should be updated on
the branch too. Thanks in advance.


commit 36086e7d1340e2b9c8b63e8b6e6b0405666f6a0b
Author: Jonathan Wakely 
Date:   Mon Jun 25 21:18:59 2018 +0100

Update powerpc64-linux-gnu/baseline_symbols.txt

PR libstdc++/81092
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.

diff --git a/libstdc++-v3/config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt
index 50734641c5b..9d070517326 100644
--- a/libstdc++-v3/config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt
@@ -444,6 +444,7 @@ FUNC:_ZNKSt13basic_fstreamIwSt11char_traitsIwEE7is_openEv@GLIBCXX_3.4
 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6gcountEv@@GLIBCXX_3.4
 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4
 FUNC:_ZNKSt13basic_ostreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4
+FUNC:_ZNKSt13random_device13_M_getentropyEv@@GLIBCXX_3.4.25
 FUNC:_ZNKSt13runtime_error4whatEv@@GLIBCXX_3.4
 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE5rdbufEv@@GLIBCXX_3.4
 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE7is_openEv@@GLIBCXX_3.4.5
@@ -4237,6 +4238,8 @@ OBJECT:0:GLIBCXX_3.4.20
 OBJECT:0:GLIBCXX_3.4.21
 OBJECT:0:GLIBCXX_3.4.22
 OBJECT:0:GLIBCXX_3.4.23
+OBJECT:0:GLIBCXX_3.4.24
+OBJECT:0:GLIBCXX_3.4.25
 OBJECT:0:GLIBCXX_3.4.3
 OBJECT:0:GLIBCXX_3.4.4
 OBJECT:0:GLIBCXX_3.4.5


Re: [PATCH 1/4] regcprop: Avoid REG_CFA_REGISTER notes (PR85645)

2018-06-25 Thread Eric Botcazou
> Are these patches okay for backport to 8?  At least the first two.

If they fulfill the criteria for a release branch, no objection by me.

-- 
Eric Botcazou


Re: [patch, fortran] Handling of .and. and .or. expressions

2018-06-25 Thread Janus Weil
Hi Thomas, hi all,

I'm back from holidays and have more time to deal with this now ...

2018-06-17 0:38 GMT+02:00 Janus Weil :
>
> Am 16. Juni 2018 23:14:08 MESZ schrieb Thomas Koenig :
>>In my patch, I have tried to do all three things at the same time, and
>>after this discussion, I still think that this is the right path
>>to follow.
>>
>>So, here is an update on the patch, which also covers ALLOCATED.
>>
>>Regression-tested. OK?
>
> I absolutely welcome the warnings for impure functions as second operand to 
> .and./.or. operators, but (as noted earlier) I disagree with changing the 
> ordering of operands. As our lengthy discussions show, things are already 
> complicated enough, and this additional optimization just adds to the 
> confusion IMHO.

the previously proposed patch by Thomas did various things at once,
not all of which we could agree upon unfortunately.

However, there also were some things that everyone seemed to agree
upon, namely that there should be warnings for the cases where impure
functions are currently short-circuited.

The patch in the attachment contains those parts of Thomas' patch that
implement that warning, but does no further optimizations or
special-casing.

Regtests cleanly on x86_64-linux-gnu. Ok for trunk?

Cheers,
Janus


2018-06-25  Thomas Koenig  
Janus Weil  

PR fortran/85599
* dump-parse-tree (show_attr): Add handling of implicit_pure.
* resolve.c (impure_function_callback): New function.
(resolve_operator): Call it vial gfc_expr_walker.


2018-06-25  Janus Weil  

PR fortran/85599
* gfortran.dg/short_circuiting.f90: New test.
Index: gcc/fortran/dump-parse-tree.c
===
--- gcc/fortran/dump-parse-tree.c	(revision 262024)
+++ gcc/fortran/dump-parse-tree.c	(working copy)
@@ -716,6 +716,8 @@ show_attr (symbol_attribute *attr, const char * mo
 fputs (" ELEMENTAL", dumpfile);
   if (attr->pure)
 fputs (" PURE", dumpfile);
+  if (attr->implicit_pure)
+fputs (" IMPLICIT_PURE", dumpfile);
   if (attr->recursive)
 fputs (" RECURSIVE", dumpfile);
 
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 262024)
+++ gcc/fortran/resolve.c	(working copy)
@@ -3821,6 +3821,44 @@ lookup_uop_fuzzy (const char *op, gfc_symtree *uop
 }
 
 
+/* Callback finding an impure function as an operand to an .and. or
+   .or.  expression.  Remember the last function warned about to
+   avoid double warnings when recursing.  */
+
+static int
+impure_function_callback (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+			  void *data)
+{
+  gfc_expr *f = *e;
+  const char *name;
+  static gfc_expr *last = NULL;
+  bool *found = (bool *) data;
+
+  if (f->expr_type == EXPR_FUNCTION)
+{
+  *found = 1;
+  if (f != last && !pure_function (f, ))
+	{
+	  /* This could still be a function without side effects, i.e.
+	 implicit pure.  Do not warn for that case.  */
+	  if (f->symtree == NULL || f->symtree->n.sym == NULL
+	  || !gfc_implicit_pure (f->symtree->n.sym))
+	{
+	  if (name)
+		gfc_warning (OPT_Wsurprising, "Function %qs at %L "
+			 "might not be evaluated", name, >where);
+	  else
+		gfc_warning (OPT_Wsurprising, "Function at %L "
+			 "might not be evaluated", >where);
+	}
+	}
+  last = f;
+}
+
+  return 0;
+}
+
+
 /* Resolve an operator expression node.  This can involve replacing the
operation with a user defined function call.  */
 
@@ -3929,6 +3967,14 @@ resolve_operator (gfc_expr *e)
 	gfc_convert_type (op1, >ts, 2);
 	  else if (op2->ts.kind < e->ts.kind)
 	gfc_convert_type (op2, >ts, 2);
+
+	  if (e->value.op.op == INTRINSIC_AND || e->value.op.op == INTRINSIC_OR)
+	{
+	  /* Warn about short-circuiting
+	 with impure function as second operand.  */
+	  bool op2_f = false;
+	  gfc_expr_walker (, impure_function_callback, _f);
+	}
 	  break;
 	}
 
! { dg-do compile }
! { dg-additional-options "-Wsurprising" }
!
! PR 85599: Prevent short-circuiting of logical expressions for non-pure functions
!
! Contributed by Janus Weil 

program short_circuit

   logical :: flag
   flag = .false.
   flag = check() .and. flag
   flag = flag .and. check()  ! { dg-warning "might not be evaluated" }
   flag = flag .and. pure_check()

contains

   logical function check()
  integer, save :: i = 1
  print *, "check", i
  i = i + 1
  check = .true.
   end function

   logical pure function pure_check()
  pure_check = .true.
   end function

end


Backports to 6.5

2018-06-25 Thread Jakub Jelinek
Hi!

I've backported 75 commits from 7.x branch to 6.x, bootstrapped/regtested
them on x86_64-linux and i686-linux and committed to gcc-6-branch.

Jakub
2018-06-25  Jakub Jelinek  

Backported from mainline
2017-09-27  Jakub Jelinek  

PR c++/82159
* gimplify.c (gimplify_modify_expr): Don't optimize away zero sized
lhs from calls if the lhs has addressable type.

* g++.dg/opt/pr82159.C: New test.

--- gcc/gimplify.c  (revision 253317)
+++ gcc/gimplify.c  (revision 253318)
@@ -5434,7 +5434,12 @@ gimplify_modify_expr (tree *expr_p, gimp
  side as statements and throw away the assignment.  Do this after
  gimplify_modify_expr_rhs so we handle TARGET_EXPRs of addressable
  types properly.  */
-  if (zero_sized_type (TREE_TYPE (*from_p)) && !want_value)
+  if (zero_sized_type (TREE_TYPE (*from_p))
+  && !want_value
+  /* Don't do this for calls that return addressable types, expand_call
+relies on those having a lhs.  */
+  && !(TREE_ADDRESSABLE (TREE_TYPE (*from_p))
+  && TREE_CODE (*from_p) == CALL_EXPR))
 {
   gimplify_stmt (from_p, pre_p);
   gimplify_stmt (to_p, pre_p);
--- gcc/testsuite/g++.dg/opt/pr82159.C  (nonexistent)
+++ gcc/testsuite/g++.dg/opt/pr82159.C  (revision 253318)
@@ -0,0 +1,18 @@
+// PR c++/82159
+// { dg-do compile }
+// { dg-options "" }
+
+template
+struct S
+{
+  ~S () {}
+  template S foo () { return S (); }
+  unsigned char data[N];
+};
+
+int
+main ()
+{
+  S<16> d;
+  S<0> t = d.foo<0> ();
+}
2018-06-25  Jakub Jelinek  

Backported from mainline
2017-09-29  Jakub Jelinek  

PR c/82340
* c-decl.c (build_compound_literal): Use c_apply_type_quals_to_decl
instead of trying to set just TREE_READONLY manually.

* gcc.dg/tree-ssa/pr82340.c: New test.

--- gcc/c/c-decl.c  (revision 253318)
+++ gcc/c/c-decl.c  (revision 253319)
@@ -5092,9 +5092,7 @@ build_compound_literal (location_t loc,
   TREE_USED (decl) = 1;
   DECL_READ_P (decl) = 1;
   TREE_TYPE (decl) = type;
-  TREE_READONLY (decl) = (TYPE_READONLY (type)
- || (TREE_CODE (type) == ARRAY_TYPE
- && TYPE_READONLY (TREE_TYPE (type;
+  c_apply_type_quals_to_decl (TYPE_QUALS (strip_array_types (type)), decl);
   store_init_value (loc, decl, init, NULL_TREE);
 
   if (TREE_CODE (type) == ARRAY_TYPE && !COMPLETE_TYPE_P (type))
--- gcc/testsuite/gcc.dg/tree-ssa/pr82340.c (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr82340.c (revision 253319)
@@ -0,0 +1,14 @@
+/* PR c/82340 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ssa" } */
+/* { dg-final { scan-tree-dump "D.\[0-9]*\\\[0\\\] ={v} 77;" "ssa" } } */
+
+int
+foo (void)
+{
+  int i;
+  volatile char *p = (volatile char[1]) { 77 };
+  for (i = 1; i < 10; i++)
+*p = 4;
+  return *p;
+}
2018-06-25  Jakub Jelinek  

Backported from mainline
2017-09-15  Jakub Jelinek  

PR rtl-optimization/82192
* combine.c (make_extraction): Don't look through non-paradoxical
SUBREGs or TRUNCATE if pos + len is or might be bigger than
inner's mode.

* gcc.c-torture/execute/pr82192.c: New test.

--- gcc/combine.c   (revision 254176)
+++ gcc/combine.c   (revision 254177)
@@ -7397,7 +7397,14 @@ make_extraction (machine_mode mode, rtx
   if (pos_rtx && CONST_INT_P (pos_rtx))
 pos = INTVAL (pos_rtx), pos_rtx = 0;
 
-  if (GET_CODE (inner) == SUBREG && subreg_lowpart_p (inner))
+  if (GET_CODE (inner) == SUBREG
+  && subreg_lowpart_p (inner)
+  && (paradoxical_subreg_p (inner)
+ /* If trying or potentionally trying to extract
+bits outside of is_mode, don't look through
+non-paradoxical SUBREGs.  See PR82192.  */
+ || (pos_rtx == NULL_RTX
+ && pos + len <= GET_MODE_PRECISION (is_mode
 {
   /* If going from (subreg:SI (mem:QI ...)) to (mem:QI ...),
 consider just the QI as the memory to extract from.
@@ -7423,7 +7430,12 @@ make_extraction (machine_mode mode, rtx
   if (new_rtx != 0)
return gen_rtx_ASHIFT (mode, new_rtx, XEXP (inner, 1));
 }
-  else if (GET_CODE (inner) == TRUNCATE)
+  else if (GET_CODE (inner) == TRUNCATE
+  /* If trying or potentionally trying to extract
+ bits outside of is_mode, don't look through
+ TRUNCATE.  See PR82192.  */
+  && pos_rtx == NULL_RTX
+  && pos + len <= GET_MODE_PRECISION (is_mode))
 inner = XEXP (inner, 0);
 
   inner_mode = GET_MODE (inner);
--- gcc/testsuite/gcc.c-torture/execute/pr82192.c   (nonexistent)
+++ gcc/testsuite/gcc.c-torture/execute/pr82192.c   (revision 254177)
@@ -0,0 +1,22 @@
+/* PR rtl-optimization/82192 */
+
+unsigned long long int a = 0x95dd3d896f7422e2ULL;
+struct S { unsigned int m : 13; } b;
+
+__attribute__((noinline, noclone)) void
+foo (void)
+{
+  b.m = ((unsigned) a) >> 

Re: [PATCH, rs6000] Fix AIX test case failures

2018-06-25 Thread Segher Boessenkool
On Mon, Jun 25, 2018 at 09:53:17AM -0700, Carl Love wrote:
> On Mon, 2018-06-25 at 04:44 -0500, Segher Boessenkool wrote:
> > On Fri, Jun 22, 2018 at 02:55:44PM -0700, Carl Love wrote:
> > > --- a/gcc/testsuite/gcc.target/powerpc/divkc3-2.c
> > > +++ b/gcc/testsuite/gcc.target/powerpc/divkc3-2.c
> > > @@ -13,4 +13,5 @@ divide (cld_t *p, cld_t *q, cld_t *r)
> > >    *p = *q / *r;
> > >  }
> > >  
> > > -/* { dg-final { scan-assembler "bl __divkc3" } } */
> > > +/* { dg-final { scan-assembler "bl __divkc3" { target { powerpc*-
> > > *-linux* } } } } */
> > > +/* { dg-final { scan-assembler "bl .__divdc3" { target { powerpc*-
> > > *-aix* } } } } */
> > 
> > Should it be calling __divdc3 on AIX, is that correct?
> 
> I was a bit surprised that it wasn't calling divkc3.  I am guessing
> these are library routines we are calling?  I couldn't find the source
> code for them and don't really know what the difference is between
> divkc3 and divdc3.

divkc3 is for KCmode, that is the complex mode for KFmode (128-bit IEEE).
divdc3 is for DCmode, that is the complex mode for DFmode (64-bit IEEE,
that is, "double").

I think this is the same as PR82625, for which I have a patch in testing.

> So, not sure why AIX and Linux are not calling the name for the
> function or if what is being called is functionally equivalent?

AIX uses 64-bit long double by default, and GCC has a bug with that and
-mabi=ieeelongdouble and __ieee128.

It thinks __ieee128 is the same as long double if it has -mabi=ieeelongdouble,
but that is not always true.  So it ends up using the long double type for
__ieee128, but that is just double precision float in this case.

So, hang on :-)


Segher


Re: [PATCH] Remove -Wunsafe-loop-optimizations option (PR middle-end/86095)

2018-06-25 Thread NightStrike
On Fri, Jun 15, 2018 at 3:08 PM, Jakub Jelinek  wrote:
> Hi!
>
> As mentioned in the PR, all traces of this warning option except these
> were removed earlier, so the warning option does nothing.

This is unfortunate.  As noted here:

https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01057.html

The warning is (or, it used to be) good at finding code areas to
rewrite, and it's MUCH easier to use than the giant walls of text from
-fopt-info.  Is it possible to recreate this functionality in some
other way (or has that already been done, and I missed it)?


Re: [PATCH] gcc_qsort: avoid overlapping memcpy (PR 86311)

2018-06-25 Thread Alexander Monakov
On Mon, 25 Jun 2018, Richard Biener wrote:
>> Sigh - I see GCC optimizes memmove as well as memcpy in this case, so
>> changing the offending memcpy calls to memmoves would be a bit cleaner. OK to
>> go with this instead?
> 
> I think that's better. Or conditionalizing the offending ones on dest! = src? 

That would work, but would require an extra comparison ('c->n == 3' test would
need to be kept) and also would introduce a poorly-predictable branch depending
on comparator outcomes - something the design of gcc_qsort strives to avoid in
the first place.

This is the patch I'm going to apply.

PR middle-end/86311
* sort.cc (REORDER_23): Avoid memcpy with same destination and source.
(REORDER_45): Likewise.

diff --git a/gcc/sort.cc b/gcc/sort.cc
index a48a477d4e8..293e2058f89 100644
--- a/gcc/sort.cc
+++ b/gcc/sort.cc
@@ -69,7 +69,7 @@ do { \
   memcpy (, e1 + OFFSET, sizeof (TYPE));  \
   char *out = c->out + OFFSET;   \
   if (likely (c->n == 3))\
-memcpy (out + 2*STRIDE, e2 + OFFSET, sizeof (TYPE)); \
+memmove (out + 2*STRIDE, e2 + OFFSET, sizeof (TYPE));\
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \
   memcpy (out, , sizeof (TYPE));  \
 } while (0)
@@ -101,7 +101,7 @@ do { \
   memcpy (, e3 + OFFSET, sizeof (TYPE));  \
   char *out = c->out + OFFSET;   \
   if (likely (c->n == 5))\
-memcpy (out + 4*STRIDE, e4 + OFFSET, sizeof (TYPE)); \
+memmove (out + 4*STRIDE, e4 + OFFSET, sizeof (TYPE));\
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \


Re: [PATCH, rs6000] Fix AIX test case failures

2018-06-25 Thread Carl Love
On Mon, 2018-06-25 at 04:44 -0500, Segher Boessenkool wrote:
> Hi Carl,
> 
> On Fri, Jun 22, 2018 at 02:55:44PM -0700, Carl Love wrote:
> > --- a/gcc/testsuite/gcc.target/powerpc/divkc3-2.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/divkc3-2.c
> > @@ -13,4 +13,5 @@ divide (cld_t *p, cld_t *q, cld_t *r)
> >    *p = *q / *r;
> >  }
> >  
> > -/* { dg-final { scan-assembler "bl __divkc3" } } */
> > +/* { dg-final { scan-assembler "bl __divkc3" { target { powerpc*-
> > *-linux* } } } } */
> > +/* { dg-final { scan-assembler "bl .__divdc3" { target { powerpc*-
> > *-aix* } } } } */
> 
> Should it be calling __divdc3 on AIX, is that correct?

I was a bit surprised that it wasn't calling divkc3.  I am guessing
these are library routines we are calling?  I couldn't find the source
code for them and don't really know what the difference is between
divkc3 and divdc3.

The source for divkc3-2.c is:

/* { dg-do compile { target { powerpc*-*-* } } } */
/* { dg-require-effective-target powerpc_p8vector_ok } */
/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */

/* Check that complex multiply generates the right call when long double is 

   IEEE 128-bit floating point.  */

typedef _Complex long double cld_t;

void
divide (cld_t *p, cld_t *q, cld_t *r)
{
  *p = *q / *r;
}

/* { dg-final { scan-assembler "bl __divkc3" { target { powerpc*-*-linux* } 
} } } */
/* { dg-final { scan-assembler "bl .__divdc3" { target { powerpc*-*-aix* } 
} } } */

When compiled as:

gcc -S -c -O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi divkc3-.c

I get:

.file   "divkc3-2.c"
.toc
.csect .text[PR]
.align 2
.align 4
.globl divide
.globl .divide
.csect divide[DS]
divide:
.long .divide, TOC[tc0], 0
.csect .text[PR]
.divide:
mflr 0
stw 31,-4(1)
lfd 4,8(5)
stw 0,8(1)
lfd 3,0(5)
mr 31,3
stwu 1,-80(1)
lfd 2,8(4)
lfd 1,0(4)
bl .__divdc3
nop
addi 1,1,80
lwz 0,8(1)
stfd 1,0(31)
stfd 2,8(31)
lwz 31,-4(1)
mtlr 0
blr
LT..divide:
.long 0
.byte 0,0,32,65,128,1,3,0
.long 0
.long LT..divide-.divide
.short 6
.byte "divide"
.align 2
_section_.text:
.csect .data[RW],4
.long _section_.text

Again, running the regression test, the test passes with the AIX value.
 
So, not sure why AIX and Linux are not calling the name for the
function or if what is being called is functionally equivalent?
> 
> > --- a/gcc/testsuite/gcc.target/powerpc/divkc3-3.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/divkc3-3.c
> > @@ -13,4 +13,5 @@ divide (cld_t *p, cld_t *q, cld_t *r)
> >    *p = *q / *r;
> >  }
> >  
> > -/* { dg-final { scan-assembler "bl __divtc3" } } */
> > +/* { dg-final { scan-assembler "bl __divtc3" { target { powerpc*-
> > *-linux* } } } } */
> > +/* { dg-final { scan-assembler "bl .__divdc3" { target { powerpc*-
> > *-aix* } } } } */
> 
> Same question here.  If the AIX port cannot handle
> -mabi=ieeelongdouble
> it shouldn't silently accept it, etc.

Ditto above comments, don't know about the -mabi=ieeelongdouble.  

I will play around with and without the mabi=ieeelongdouble on AIX and
Linux to see what happens.  
> 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-
> > double.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-
> > double.c
> > index 25f4bc6..403876d 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-double.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-double.c
> > @@ -19,7 +19,6 @@ testd_h (vector double vd2, vector double vd3)
> >    return vec_mergeh (vd2, vd3);
> >  }
> >  
> > -/* vec_merge with doubles tend to just use xxpermdi (3 ea for BE,
> > 1 ea for LE).  */
> > -/* { dg-final { scan-assembler-times "xxpermdi" 2  { target {
> > powerpc*le-*-* } }} } */
> > -/* { dg-final { scan-assembler-times "xxpermdi" 6  { target {
> > powerpc-*-* } } } } */
> > +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
> > +
> >  
> 
> Is this change correct?  The test didn't fail on BE Linux (neither
> 32-bit
> nor 64-bit) as far as I know.

As I recall, my testing on the various systems did give me an error
without the change.  I was using Willow 8 (P8 BE), genoa (P8 LE) and
gcc119 for AIX.  I will go back and re-verify.

Carl Love



Invert sense of NO_IMPLICIT_EXTERN_C

2018-06-25 Thread Nathan Sidwell
NO_IMPLICIT_EXTERN_C was introduced to tell the compiler that it didn't 
need to fake up 'extern "C" { ... }' around system header files.  Over 
the years more and more system headers have become C++-aware, leading to 
more targets defining this macro.


Unfortunately because of the sense of this macro, and that the 
requirement is based on the target-OS, whereas we partition the config 
directory by target-ARCH, it's become hard to know which targets still 
require the older functionality.


There have been a few questions over the past 2 decades to figure this 
out, but they didn;t progress.


This patch replaces the negative NO_IMPLICIT_EXTERN_C with the positive 
SYSTEM_IMPLICIT_EXTERN_C.  Targets that previously did not define 
NO_IMPLICIT_EXTERN_C now need to define SYSTEM_IMPLICIT_EXTERN_C.  I 
know of one such target -- AIX, and I'd be grateful this patch could be 
tried there.


Going through the config files was tricky, and I may well have missed 
something.  One suspicious file is config/sparc/openbsd64.h which did 
explicitly undef the macro, with the comment:


  /* Inherited from sp64-elf.  */

sp64-elf.h does define the macro, but the other bsd's also define it, 
which leaves me wondering if openbsd.h has bit rotted here.  Which leads 
me to another observation:


It's quite possible the extern "C" functionality is enabled on targets 
that no longer need it, because their observed behaviour would not be 
broken.  On the other hand, the failure mode of not defining its 
replacement (or alternatively mistakenly defining NO_IMPLICIT_EXTERN_C), 
would be immediate and obvious.  And the fix is also simple.


So, if you have a target that you think has C++-unaware system headers, 
please give this patch a spin and report.  Blessing from a GM after a 
few days out there would be nice :)


The lesson here is that when one has a transition, chose an enablement 
mechanism that makes it easy to tell when the transition is complete.


nathan

--
Nathan Sidwell
2018-06-25  Nathan Sidwell  

	gcc/c-family/
	* c-lex.c (fe_file_change): Check SYSTEM_IMPLICIT_EXTERN_C not
	NO_IMPLICIT_EXTERN_C.

	gcc/cp/
	* cp/decl.c (decls_match): Check SYSTEM_IMPLICIT_EXTERN_C not
	NO_IMPLICIT_EXTERN_C.
	* cp/parser.c (cp_parser_parameter_declaration_clause): Likewise.

	gcc/
	Replace NO_IMPLICIT_EXTERN_C with SYSTEM_IMPLICIT_EXTERN_C.
	* doc/cpp.texi: Update comment.
	* doc/tm.texi: Rebuilt.
	* doc/tm.texi.in (NO_IMPLICIT_EXTERN_C): Replace with ...
	(SYSTEM_IMPLICIT_EXTERN_C): ... this, opposite sense.
	* doc/extend.texi (Backwards Compatibility): Clarify it is system
	headers affected by extern "C".
	* system.h: Poison NO_IMPLICIT_EXTERN_C.
	* config/alpha/alpha.h, config/arm/uclinux-elf.h,
	config/bfin/elf.h, config/cris/cris.h, config/darwin.h,
	config/dragonfly.h, config/freebsd.h, config/gnu-user.h,
	config/i386/cygming.h, config/i386/djgpp.h, config/i386/nto.h,
	config/ia64/hpux.h, config/lm32/lm32.h, config/lm32/uclinux-elf.h,
	config/lynx.h, config/mips/elf.h, config/mmix/mmix.h,
	config/netbsd.h, config/pa/pa-hpux.h, config/powerpcspe/sysv4.h,
	config/riscv/elf.h, config/rs6000/sysv4.h, config/rtems.h,
	config/s390/tpf.h, config/sh/newlib.h, config/sol2.h,
	config/sparc/openbsd64.h, config/sparc/sp-elf.h,
	config/sparc/sp64-elf.h, config/spu/spu.h,
	config/stormy16/stormy16.h, config/v850/v850.h,
	config/visium/visium.h, config/vx-common.h, config/xtensa/elf.h: Don't
	define NO_IMPLICIT_EXTERN_C.
	* config/rs6000/aix.h: Set SYSTEM_IMPLICIT_EXTERN_C.

Index: c-family/c-lex.c
===
--- c-family/c-lex.c	(revision 262020)
+++ c-family/c-lex.c	(working copy)
@@ -206,7 +206,7 @@ fe_file_change (const line_map_ordinary
 
 	  input_location = new_map->start_location;
 	  (*debug_hooks->start_source_file) (line, LINEMAP_FILE (new_map));
-#ifndef NO_IMPLICIT_EXTERN_C
+#ifdef SYSTEM_IMPLICIT_EXTERN_C
 	  if (c_header_level)
 	++c_header_level;
 	  else if (LINEMAP_SYSP (new_map) == 2)
@@ -219,7 +219,7 @@ fe_file_change (const line_map_ordinary
 }
   else if (new_map->reason == LC_LEAVE)
 {
-#ifndef NO_IMPLICIT_EXTERN_C
+#ifdef SYSTEM_IMPLICIT_EXTERN_C
   if (c_header_level && --c_header_level == 0)
 	{
 	  if (LINEMAP_SYSP (new_map) == 2)
Index: config/alpha/alpha.h
===
--- config/alpha/alpha.h	(revision 262020)
+++ config/alpha/alpha.h	(working copy)
@@ -922,7 +922,4 @@ extern long alpha_auto_offset;
 /* By default, turn on GDB extensions.  */
 #define DEFAULT_GDB_EXTENSIONS 1
 
-/* The system headers under Alpha systems are generally C++-aware.  */
-#define NO_IMPLICIT_EXTERN_C
-
 #define TARGET_SUPPORTS_WIDE_INT 1
Index: config/arm/uclinux-elf.h
===
--- config/arm/uclinux-elf.h	(revision 262020)
+++ config/arm/uclinux-elf.h	(working copy)
@@ -48,9 +48,6 @@
 }		\
   while (false)
 
-/* Do not assume anything 

[PATCH] PR libstdc++/86292 fix exception safety of std::vector constructor

2018-06-25 Thread Jonathan Wakely

PR libstdc++/86292
* include/bits/stl_vector.h (vector::_M_range_initialize):
Add try-catch block.
* testsuite/23_containers/vector/cons/86292.cc: New.

Tested x86_64-linux, committed to trunk.

commit 541662d2c8d2633f2140e434cdaadcf36b8db282
Author: Jonathan Wakely 
Date:   Mon Jun 25 17:24:29 2018 +0100

PR libstdc++/86292 fix exception safety of std::vector 
constructor

PR libstdc++/86292
* include/bits/stl_vector.h 
(vector::_M_range_initialize):
Add try-catch block.
* testsuite/23_containers/vector/cons/86292.cc: New.

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index acec501bc1b..129d45cd34b 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1440,22 +1440,27 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   // Called by the second initialize_dispatch above
   template
void
-   _M_range_initialize(_InputIterator __first,
-   _InputIterator __last, std::input_iterator_tag)
+   _M_range_initialize(_InputIterator __first, _InputIterator __last,
+   std::input_iterator_tag)
{
- for (; __first != __last; ++__first)
+ __try {
+   for (; __first != __last; ++__first)
 #if __cplusplus >= 201103L
-   emplace_back(*__first);
+ emplace_back(*__first);
 #else
-   push_back(*__first);
+ push_back(*__first);
 #endif
+ } __catch(...) {
+   clear();
+   __throw_exception_again;
+ }
}
 
   // Called by the second initialize_dispatch above
   template
void
-   _M_range_initialize(_ForwardIterator __first,
-   _ForwardIterator __last, std::forward_iterator_tag)
+   _M_range_initialize(_ForwardIterator __first, _ForwardIterator __last,
+   std::forward_iterator_tag)
{
  const size_type __n = std::distance(__first, __last);
  this->_M_impl._M_start = this->_M_allocate(__n);
diff --git a/libstdc++-v3/testsuite/23_containers/vector/cons/86292.cc 
b/libstdc++-v3/testsuite/23_containers/vector/cons/86292.cc
new file mode 100644
index 000..7103efb82ff
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/vector/cons/86292.cc
@@ -0,0 +1,64 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run }
+
+#include 
+#include 
+#include 
+
+struct X
+{
+  X() { ++count; }
+  X(const X&) { if (++copies >= max_copies) throw 1; ++count; }
+  ~X() { --count; }
+
+  static int count;
+  static int copies;
+  static int max_copies;
+};
+
+int X::count = 0;
+int X::copies = 0;
+int X::max_copies = 0;
+
+void
+test01()
+{
+  X x[3];
+  const int count = X::count;
+  X::max_copies = 2;
+  __gnu_test::test_container
+x_input(x, x+3);
+  bool caught = false;
+  try
+  {
+std::vector v(x_input.begin(), x_input.end());
+  }
+  catch(int)
+  {
+caught = true;
+  }
+  VERIFY( caught );
+  VERIFY( X::count == count );
+}
+
+int
+main()
+{
+  test01();
+}


Re: [PATCH] Add experimental::sample and experimental::shuffle from N4531

2018-06-25 Thread Jonathan Wakely

On 25/06/18 17:23 +0100, Jonathan Wakely wrote:

The additions to  were added in 2015 but the new
algorithms in  were not. This adds them.

* include/experimental/algorithm (sample, shuffle): Add new overloads
using per-thread random number engine.
* testsuite/experimental/algorithm/sample.cc: Simpify and reduce
dependencies by using __gnu_test::test_container.
* testsuite/experimental/algorithm/sample-2.cc: New.
* testsuite/experimental/algorithm/shuffle.cc: New.

Tested x86_64-linux, committed to trunk.


And this documents it in the manual.


commit 0e350380ee69c6b719362fbd9fbb6a6d0854f6ec
Author: Jonathan Wakely 
Date:   Mon Jun 25 17:42:09 2018 +0100

* doc/xml/manual/status_cxx2017.xml: Document N4531 status.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index aa0914cff72..a77653a3ab4 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -954,6 +954,16 @@ and test for __STDCPP_MATH_SPEC_FUNCS__ >= 201003L.
   Library Fundamentals 2 TS
 
 
+
+  
+	http://www.w3.org/1999/xlink; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4531.html;>
+	  N4531
+	
+  
+  std::rand replacement, revision 3
+  Y
+  Library Fundamentals 2 TS
+
 
   
 


Re: [PATCH] gcc_qsort: avoid overlapping memcpy (PR 86311)

2018-06-25 Thread Richard Biener
On June 25, 2018 4:52:43 PM GMT+02:00, Alexander Monakov  
wrote:
>On Mon, 25 Jun 2018, Alexander Monakov wrote:
>> 
>> In PR 86311 Valgrind flags a call to memcpy with overlapping buffers.
>This can
>> happen in reorder{23,45} helpers when we're reordering in-place, and
>the 3rd/5th
>> element doesn't need to be moved: in that case the middle memcpy is
>called
>> with source == destination.
>> 
>> The fix is simple: just use a temporary, just like for other
>elements.
>
>Sigh - I see GCC optimizes memmove as well as memcpy in this case, so
>changing
>the offending memcpy calls to memmoves would be a bit cleaner. OK to go
>with
>this instead?

I think that's better. Or conditionalizing the offending ones on dest! = src? 

Richard. 

>Alexander



[PATCH] Add experimental::sample and experimental::shuffle from N4531

2018-06-25 Thread Jonathan Wakely

The additions to  were added in 2015 but the new
algorithms in  were not. This adds them.

* include/experimental/algorithm (sample, shuffle): Add new overloads
using per-thread random number engine.
* testsuite/experimental/algorithm/sample.cc: Simpify and reduce
dependencies by using __gnu_test::test_container.
* testsuite/experimental/algorithm/sample-2.cc: New.
* testsuite/experimental/algorithm/shuffle.cc: New.

Tested x86_64-linux, committed to trunk.

This would be safe to backport, but nobody has noticed the algos are
missing or complained, so it doesn't seem very important to backport.


commit 3dd31954a53f74b1faa1b5a6dcb0b3d355738931
Author: Jonathan Wakely 
Date:   Mon Jun 25 16:41:13 2018 +0100

Add experimental::sample and experimental::shuffle from N4531

The additions to  were added in 2015 but the new
algorithms in  were not. This adds them.

* include/experimental/algorithm (sample, shuffle): Add new 
overloads
using per-thread random number engine.
* testsuite/experimental/algorithm/sample.cc: Simpify and reduce
dependencies by using __gnu_test::test_container.
* testsuite/experimental/algorithm/sample-2.cc: New.
* testsuite/experimental/algorithm/shuffle.cc: New.

diff --git a/libstdc++-v3/include/experimental/algorithm 
b/libstdc++-v3/include/experimental/algorithm
index fde4f347f88..4c51efb1c97 100644
--- a/libstdc++-v3/include/experimental/algorithm
+++ b/libstdc++-v3/include/experimental/algorithm
@@ -35,6 +35,7 @@
 
 #include 
 #include 
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -42,7 +43,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 namespace experimental
 {
-inline namespace fundamentals_v1
+inline namespace fundamentals_v2
 {
   template
 inline _ForwardIterator
@@ -79,7 +80,23 @@ inline namespace fundamentals_v1
   __d,
   std::forward<_UniformRandomNumberGenerator>(__g));
 }
-} // namespace fundamentals_v1
+
+  template
+inline _SampleIterator
+sample(_PopulationIterator __first, _PopulationIterator __last,
+  _SampleIterator __out, _Distance __n)
+{
+  return experimental::sample(__first, __last, __out, __n,
+ _S_randint_engine());
+}
+
+  template
+inline void
+shuffle(_RandomAccessIterator __first, _RandomAccessIterator __last)
+{ return std::shuffle(__first, __last, _S_randint_engine()); }
+
+} // namespace fundamentals_v2
 } // namespace experimental
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/testsuite/experimental/algorithm/sample-2.cc 
b/libstdc++-v3/testsuite/experimental/algorithm/sample-2.cc
new file mode 100644
index 000..4ef9a7c77e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/algorithm/sample-2.cc
@@ -0,0 +1,98 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++14 } }
+
+#include 
+#include 
+#include 
+#include 
+
+using __gnu_test::test_container;
+using __gnu_test::input_iterator_wrapper;
+using __gnu_test::output_iterator_wrapper;
+using __gnu_test::forward_iterator_wrapper;
+
+void
+test01()
+{
+  const int pop[] = { 1, 2 };
+  int samp[10] = { };
+
+  // population smaller than desired sample size
+  auto it = std::experimental::sample(pop, pop + 2, samp, 10);
+  VERIFY( it == samp + 2 );
+  VERIFY( std::accumulate(samp, samp + 10, 0) == 3 );
+}
+
+void
+test02()
+{
+  const int pop[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
+  int samp[10] = { };
+
+  auto it = std::experimental::sample(pop, std::end(pop), samp, 10);
+  VERIFY( it == samp + 10 );
+
+  std::sort(samp, it);
+  auto it2 = std::unique(samp, it);
+  VERIFY( it2 == it );
+}
+
+void
+test03()
+{
+  const int pop[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, };
+  int samp[5] = { };
+
+  // input iterator for population
+  test_container pop_in{pop};
+  auto it = std::experimental::sample(pop_in.begin(), pop_in.end(), samp, 5);
+  VERIFY( it == samp + 5 );
+
+  std::sort(samp, it);
+  auto it2 = std::unique(samp, it);
+  VERIFY( it2 == it );
+}
+
+void
+test04()
+{
+  const int pop[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
+  int samp[5] = 

Re: Have g++ define _FILE_OFFSET_BITS=64 on Solaris

2018-06-25 Thread Franz Sirl

Am 2018-06-25 um 15:57 schrieb Rainer Orth:

Hi Franz,


so you are supposed to use "-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64",
but at least a quick glance at the Sol10 headers shows that the additional
-D_LARGEFILE_SOURCE only makes a difference for fseeko/ftello. That still


right, that's also explained in lfcompile(7).


doesn't explain -D_LARGEFILE64_SOURCE, does libstdc++ really need to use
_LARGEFILE64_SOURCE functions?


Honestly, I hadn't checked, but wondered the same thing.  However, I'm a
bit wary to simply remove them after years for fear of breaking user
existing user code.


But adding _FILE_OFFSET_BITS=64 is the far bigger change for the user. 
Now suddenly (for 32-bit applications) off_t changes size and thus many 
applications with mixed C/C++-code simply might break. The reason is 
that now (if they didn't take care of _LARGEFILE_SOURCE themselves), for 
example fread() really does a fread64() in the C++ parts and a fread() 
(the 32-bit version) in the C parts. This situation was avoided before 
by enabling _LARGEFILE_SOURCE without _FILE_OFFSET_BITS=64.



Re-reading lfcompile(7) again shows that you can use either
"-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64" (for portable applications) or
only "-D_FILE_OFFSET_BITS=64". But in the GCC case we only need it for
C++/libstdc++ so it seems "-D_FILE_OFFSET_BITS=64" should be enough. The
rest is up to the users application, or?


One might argue that way, but again it's a bit late to change this now
for no compelling reason.


The compelling reason is that with _FILE_OFFSET_BITS=64 all bets are off 
anyway IMO, see above.



My guess is that without defining _LARGEFILE_SOURCE and _LARGEFILE64_SOURCE
the configure check in libstdc++-v3/acinclude.m4 just won't define
_GLIBCXX_USE_LFS and everything will fall in place. This would leave HPUX
as the last user of _GLIBCXX_USE_LFS.


I don't know about HP-UX, but _GLIBC_USE_LFS is defined on Linux/x86_64,
too.  To me, the meaning seems a bit confused: 27_io/fpos/14775.cc
suggests that it denotes all system with largefile support, while
acinclude.m4 (GLIBCXX_CHECK_LFS) only tests for the transitional
functions (enabled by _LARGEFILE64_SOURCE on Solaris) while ignoring the
`transparent' largefile support from _LARGEFILE_SOURCE
_FILE_OFFSET_BITS=64.

I'd rather not mess with this stuff.


That I can fully understand ;-). Maybe a solution is to define the 
macros also for C, not only for C++. And to conditionalize 
_LARGEFILE_SOURCE on 32-bit compile and _LARGEFILE64_SOURCE on 64-bit 
compile to make it at least less confusing.


Franz


Re: [PATCH][testsuite/guality] Be verbose about gdb version used

2018-06-25 Thread Tom de Vries
On 06/25/2018 04:36 PM, Andreas Schwab wrote:
> On Jun 25 2018, Tom de Vries  wrote:
> 
>> @@ -151,6 +151,9 @@ proc report_gdb { gdb loc } {
>>  }
>>  set gdb [exec which $gdb]
>>  send_log "gdb used in $loc: $gdb\n"
>> -set gdb_version [exec $gdb -v]
>> +if { [catch { set gdb_version [exec $gdb -v] }] } {
>> +   send_log "gdb used in $loc: getting version failed\n"
>> +   return
>> +}
>>  send_log "gdb used in $loc: version:\n---\n$gdb_version\n---\n"
>>  }
> 
> How about this instead:
> 
> diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp 
> b/gcc/testsuite/lib/gcc-gdb-test.exp
> index 9aff6218300..26fb7cd2f4d 100644
> --- a/gcc/testsuite/lib/gcc-gdb-test.exp
> +++ b/gcc/testsuite/lib/gcc-gdb-test.exp
> @@ -151,6 +151,6 @@ proc report_gdb { gdb loc } {
>  }
>  set gdb [exec which $gdb]
>  send_log "gdb used in $loc: $gdb\n"
> -set gdb_version [exec $gdb -v]
> +catch { exec $gdb -v } gdb_version
>  send_log "gdb used in $loc: version:\n---\n$gdb_version\n---\n"
>  }

Hmm, eliminating the set in the catch body is a good idea. But I want to
start with a note that running gdb -v failed.

Committed as attached.

Thanks,
- Tom
[testsuite/guality] Fix tcl error on gdb -v failure

2018-06-25  Tom de Vries  

	* lib/gcc-gdb-test.exp (report_gdb): Handle gdb -v failure.

---
 gcc/testsuite/lib/gcc-gdb-test.exp | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp b/gcc/testsuite/lib/gcc-gdb-test.exp
index 9aff6218300..2ef9ca15c12 100644
--- a/gcc/testsuite/lib/gcc-gdb-test.exp
+++ b/gcc/testsuite/lib/gcc-gdb-test.exp
@@ -151,6 +151,12 @@ proc report_gdb { gdb loc } {
 }
 set gdb [exec which $gdb]
 send_log "gdb used in $loc: $gdb\n"
-set gdb_version [exec $gdb -v]
-send_log "gdb used in $loc: version:\n---\n$gdb_version\n---\n"
+
+send_log "gdb used in $loc: "
+if { [catch { exec $gdb -v } gdb_version] } {
+	send_log "getting version failed:\n"
+} else {
+	send_log "version:\n"
+}
+send_log -- "---\n$gdb_version\n---\n"
 }


[committed] Fix v850e divmodhi4 and udivmodhi4

2018-06-25 Thread Jeff Law

Discovered when analyzing testresults for the v850.

There's two bugs lurking in those patterns.  The most serious is the
divmodhi4 pattern.  It takes a 32bit dividend and divides it by the low
16 bits of the divisor.

However, being a HImode pattern, we've assumed the upper 16 bits of the
divisor don't affect the result.  Consider -9216 as the divisor.  We may
have loaded that via a movhi from memory with zero extension resulting
in 0xdc00 as the divisor which looks like 56320 since the dividend
is actually 32 bits.  Opps.

The fix is to sign extend the dividend for divmodhi4 much like we were
already zero extending it for udivmodhi4.  Of course we need to fix the
length while we're at it...

Which brings us to the second bug.  The udivmodhi4 pattern claimed to
have a length of "4", but that's not right.  The div instruction alone
has a length of "4", but we're also emitting a zero-extension, so the
total length is actually 6.

Rather than use

extend ; div

In the output template, I'm now using

extend\n\tdiv

The latter gives better results when fed into the assembler with -al to
verify lengths.

This fixes a handful of execution failures on the v850e3v5 tests.
Installing on the trunk.

Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 754b5f10063..ceec833bb70 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2018-06-25  Jeff Law  
+
+   * config/v850/v850.md (divmodhi4): Make sure to sign extend the
+   dividend to 32 bits.  Adjust length.
+   (udivmodhi4): Cleanup output template.  Fix length.
+
 2018-06-25  Carl Love  
 
* config/rs6000/vsx.md: Change word selector to prefered location.
diff --git a/gcc/config/v850/v850.md b/gcc/config/v850/v850.md
index 2656e90c90b..e01a3102c31 100644
--- a/gcc/config/v850/v850.md
+++ b/gcc/config/v850/v850.md
@@ -738,13 +738,13 @@
(match_dup 2)))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_V850E_UP"
-  "divh %2,%0,%3"
-  [(set_attr "length" "4")
+  "sxh %0\n\tdivh %2,%0,%3"
+  [(set_attr "length" "6")
(set_attr "cc" "clobber")
(set_attr "type" "div")])
 
-;; Half-words are sign-extended by default, so we must zero extend to a word
-;; here before doing the divide.
+;; The half word needs to be zero/sign extended to 32 bits before doing
+;; the division/modulo operation.
 
 (define_insn "udivmodhi4"
   [(set (match_operand:HI 0 "register_operand" "=r")
@@ -755,8 +755,8 @@
 (match_dup 2)))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_V850E_UP"
-  "zxh %0 ; divhu %2,%0,%3"
-  [(set_attr "length" "4")
+  "zxh %0\n\tdivhu %2,%0,%3"
+  [(set_attr "length" "6")
(set_attr "cc" "clobber")
(set_attr "type" "div")])
 


[PATCH] Fix to [-Wmissing-field-initializers] in libatomic/config/posix/lock.c:58

2018-06-25 Thread U.Mutlu
When building with "... -g0 -DNDEBUG -Wall -Wextra -Wpedantic" then the 
following error happens:


../../../gcc_trunk/libatomic/config/posix/lock.c: At top level:
../../../gcc_trunk/libatomic/config/posix/lock.c:58:3: warning: missing 
initializer for field 'pad' of 'struct lock' [-Wmissing-field-initializers]

   [0 ... NLOCKS-1].mutex = PTHREAD_MUTEX_INITIALIZER
   ^


The folowing patch fixes it.
Regtested x86_64-linux.
NOT committed b/c don't have write access yet.


2018-06-25  U.Mutlu  

* config/posix/lock.c: fix to [-Wmissing-field-initializers] when
building with "... -g0 -DNDEBUG -Wall -Wextra -Wpedantic"

Index: libatomic/config/posix/lock.c
===
--- libatomic/config/posix/lock.c   (revision 261994)
+++ libatomic/config/posix/lock.c   (working copy)
@@ -55,7 +55,8 @@

 #define NLOCKS (PAGE_SIZE / WATCH_SIZE)
 static struct lock locks[NLOCKS] = {
-  [0 ... NLOCKS-1].mutex = PTHREAD_MUTEX_INITIALIZER
+  [0 ... NLOCKS-1] = (struct lock)
+   { .mutex = PTHREAD_MUTEX_INITIALIZER, .pad[0] = 0 }
 };

 static inline uintptr_t




[PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes

2018-06-25 Thread Aaron Sawdey
In gcc 8 I added support for unaligned vsx in the builtin expansion of
memset(x,0,y). Turns out that for memset of less than 32 bytes, this
doesn't really help much, and it also runs into an egregious load-hit-
store case in CPU2006 components gcc and hmmer.

This patch reverts to the previous (gcc 7) behavior for memset of 16-31 
bytes, which is to use vsx stores only if the target is 16 byte
aligned. For 32 bytes or more, unaligned vsx stores will still be used.
  Performance testing of the memset expansion shows that not much is
given up by using scalar stores for 16-31 bytes, and CPU2006 runs show
the performance regression is fixed.

Regstrap passes on powerpc64le, ok for trunk and backport to 8?

Thanks,
   Aaron

2018-06-25  Aaron Sawdey  

* config/rs6000/rs6000-string.c (expand_block_clear): Don't use
unaligned vsx for 16B memset.


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-string.c
===
--- gcc/config/rs6000/rs6000-string.c	(revision 261808)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -90,7 +90,9 @@
   machine_mode mode = BLKmode;
   rtx dest;
 
-  if (bytes >= 16 && TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX))
+  if (TARGET_ALTIVEC
+	  && ((bytes >= 16 && align >= 128)
+	  || (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX)))
 	{
 	  clear_bytes = 16;
 	  mode = V4SImode;


Re: [PATCH][Middle-end][version 3]2nd patch of PR78809 and PR83026

2018-06-25 Thread Qing Zhao


> On Jun 22, 2018, at 11:49 PM, Jeff Law  wrote:
> 
> On 05/29/2018 06:08 PM, Qing Zhao wrote:
>> Hi, Jeff,
>> 
>> Thanks a lot for your review and comments.
>> 
>> I have updated my patch based on your suggestion, and retested this whole 
>> patch on both X86 and aarch64.
>> 
>> please take a look at the patch again.
>> 
>> thanks.
>> 
>> Qing
>> 
>>> On May 25, 2018, at 3:38 PM, Jeff Law  wrote:
>>> So I originally thought you had the core logic wrong in the immediate
>>> uses loop.  But it's actually the case that the return value is the
>>> exact opposite of what I expected.
>>> 
>>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>>> was not transformed.
>>> 
>>> Can you fix that so it's not so confusing?
>>> 
>>> I think with that change we'll be good to go, but please repost for a
>>> final looksie.
>>> 
>>> THanks,
>>> Jeff
>> 
>> 
>> 0001-2nd-Patch-for-PR78009.patch
>> 
>> 
>> From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001
>> From: qing zhao mailto:qing.z...@oracle.com>>
>> Date: Tue, 29 May 2018 16:15:21 -0400
>> Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 
>> 
>> Inline strcmp with small constant strings
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 
>> 
>> missing strlen optimization for strcmp of unequal strings
>> 
>> The design doc for PR78809 is at:
>> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html 
>> 
>> 
>> this patch is for the second part of change of PR78809 and PR83026:
>> 
>> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
>> 
>>   B.1. (PR83026) When the lengths of both arguments are constant and
>>it's a strcmp:
>>  * if the lengths are NOT equal, we can safely fold the call
>>to a non-zero value.
>>  * otherwise, do nothing now.
>> 
>>   B.2. (PR78809) When the length of one argument is constant, try to replace
>>   the call with a __builtin_str(n)cmp_eq call where possible, i.e:
>> 
>>   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
>>   string with constant length, C is a constant.
>> if (C <= strlen(STR) && sizeof_array(s) > C)
>>   {
>> replace this call with
>> __builtin_strncmp_eq (s, STR, C) (!)= 0
>>   }
>> if (C > strlen(STR)
>>   {
>> it can be safely treated as a call to strcmp (s, STR) (!)= 0
>> can handled by the following strcmp.
>>   }
>> 
>>   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>>   string with constant length.
>> if  (sizeof_array(s) > strlen(STR))
>>   {
>> replace this call with
>> __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>>   }
>> 
>>   later when expanding the new __builtin_str(n)cmp_eq calls, first expand 
>> them
>>   as __builtin_memcmp_eq, if the expansion does not succeed, change them back
>>   to call to __builtin_str(n)cmp.
>> 
>> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
>> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026
>> 
>> bootstraped and tested on both X86 and Aarch64. no regression.
>> ---
>> gcc/builtins.c |  33 
>> gcc/builtins.def   |   5 +
>> gcc/gimple-fold.c  |   5 +
>> gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 
>> gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 
>> gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 ++
>> gcc/tree-ssa-strlen.c  | 304 
>> ++---
>> gcc/tree-ssa-structalias.c |   2 +
>> gcc/tree.c |   8 +
>> 9 files changed, 454 insertions(+), 17 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c
> Sorry for the long delay.  This needs a ChangeLog.  With the ChangeLog
> it is OK for the trunk.

Hi, Jeff,

the patch had been committed with ChangeLogs as following:

https://gcc.gnu.org/viewcvs/gcc?view=revision=261039 


thanks.

Qing

> 
> Jeff



Re: [PATCH] gcc_qsort: avoid overlapping memcpy (PR 86311)

2018-06-25 Thread Alexander Monakov
On Mon, 25 Jun 2018, Alexander Monakov wrote:
> 
> In PR 86311 Valgrind flags a call to memcpy with overlapping buffers. This can
> happen in reorder{23,45} helpers when we're reordering in-place, and the 
> 3rd/5th
> element doesn't need to be moved: in that case the middle memcpy is called
> with source == destination.
> 
> The fix is simple: just use a temporary, just like for other elements.

Sigh - I see GCC optimizes memmove as well as memcpy in this case, so changing
the offending memcpy calls to memmoves would be a bit cleaner. OK to go with
this instead?

Alexander


[PATCH] gcc_qsort: avoid overlapping memcpy (PR 86311)

2018-06-25 Thread Alexander Monakov
Hi,

In PR 86311 Valgrind flags a call to memcpy with overlapping buffers. This can
happen in reorder{23,45} helpers when we're reordering in-place, and the 3rd/5th
element doesn't need to be moved: in that case the middle memcpy is called
with source == destination.

The fix is simple: just use a temporary, just like for other elements.

I'm a bit surprised at this though: when I looked, gcc turned those fixed-size
memcpy calls to MEMs even at -O0, so Valgrind wouldn't see them.

Bootstrapped on x86_64, OK to apply?

Thanks.
Alexander

PR middle-end/86311
* sort.cc (REORDER_23): Avoid memcpy with same destination and source.
(REORDER_45): Likewise.

diff --git a/gcc/sort.cc b/gcc/sort.cc
index a48a477d4e8..baabc39044f 100644
--- a/gcc/sort.cc
+++ b/gcc/sort.cc
@@ -64,12 +64,15 @@ reorder23 (sort_ctx *c, char *e0, char *e1, char *e2)
 {
 #define REORDER_23(TYPE, STRIDE, OFFSET) \
 do { \
-  TYPE t0, t1;   \
+  TYPE t0, t1, t2;   \
   memcpy (, e0 + OFFSET, sizeof (TYPE));  \
   memcpy (, e1 + OFFSET, sizeof (TYPE));  \
   char *out = c->out + OFFSET;   \
   if (likely (c->n == 3))\
-memcpy (out + 2*STRIDE, e2 + OFFSET, sizeof (TYPE)); \
+{\
+  memcpy (, e2 + OFFSET, sizeof (TYPE));  \
+  memcpy (out + 2*STRIDE, , sizeof (TYPE));   \
+}\
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \
   memcpy (out, , sizeof (TYPE));  \
 } while (0)
@@ -94,14 +97,17 @@ reorder45 (sort_ctx *c, char *e0, char *e1, char *e2, char 
*e3, char *e4)
 {
 #define REORDER_45(TYPE, STRIDE, OFFSET) \
 do { \
-  TYPE t0, t1, t2, t3;   \
+  TYPE t0, t1, t2, t3, t4;   \
   memcpy (, e0 + OFFSET, sizeof (TYPE));  \
   memcpy (, e1 + OFFSET, sizeof (TYPE));  \
   memcpy (, e2 + OFFSET, sizeof (TYPE));  \
   memcpy (, e3 + OFFSET, sizeof (TYPE));  \
   char *out = c->out + OFFSET;   \
   if (likely (c->n == 5))\
-memcpy (out + 4*STRIDE, e4 + OFFSET, sizeof (TYPE)); \
+{\
+  memcpy (, e4 + OFFSET, sizeof (TYPE));  \
+  memcpy (out + 4*STRIDE, , sizeof (TYPE));   \
+}\
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \
   memcpy (out, , sizeof (TYPE)); out += STRIDE;   \


Re: [PATCH][testsuite/guality] Be verbose about gdb version used

2018-06-25 Thread Andreas Schwab
On Jun 25 2018, Tom de Vries  wrote:

> @@ -151,6 +151,9 @@ proc report_gdb { gdb loc } {
>  }
>  set gdb [exec which $gdb]
>  send_log "gdb used in $loc: $gdb\n"
> -set gdb_version [exec $gdb -v]
> +if { [catch { set gdb_version [exec $gdb -v] }] } {
> +   send_log "gdb used in $loc: getting version failed\n"
> +   return
> +}
>  send_log "gdb used in $loc: version:\n---\n$gdb_version\n---\n"
>  }

How about this instead:

diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp 
b/gcc/testsuite/lib/gcc-gdb-test.exp
index 9aff6218300..26fb7cd2f4d 100644
--- a/gcc/testsuite/lib/gcc-gdb-test.exp
+++ b/gcc/testsuite/lib/gcc-gdb-test.exp
@@ -151,6 +151,6 @@ proc report_gdb { gdb loc } {
 }
 set gdb [exec which $gdb]
 send_log "gdb used in $loc: $gdb\n"
-set gdb_version [exec $gdb -v]
+catch { exec $gdb -v } gdb_version
 send_log "gdb used in $loc: version:\n---\n$gdb_version\n---\n"
 }

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH] Fix PR86271

2018-06-25 Thread Richard Biener


The following fixes the bogus IL that does pointer extension
from inlining of mismatched function args (testcase with -m32).

The solution as implemented is to not inline by making
fold_convertible reject the required conversion.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2018-06-25  Richard Biener  

PR middle-end/86271
* fold-const.c (fold_convertible_p): Pointer extension
isn't valid.

* gcc.dg/pr86271.c: New testcase.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 262006)
+++ gcc/fold-const.c(working copy)
@@ -2358,7 +2358,9 @@ fold_convertible_p (const_tree type, con
 case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE:
 case POINTER_TYPE: case REFERENCE_TYPE:
 case OFFSET_TYPE:
-  return (INTEGRAL_TYPE_P (orig) || POINTER_TYPE_P (orig)
+  return (INTEGRAL_TYPE_P (orig)
+ || (POINTER_TYPE_P (orig)
+ && TYPE_PRECISION (type) <= TYPE_PRECISION (orig))
  || TREE_CODE (orig) == OFFSET_TYPE);
 
 case REAL_TYPE:
Index: gcc/testsuite/gcc.dg/pr86271.c
===
--- gcc/testsuite/gcc.dg/pr86271.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr86271.c  (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int main ()
+{
+  int i;
+  foobar (i, ); /* { dg-warning "implicit declaration" } */
+}
+
+int foobar (int a, long long b)
+{
+  int c;
+
+  c = a % b;
+  a = a / b;
+  return a + b;
+}


[PATCH] Fix PR86287

2018-06-25 Thread Richard Biener


Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2018-06-25  Richard Biener  

PR tree-optimization/86287
* tree-vect-loop.c (vect_transform_loop_stmt): Fix read-after-free.

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 8e3f5f550b0..bd549566245 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -8380,8 +8380,9 @@ vect_transform_loop_stmt (loop_vec_info loop_vinfo, 
gimple *stmt,
 
   /* SLP.  Schedule all the SLP instances when the first SLP stmt is
  reached.  */
-  if (STMT_SLP_TYPE (stmt_info))
+  if (slp_vect_type slptype = STMT_SLP_TYPE (stmt_info))
 {
+
   if (!*slp_scheduled)
{
  *slp_scheduled = true;
@@ -8392,7 +8393,7 @@ vect_transform_loop_stmt (loop_vec_info loop_vinfo, 
gimple *stmt,
}
 
   /* Hybrid SLP stmts must be vectorized in addition to SLP.  */
-  if (PURE_SLP_STMT (stmt_info))
+  if (slptype == pure_slp)
return;
 }
 


Re: Have g++ define _FILE_OFFSET_BITS=64 on Solaris

2018-06-25 Thread Rainer Orth
Hi Franz,

> so you are supposed to use "-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64",
> but at least a quick glance at the Sol10 headers shows that the additional
> -D_LARGEFILE_SOURCE only makes a difference for fseeko/ftello. That still

right, that's also explained in lfcompile(7).

> doesn't explain -D_LARGEFILE64_SOURCE, does libstdc++ really need to use
> _LARGEFILE64_SOURCE functions?

Honestly, I hadn't checked, but wondered the same thing.  However, I'm a
bit wary to simply remove them after years for fear of breaking user
existing user code.

> Re-reading lfcompile(7) again shows that you can use either
> "-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64" (for portable applications) or
> only "-D_FILE_OFFSET_BITS=64". But in the GCC case we only need it for
> C++/libstdc++ so it seems "-D_FILE_OFFSET_BITS=64" should be enough. The
> rest is up to the users application, or?

One might argue that way, but again it's a bit late to change this now
for no compelling reason.

> My guess is that without defining _LARGEFILE_SOURCE and _LARGEFILE64_SOURCE
> the configure check in libstdc++-v3/acinclude.m4 just won't define
> _GLIBCXX_USE_LFS and everything will fall in place. This would leave HPUX
> as the last user of _GLIBCXX_USE_LFS.

I don't know about HP-UX, but _GLIBC_USE_LFS is defined on Linux/x86_64,
too.  To me, the meaning seems a bit confused: 27_io/fpos/14775.cc
suggests that it denotes all system with largefile support, while
acinclude.m4 (GLIBCXX_CHECK_LFS) only tests for the transitional
functions (enabled by _LARGEFILE64_SOURCE on Solaris) while ignoring the
`transparent' largefile support from _LARGEFILE_SOURCE
_FILE_OFFSET_BITS=64.

I'd rather not mess with this stuff.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] v3 of optinfo, remarks and optimization records

2018-06-25 Thread Richard Biener
On Wed, Jun 20, 2018 at 6:34 PM David Malcolm  wrote:
>
> Here's v3 of the patch (one big patch this time, rather than a kit).
>
> Like the v2 patch kit, this patch reuses the existing dump API,
> rather than inventing its own.
>
> Specifically, it uses the dump_* functions in dumpfile.h that don't
> take a FILE *, the ones that implicitly write to dump_file and/or
> alt_dump_file.  I needed a name for them, so I've taken to calling
> them the "structured dump API" (better name ideas welcome).
>
> v3 eliminates v2's optinfo_guard class, instead using "dump_*_loc"
> calls as delimiters when consolidating "dump_*" calls.  There's a
> new dump_context class which has responsibility for consolidating
> them into optimization records.
>
> The dump_*_loc calls now capture more than just a location_t: they
> capture the profile_count and the location in GCC's own sources where
> the dump is being emitted from.
>
> This works by introducing a new "dump_location_t" class as the
> argument of those dump_*_loc calls.  The dump_location_t can
> be constructed from a gimple * or from an rtx_insn *, so that
> rather than writing:
>
>   dump_printf_loc (MSG_NOTE, gimple_location (stmt),
>"some message: %i", 42);
>
> you can write:
>
>   dump_printf_loc (MSG_NOTE, stmt,
>"some message: %i", 42);
>
> and the dump_location_t constructor will grab the location_t and
> profile_count of stmt, and the location of the "dump_printf_loc"
> callsite (and gracefully handle "stmt" being NULL).
>
> Earlier versions of the patch captured the location of the
> dump_*_loc call via preprocessor hacks, or didn't work properly;
> this version of the patch works more cleanly: internally,
> dump_location_t is split into two new classes:
>   * dump_user_location_t: the location_t and profile_count within
> the *user's code*, and
>   * dump_impl_location_t: the __builtin_FILE/LINE/FUNCTION within
> the *implementation* code (i.e. GCC or a plugin), captured
> "automagically" via default params
>
> These classes are sometimes used elsewhere in the code.  For
> example, "vect_location" becomes a dump_user_location_t
> (location_t and profile_count), so that in e.g:
>
>   vect_location = find_loop_location (loop);
>
> it's capturing the location_t and profile_count, and then when
> it's used here:
>
>   dump_printf_loc (MSG_NOTE, vect_location, "foo");
>
> the dump_location_t is constructed from the vect_location
> plus the dump_impl_location_t at that callsite.
>
> In contrast, loop-unroll.c's report_unroll's "locus" param
> becomes a dump_location_t: we're interested in where it was
> called from, not in the locations of the various dump_*_loc calls
> within it.
>
> Previous versions of the patch captured a gimple *, and needed
> GTY markers; in this patch, the dump_user_location_t is now just a
> location_t and a profile_count.
>
> The v2 patch added an overload for dump_printf_loc so that you
> could pass in either a location_t, or the new type; this version
> of the patch eliminates that: they all now take dump_location_t.
>
> Doing so required adding support for rtx_insn *, so that one can
> write this kind of thing in RTL passes:
>
>   dump_printf_loc (MSG_NOTE, insn, "foo");
>
> One knock-on effect is that get_loop_location now returns a
> dump_user_location_t rather than a location_t, so that it has
> hotness information.
>
> Richi: would you like me to split out this location-handling
> code into a separate patch?  (It's kind of redundant without
> adding the remarks and optimization records work, but if that's
> easier I can do it)

I think that would be easier because it doesn't require the JSON
stuff and so I'll happily approve it.

Thus - trying to review that bits (and sorry for the delay).

+  location_t srcloc = loc.get_location_t ();
+
   if (dump_file && (dump_kind & pflags))
 {
-  dump_loc (dump_kind, dump_file, loc);
+  dump_loc (dump_kind, dump_file, srcloc);
   print_gimple_stmt (dump_file, gs, spc, dump_flags | extra_dump_flags);
 }

   if (alt_dump_file && (dump_kind & alt_flags))
 {
-  dump_loc (dump_kind, alt_dump_file, loc);
+  dump_loc (dump_kind, alt_dump_file, srcloc);
   print_gimple_stmt (alt_dump_file, gs, spc, dump_flags |
extra_dump_flags);
 }
+
+  if (optinfo_enabled_p ())
+{
+  optinfo  = begin_next_optinfo (loc);
+  info.handle_dump_file_kind (dump_kind);
+  info.add_stmt (gs, extra_dump_flags);
+}

seeing this in multiple places.  I seem to remember that
dump_file / alt_dump_file was suposed to handle dumping
into two locations - a dump file and optinfo (or stdout).  This looks
like the optinfo "stream" is even more separate.  Could that
obsolete the alt_dump_file stream?  I'd need to review existing stuff
in more detail to answer but maybe you already know from recently
digging into this.

Oh, and all the if (optinfo_enable_p ()) stuff is for the followup then, right?

I like the boiler-plate 

Re: [PATCH, rs6000] Change word selector to prefered location for vec_insert builtin

2018-06-25 Thread Bill Schmidt


> On Jun 22, 2018, at 7:56 PM, Segher Boessenkool  
> wrote:
> 
> Hi Carl,
> 
> On Fri, Jun 22, 2018 at 07:32:47AM -0700, Carl Love wrote:
>> The following patch changes the word selected when extracting the word
>> from the second vector to insert into the first vector by the
>> vec_insert() builtin.
>> 
>> Specifically, the test case
>> 
>> vector float
>> fn2 (float a, vector float b)
>> {
>>   return vec_insert (a, b, 1);
>> }
>> 
>> without the patch  generates the code sequence 
>> 
>>  xscvdpspn vs0,vs1
>>  xxextractuw vs0,vs0,4
>>  xxinsertw vs34,vs0,8
> 
> [ For -mcpu=power9 -mlittle ]
> 
> Is this from something in the testsuite?  I can't find it.

It's not from the testsuite.  I found the problem while working on some
vector intrinsic documentation and alerted Carl.

Thanks,
Bill
> 
>> The xscvdpspn places the extracted word into words 0 and 1 of the
>> destination.  The xxextractuw extracts word 1 (offset of 4 bytes)from
>> the source.  The patch changes the offset so that the xxexractuw will
>> extract word 0 (offset 0 bytes) instead of word 1.  The values are the
>> same so there is no functional change. But it was decided that using
>> word 0 was preferred choice.
> 
> The patch looks fine.  Okay for trunk.  Thanks!
> 
> 
> Segher
> 



Re: [PATCH][testsuite/guality] Be verbose about gdb version used

2018-06-25 Thread Tom de Vries
On 06/25/2018 02:32 PM, Andreas Schwab wrote:
> I'm still getting this error:
> 
> Running /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp 
> ...
> gdb used in 
> /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp: 
> /usr/bin/gdb
> ERROR: tcl error sourcing 
> /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp.
> ERROR: GNU gdb (GDB; SUSE Linux Enterprise 11) 7.9.1
> Copyright (C) 2015 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "ia64-suse-linux".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://bugs.opensuse.org/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word".
> [GDB failed to load libunwind-ia64.so.8: libunwind-ia64.so.8: cannot open 
> shared object file: No such file or directory]
> [GDB failed to load libunwind-ia64.so.7: /usr/lib/libunwind-ia64.so.7: 
> undefined symbol: _Uelf64_get_proc_name]
> while executing
> "exec $gdb -v"
> (procedure "report_gdb" line 8)
> invoked from within
> "report_gdb $::env(GUALITY_GDB_NAME) [info script]"
> (file 
> "/usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp" line 
> 49)
> invoked from within
> "source /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp"
> ("uplevel" body line 1)
> invoked from within
> "uplevel #0 source 
> /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp"
> invoked from within
> "catch "uplevel #0 source $test_file_name""
> testcase /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp 
> completed in 0 seconds

Hi,

Can you try this patch to see if it fixes the error?

Thanks,
- Tom

diff --git a/gcc/testsuite/lib/gcc-gdb-test.exp
b/gcc/testsuite/lib/gcc-gdb-test.exp
index 9aff6218300..a83c4ca04c7 100644
--- a/gcc/testsuite/lib/gcc-gdb-test.exp
+++ b/gcc/testsuite/lib/gcc-gdb-test.exp
@@ -151,6 +151,9 @@ proc report_gdb { gdb loc } {
 }
 set gdb [exec which $gdb]
 send_log "gdb used in $loc: $gdb\n"
-set gdb_version [exec $gdb -v]
+if { [catch { set gdb_version [exec $gdb -v] }] } {
+   send_log "gdb used in $loc: getting version failed\n"
+   return
+}
 send_log "gdb used in $loc: version:\n---\n$gdb_version\n---\n"
 }



Re: [PATCH, PR86257, i386/debug] Fix insn prefix in tls_global_dynamic_64_

2018-06-25 Thread Tom de Vries
On 06/25/2018 02:45 PM, Nathan Sidwell wrote:
> On 06/25/2018 08:25 AM, Tom de Vries wrote:
> 
>> If we'd implemented something like this in gas:
>> ...
>> .insn
>> .byte 0x66
>> .endinsn
>> ...
>> we could fix this more generically.
> 
> Doesn't arm gas provide this functionality with some target-specific
> pseudo?  It'd be good to copy that.
> 
> ah, inst:
> 
> @cindex @code{.inst} directive, ARM
> @item .inst @var{opcode} [ , @dots{} ]
> @itemx .inst.n @var{opcode} [ , @dots{} ]
> @itemx .inst.w @var{opcode} [ , @dots{} ]
> Generates the instruction corresponding to the numerical value
> @var{opcode}.
> @code{.inst.n} and @code{.inst.w} allow the Thumb instruction size to be
> specified explicitly, overriding the normal encoding rules.
> 
> 
> (ARM needs to distinguish data from insns to emit the mapping symbols
> needed for be8 endianness xforms).
> 

Hmm, thanks for pointing that out. There's also something similar for
s390 and riscv.

For mips there's another form of .insn (
https://sourceware.org/binutils/docs/as/MIPS-insn.html#index-_002einsn
), which is similar to what I was proposing:
...
9.27.8 Directive to mark data as an instruction

The .insn directive tells as that the following data is actually
instructions. This makes a difference in MIPS 16 and microMIPS modes:
when loading the address of a label which precedes instructions, as
automatically adds 1 to the value, so that jumping to the loaded address
will do the right thing.
...

Thanks,
- Tom


[PATCH] Fix PR86304

2018-06-25 Thread Richard Biener


Committed as obvious.

Richard.

2018-06-25  Richard Biener  

PR tree-optimization/86304
* tree-vectorizer.c (vectorize_loops): Walk over new possibly
epilogue-if-converted loops as well.

Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   (revision 262012)
+++ gcc/tree-vectorizer.c   (working copy)
@@ -929,7 +929,7 @@ vectorize_loops (void)
   /*  --- Finalize. ---  */
 
   if (any_ifcvt_loops)
-for (i = 1; i < vect_loops_num; i++)
+for (i = 1; i < number_of_loops (cfun); i++)
   {
loop = get_loop (cfun, i);
if (loop && loop->dont_vectorize)


Re: [PATCH, PR86257, i386/debug] Fix insn prefix in tls_global_dynamic_64_

2018-06-25 Thread Nathan Sidwell

On 06/25/2018 08:25 AM, Tom de Vries wrote:


If we'd implemented something like this in gas:
...
.insn
.byte 0x66
.endinsn
...
we could fix this more generically.


Doesn't arm gas provide this functionality with some target-specific 
pseudo?  It'd be good to copy that.


ah, inst:

@cindex @code{.inst} directive, ARM
@item .inst @var{opcode} [ , @dots{} ]
@itemx .inst.n @var{opcode} [ , @dots{} ]
@itemx .inst.w @var{opcode} [ , @dots{} ]
Generates the instruction corresponding to the numerical value @var{opcode}.
@code{.inst.n} and @code{.inst.w} allow the Thumb instruction size to be
specified explicitly, overriding the normal encoding rules.


(ARM needs to distinguish data from insns to emit the mapping symbols 
needed for be8 endianness xforms).


nathan

--
Nathan Sidwell


Fix noaddr testcase

2018-06-25 Thread Jan Hubicka
Hi,
this patch prevents lto-section-out to dump section name into the
dump file when noaddr or nonnumbered is used because it contains
a random seed.

regtested x86_64-linux, comitted.

Honza

* lto-section-out.c (lto_begin_section): Do not print section
name for noaddr and unnumbered dumps.
Index: lto-section-out.c
===
--- lto-section-out.c   (revision 261995)
+++ lto-section-out.c   (working copy)
@@ -68,8 +68,14 @@ lto_begin_section (const char *name, boo
   lang_hooks.lto.begin_section (name);
 
   if (streamer_dump_file)
-fprintf (streamer_dump_file, "Creating %ssection %s\n",
-compress ? "compressed " : "", name);
+{
+  if (flag_dump_unnumbered || flag_dump_noaddr)
+ fprintf (streamer_dump_file, "Creating %ssection\n",
+  compress ? "compressed " : "");
+   else
+ fprintf (streamer_dump_file, "Creating %ssection %s\n",
+  compress ? "compressed " : "", name);
+}
   gcc_assert (compression_stream == NULL);
   if (compress)
 compression_stream = lto_start_compression (lto_append_data, NULL);


Re: [PATCH, PR86257, i386/debug] Fix insn prefix in tls_global_dynamic_64_

2018-06-25 Thread Tom de Vries
On 06/24/2018 11:59 PM, Jan Hubicka wrote:
> Hi,
> searching for other occurences I see:
> jan@skylake:~/trunk/gcc/config/i386> grep ASM_BYTE *md *.c
> i386.md:return ASM_BYTE "0x9e";
> i386.md:fputs (ASM_BYTE "0x66\n", asm_out_file);
> i386.md:fputs (ASM_BYTE "0x66\n", asm_out_file);
> i386.c:   fputs (ASM_BYTE "0x48, 0x8d, 0xa4, 0x24, 0x00, 0x00, 0x00, 0x00\n",
> i386.c:   fputs (ASM_BYTE "0x8b, 0xff, 0x55, 0x8b, 0xec\n", asm_out_file);
> i386.c: fputs ("\n" ASM_BYTE "0xf2\n\t", file);
> i386.c: fputs ("\n" ASM_BYTE "0xf3\n\t", file);
> i386.c:fprintf (file, "1:" ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n");
> i386.c:#undef TARGET_ASM_BYTE_OP
> i386.c:#define TARGET_ASM_BYTE_OP ASM_BYTE
> 

I've just reviewed all these occurances, as well as the bigger-sized
ASM_ ones, and indeed in a couple of places we get a breakpoint at
an incorrect location, but AFAIU not in the middle of an insn as is the
case here.

> Perhaps we want to add new macro like ASM_INSN_BYTE which is used to output
> such prefixes...

Maybe. And I suspect that with the .insn/.endinsn construct I mentioned
earlier in this thread, this new macro would be easier to implement.

Thanks,
- Tom


Re: [PATCH][testsuite/guality] Be verbose about gdb version used

2018-06-25 Thread Andreas Schwab
I'm still getting this error:

Running /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp ...
gdb used in 
/usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp: 
/usr/bin/gdb
ERROR: tcl error sourcing 
/usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp.
ERROR: GNU gdb (GDB; SUSE Linux Enterprise 11) 7.9.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "ia64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
[GDB failed to load libunwind-ia64.so.8: libunwind-ia64.so.8: cannot open 
shared object file: No such file or directory]
[GDB failed to load libunwind-ia64.so.7: /usr/lib/libunwind-ia64.so.7: 
undefined symbol: _Uelf64_get_proc_name]
while executing
"exec $gdb -v"
(procedure "report_gdb" line 8)
invoked from within
"report_gdb $::env(GUALITY_GDB_NAME) [info script]"
(file 
"/usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp" line 49)
invoked from within
"source /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp"
("uplevel" body line 1)
invoked from within
"uplevel #0 source 
/usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp"
invoked from within
"catch "uplevel #0 source $test_file_name""
testcase /usr/local/gcc/gcc-20180625/gcc/testsuite/gcc.dg/guality/guality.exp 
completed in 0 seconds

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH, PR86257, i386/debug] Fix insn prefix in tls_global_dynamic_64_

2018-06-25 Thread Tom de Vries
On 06/24/2018 11:56 PM, Jan Hubicka wrote:
>> Hi,
>>
>> [ The analysis of this PR was done at https://stackoverflow.com/a/33557963 ,
>> November 2015. ]
>>
>> This tls sequence:
>> ...
>> 0x00 .byte 0x66
>> 0x01 leaq  x@tlsgd(%rip),%rdi
>> 0x08 .word 0x
>> 0x0a rex64
>> 0x0b call __tls_get_addr@plt
>> ...
>> starts with an insn prefix, produced using a .byte.
>>
>> When using a .loc before the sequence, it's associated with the next assembly
>> instruction, which is the leaq, so the .loc will end up pointing inside the
>> sequence rather than to the start of the sequence.  And when the linker
>> relaxes the sequence, the .loc may end up pointing inside an insn.  This will
>> cause an executable containing such a misplaced .loc to crash when gdb
>> continues from the associated breakpoint.
> 
> Hmm, I am not sure why .byte should be non-instruction and .data should be 
> instruction,
> but I assume gas simply behaves this way.
> 

Well, I suppose when encountering .byte/.value/.long/.quad, gas has no
way of knowing whether it's dealing with data or instructions, even in
the text section. An even if it would know it's dealing with
instructions, it wouldn't know where an instruction begins or ends. So
to me the behaviour seems reasonable.

If we'd implemented something like this in gas:
...
.insn
.byte 0x66
.endinsn
...
we could fix this more generically.

Or maybe we'd want this, which allows us to express that the .byte is a
prefix to an existing insn:
...
.insn
.byte 0x66
leaq  x@tlsgd(%rip),%rdi
.endinsn
...

> Don't we have also other cases wehre .byte is used to output instructions?
> 
> Patch is OK (and probably should be backported after some soaking in mainline)
> 

Committed (after moving the testcase to gcc.target/i386).

Thanks,
- Tom

> Honza
>>
>> This patch fixes the problem by using data16 to generate the prefix.
>>
>> Bootstrapped and reg-tested on x86_64.
>>
>> OK for trunk?
>>
>> Thanks,
>> - Tom
>>
>> [i386/debug] Fix insn prefix in tls_global_dynamic_64_
>>
>> 2018-06-22  Tom de Vries  
>>
>>  PR debug/86257
>>  * config/i386/i386.md (define_insn "*tls_global_dynamic_64_"):
>>  Use data16 instead of .byte for insn prefix.
>>
>>  * gcc.dg/pr86257.c: New test.
>>
>> ---
>>  gcc/config/i386/i386.md| 13 -
>>  gcc/testsuite/gcc.dg/pr86257.c | 14 ++
>>  2 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index eb77ef3c08f..6f2300057aa 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -14733,7 +14733,18 @@
>>"TARGET_64BIT"
>>  {
>>if (!TARGET_X32)
>> -fputs (ASM_BYTE "0x66\n", asm_out_file);
>> +/* The .loc directive has effect for 'the immediately following assembly
>> +   instruction'.  So for a sequence:
>> + .loc f l
>> + .byte x
>> + insn1
>> +   the 'immediately following assembly instruction' is insn1.
>> +   We want to emit an insn prefix here, but if we use .byte (as shown in
>> +   'ELF Handling For Thread-Local Storage'), a preceding .loc will point
>> +   inside the insn sequence, rather than to the start.  After relaxation
>> +   of the sequence by the linker, the .loc might point inside an insn.
>> +   Use data16 prefix instead, which doesn't have this problem.  */
>> +fputs ("\tdata16", asm_out_file);
>>output_asm_insn
>>  ("lea{q}\t{%E1@tlsgd(%%rip), %%rdi|rdi, %E1@tlsgd[rip]}", operands);
>>if (TARGET_SUN_TLS || flag_plt || !HAVE_AS_IX86_TLS_GET_ADDR_GOT)
>> diff --git a/gcc/testsuite/gcc.dg/pr86257.c b/gcc/testsuite/gcc.dg/pr86257.c
>> new file mode 100644
>> index 000..3287c190d36
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr86257.c
>> @@ -0,0 +1,14 @@
>> +/* { dg-require-effective-target fpic } */
>> +/* { dg-require-effective-target tls } */
>> +/* { dg-options "-g -fPIC" } */
>> +
>> +__thread int i;
>> +
>> +void
>> +foo(void)
>> +{
>> +  i = 0;
>> +}
>> +
>> +/* { dg-final { scan-assembler "data16\[ \t\]*leaq" } } */
>> +/* { dg-final { scan-assembler-not "\.byte\[ \t\]*0x66\n\[ \t\]*leaq" } } */


Re: [PATCH 1/4] regcprop: Avoid REG_CFA_REGISTER notes (PR85645)

2018-06-25 Thread Segher Boessenkool
Hi Eric,

On Wed, May 09, 2018 at 09:22:47AM +0200, Eric Botcazou wrote:
> > 2018-05-08  Segher Boessenkool  
> > 
> > PR rtl-optimization/85645
> > *  regcprop.c (copyprop_hardreg_forward_1): Don't propagate into an
> > insn that has a REG_CFA_REGISTER note.
> 
> OK, thanks.

Are these patches okay for backport to 8?  At least the first two.


Segher


Re: [PATCH] rs6000: Fix absif2

2018-06-25 Thread Segher Boessenkool
On Mon, Jun 04, 2018 at 04:30:36PM +, Segher Boessenkool wrote:
> Without this patch absif2 always FAILs.  There is no testcase for
> that, nor do we see it during bootstrap, but it is obvious.
> 
> Bootstrapped and tested on powerpc64-linux {-m32,-m64}; committing
> to trunk.

Backported to 8 now.


Segher


[PATCH] libtool: Sort output of 'find' to enable deterministic builds.

2018-06-25 Thread Bernhard M. Wiedemann
so that gcc builds in a reproducible way
in spite of indeterministic filesystem readdir order

See https://reproducible-builds.org/ for why this is good.

While working on the reproducible builds effort, I found that
when building the gcc8 package for openSUSE, there were differences
between each build in resulting binaries like gccgo, cc1obj and cpp
because the order of objects in libstdc++.a varied based on
the order of entries returned by the filesystem.

Two remaining issues are with timestamps in the ada build
and with profiledbootstrap that only is reproducible if all inputs
in the profiling run remain constant (and make -j breaks it too)

Testcases:
  none included because patch is trivial and it would need to compare builds on 
2 filesystems.

Bootstrapping and testing:
  tested successfully with gcc8 on x86_64

[gcc]
2018-06-19  Bernhard M. Wiedemann  

libtool: Sort output of 'find' to enable deterministic builds.

---
pulled in libtool commit 74c8993c178a1386ea5e2363a01d919738402f30
because a full update appears to be too troublesome after 8+ years
of divergence, but we still really want that fix.

See also https://gcc.gnu.org/ml/gcc/2017-10/msg00060.html
---
 libtool.m4 | 8 
 ltmain.sh  | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/libtool.m4 b/libtool.m4
index 24d13f344..940faaa16 100644
--- a/libtool.m4
+++ b/libtool.m4
@@ -6005,20 +6005,20 @@ if test "$_lt_caught_CXX_error" != yes; then
  _LT_TAGVAR(prelink_cmds, $1)='tpldir=Template.dir~
rm -rf $tpldir~
$CC --prelink_objects --instantiation_dir $tpldir $objs 
$libobjs $compile_deplibs~
-   compile_command="$compile_command `find $tpldir -name \*.o | 
$NL2SP`"'
+   compile_command="$compile_command `find $tpldir -name \*.o | 
sort | $NL2SP`"'
  _LT_TAGVAR(old_archive_cmds, $1)='tpldir=Template.dir~
rm -rf $tpldir~
$CC --prelink_objects --instantiation_dir $tpldir 
$oldobjs$old_deplibs~
-   $AR $AR_FLAGS $oldlib$oldobjs$old_deplibs `find $tpldir -name 
\*.o | $NL2SP`~
+   $AR $AR_FLAGS $oldlib$oldobjs$old_deplibs `find $tpldir -name 
\*.o | sort | $NL2SP`~
$RANLIB $oldlib'
  _LT_TAGVAR(archive_cmds, $1)='tpldir=Template.dir~
rm -rf $tpldir~
$CC --prelink_objects --instantiation_dir $tpldir 
$predep_objects $libobjs $deplibs $convenience $postdep_objects~
-   $CC -shared $pic_flag $predep_objects $libobjs $deplibs `find 
$tpldir -name \*.o | $NL2SP` $postdep_objects $compiler_flags ${wl}-soname 
${wl}$soname -o $lib'
+   $CC -shared $pic_flag $predep_objects $libobjs $deplibs `find 
$tpldir -name \*.o | sort | $NL2SP` $postdep_objects $compiler_flags 
${wl}-soname ${wl}$soname -o $lib'
  _LT_TAGVAR(archive_expsym_cmds, $1)='tpldir=Template.dir~
rm -rf $tpldir~
$CC --prelink_objects --instantiation_dir $tpldir 
$predep_objects $libobjs $deplibs $convenience $postdep_objects~
-   $CC -shared $pic_flag $predep_objects $libobjs $deplibs `find 
$tpldir -name \*.o | $NL2SP` $postdep_objects $compiler_flags ${wl}-soname 
${wl}$soname ${wl}-retain-symbols-file ${wl}$export_symbols -o $lib'
+   $CC -shared $pic_flag $predep_objects $libobjs $deplibs `find 
$tpldir -name \*.o | sort | $NL2SP` $postdep_objects $compiler_flags 
${wl}-soname ${wl}$soname ${wl}-retain-symbols-file ${wl}$export_symbols -o 
$lib'
  ;;
*) # Version 6 and above use weak symbols
  _LT_TAGVAR(archive_cmds, $1)='$CC -shared $pic_flag 
$predep_objects $libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname 
${wl}$soname -o $lib'
diff --git a/ltmain.sh b/ltmain.sh
index 9503ec85d..79f9ba89a 100644
--- a/ltmain.sh
+++ b/ltmain.sh
@@ -2917,7 +2917,7 @@ func_extract_archives ()
darwin_file=
darwin_files=
for darwin_file in $darwin_filelist; do
- darwin_files=`find unfat-$$ -name $darwin_file -print | $NL2SP`
+ darwin_files=`find unfat-$$ -name $darwin_file -print | sort | 
$NL2SP`
  $LIPO -create -output "$darwin_file" $darwin_files
done # $darwin_filelist
$RM -rf unfat-$$
@@ -2932,7 +2932,7 @@ func_extract_archives ()
 func_extract_an_archive "$my_xdir" "$my_xabs"
;;
   esac
-  my_oldobjs="$my_oldobjs "`find $my_xdir -name \*.$objext -print -o -name 
\*.lo -print | $NL2SP`
+  my_oldobjs="$my_oldobjs "`find $my_xdir -name \*.$objext -print -o -name 
\*.lo -print | sort | $NL2SP`
 done
 
 func_extract_archives_result="$my_oldobjs"
-- 
2.13.7



Re: [PATCH] rs6000: Fix vector homogeneous aggregates (PR86197)

2018-06-25 Thread Segher Boessenkool
On Tue, Jun 19, 2018 at 10:45:59AM +, Segher Boessenkool wrote:
> The existing code allows only 4 vectors worth of ieee128 homogeneous
> aggregates, but it should be 8.  This happens because at one spot it
> is mistakenly qualified as being passed in floating point registers.
> 
> This patch fixes it and makes the code easier to read.  Committing to
> trunk; needs backports too.

Backported to 8 now.


Segher


Re: [PATCH] Fix up AVX512F 128/256-bit shifts from using EVEX counts (PR target/84786)

2018-06-25 Thread Kirill Yukhin
Hello Jakub,
On 22 июн 23:47, Jakub Jelinek wrote:
> Hi!
> 
> The following testcase got fixed in 8/trunk with r253924 part of
> PR82370 enhancements, but that is not IMHO something we should backport.
> So instead the following patch adds something simpler, use Yv constraint
> for the DImode shift count in instructions that need AVX512VL when EVEX
> encoded if AVX512VL is not enabled instead of v.  Bootstrapped/regtested
> on 7.x branch on x86_64-linux and i686-linux, ok for 7.x?
> 
> Is the testcase alone ok also for trunk/8.2?
Patch is ok for trunk and ports.

--
Regards, Kirill Yukhin


Re: [PATCH][3/3] Consider multiple vector sizes for vectorization based on cost

2018-06-25 Thread Richard Biener
On Fri, 22 Jun 2018, Richard Biener wrote:

> 
> The following makes the vectorizer consider all vector sizes as advertised
> by targetm.vectorize.autovectorize_vector_sizes and decide on which
> vector size to use based on costs.
> 
> Given comparing costs is difficult if you do not know the number of
> scalar iterations the patch simply uses the cost of a single vector
> iteration (weighted by vectorization factor) to decide which variant
> is better.  For this we compute this cost also for -fno-vect-cost-model
> (so even with that you'll get the vectorizer choose between sizes)
> and store it in a new LOOP_VINFO_SINGLE_VECTOR_ITERATION_COST.
> 
> Otherwise this is straight-forward and doesn't really depend on
> dataref/ddr analysis sharing (but that makes it less costly).
> 
> Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

I have committed 1/3 and 2/3 now, for the following I have done
benchmarks on Haswell with SPEC CPU 2006 and results are in the
noise (as expected).  So with -Ofast -march=native I see the
following number of vectorized loops:

AVX256   AVX128
unpatched   13124 1367
patched 12893 1598

which means a slight shift towards AVX128 for whatever
(unanalyzed) reason.

As amendmend for the patch I am considering to keep -fno-vect-cost-model
behavior the same as now - choose the first successful vectorization.

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 0c2ab57bb20..8e3f5f550b0 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2336,6 +2336,10 @@ vect_analyze_loop (struct loop *loop, loop_vec_info 
orig_loop_vinfo,
 
  loop_vinfos.safe_push (std::make_pair (loop_vinfo,
 current_vector_size));
+ /* For -fno-vect-cost-model do as in the past, choose the
+first successful vectorization.  */
+ if (unlimited_cost_model (loop))
+   break;
}
   else
delete loop_vinfo;

the patch already bootstrapped and tested fine on x86_64-unknown-linux-gnu
without the above hunk (the vectorizer testsuite mostly uses 
-fno-vect-cost-model).

Richard.

> Richard.
> 
> From 4234213fcd5e6fe844bc1171938e12ab189ef290 Mon Sep 17 00:00:00 2001
> From: Richard Guenther 
> Date: Thu, 21 Jun 2018 13:40:14 +0200
> Subject: [PATCH] try-multiple-vector-sizes-and-compare-costs
> 
> 2018-06-22  Richard Biener  
> 
>   * tree-vectorizer.h (struct _loop_vec_info): Add
>   single_vector_iteration_cost member.
>   (LOOP_VINFO_SINGLE_VECTOR_ITERATION_COST): New.
>   * tree-vect-loop.c (_loop_vec_info::~_loop_vec_info): Do not
>   reset loop->aux here.
>   (vect_analyze_loop_form): Do not assert or set loop->aux here.
>   (vect_analyze_loop): Iterate over all vector sizes and decide
>   based on the vector iteration cost which one to use.
>   (vect_estimate_min_profitable_iters): Move check for
>   unlimited cost model later to not skip cost computation or
>   dumping.  Set LOOP_VINFO_SINGLE_VECTOR_ITERATION_COST.
>   (vect_create_epilog_for_reduction): Use loop_vinfo rather
>   than loop_vec_info_for_loop.
> 
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index dacc8811636..0c2ab57bb20 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -947,8 +947,6 @@ _loop_vec_info::~_loop_vec_info ()
>  
>release_vec_loop_masks ();
>delete ivexpr_map;
> -
> -  loop->aux = NULL;
>  }
>  
>  /* Return an invariant or register for EXPR and emit necessary
> @@ -1395,8 +1393,6 @@ vect_analyze_loop_form (struct loop *loop, 
> vec_info_shared *shared)
>  STMT_VINFO_TYPE (vinfo_for_stmt (inner_loop_cond))
>= loop_exit_ctrl_vec_info_type;
>  
> -  gcc_assert (!loop->aux);
> -  loop->aux = loop_vinfo;
>return loop_vinfo;
>  }
>  
> @@ -2285,7 +2281,6 @@ loop_vec_info
>  vect_analyze_loop (struct loop *loop, loop_vec_info orig_loop_vinfo,
>  vec_info_shared *shared)
>  {
> -  loop_vec_info loop_vinfo;
>auto_vector_sizes vector_sizes;
>  
>/* Autodetect first vector size we try.  */
> @@ -2316,11 +2311,12 @@ vect_analyze_loop (struct loop *loop, loop_vec_info 
> orig_loop_vinfo,
>  }
>  
>unsigned n_stmts;
> +  auto_vec > loop_vinfos;
>poly_uint64 autodetected_vector_size = 0;
>while (1)
>  {
>/* Check the CFG characteristics of the loop (nesting, entry/exit).  */
> -  loop_vinfo = vect_analyze_loop_form (loop, shared);
> +  loop_vec_info loop_vinfo = vect_analyze_loop_form (loop, shared);
>if (!loop_vinfo)
>   {
> if (dump_enabled_p ())
> @@ -2338,10 +2334,24 @@ vect_analyze_loop (struct loop *loop, loop_vec_info 
> orig_loop_vinfo,
>   {
> LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1;
>  
> -   return loop_vinfo;
> +   loop_vinfos.safe_push (std::make_pair (loop_vinfo,
> +  current_vector_size));
>   }
> +  else
> + 

Re: [PATCH 3/3][POPCOUNT] Remove unnecessary if condition in phiopt

2018-06-25 Thread Richard Biener
On Fri, Jun 22, 2018 at 11:16 AM Kugan Vivekanandarajah
 wrote:
>
> gcc/ChangeLog:

@@ -1516,6 +1521,114 @@ minmax_replacement (basic_block cond_bb,
basic_block middle_bb,

   return true;
 }
+/* Convert
+
+   
+   if (b_4(D) != 0)
+   goto 

vertical space before the comment.

+ edge e0 ATTRIBUTE_UNUSED, edge e1
ATTRIBUTE_UNUSED,

why pass them if they are unused?

+  if (stmt_count != 2)
+return false;

so what about the case when there is no conversion?

+  /* Check that we have a popcount builtin.  */
+  if (!is_gimple_call (popcount)
+  || !gimple_call_builtin_p (popcount, BUILT_IN_NORMAL))
+return false;
+  tree fndecl = gimple_call_fndecl (popcount);
+  if ((DECL_FUNCTION_CODE (fndecl) != BUILT_IN_POPCOUNT)
+  && (DECL_FUNCTION_CODE (fndecl) != BUILT_IN_POPCOUNTL)
+  && (DECL_FUNCTION_CODE (fndecl) != BUILT_IN_POPCOUNTLL))
+return false;

look at popcount handling in tree-vrp.c how to properly also handle
IFN_POPCOUNT.
(CASE_CFN_POPCOUNT)

+  /* Cond_bb has a check for b_4 != 0 before calling the popcount
+ builtin.  */
+  if (gimple_code (cond) != GIMPLE_COND
+  || gimple_cond_code (cond) != NE_EXPR
+  || TREE_CODE (gimple_cond_lhs (cond)) != SSA_NAME
+  || rhs != gimple_cond_lhs (cond))
+return false;

The check for SSA_NAME is redundant.
You fail to check that gimple_cond_rhs is zero.

+  /* Remove the popcount builtin and cast stmt.  */
+  gsi = gsi_for_stmt (popcount);
+  gsi_remove (, true);
+  gsi = gsi_for_stmt (cast);
+  gsi_remove (, true);
+
+  /* And insert the popcount builtin and cast stmt before the cond_bb.  */
+  gsi = gsi_last_bb (cond_bb);
+  gsi_insert_before (, popcount, GSI_NEW_STMT);
+  gsi_insert_before (, cast, GSI_NEW_STMT);

use gsi_move_before ().  You need to reset flow sensitive info on the
LHS of the popcount call as well as on the LHS of the cast.

You fail to check the PHI operand on the false edge.  Consider

 if (b != 0)
   res = __builtin_popcount (b);
 else
   res = 1;

You fail to check the PHI operand on the true edge.  Consider

 res = 0;
 if (b != 0)
   {
  __builtin_popcount (b);
  res = 2;
   }

and using -fno-tree-dce and whatever you need to keep the
popcount call in the IL.  A gimple testcase for phiopt will do.

Your testcase relies on popcount detection.  Please write it
using __builtin_popcount instead.  Write one with a cast and
one without.

Thanks,
Richard.


> 2018-06-22  Kugan Vivekanandarajah  
>
> * tree-ssa-phiopt.c (cond_removal_in_popcount_pattern): New.
> (tree_ssa_phiopt_worker): Call cond_removal_in_popcount_pattern.
>
> gcc/testsuite/ChangeLog:
>
> 2018-06-22  Kugan Vivekanandarajah  
>
> * gcc.dg/tree-ssa/popcount3.c: New test.


Re: [committed] Update OpenACC testcases

2018-06-25 Thread Rainer Orth
Hi Thomas,

> On our development branch(es) we had accumulated a bunch of testcases
> (updates) that should have been part of earlier patch submissions, or
> were not yet pushed for unknown reasons.  ... until now; in r261884, I
> just committed the following to trunk:
>
> commit e342f300e74ee68bc48ccfdb6ee202da6ca99e9e
> Author: tschwinge 
> Date:   Fri Jun 22 10:04:14 2018 +
>
> Update OpenACC testcases
[...]
> libgomp/
[...]
> * testsuite/libgomp.oacc-c++/non-scalar-data.C: New file.

this test ...

> diff --git libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C 
> libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C
> new file mode 100644
> index 000..8e4b296
> --- /dev/null
> +++ libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C
> @@ -0,0 +1,110 @@
> +// Ensure that a non-scalar dummy arguments which are implicitly used inside
> +// offloaded regions are properly mapped using present_or_copy semantics.
> +
> +// { dg-xfail-if "TODO" { *-*-* } }
> +// { dg-excess-errors "ICE" }

comes up as UNRESOLVED everywhere:

UNRESOLVED: libgomp.oacc-c++/non-scalar-data.C -DACC_DEVICE_TYPE_host=1 
-DACC_MEM_SHARED=1  -O2  compilation failed to produce executable

Unless you plan to fix the ICE soon, please either remove the test or
dg-skip-if it to avoid unnecessary testsuite noise.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH v2, rs6000] Backport Fix implementation of vec_pack (vector double, vector double) built-in function

2018-06-25 Thread Segher Boessenkool
On Fri, Jun 22, 2018 at 05:36:24PM -0500, Kelvin Nilsen wrote:
> After waiting a few days for this newly committed patch to settle, is it ok 
> to backport to gcc 6, gcc 7, and gcc 8?

Sure.  Thanks!


Segher


Re: [PATCH 2/3][POPCOUNT] Check if zero check is done before entering the loop

2018-06-25 Thread Richard Biener
On Fri, Jun 22, 2018 at 11:14 AM Kugan Vivekanandarajah
 wrote:
>
> gcc/ChangeLog:

The canonical way is calling simplify_using_initial_conditions on the
may_be_zero condition.

Richard.

> 2018-06-22  Kugan Vivekanandarajah  
>
> * tree-ssa-loop-niter.c (number_of_iterations_popcount): If popcount
> argument is checked for zero before entering loop, avoid checking again.


Re: [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p

2018-06-25 Thread Richard Biener
On Fri, Jun 22, 2018 at 11:13 AM Kugan Vivekanandarajah
 wrote:
>
> [PATCH 1/3][POPCOUNT] Handle COND_EXPR in expression_expensive_p

This says that COND_EXPR itself isn't expensive.  I think we should
constrain that a bit.
I think a good default would be to only allow a single COND_EXPR which
you can achieve
by adding a bool in_cond_expr_p = false argument to the function, pass
in_cond_expr_p
down and pass true down from the COND_EXPR handling itself.

I'm not sure if we should require either COND_EXPR arm (operand 1 or
2) to be constant
or !EXPR_P (then multiple COND_EXPRs might be OK).

The main idea is to avoid evaluating many expressions but only
choosing one in the end.

The simplest patch achieving that is sth like

+  if (code == COND_EXPR)
+return (expression_expensive_p (TREE_OPERAND (expr, 0))
  || (EXPR_P (TREE_OPERAND (expr, 1)) && EXPR_P
(TREE_OPERAND (expr, 2)))
+   || expression_expensive_p (TREE_OPERAND (expr, 1))
+   || expression_expensive_p (TREE_OPERAND (expr, 2)));

OK with that change.

Richard.

> gcc/ChangeLog:
>
> 2018-06-22  Kugan Vivekanandarajah  
>
> * tree-scalar-evolution.c (expression_expensive_p): Handle COND_EXPR.


Re: [PATCH 0/3][POPCOUNT]

2018-06-25 Thread Richard Biener
On Mon, Jun 25, 2018 at 11:30 AM Bin.Cheng  wrote:
>
> On Mon, Jun 25, 2018 at 1:50 PM, Kugan Vivekanandarajah
>  wrote:
> > Hi Bin,
> >
> > On 25 June 2018 at 13:56, Bin.Cheng  wrote:
> >> On Mon, Jun 25, 2018 at 11:37 AM, Kugan Vivekanandarajah
> >>  wrote:
> >>> Hi Bin,
> >>>
> >>> Thanks for your comments.
> >>>
> >>> On 25 June 2018 at 11:15, Bin.Cheng  wrote:
>  On Fri, Jun 22, 2018 at 5:11 PM, Kugan Vivekanandarajah
>   wrote:
> > When we set niter with maybe_zero, currently final_value_relacement
> > will not happen due to expression_expensive_p not handling. Patch 1
> > adds this.
> >
> > With that we have the following optimized gimple.
> >
> >[local count: 118111601]:
> >   if (b_4(D) != 0)
> > goto ; [89.00%]
> >   else
> > goto ; [11.00%]
> >
> >[local count: 105119324]:
> >   _2 = (unsigned long) b_4(D);
> >   _9 = __builtin_popcountl (_2);
> >   c_3 = b_4(D) != 0 ? _9 : 1;
> >
> >[local count: 118111601]:
> >   # c_12 = PHI 
> >
> > I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when the
>  No, it doesn't make much sense.  when b_4(D) == 0, the popcount
>  computed should be 0.  Point is you can never get b_4(D) == 0 with
>  guard condition in basic block 2.  So the result should simply be:
> >>>
> >>> When we do  calculate niter (for the copy header case), we set
> >>> may_be_zero as (which I think is correct)
> >>> niter->may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src,
> >>>   build_zero_cst
> >>>   (TREE_TYPE (src)));
> >>>
> >>> Then in final_value_replacement_loop (struct loop *loop)
> >>>
> >>> for the PHI stmt for which we are going to do the final value replacement,
> >>> we analyze_scalar_evolution_in_loop which is POLYNOMIAL_CHREC.
> >>>
> >>> then we do
> >>> compute_overall_effect_of_inner_loop (struct loop *loop, tree 
> >>> evolution_fn)
> >>>
> >>> where when we do chrec_apply to the polynomial_chrec with niter from
> >>> popcount which also has the may_be_zero, we end up with the 1.
> >>> Looking at this, I am not sure if this is wrong. May be I am missing 
> >>> something.
> >> I think it is wrong.  How could you get popcount == 1 when b_4(D) ==
> >> 0?  Though it never happens in this case.
> >
> > We dont set popcount = 1. When we set niter for popcount pattern with
> > niter->may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src,
> >   build_zero_cst (TREE_TYPE (src)));
> Hmm, I think this is unnecessary and causing the weird cond_expr in
> following optimization.  What happens if you simply set it to false?

It is surely not unnecessary because we set it to non-NULL only when
the result is _not_ simply popcount() but popcount()-1.

All the above suggests that SCEV does sth wrong.  Dumps show

  (set_nb_iterations_in_loop = b_4(D) != 0 ? (unsigned long)
__builtin_popcountl ((unsigned long) b_4(D)) + 18446744073709551615 :
0))

(chrec_apply
  (varying_loop = 1
)
  (chrec = {1, +, 1}_1)
  (x = b_4(D) != 0 ? (int) ((unsigned int) __builtin_popcountl
((unsigned long) b_4(D)) + 4294967295) : 0)
  (res = b_4(D) != 0 ? __builtin_popcountl ((unsigned long) b_4(D)) : 1))
)

this is because the analysis works on a header-copied loop and thus
the IV for C is {1, +, 1}
so this is all correct and to be simply optimized by following passes
factoring in
the range of b_4(D) which was tested to be _not_zero before.

Richard.

> Thanks,
> bin
> >
> > Because of which, we have an niter in the final_value_replacement, we have
> > (gdb) p debug_tree (niter)
> >   > type  > size 
> > unit-size 
> > align:64 warn_if_not_align:0 symtab:0 alias-set -1
> > canonical-type 0x7694d1f8 precision:64 min  > 0x7694a120 0> max  > 18446744073709551615>>
> >
> > arg:0  > type  > size 
> > unit-size 
> > align:8 warn_if_not_align:0 symtab:0 alias-set -1
> > canonical-type 0x76945b28 precision:1 min  > 0x7694a048 0> max >
> >
> > arg:0  > 0x76945738 long int>
> > visited var 
> > def_stmt GIMPLE_NOPvolu
> > version:4>
> > arg:1 >
> > arg:1 
> >
> > arg:0 
> >
> > arg:0  > 0x769455e8 int>
> >
> > fn  > 0x76a55888>
> > readonly constant arg:0  > 0x769ff600 __builtin_popcountl>>
> > arg:0  > 0x7694d1f8>
> > arg:0 >>>
> > arg:1 >
> > arg:2  > 0x7694d1f8> constant 0>>
> >
> > Then from there then we do compute_overall_effect_of_inner_loop for
> > scalar evolution of PHI with niter we get the 1.
> >
> >>>
> >>> In this testcase, before we enter the loop we have a check for (b_4(D)
>  0). Thus, setting niter->may_be_zero is not strictly necessary but
> >>> conservatively correct (?).
> >> Yes, but not necessarily.  Setting maybe_zero could 

Re: [PATCH, rs6000] Fix AIX test case failures

2018-06-25 Thread Segher Boessenkool
Hi Carl,

On Fri, Jun 22, 2018 at 02:55:44PM -0700, Carl Love wrote:
> --- a/gcc/testsuite/gcc.target/powerpc/divkc3-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/divkc3-2.c
> @@ -13,4 +13,5 @@ divide (cld_t *p, cld_t *q, cld_t *r)
>*p = *q / *r;
>  }
>  
> -/* { dg-final { scan-assembler "bl __divkc3" } } */
> +/* { dg-final { scan-assembler "bl __divkc3" { target { powerpc*-*-linux* } 
> } } } */
> +/* { dg-final { scan-assembler "bl .__divdc3" { target { powerpc*-*-aix* } } 
> } } */

Should it be calling __divdc3 on AIX, is that correct?

> --- a/gcc/testsuite/gcc.target/powerpc/divkc3-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/divkc3-3.c
> @@ -13,4 +13,5 @@ divide (cld_t *p, cld_t *q, cld_t *r)
>*p = *q / *r;
>  }
>  
> -/* { dg-final { scan-assembler "bl __divtc3" } } */
> +/* { dg-final { scan-assembler "bl __divtc3" { target { powerpc*-*-linux* } 
> } } } */
> +/* { dg-final { scan-assembler "bl .__divdc3" { target { powerpc*-*-aix* } } 
> } } */

Same question here.  If the AIX port cannot handle -mabi=ieeelongdouble
it shouldn't silently accept it, etc.

> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-double.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-double.c
> index 25f4bc6..403876d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-double.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-mergehl-double.c
> @@ -19,7 +19,6 @@ testd_h (vector double vd2, vector double vd3)
>return vec_mergeh (vd2, vd3);
>  }
>  
> -/* vec_merge with doubles tend to just use xxpermdi (3 ea for BE, 1 ea for 
> LE).  */
> -/* { dg-final { scan-assembler-times "xxpermdi" 2  { target { powerpc*le-*-* 
> } }} } */
> -/* { dg-final { scan-assembler-times "xxpermdi" 6  { target { powerpc-*-* } 
> } } } */
> +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
> +
>  

Is this change correct?  The test didn't fail on BE Linux (neither 32-bit
nor 64-bit) as far as I know.


Segher


Re: [PATCH], PowerPC long double transition patches, v2, Patch #1 (disable long double multilib)

2018-06-25 Thread Segher Boessenkool
On Fri, Jun 22, 2018 at 05:04:14PM -0400, Michael Meissner wrote:
> Here is the patch redone to disable multilib support altogether.  I verified
> that without --{enable,disable}-multilib that it no longer builds a multilib
> compiler.  Can I install this into the trunk and to the GCC 8.x branch?

Okay for trunk and 8.  Thanks!


Segher


> 2018-06-22  Michael Meissner  
> 
>   * config.gcc (powerpc64le*): Revert January 16th, 2018 patch that
>   added IEEE/IBM long double multilib support on PowerPC little
>   endian Linux systems.
>   * config/rs6000/linux64.h (MULTILIB_DEFAULTS_IEEE): Likewise.
>   (MULTILIB_DEFAULTS): Likewise.
>   * config/rs6000/rs6000.c (rs6000_option_override_internal):
>   Likewise.
>   * config/rs6000/rs6000.h (TARGET_IEEEQUAD_MULTILIB): Likewise.
>   * config/rs6000/t-ldouble-linux64le-ibm: Delete, no longer used.
>   * config/rs6000/t-ldouble-linux64le-ieee: Delete, no longer used.


Re: [PATCH 0/3][POPCOUNT]

2018-06-25 Thread Bin.Cheng
On Mon, Jun 25, 2018 at 1:50 PM, Kugan Vivekanandarajah
 wrote:
> Hi Bin,
>
> On 25 June 2018 at 13:56, Bin.Cheng  wrote:
>> On Mon, Jun 25, 2018 at 11:37 AM, Kugan Vivekanandarajah
>>  wrote:
>>> Hi Bin,
>>>
>>> Thanks for your comments.
>>>
>>> On 25 June 2018 at 11:15, Bin.Cheng  wrote:
 On Fri, Jun 22, 2018 at 5:11 PM, Kugan Vivekanandarajah
  wrote:
> When we set niter with maybe_zero, currently final_value_relacement
> will not happen due to expression_expensive_p not handling. Patch 1
> adds this.
>
> With that we have the following optimized gimple.
>
>[local count: 118111601]:
>   if (b_4(D) != 0)
> goto ; [89.00%]
>   else
> goto ; [11.00%]
>
>[local count: 105119324]:
>   _2 = (unsigned long) b_4(D);
>   _9 = __builtin_popcountl (_2);
>   c_3 = b_4(D) != 0 ? _9 : 1;
>
>[local count: 118111601]:
>   # c_12 = PHI 
>
> I assume that 1 in  b_4(D) != 0 ? _9 : 1; is OK (?) because when the
 No, it doesn't make much sense.  when b_4(D) == 0, the popcount
 computed should be 0.  Point is you can never get b_4(D) == 0 with
 guard condition in basic block 2.  So the result should simply be:
>>>
>>> When we do  calculate niter (for the copy header case), we set
>>> may_be_zero as (which I think is correct)
>>> niter->may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src,
>>>   build_zero_cst
>>>   (TREE_TYPE (src)));
>>>
>>> Then in final_value_replacement_loop (struct loop *loop)
>>>
>>> for the PHI stmt for which we are going to do the final value replacement,
>>> we analyze_scalar_evolution_in_loop which is POLYNOMIAL_CHREC.
>>>
>>> then we do
>>> compute_overall_effect_of_inner_loop (struct loop *loop, tree evolution_fn)
>>>
>>> where when we do chrec_apply to the polynomial_chrec with niter from
>>> popcount which also has the may_be_zero, we end up with the 1.
>>> Looking at this, I am not sure if this is wrong. May be I am missing 
>>> something.
>> I think it is wrong.  How could you get popcount == 1 when b_4(D) ==
>> 0?  Though it never happens in this case.
>
> We dont set popcount = 1. When we set niter for popcount pattern with
> niter->may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src,
>   build_zero_cst (TREE_TYPE (src)));
Hmm, I think this is unnecessary and causing the weird cond_expr in
following optimization.  What happens if you simply set it to false?

Thanks,
bin
>
> Because of which, we have an niter in the final_value_replacement, we have
> (gdb) p debug_tree (niter)
>   type  size 
> unit-size 
> align:64 warn_if_not_align:0 symtab:0 alias-set -1
> canonical-type 0x7694d1f8 precision:64 min  0x7694a120 0> max  18446744073709551615>>
>
> arg:0  type  size 
> unit-size 
> align:8 warn_if_not_align:0 symtab:0 alias-set -1
> canonical-type 0x76945b28 precision:1 min  0x7694a048 0> max >
>
> arg:0  0x76945738 long int>
> visited var 
> def_stmt GIMPLE_NOPvolu
> version:4>
> arg:1 >
> arg:1 
>
> arg:0 
>
> arg:0  0x769455e8 int>
>
> fn  0x76a55888>
> readonly constant arg:0  0x769ff600 __builtin_popcountl>>
> arg:0  0x7694d1f8>
> arg:0 >>>
> arg:1 >
> arg:2  0x7694d1f8> constant 0>>
>
> Then from there then we do compute_overall_effect_of_inner_loop for
> scalar evolution of PHI with niter we get the 1.
>
>>>
>>> In this testcase, before we enter the loop we have a check for (b_4(D)
 0). Thus, setting niter->may_be_zero is not strictly necessary but
>>> conservatively correct (?).
>> Yes, but not necessarily.  Setting maybe_zero could confuse following
>> optimizations and we should avoid doing that whenever possible.  If
>> any pass goes wrong because it's not set conservatively, it is that
>> pass' responsibility and should be fixed accordingly.  Here IMHO, we
>> don't need to set it.
>
> My patch 2 is for not setting this when we know know a_4(D) is not
> zero in this path.
>
> Thanks,
> Kugan
>
>
>
>
>>
>> Thanks,
>> bin
>>>
>>> Thanks,
>>> Kugan
>>>

>[local count: 118111601]:
>   if (b_4(D) != 0)
> goto ; [89.00%]
>   else
> goto ; [11.00%]
>
>[local count: 105119324]:
>   _2 = (unsigned long) b_4(D);
>   c_3 = __builtin_popcountl (_2);
>
>[local count: 118111601]:
>   # c_12 = PHI 

 I think this is the code generated if maybe_zero is not set?  which it
 should not be set here.
 For the same reason, it can be further optimized into:

>[local count: 118111601]:
>   _2 = (unsigned long) b_4(D);
>   c_12 = __builtin_popcountl (_2);
>

> latch execute zero times for b_4 == 0 means that the body will execute

Re: [PATCH] Avoid changing DR for scatter/gather refs

2018-06-25 Thread Richard Biener
On Sun, 24 Jun 2018, Eric Botcazou wrote:

> > 2018-06-21  Richard Biener  
> > 
> > * tree-data-ref.c (dr_step_indicator): Handle NULL DR_STEP.
> > * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
> > Avoid calling vect_mark_for_runtime_alias_test with gathers or scatters.
> > (vect_analyze_data_ref_dependence): Re-order checks to deal with
> > NULL DR_STEP.
> > (vect_record_base_alignments): Do not record base alignment
> > for gathers or scatters.
> > (vect_compute_data_ref_alignment): Drop return value that is always
> > true.  Bail out early for gathers or scatters.
> > (vect_enhance_data_refs_alignment): Bail out early for gathers
> > or scatters.
> > (vect_find_same_alignment_drs): Likewise.
> > (vect_analyze_data_refs_alignment): Remove dead code.
> > (vect_slp_analyze_and_verify_node_alignment): Likewise.
> > (vect_analyze_data_refs): For possible gathers or scatters do
> > not create an alternate DR, just check their possible validity
> > and mark them.  Adjust DECL_NONALIASED handling to not rely
> > on DR_BASE_ADDRESS.
> > * tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
> > update inits of gathers or scatters.
> > * tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
> > Also copy gather/scatter flag to pattern vinfo.
> 
> This breaks the attached testcase sso9.adb compiled at -O3:
> 
> +===GNAT BUG DETECTED==+
> | 9.0.0 20180621 (experimental) [trunk revision 261832] (x86_64-suse-linux) 
> GCC error:|
> | in vect_check_gather_scatter, at tree-vect-data-refs.c:3733  |
> | Error detected around /home/eric/svn/gcc/gcc/testsuite/gnat.dg/sso9.adb:6:1|
> | Please submit a bug report; see https://gcc.gnu.org/bugs/ .  |

Ah, ok.  So currently data-ref analysis in dr_analyze_innermost punts
on reverse storage order accesses.  Before this patch the scatter/gather
case re-analyzed the ref w/o loop context and the of course failed
again because of that.  Now we treat it as possibly scatter/gather
and ask vect_check_gather_scatter whether it is really ok.  That does

  base = get_inner_reference (base, , , , ,
  , , );
  gcc_assert (base && !reversep);

but of course both reversep can now happen.  (I think get_inner_reference
can never return NULL)

So the following should fix it, testing in progress.

Richard.

2018-06-25  Richard Biener  

* tree-vect-data-refs.c (vect_check_gather_scatter): Fail
for reverse storage order accesses rather than asserting
they cannot happen here.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 262005)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -3730,7 +3730,9 @@ vect_check_gather_scatter (gimple *stmt,
  that can be gimplified before the loop.  */
   base = get_inner_reference (base, , , , ,
  , , );
-  gcc_assert (base && !reversep);
+  if (reversep)
+return false;
+
   poly_int64 pbytepos = exact_div (pbitpos, BITS_PER_UNIT);
 
   if (TREE_CODE (base) == MEM_REF)


Re: [PATCH, rs6000] Add vec_extract builtin tests, fix arguments on existing tests

2018-06-25 Thread Segher Boessenkool
Hi!

On Fri, Jun 22, 2018 at 08:21:35AM -0700, Carl Love wrote:
> GCC Maintainers:
> 
> The following patch adds tests for the vec_extract builtin.  I also
> adjusts the second argument on the existing tests so they match the
> ABI, specifically an integer not a const integer.

Does that make a difference anywhere?  It shouldn't as far as I know.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/p9-extract-4.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */

powerpc*-*-* && lp64

> +/* This file tests the extraction of 64-bit values.  The direct move is
> +   prefered for the  64-bit extract as it is either lower latency or the same
> +   latency as the extract instruction depending on the Endianess of the 
> system.
> +   Furthermore, there can be up to four move instructions in flight at a time
> +   versus only two extract intructions at a time.  */

Please say this is for Power9?  It won't stay true forever.

Okay for trunk with those trivial changes.  Thanks!


Segher


Re: [PATCH][AARCH64] PR target/84521 Fix frame pointer corruption with -fomit-frame-pointer with __builtin_setjmp

2018-06-25 Thread Sudakshina Das

PING!

On 14/06/18 12:10, Sudakshina Das wrote:

Hi Eric

On 07/06/18 16:33, Eric Botcazou wrote:

Sorry this fell off my radar. I have reg-tested it on x86 and tried it
on the sparc machine from the gcc farm but I think I couldn't finished
the run and now its showing to he unreachable.


The patch is a no-op for SPARC because it defines the nonlocal_goto 
pattern.


But I would nevertheless strongly suggest _not_ fiddling with the 
generic code
like that and just defining the nonlocal_goto pattern for Aarch64 
instead.




Thank you for the suggestion, I have edited the patch accordingly and
defined the nonlocal_goto pattern for AArch64. This has also helped take
care of the issue with __builtin_longjmp that Wilco had mentioned in his
comment on the PR (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521#c19).

I have also modified the test case according to Wilco's comment to add 
an extra jump buffer. This test case passes with AArch64 but fails on

x86 trunk as follows (It may fail on other targets as well):

FAIL: gcc.c-torture/execute/pr84521.c   -O1  execution test
FAIL: gcc.c-torture/execute/pr84521.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr84521.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gcc.c-torture/execute/pr84521.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/pr84521.c   -Os  execution test
FAIL: gcc.c-torture/execute/pr84521.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr84521.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test

Testing: Bootstrapped and regtested on aarch64-none-linux-gnu.

Is this ok for trunk?

Sudi

*** gcc/ChangeLog ***

2018-06-14  Sudakshina Das  

 PR target/84521
 * config/aarch64/aarch64.h (DONT_USE_BUILTIN_SETJMP): Update comment.
 * config/aarch64/aarch64.c (aarch64_needs_frame_chain): Add
 cfun->has_nonlocal_label to force frame chain.
 (aarch64_builtin_setjmp_frame_value): New.
 (TARGET_BUILTIN_SETJMP_FRAME_VALUE): Define.
 * config/aarch64/aarch64.md (nonlocal_goto): New.

*** gcc/testsuite/ChangeLog ***

2018-06-14  Sudakshina Das  

 PR target/84521
 * gcc.c-torture/execute/pr84521.c: New test.




Re: [PATCH] [RFC] Higher-level reporting of vectorization problems

2018-06-25 Thread Richard Biener
On Fri, 22 Jun 2018, David Malcolm wrote:

> NightStrike and I were chatting on IRC last week about
> issues with trying to vectorize the following code:
> 
> #include 
> std::size_t f(std::vector> const & v) {
>   std::size_t ret = 0;
>   for (auto const & w: v)
>   ret += w.size();
>   return ret;
> }
> 
> icc could vectorize it, but gcc couldn't, but neither of us could
> immediately figure out what the problem was.
> 
> Using -fopt-info leads to a wall of text.
> 
> I tried using my patch here:
> 
>  "[PATCH] v3 of optinfo, remarks and optimization records"
>   https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01267.html
> 
> It improved things somewhat, by showing:
> (a) the nesting structure via indentation, and
> (b) the GCC line at which each message is emitted (by using the
> "remark" output)
> 
> but it's still a wall of text:
> 
>   https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.remarks.html
>   
> https://dmalcolm.fedorapeople.org/gcc/2018-06-18/test.cc.d/..%7C..%7Csrc%7Ctest.cc.html#line-4
> 
> It doesn't yet provide a simple high-level message to a
> tech-savvy user on what they need to do to get GCC to
> vectorize their loop.

Yeah, in particular the vectorizer is way too noisy in its low-level
functions.  IIRC -fopt-info-vec-missed is "somewhat" better:

t.C:4:26: note: step unknown.
t.C:4:26: note: vector alignment may not be reachable
t.C:4:26: note: not ssa-name.
t.C:4:26: note: use not simple.
t.C:4:26: note: not ssa-name.
t.C:4:26: note: use not simple.
t.C:4:26: note: no array mode for V2DI[3]
t.C:4:26: note: Data access with gaps requires scalar epilogue loop
t.C:4:26: note: can't use a fully-masked loop because the target doesn't 
have the appropriate masked load or store.
t.C:4:26: note: not ssa-name.
t.C:4:26: note: use not simple.
t.C:4:26: note: not ssa-name.
t.C:4:26: note: use not simple.
t.C:4:26: note: no array mode for V2DI[3]
t.C:4:26: note: Data access with gaps requires scalar epilogue loop
t.C:4:26: note: op not supported by target.
t.C:4:26: note: not vectorized: relevant stmt not supported: _15 = _14 
/[ex] 4;
t.C:4:26: note: bad operation or unsupported loop bound.
t.C:4:26: note: not vectorized: no grouped stores in basic block.
t.C:4:26: note: not vectorized: no grouped stores in basic block.
t.C:6:12: note: not vectorized: not enough data-refs in basic block.


> The pertinent dump messages are:
> 
> test.cc:4:23: remark: === try_vectorize_loop_1 === 
> [../../src/gcc/tree-vectorizer.c:674:try_vectorize_loop_1]
> cc1plus: remark:
> Analyzing loop at test.cc:4 
> [../../src/gcc/dumpfile.c:735:ensure_pending_optinfo]
> test.cc:4:23: remark:  === analyze_loop_nest === 
> [../../src/gcc/tree-vect-loop.c:2299:vect_analyze_loop]
> [...snip...]
> test.cc:4:23: remark:   === vect_analyze_loop_operations === 
> [../../src/gcc/tree-vect-loop.c:1520:vect_analyze_loop_operations]
> [...snip...]
> test.cc:4:23: remark:==> examining statement: ‘_15 = _14 /[ex] 4;’ 
> [../../src/gcc/tree-vect-stmts.c:9382:vect_analyze_stmt]
> test.cc:4:23: remark:vect_is_simple_use: operand ‘_14’ 
> [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
> test.cc:4:23: remark:def_stmt: ‘_14 = _8 - _7;’ 
> [../../src/gcc/tree-vect-stmts.c:10098:vect_is_simple_use]
> test.cc:4:23: remark:type of def: internal 
> [../../src/gcc/tree-vect-stmts.c:10112:vect_is_simple_use]
> test.cc:4:23: remark:vect_is_simple_use: operand ‘4’ 
> [../../src/gcc/tree-vect-stmts.c:10064:vect_is_simple_use]
> test.cc:4:23: remark:op not supported by target. 
> [../../src/gcc/tree-vect-stmts.c:5932:vectorizable_operation]
> test.cc:4:23: remark:not vectorized: relevant stmt not supported: ‘_15 = 
> _14 /[ex] 4;’ [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt]
> test.cc:4:23: remark:   bad operation or unsupported loop bound. 
> [../../src/gcc/tree-vect-loop.c:2043:vect_analyze_loop_2]
> cc1plus: remark: vectorized 0 loops in function. 
> [../../src/gcc/tree-vectorizer.c:904:vectorize_loops]
> 
> In particular, that complaint from
>   [../../src/gcc/tree-vect-stmts.c:9565:vect_analyze_stmt]
> is coming from:
> 
>   if (!ok)
> {
>   if (dump_enabled_p ())
> {
>   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>"not vectorized: relevant stmt not ");
>   dump_printf (MSG_MISSED_OPTIMIZATION, "supported: ");
>   dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0);
> }
> 
>   return false;
> }
> 
> This got me thinking: the user presumably wants to know several
> things:
> 
> * the location of the loop that can't be vectorized (vect_location
>   captures this)
> * location of the problematic statement
> * why it's problematic
> * the problematic statement itself.
> 
> The following is an experiment at capturing that information, by
> recording an "opt_problem" instance describing what the optimization
> problem is, created deep in the callstack when 

Re: [AArch64][PATCH 1/2] Fix addressing printing of LDP/STP

2018-06-25 Thread Andre Simoes Dias Vieira
On 18/06/18 09:08, Andre Simoes Dias Vieira wrote:
> Hi Richard,
> 
> Sorry for the delay I have been on holidays.  I had a look and I think you 
> are right.  With these changes Umq and Uml seem to have the same 
> functionality though, so I would suggest using only one.  Maybe use a 
> different name for both, removing both Umq and Uml in favour of Umn, where 
> the n indicates it narrows the addressing mode.  How does that sound to you?
> 
> I also had a look at Ump, but that one is used in the parallel pattern for 
> STP/LDP which does not use this "narrowing". So we should leave that one as 
> is.
> 
> Cheers,
> Andre
> 
> 
> From: Richard Sandiford 
> Sent: Thursday, June 14, 2018 12:28:16 PM
> To: Andre Simoes Dias Vieira
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: Re: [AArch64][PATCH 1/2] Fix addressing printing of LDP/STP
> 
> Andre Simoes Dias Vieira  writes:
>> @@ -5716,10 +5717,17 @@ aarch64_classify_address (struct 
>> aarch64_address_info *info,
>>unsigned int vec_flags = aarch64_classify_vector_mode (mode);
>>bool advsimd_struct_p = (vec_flags == (VEC_ADVSIMD | VEC_STRUCT));
>>bool load_store_pair_p = (type == ADDR_QUERY_LDP_STP
>> + || type == ADDR_QUERY_LDP_STP_N
>>   || mode == TImode
>>   || mode == TFmode
>>   || (BYTES_BIG_ENDIAN && advsimd_struct_p));
>>
>> +  /* If we are dealing with ADDR_QUERY_LDP_STP_N that means the incoming 
>> mode
>> + corresponds to the actual size of the memory being loaded/stored and 
>> the
>> + mode of the corresponding addressing mode is half of that.  */
>> +  if (type == ADDR_QUERY_LDP_STP_N && known_eq (GET_MODE_SIZE (mode), 16))
>> +mode = DFmode;
>> +
>>bool allow_reg_index_p = (!load_store_pair_p
>>   && (known_lt (GET_MODE_SIZE (mode), 16)
>>   || vec_flags == VEC_ADVSIMD
> 
> I don't know whether it matters in practice, but that description also
> applies to Umq, not just Uml.  It might be worth changing it too so
> that things stay consistent.
> 
> Thanks,
> Richard
> 
Hi all,

This is a reworked patched, replacing Umq and Uml with Umn now.

Bootstrapped and tested on aarch64-none-linux-gnu.

Is this OK for trunk?

gcc
2018-06-25  Andre Vieira  

* config/aarch64/aarch64-simd.md (aarch64_simd_mov):
Replace
Umq with Umn.
(store_pair_lanes): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_addr_query_type): Add new
enum value 'ADDR_QUERY_LDP_STP_N'.
* config/aarch64/aarch64.c (aarch64_addr_query_type): Likewise.
(aarch64_print_address_internal): Add declaration.
(aarch64_print_ldpstp_address): Remove.
(aarch64_classify_address): Adapt mode for 'ADDR_QUERY_LDP_STP_N'.
(aarch64_print_operand): Change printing of 'y'.
* config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Use
new enum value 'ADDR_QUERY_LDP_STP_N', don't hardcode mode and use
'true' rather than '1'.
* gcc/config/aarch64/constraints.md (Uml): Likewise.
(Uml): Rename to Umn.
(Umq): Remove.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 4ea50acaa59c0b58a213bd1f27fb78b6d8deee96..c03a442107815eed44a3b6bceb386d78a6615483 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -120,6 +120,10 @@ enum aarch64_symbol_type
ADDR_QUERY_LDP_STP
   Query what is valid for a load/store pair.
 
+   ADDR_QUERY_LDP_STP_N
+  Query what is valid for a load/store pair, but narrow the incoming mode
+  for address checking.  This is used for the store_pair_lanes patterns.
+
ADDR_QUERY_ANY
   Query what is valid for at least one memory constraint, which may
   allow things that "m" doesn't.  For example, the SVE LDR and STR
@@ -128,6 +132,7 @@ enum aarch64_symbol_type
 enum aarch64_addr_query_type {
   ADDR_QUERY_M,
   ADDR_QUERY_LDP_STP,
+  ADDR_QUERY_LDP_STP_N,
   ADDR_QUERY_ANY
 };
 
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 4e5c42b0f15b863f3088dba4aac450f31ca309bb..7936581947b360a6d6a88ce6523bbb804c3eb89c 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -131,7 +131,7 @@
 
 (define_insn "*aarch64_simd_mov"
   [(set (match_operand:VQ 0 "nonimmediate_operand"
-		"=w, Umq,  m,  w, ?r, ?w, ?r, w")
+		"=w, Umn,  m,  w, ?r, ?w, ?r, w")
 	(match_operand:VQ 1 "general_operand"
 		"m,  Dz, w,  w,  w,  r,  r, Dn"))]
   "TARGET_SIMD
@@ -3059,7 +3059,7 @@
 )
 
 (define_insn "store_pair_lanes"
-  [(set (match_operand: 0 "aarch64_mem_pair_lanes_operand" "=Uml, Uml")
+  [(set (match_operand: 0 "aarch64_mem_pair_lanes_operand" "=Umn, Umn")
 	(vec_concat:
 	   (match_operand:VDC 1 "register_operand" "w, r")
 	   (match_operand:VDC 2 "register_operand" "w, r")))]
diff 

Re: [AArch64][PATCH 2/2] PR target/83009: Relax strict address checking for store pair lanes

2018-06-25 Thread Andre Simoes Dias Vieira
On 14/06/18 12:47, Richard Sandiford wrote:
> Kyrill  Tkachov  writes:
>> Hi Andre,
>> On 07/06/18 18:02, Andre Simoes Dias Vieira wrote:
>>> Hi,
>>>
>>> See below a patch to address PR 83009.
>>>
>>> Tested with aarch64-linux-gnu bootstrap and regtests for c, c++ and fortran.
>>> Ran the adjusted testcase for -mabi=ilp32.
>>>
>>> Is this OK for trunk?
>>>
>>> Cheers,
>>> Andre
>>>
>>> PR target/83009: Relax strict address checking for store pair lanes
>>>
>>> The operand constraint for the memory address of store/load pair lanes was
>>> enforcing strictly hardware registers be allowed as memory addresses.  We 
>>> want
>>> to relax that such that these patterns can be used by combine, prior
>>> to reload.
>>> During register allocation the register constraint will enforce the correct
>>> register is chosen.
>>>
>>> gcc
>>> 2018-06-07  Andre Vieira 
>>>
>>> PR target/83009
>>> * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): 
>>> Make
>>> address check not strict prior to reload.
>>>
>>> gcc/testsuite
>>> 2018-06-07 Andre Vieira 
>>>
>>> PR target/83009
>>> * gcc/target/aarch64/store_v2vec_lanes.c: Add extra tests.
>>
>> diff --git a/gcc/config/aarch64/predicates.md 
>> b/gcc/config/aarch64/predicates.md
>> index 
>> f0917af8b4cec945ba4e38e4dc670200f8812983..30aa88838671bf343a883077c2b606a035c030dd
>>  100644
>> --- a/gcc/config/aarch64/predicates.md
>> +++ b/gcc/config/aarch64/predicates.md
>> @@ -227,7 +227,7 @@
>>   (define_predicate "aarch64_mem_pair_lanes_operand"
>> (and (match_code "mem")
>>  (match_test "aarch64_legitimate_address_p (GET_MODE (op), XEXP (op, 
>> 0),
>> -  true,
>> +  reload_completed,
>>ADDR_QUERY_LDP_STP_N)")))
>>   
>>
>> If you want to enforce strict checking during reload and later then I think 
>> you need to use reload_in_progress || reload_completed ?
> 
> That was the old way, but it would be lra_in_progress now.
> However...
> 
>> I guess that would be equivalent to !can_create_pseudo ().
> 
> We should never see pseudos when reload_completed, so the choice
> shouldn't really matter then.  And I don't think we should use
> lra_in_progress either, since that would make the checks stricter
> before RA has actually happened, which would likely lead to an
> unrecognisable insn ICE if recog is called during one of the LRA
> subpasses.
> 
> So unless we know a reason otherwise, I think this should just
> be "false" (like it already is for aarch64_mem_pair_operand).
> 
> Thanks,
> Richard
> 
Changed it to false.

Bootstrapped and regression testing for aarch64-none-linux-gnu.

Is this OK for trunk?

Cheers,
Andre

gcc
2018-06-25  Andre Vieira  

PR target/83009
* config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand):
Make
address check not strict.

gcc/testsuite
2018-06-25  Andre Vieira  

PR target/83009
* gcc/target/aarch64/store_v2vec_lanes.c: Add extra tests.
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index f0917af8b4cec945ba4e38e4dc670200f8812983..e1a022b5c5a371230c71cd1dd944f5b0d4f4dc4c 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -227,7 +227,7 @@
 (define_predicate "aarch64_mem_pair_lanes_operand"
   (and (match_code "mem")
(match_test "aarch64_legitimate_address_p (GET_MODE (op), XEXP (op, 0),
-		  true,
+		  false,
 		  ADDR_QUERY_LDP_STP_N)")))
 
 (define_predicate "aarch64_prefetch_operand"
diff --git a/gcc/testsuite/gcc.target/aarch64/store_v2vec_lanes.c b/gcc/testsuite/gcc.target/aarch64/store_v2vec_lanes.c
index 990aea32de6f8239effa95a081950684c6e11386..3296d04da14149d26d19da785663b87bd5ad8994 100644
--- a/gcc/testsuite/gcc.target/aarch64/store_v2vec_lanes.c
+++ b/gcc/testsuite/gcc.target/aarch64/store_v2vec_lanes.c
@@ -22,10 +22,32 @@ construct_lane_2 (long long *y, v2di *z)
   z[2] = x;
 }
 
+void
+construct_lane_3 (double **py, v2df **pz)
+{
+  double *y = *py;
+  v2df *z = *pz;
+  double y0 = y[0] + 1;
+  double y1 = y[1] + 2;
+  v2df x = {y0, y1};
+  z[2] = x;
+}
+
+void
+construct_lane_4 (long long **py, v2di **pz)
+{
+  long long *y = *py;
+  v2di *z = *pz;
+  long long y0 = y[0] + 1;
+  long long y1 = y[1] + 2;
+  v2di x = {y0, y1};
+  z[2] = x;
+}
+
 /* We can use the load_pair_lanes pattern to vec_concat two DI/DF
values from consecutive memory into a 2-element vector by using
a Q-reg LDR.  */
 
-/* { dg-final { scan-assembler-times "stp\td\[0-9\]+, d\[0-9\]+" 1 { xfail ilp32 } } } */
-/* { dg-final { scan-assembler-times "stp\tx\[0-9\]+, x\[0-9\]+" 1 { xfail ilp32 } } } */
-/* { dg-final { scan-assembler-not "ins\t" { xfail ilp32 } } } */
+/* { dg-final { scan-assembler-times "stp\td\[0-9\]+, d\[0-9\]+" 2 } } */
+/* { dg-final { scan-assembler-times "stp\tx\[0-9\]+, x\[0-9\]+" 2 } } */
+/* { dg-final 

Re: [RFC] fixincludes: vxworks: add hack around ioLib.h/unistd.h mutual inclusion

2018-06-25 Thread Rasmus Villemoes
On 2018-06-20 14:35, Martin Liška wrote:
> On 06/20/2018 02:19 PM, Tom de Vries wrote:
>> Hi,
>>
>> for make check-fixincludes I'm seeing:
>> ...
>> cmp: EOF on
>> /home/vries/gcc_versions/devel/src/fixincludes/tests/base/ioLib.h
>> *** ioLib.h 2018-06-20 14:14:40.035956737 +0200
>> --- /home/vries/gcc_versions/devel/src/fixincludes/tests/base/ioLib.h
>> 2018-06-20 14:14:28.183925247 +0200
>> ***
>> *** 17,24 
>>   #if defined( VXWORKS_WRITE_CONST_CHECK )
>>   extern int  write (int, const char*, size_t);
>>   #endif  /* VXWORKS_WRITE_CONST_CHECK */
>> -
>> -
>> - #if defined( VXWORKS_IOLIB_INCLUDE_UNISTD_CHECK )
>> - #include "unistd.h"
>> - #endif  /* VXWORKS_IOLIB_INCLUDE_UNISTD_CHECK */
>> --- 17,19 
>>
>> There were fixinclude test FAILURES
>> Makefile:176: recipe for target 'check' failed
>> make: *** [check] Error 1
>> ...
>>
>> Thanks,
>> - Tom
>>
> 
> I can confirm that.

Sorry about that. I completely missed that part of the README. I also
get the test failure now that I do the make check. BUT: The hunk I'm
seeing has , not "unistd.h", which is also what I'd expect,
since that's how it should look after fixing.

Ah, it seems that fixincl.x was not regenerated when the inclhack.def
patch was applied, which is probably my fault as well, since I hadn't
included a proper changelog fragment.

Fortunately it seems that the fix is simply to add that hunk (with the
angle brackets) to tests/base/ioLib.h, now that fixincl.x has been
regenerated due to a later patch.

Again, sorry about all this (and for not responding sooner, but have
been on vacation).

Rasmus