Re: [PATCH] Remove unused libgfortran functions

2016-12-20 Thread Janne Blomqvist
On Tue, Dec 20, 2016 at 11:31 AM, FX  wrote:
>> I don't understand. Why would it imply a 1:1 mapping of release series
>> with major ABI versions?
>
> OK, I thought you meant to map libgfortran version numbers (libgfortran.so.7 
> with GCC 7). If it’s the gfortran.map node names, I’m happy with that indeed.
>
> Attached patch regtested on x86_64-apple-darwin16.3.0.
> OK to commit?

Ok, thanks!

>
>
>> Currently we have _gfortran_, that is with a single underscore in the
>> beginning, so it's not in the "C/POSIX reserved for the implementation
>> namespace". But yes, I agree that at least those functions documented
>> under the non-Fortran main program section in the manual should be
>> kept as is.
>
> Then, if we keep some functions under _gfortran_, I say let’s keep them all 
> there. It’s not hurting, and the few users who care have come to expect it.

Fair enough, works for me.




-- 
Janne Blomqvist


Re: [C PATCH] Fix handling side-effects of parameters (PR c/77767)

2016-12-20 Thread Joseph Myers
On Tue, 20 Dec 2016, Jakub Jelinek wrote:

> Hi!
> 
> We only record side-effects from the last parameter, the following patch
> fixes that by accumulating them from all the parameters.  I've checked all
> the callers of grokdeclarator and grokdeclarator is always called with expr
> either pointing to NULL_TREE, or being NULL, or where we want the
> side-effects to be accumulated.  Creating a STATEMENT_LIST instead of
> COMPOUND_EXPRs doesn't work because we want to mark_exp_read the resulting
> expression, which doesn't work for a statement list.  COMPOUND_EXPR is what
> is used in the two other places where side-effect expressions are
> accumulated - one in grokdeclarator, another one in declspecs_add_type.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v4] Run tests only if the machine supports the instruction set.

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 10:26:13PM +0100, Dominik Vogt wrote:
> On Tue, Dec 20, 2016 at 11:57:52AM -0800, Mike Stump wrote:
> > On Dec 20, 2016, at 6:10 AM, Dominik Vogt  wrote:
> > > Right, it gets called even more often than one would think, and
> > > even with empty torture_current_options.  The attached new patch
> > > (v3) removes -Ox options and superflous whitespace and caches that
> > > between calls if it's not empty.  There's another, permanent cache
> > > for calls without any flags.  With proper ordering of the torture
> > > options, the test program is built only a couple of times.
> > 
> > Seems fine to me, but most other cases use the postfix _hw.  Any
> > reason not use use _hw (and not _runable) on these?  If not,
> > could you please use _hw instead.
> 
> No specific reason other than lack of imagination.  "s390_hw" is a
> bit too generic in my eyes -> the new names are:
> 
> v4:
> 
>   * Renamed "s390_runnable" to "s390_useable_hw".
>   * Renamed "z900_runnable" to "s390_z900_hw",
> Renamed "z10_runnable" to "s390_z10_hw",
> etc.

Grepping for _hw in target-supports.exp reveals that usually the
effective target predicates are called _hw or _hw_available,
__hw only if it is too ambiguous (e.g. alpha_max_hw or
ppc_float128_hw_available).  So I think z900_hw, z10_hw etc. is good
enough (as long as it does not clash with some other target isa name),
s390_usable_hw or s390_hw_available is fine.

Jakub


[i386] New lea patterns for PR71321

2016-12-20 Thread Bernd Schmidt
The problem here is that we don't have complete coverage of lea patterns 
for HImode/QImode: the combiner can't recognize a (plus (ashift reg 2) 
reg) pattern it builds.


My first idea was to canonicalize ASHIFT by constant inside PLUS to 
MULT. The docs say that this is done only inside a MEM context, but that 
seems misguided considering the existence of lea patterns. Maybe that's 
something to revisit for gcc-8. In the meantime, the patch below is more 
like a minimal fix, and it seems to produce better results at the moment 
anyway.


Bootstrapped and tested on x86_64-linux. Ok?


Bernd
	PR target/71321
	* config/i386/i386.md (lea_general_2b, lea_general_3b): New
	patterns.
	* config/i386/predicates.md (const123_operand): New.

	PR target/71321
	* gcc.target/i386/pr71321.c: New test.

Index: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md	(revision 243548)
+++ gcc/config/i386/i386.md	(working copy)
@@ -6266,6 +6266,27 @@ (define_insn_and_split "*lea_gener
   [(set_attr "type" "lea")
(set_attr "mode" "SI")])
 
+(define_insn_and_split "*lea_general_2b"
+  [(set (match_operand:SWI12 0 "register_operand" "=r")
+	(plus:SWI12
+	  (ashift:SWI12 (match_operand:SWI12 1 "index_register_operand" "l")
+			(match_operand 2 "const123_operand" "n"))
+	  (match_operand:SWI12 3 "nonmemory_operand" "ri")))]
+  "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+	(plus:SI
+	  (ashift:SI (match_dup 1) (match_dup 2))
+	  (match_dup 3)))]
+{
+  operands[0] = gen_lowpart (SImode, operands[0]);
+  operands[1] = gen_lowpart (SImode, operands[1]);
+  operands[3] = gen_lowpart (SImode, operands[3]);
+}
+  [(set_attr "type" "lea")
+   (set_attr "mode" "SI")])
+
 (define_insn_and_split "*lea_general_3"
   [(set (match_operand:SWI12 0 "register_operand" "=r")
 	(plus:SWI12
@@ -6284,6 +6305,32 @@ (define_insn_and_split "*lea_gener
 	(match_dup 3))
 	  (match_dup 4)))]
 {
+  operands[0] = gen_lowpart (SImode, operands[0]);
+  operands[1] = gen_lowpart (SImode, operands[1]);
+  operands[3] = gen_lowpart (SImode, operands[3]);
+  operands[4] = gen_lowpart (SImode, operands[4]);
+}
+  [(set_attr "type" "lea")
+   (set_attr "mode" "SI")])
+
+(define_insn_and_split "*lea_general_3b"
+  [(set (match_operand:SWI12 0 "register_operand" "=r")
+	(plus:SWI12
+	  (plus:SWI12
+	(ashift:SWI12 (match_operand:SWI12 1 "index_register_operand" "l")
+			  (match_operand 2 "const123_operand" "n"))
+	(match_operand:SWI12 3 "register_operand" "r"))
+	  (match_operand:SWI12 4 "immediate_operand" "i")))]
+  "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+	(plus:SI
+	  (plus:SI
+	(ashift:SI (match_dup 1) (match_dup 2))
+	(match_dup 3))
+	  (match_dup 4)))]
+{
   operands[0] = gen_lowpart (SImode, operands[0]);
   operands[1] = gen_lowpart (SImode, operands[1]);
   operands[3] = gen_lowpart (SImode, operands[3]);
Index: gcc/config/i386/predicates.md
===
--- gcc/config/i386/predicates.md	(revision 243548)
+++ gcc/config/i386/predicates.md	(working copy)
@@ -765,6 +765,14 @@ (define_predicate "const248_operand"
   return i == 2 || i == 4 || i == 8;
 })
 
+;; Match 1, 2, or 3.  Used for lea shift amounts.
+(define_predicate "const123_operand"
+  (match_code "const_int")
+{
+  HOST_WIDE_INT i = INTVAL (op);
+  return i == 1 || i == 2 || i == 3;
+})
+
 ;; Match 2, 3, 6, or 7
 (define_predicate "const2367_operand"
   (match_code "const_int")
Index: gcc/testsuite/gcc.target/i386/pr71321.c
===
--- gcc/testsuite/gcc.target/i386/pr71321.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr71321.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef unsigned char uint8_t;
+typedef unsigned int uint32_t;
+
+unsigned cvt_to_2digit(uint8_t i, uint8_t base)
+{
+  return ((i / base) | (uint32_t)(i % base)<<8);
+}
+unsigned cvt_to_2digit_ascii(uint8_t i)
+{
+  return cvt_to_2digit(i, 10) + 0x0a3030;
+}
+/* { dg-final { scan-assembler-times "lea.\t\\(%\[0-9a-z\]+,%\[0-9a-z\]+,4" 3 } } */
+/* { dg-final { scan-assembler-times "lea.\t\\(%\[0-9a-z\]+,%\[0-9a-z\]+,8" 1 } } */


Re: [PATCH v4] Run tests only if the machine supports the instruction set.

2016-12-20 Thread Dominik Vogt
On Tue, Dec 20, 2016 at 11:57:52AM -0800, Mike Stump wrote:
> On Dec 20, 2016, at 6:10 AM, Dominik Vogt  wrote:
> > Right, it gets called even more often than one would think, and
> > even with empty torture_current_options.  The attached new patch
> > (v3) removes -Ox options and superflous whitespace and caches that
> > between calls if it's not empty.  There's another, permanent cache
> > for calls without any flags.  With proper ordering of the torture
> > options, the test program is built only a couple of times.
> 
> Seems fine to me, but most other cases use the postfix _hw.  Any
> reason not use use _hw (and not _runable) on these?  If not,
> could you please use _hw instead.

No specific reason other than lack of imagination.  "s390_hw" is a
bit too generic in my eyes -> the new names are:

v4:

  * Renamed "s390_runnable" to "s390_useable_hw".
  * Renamed "z900_runnable" to "s390_z900_hw",
Renamed "z10_runnable" to "s390_z10_hw",
etc.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog-archlevel

* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Define
__S390_ARCH_LEVEL__.
gcc/testsuite/ChangeLog-setmem

* gcc.target/s390/md/setmem_long-1.c: Use "s390_useable_hw".
* gcc.target/s390/md/rXsbg_mode_sXl.c: Likewise.
* gcc.target/s390/md/andc-splitter-1.c: Likewise.
* gcc.target/s390/md/andc-splitter-2.c: Likewise.
* lib/gcc-dg.exp (gcc-dg-runtest): Export torture_current_flags.
* gcc.target/s390/s390.exp: Import torture_current_flags.
(check_effective_target_s390_useable_hw): New.
(check_effective_target_s390_z900_hw): New.
(check_effective_target_s390_z990_hw): New.
(check_effective_target_s390_z9_ec_hw): New.
(check_effective_target_s390_z10_hw): New.
(check_effective_target_s390_z196_hw): New.
(check_effective_target_s390_zEC12_hw): New.
(check_effective_target_s390_z13_hw): New.
(check_effective_target_z10_instructions): Removed.
(torture tests): Add optimization level without -march=.
Reorder torture tests for good cache usage.
>From 8ae8d6f48fbede4188a7e02c835fcf9d0b535da7 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Tue, 13 Dec 2016 10:21:08 +0100
Subject: [PATCH] S/390: Run md tests only if the machine supports the
 instruction set.

---
 gcc/config/s390/s390-c.c   |  17 +++
 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c |  19 +--
 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c |  19 +--
 gcc/testsuite/gcc.target/s390/md/rXsbg_mode_sXl.c  |   4 +-
 gcc/testsuite/gcc.target/s390/md/setmem_long-1.c   |   7 +-
 gcc/testsuite/gcc.target/s390/s390.exp | 170 ++---
 gcc/testsuite/lib/gcc-dg.exp   |   2 +
 7 files changed, 198 insertions(+), 40 deletions(-)

diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index fcf7477..e841365 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -320,6 +320,8 @@ s390_cpu_cpp_builtins_internal (cpp_reader *pfile,
 {
   s390_def_or_undef_macro (pfile, MASK_OPT_HTM, old_opts, opts,
   "__HTM__", "__HTM__");
+  s390_def_or_undef_macro (pfile, MASK_OPT_VX, old_opts, opts,
+  "__S390_VX__", "__S390_VX__");
   s390_def_or_undef_macro (pfile, MASK_ZVECTOR, old_opts, opts,
   "__VEC__=10301", "__VEC__");
   s390_def_or_undef_macro (pfile, MASK_ZVECTOR, old_opts, opts,
@@ -328,6 +330,21 @@ s390_cpu_cpp_builtins_internal (cpp_reader *pfile,
   s390_def_or_undef_macro (pfile, MASK_ZVECTOR, old_opts, opts,
   "__bool=__attribute__((s390_vector_bool)) unsigned",
   "__bool");
+  {
+char macro_def[64];
+int arch_level;
+gcc_assert (s390_arch != PROCESSOR_NATIVE);
+arch_level = (int)s390_arch + 3;
+if (s390_arch >= PROCESSOR_2094_Z9_EC)
+  /* Z9_EC has the same level as Z9_109.  */
+  arch_level--;
+/* Review when a new arch is added and increase the value.  */
+char dummy[23 - 2 * PROCESSOR_max] __attribute__((unused));
+sprintf (macro_def, "__S390_ARCH_LEVEL__=%d", arch_level);
+cpp_undef (pfile, "__S390_ARCH_LEVEL__");
+cpp_define (pfile, macro_def);
+  }
+
   if (!flag_iso)
 {
   s390_def_or_undef_macro (pfile, MASK_ZVECTOR, old_opts, opts,
diff --git a/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c 
b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
index ed78921..3f0677c 100644
--- a/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
+++ b/gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
@@ -1,7 +1,8 @@
 /* Machine description pattern tests.  */
 
-/* { dg-do run { target { lp64 } } } */
+/* { dg-do compile { target { lp64 } } } */
 /* { dg-options "-mzarch -save-temps -dP" } */
+/* { dg-do run { target { lp64 && s390_useable_hw } } } */
 /* 

[C PATCH] Fix handling side-effects of parameters (PR c/77767)

2016-12-20 Thread Jakub Jelinek
Hi!

We only record side-effects from the last parameter, the following patch
fixes that by accumulating them from all the parameters.  I've checked all
the callers of grokdeclarator and grokdeclarator is always called with expr
either pointing to NULL_TREE, or being NULL, or where we want the
side-effects to be accumulated.  Creating a STATEMENT_LIST instead of
COMPOUND_EXPRs doesn't work because we want to mark_exp_read the resulting
expression, which doesn't work for a statement list.  COMPOUND_EXPR is what
is used in the two other places where side-effect expressions are
accumulated - one in grokdeclarator, another one in declspecs_add_type.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-20  Jakub Jelinek  

PR c/77767
* c-decl.c (grokdeclarator): If *expr is non-NULL, append expression
to *expr instead of overwriting it.

* gcc.c-torture/execute/pr77767.c: New test.

--- gcc/c/c-decl.c.jj   2016-11-21 19:47:01.0 +0100
+++ gcc/c/c-decl.c  2016-12-20 12:01:14.518748131 +0100
@@ -5580,11 +5580,21 @@ grokdeclarator (const struct c_declarato
   if (TREE_CODE (type) == ERROR_MARK)
 return error_mark_node;
   if (expr == NULL)
-expr = _dummy;
+{
+  expr = _dummy;
+  expr_dummy = NULL_TREE;
+}
   if (expr_const_operands == NULL)
 expr_const_operands = _const_operands_dummy;
 
-  *expr = declspecs->expr;
+  if (declspecs->expr)
+{
+  if (*expr)
+   *expr = build2 (COMPOUND_EXPR, TREE_TYPE (declspecs->expr), *expr,
+   declspecs->expr);
+  else
+   *expr = declspecs->expr;
+}
   *expr_const_operands = declspecs->expr_const_operands;
 
   if (decl_context == FUNCDEF)
--- gcc/testsuite/gcc.c-torture/execute/pr77767.c.jj2016-12-20 
12:08:22.569163979 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr77767.c   2016-12-20 
11:45:03.0 +0100
@@ -0,0 +1,16 @@
+/* PR c/77767 */
+
+void
+foo (int a, int b[a++], int c, int d[c++])
+{
+  if (a != 2 || c != 2)
+__builtin_abort ();
+}
+
+int
+main ()
+{
+  int e[10];
+  foo (1, e, 1, e);
+  return 0;
+}

Jakub


[v3 PATCH] Implement 2801, Default-constructibility of unique_ptr.

2016-12-20 Thread Ville Voutilainen
Tested on Linux-x64.

The issue doesn't have a proposed resolution yet, so we can certainly wait
with this, but I have an inkling that this implementation is what the proposed
resolution must say. :)

2016-12-20  Ville Voutilainen  

Implement 2801, Default-constructibility of unique_ptr.
* include/bits/unique_ptr.h (unique_ptr()): Constrain.
(unique_ptr(pointer)): Likewise.
(unique_ptr(nullptr_t)): Likewise.
(unique_ptr<_Tp[], _Dp>::unique_ptr()): Likewise.
(unique_ptr<_Tp[], _Dp>::unique_ptr(_Up)): Likewise.
(unique_ptr<_Tp[], _Dp>::unique_ptr(nullptr_t)): Likewise.
* testsuite/20_util/unique_ptr/assign/48635_neg.cc: Adjust.
* testsuite/20_util/unique_ptr/cons/cv_qual_neg.cc: Likewise.
* testsuite/20_util/unique_ptr/cons/default.cc: New.
* testsuite/20_util/unique_ptr/cons/ptr_deleter_neg.cc: Adjust.
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index 56e6ec0..63dff37 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -175,10 +175,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Constructors.
 
   /// Default constructor, creates a unique_ptr that owns nothing.
+  template >,
+is_default_constructible<_Up>>::value,
+ bool>::type = false>
   constexpr unique_ptr() noexcept
   : _M_t()
-  { static_assert(!is_pointer::value,
-"constructed with null function pointer deleter"); }
+  { }
 
   /** Takes ownership of a pointer.
*
@@ -186,11 +190,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*
* The deleter will be value-initialized.
*/
+  template >,
+is_default_constructible<_Up>>::value,
+ bool>::type = false>
   explicit
   unique_ptr(pointer __p) noexcept
   : _M_t(__p)
-  { static_assert(!is_pointer::value,
-"constructed with null function pointer deleter"); }
+  { }
 
   /** Takes ownership of a pointer.
*
@@ -218,6 +226,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  "rvalue deleter bound to reference"); }
 
   /// Creates a unique_ptr that owns nothing.
+  template >,
+is_default_constructible<_Up>>::value,
+ bool>::type = false>
   constexpr unique_ptr(nullptr_t) noexcept : unique_ptr() { }
 
   // Move constructors.
@@ -432,10 +445,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Constructors.
 
   /// Default constructor, creates a unique_ptr that owns nothing.
+  template >,
+is_default_constructible<_Up>>::value,
+ bool>::type = false>
   constexpr unique_ptr() noexcept
   : _M_t()
-  { static_assert(!std::is_pointer::value,
- "constructed with null function pointer deleter"); }
+  { }
 
   /** Takes ownership of a pointer.
*
@@ -445,13 +462,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* The deleter will be value-initialized.
*/
   template>,
+   is_default_constructible<_Vp>>::value,
+bool>::type = false,
+  typename = typename enable_if<
  __safe_conversion_raw<_Up>::value, bool>::type>
   explicit
   unique_ptr(_Up __p) noexcept
   : _M_t(__p)
-  { static_assert(!is_pointer::value,
- "constructed with null function pointer deleter"); }
+  { }
 
   /** Takes ownership of a pointer.
*
@@ -491,6 +512,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_t(__u.release(), std::forward(__u.get_deleter())) { }
 
   /// Creates a unique_ptr that owns nothing.
+  template >,
+is_default_constructible<_Up>>::value,
+ bool>::type = false>
   constexpr unique_ptr(nullptr_t) noexcept : unique_ptr() { }
 
   template ud(nullptr, d);
   ub = std::move(ud); // { dg-error "no match" }
   ub2 = ud; // { dg-error "no match" }
-// { dg-error "no type" "" { target *-*-* } 289 }
+// { dg-error "no type" "" { target *-*-* } 302 }
 
   std::unique_ptr uba(nullptr, b);
   std::unique_ptr uda(nullptr, d);
   uba = std::move(uda); // { dg-error "no match" }
-// { dg-error "no type" "" { target *-*-* } 540 }
+// { dg-error "no type" "" { target *-*-* } 566 }
 }
diff --git a/libstdc++-v3/testsuite/20_util/unique_ptr/cons/cv_qual_neg.cc 
b/libstdc++-v3/testsuite/20_util/unique_ptr/cons/cv_qual_neg.cc
index 3e6f41b..824f3c3 100644
--- a/libstdc++-v3/testsuite/20_util/unique_ptr/cons/cv_qual_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/unique_ptr/cons/cv_qual_neg.cc
@@ -39,7 +39,7 @@ test07()
   std::unique_ptr cA3(p); // { dg-error "no matching function" }
   std::unique_ptr vA3(p); // { dg-error "no matching function" }
   std::unique_ptr cvA3(p); // { dg-error "no matching 
function" }
-  

Re: [PATCH] Optimize X << Y with low bits of Y known to be 0 (PR tree-optimization/71563)

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 09:30:17PM +0100, Marc Glisse wrote:
> would it make sense to extend it to rotates later?

I wasn't 100% sure if rotates also require 0..prec-1 rotate counts, or
if they accept arbitrary ones.

> Note that you can write (shift @0 SSA_NAME@1) in the pattern instead of a
> separate test.

That is what I tried first, but there is some bug in genmatch.c that
prevents it.  The:
 (for vec (VECTOR_CST CONSTRUCTOR)
  (simplify
   (shiftrotate @0 vec@1)
results in case SSA_NAME: being added to a switch:
case SSA_NAME:
  if (do_valueize (valueize, op1) != NULL_TREE)
{
  gimple *def_stmt = SSA_NAME_DEF_STMT (op1);
  if (gassign *def = dyn_cast  (def_stmt))
switch (gimple_assign_rhs_code (def))
  {
  case CONSTRUCTOR:
and the SSA_NAME@1 in another simplification resulted in another
case SSA_NAME:
into the same switch (rather than appending to the case SSA_NAME).

Jakub


Re: [PATCH] Optimize X << Y with low bits of Y known to be 0 (PR tree-optimization/71563)

2016-12-20 Thread Marc Glisse

On Tue, 20 Dec 2016, Jakub Jelinek wrote:


If the shift count has enough known zero low bits (non-zero bits only
above the ceil_log2 (precision)), then the only valid shift count that
is not out of bounds is 0, so we can as well fold it into the first
argument of the shift.  This resolves a regression introduced by partly
optimizing it at the gimple level, which results in it not being optimized
at the RTL level that managed to do that completely.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-20  Jakub Jelinek  

PR tree-optimization/71563
* match.pd: Simplify X << Y into X if Y is known to be 0 or
out of range value - has low bits known to be zero.


Hello,

would it make sense to extend it to rotates later?

Note that you can write (shift @0 SSA_NAME@1) in the pattern instead of a 
separate test.


--
Marc Glisse


Re: [PR c++/78572] handle array self references in intializers

2016-12-20 Thread Nathan Sidwell

On 12/20/2016 01:52 PM, Aldy Hernandez wrote:


int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };

(gdb) x/4x 
0x601040 :   0x0005  0x0111  0x0222
0x0005

That is, the array[3]=5 overwrites the last 0x333.  I would expect that...


That may be wrong. Using the 4431 draft, [8.5.4]/4 says 'Within the 
initializer-list of a braced-init-list, the initializer-clauses, 
including any that result from pack expansions (14.5.3), are evaluated 
in the order in which they appear.'
It goes on to mention that there are sequence points, plus that the 
ordering holds regardless of the semantics of initialization.


So by my reading, the effects of the above list are:
a) assign 5 to array[3]
b) assign 5 to array[0] -- consume the first initializer
c) assign x111 to array[1] -- consume the second initializer
d) assign 0x222 to array[2] -- consume the third initializer
e) assign 0x333 to array[3] -- consume the fourth intializer,
   overwrite #a

nathan
--
Nathan Sidwell


Re: [PATCH v3] Run tests only if the machine supports the instruction set.

2016-12-20 Thread Mike Stump
On Dec 20, 2016, at 6:10 AM, Dominik Vogt  wrote:
> Right, it gets called even more often than one would think, and
> even with empty torture_current_options.  The attached new patch
> (v3) removes -Ox options and superflous whitespace and caches that
> between calls if it's not empty.  There's another, permanent cache
> for calls without any flags.  With proper ordering of the torture
> options, the test program is built only a couple of times.

Seems fine to me, but most other cases use the postfix _hw.  Any reason not use 
use _hw (and not _runable) on these?  If not, could you please use _hw instead.



[PATCH] Optimize X << Y with low bits of Y known to be 0 (PR tree-optimization/71563)

2016-12-20 Thread Jakub Jelinek
Hi!

If the shift count has enough known zero low bits (non-zero bits only
above the ceil_log2 (precision)), then the only valid shift count that
is not out of bounds is 0, so we can as well fold it into the first
argument of the shift.  This resolves a regression introduced by partly
optimizing it at the gimple level, which results in it not being optimized
at the RTL level that managed to do that completely.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-20  Jakub Jelinek  

PR tree-optimization/71563
* match.pd: Simplify X << Y into X if Y is known to be 0 or
out of range value - has low bits known to be zero.

* gcc.dg/tree-ssa/pr71563.c: New test.

--- gcc/match.pd.jj 2016-12-10 13:05:39.0 +0100
+++ gcc/match.pd2016-12-20 15:44:30.892704283 +0100
@@ -1497,6 +1497,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (tem)
  (shiftrotate @0 { tem; }))
 
+/* Simplify X << Y where Y's low width bits are 0 to X, as only valid
+   Y is 0.  Similarly for X >> Y.  */
+#if GIMPLE
+(for shift (lshift rshift)
+ (simplify
+  (shift @0 @1)
+   (if (TREE_CODE (@1) == SSA_NAME && INTEGRAL_TYPE_P (TREE_TYPE (@1)))
+(with {
+  int width = ceil_log2 (element_precision (TREE_TYPE (@0)));
+  int prec = TYPE_PRECISION (TREE_TYPE (@1));
+ }
+ (if ((get_nonzero_bits (@1) & wi::mask (width, false, prec)) == 0)
+  @0)
+#endif
+
 /* Rewrite an LROTATE_EXPR by a constant into an
RROTATE_EXPR by a new constant.  */
 (simplify
--- gcc/testsuite/gcc.dg/tree-ssa/pr71563.c.jj  2016-12-20 15:57:16.624722177 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr71563.c 2016-12-20 15:57:01.0 
+0100
@@ -0,0 +1,23 @@
+/* PR tree-optimization/71563 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void link_error (void);
+
+void
+foo (int k)
+{
+  int t = 1 << ((1 / k) << 8);
+  if (t != 1)
+link_error ();
+}
+
+void
+bar (int k, int l)
+{
+  int t = l << (k << 8);
+  if (t != l)
+link_error ();
+}
+
+/* { dg-final { scan-tree-dump-not "link_error" "optimized" } } */

Jakub


[C++ PATCH] Error on shadowing a parameter by anon union member in the same scope (PR c++/72707)

2016-12-20 Thread Jakub Jelinek
Hi!

DECL_ANON_UNION_VAR_P vars are DECL_ARTIFICIAL, but we still to diagnose
them if they shadow something.  The DECL_ARTIFICIAL (x) check has been
missing in older gcc releases, so we diagnosed that properly.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-20  Jakub Jelinek  

PR c++/72707
* name-lookup.c (pushdecl_maybe_friend_1): Do check shadowing of
artificial x if it is an anonymous union variable.

* g++.dg/warn/Wshadow-12.C: New test.

--- gcc/cp/name-lookup.c.jj 2016-11-30 08:57:21.0 +0100
+++ gcc/cp/name-lookup.c2016-12-20 14:42:35.482232062 +0100
@@ -,8 +,10 @@ pushdecl_maybe_friend_1 (tree x, bool is
|| TREE_CODE (x) == TYPE_DECL)))
/* Don't check for internally generated vars unless
   it's an implicit typedef (see create_implicit_typedef
-  in decl.c).  */
-  && (!DECL_ARTIFICIAL (x) || DECL_IMPLICIT_TYPEDEF_P (x)))
+  in decl.c) or anonymous union variable.  */
+  && (!DECL_ARTIFICIAL (x)
+  || DECL_IMPLICIT_TYPEDEF_P (x)
+  || (VAR_P (x) && DECL_ANON_UNION_VAR_P (x
{
  bool nowarn = false;
 
--- gcc/testsuite/g++.dg/warn/Wshadow-12.C.jj   2016-12-20 14:41:28.083114644 
+0100
+++ gcc/testsuite/g++.dg/warn/Wshadow-12.C  2016-12-20 14:40:19.0 
+0100
@@ -0,0 +1,9 @@
+// PR c++/72707
+// { dg-do compile }
+
+void
+foo (double x)
+{
+  union { int x; };// { dg-error "shadows a parameter" }
+  x = 0;
+}

Jakub


[PATCH] Don't bootstrap libmpx unless needed

2016-12-20 Thread Jakub Jelinek
Hi!

Similarly how we deal with bootstrapping libsanitizer only when
doing bootstrap-{a,u}san bootstrap, this avoids bootstrapping libmpx
if we don't need it for bootstrapping.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-20  Jakub Jelinek  

* configure.ac: Don't bootstrap libmpx unless --with-build-config
includes bootstrap-mpx.
* configure: Regenerated.

--- configure.ac.jj12016-12-01 23:24:53.0 +0100
+++ configure.ac2016-12-20 10:50:08.715213438 +0100
@@ -2643,9 +2643,14 @@ if echo " ${target_configdirs} " | grep
   bootstrap_target_libs=${bootstrap_target_libs}target-libvtv,
 fi
 
-# If we are building libmpx, bootstrap it.
+# If we are building libmpx and $BUILD_CONFIG contains bootstrap-mpx,
+# bootstrap it.
 if echo " ${target_configdirs} " | grep " libmpx " > /dev/null 2>&1; then
-  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
+  case "$BUILD_CONFIG" in
+*bootstrap-mpx* )
+  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
+  ;;
+  esac
 fi
 
 # Determine whether gdb needs tk/tcl or not.
--- configure.jj1   2016-12-02 00:15:10.0 +0100
+++ configure   2016-12-20 10:50:22.503034682 +0100
@@ -7057,9 +7057,14 @@ if echo " ${target_configdirs} " | grep
   bootstrap_target_libs=${bootstrap_target_libs}target-libvtv,
 fi
 
-# If we are building libmpx, bootstrap it.
+# If we are building libmpx and $BUILD_CONFIG contains bootstrap-mpx,
+# bootstrap it.
 if echo " ${target_configdirs} " | grep " libmpx " > /dev/null 2>&1; then
-  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
+  case "$BUILD_CONFIG" in
+*bootstrap-mpx* )
+  bootstrap_target_libs=${bootstrap_target_libs}target-libmpx,
+  ;;
+  esac
 fi
 
 # Determine whether gdb needs tk/tcl or not.

Jakub


[PATCH] Replace DW_FORM_ref_sup with DW_FORM_ref_sup{4,8}

2016-12-20 Thread Jakub Jelinek
Hi!

Recently DW_FORM_ref_sup (which is meant e.g. for dwz, gcc doesn't emit it)
has been renamed to DW_FORM_ref_sup4 (and changed so that it is always 4
byte) and DW_FORM_ref_sup8 (always 8 byte) has been added.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-12-20  Jakub Jelinek  

* dwarf2.def (DW_FORM_ref_sup): Renamed to ...
(DW_FORM_ref_sup4): ... this.  New form.
(DW_FORM_ref_sup8): New form.

--- include/dwarf2.def.jj   2016-10-31 13:28:05.0 +0100
+++ include/dwarf2.def  2016-12-19 11:40:07.303525953 +0100
@@ -212,13 +212,14 @@ DW_FORM (DW_FORM_ref_sig8, 0x20)
 /* DWARF 5.  */
 DW_FORM (DW_FORM_strx, 0x1a)
 DW_FORM (DW_FORM_addrx, 0x1b)
-DW_FORM (DW_FORM_ref_sup, 0x1c)
+DW_FORM (DW_FORM_ref_sup4, 0x1c)
 DW_FORM (DW_FORM_strp_sup, 0x1d)
 DW_FORM (DW_FORM_data16, 0x1e)
 DW_FORM (DW_FORM_line_strp, 0x1f)
 DW_FORM (DW_FORM_implicit_const, 0x21)
 DW_FORM (DW_FORM_loclistx, 0x22)
 DW_FORM (DW_FORM_rnglistx, 0x23)
+DW_FORM (DW_FORM_ref_sup8, 0x24)
 /* Extensions for Fission.  See http://gcc.gnu.org/wiki/DebugFission.  */
 DW_FORM (DW_FORM_GNU_addr_index, 0x1f01)
 DW_FORM (DW_FORM_GNU_str_index, 0x1f02)

Jakub


Re: [PR c++/78572] handle array self references in intializers

2016-12-20 Thread Aldy Hernandez

On 12/20/2016 11:48 AM, Nathan Sidwell wrote:

On 12/20/2016 11:25 AM, Aldy Hernandez wrote:

The problem in this PR is that we're trying to initialize an array with
members of itself:



Jakub has even gone further to show that for the following:

... = { array[3]=5, array[7]=3, array[7]=8, array[7] = 9 };

things get even worse, because we generate code to write twice into [3]:

{[3]=5, [0]=5, [7]=9, [1]=3, [2]=8, [3]=9}


I looked at this a couple of weeks ago, and got confused.  It wasn't
very clear to me how the side-effect assignments interact with any
implict zero assignments.  (Especially if one then tries to support
C-style designated initializers).

IIUC the side effects of an initializer are evaluated before the storage
of the initializer itself (separated by a sequence point).  further it
seems that default zero initialization of non-explicitly initialized
elements happens after any side-effected store to that element (and
hence zaps the sideeffect).  The rules for non-aggregates appear to be
as-if one builds a temporary object and then copy-constructs it into the
object -- clearly zapping any and all side-effects.  However for
aggregate initializations it looked like elements were initialized in
order -- so the side effect on a later element could overwrite the
initialization of an earlier element.

It wasn't entirely clear.


Looks like things are happening in the right order for this testcase. 
That is, the zeroing happens by virtue of it being in the global 
section, and the rest of the runtime initialization happening afterwards:


abulafia:/build/t/gcc$ cat a.cc
int array[10] = { array[3]=5, array[7]=3 };

int main()
{
  return 0;
}

assembly:
-
.size   array, 40
array:
.zero   40

...
...

abulafia:/build/t/gcc$ gdb -n a.out -q
Reading symbols from a.out...(no debugging symbols found)...done.
(gdb) b main
Breakpoint 1 at 0x40055b
(gdb) r
Starting program: /home/build/t/gcc/a.out

Breakpoint 1, 0x0040055b in main ()
(gdb) x/4x 
0x601080 :   0x0005  0x0003  0x 
0x0005

(gdb)

The above looks correct.

Now for the next example below I get what I intuitively expect, though I 
don't know if that's how it's defined in the standard:


int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };

(gdb) x/4x 
0x601040 :   0x0005  0x0111  0x0222 
0x0005


That is, the array[3]=5 overwrites the last 0x333.  I would expect that...

Is any of this incorrect?

Aldy


Re: [PATCH, ARM] Further improve stack usage in sha512, part 2 (PR 77308)

2016-12-20 Thread Bernd Edlinger
On 12/20/16 16:09, Wilco Dijkstra wrote:
> Bernd Edlinger wrote:
>> this splits the *arm_negdi2, *arm_cmpdi_insn and *arm_cmpdi_unsigned
>> also at split1 except for TARGET_NEON and TARGET_IWMMXT.
>>
>> In the new test case the stack is reduced to about 270 bytes, except
>> for neon and iwmmxt, where this does not change anything.
>
> This looks odd:
>
> -operands[2] = gen_lowpart (SImode, operands[2]);
> +if (can_create_pseudo_p ())
> +  operands[2] = gen_reg_rtx (SImode);
> +else
> +  operands[2] = gen_lowpart (SImode, operands[2]);
>
> Given this is an SI mode scratch, do we need the else part at all? It seems 
> wrong
> to ask for the low part of an SI mode operand...
>

Yes, I think that is correct.

> Other than that it looks good to me, but I can't approve.
>
> As a result of your patches a few patterns are unused now. All the Thumb-2 
> iordi_notdi*
> patterns cannot be used anymore. Also I think arm_cmpdi_zero never gets used 
> - a DI
> mode compare with zero is always split into ORR during expand.
>

I did not change anything for -mthumb -mfpu=neon for instance.
Do you think that iordi_notdi* is never used also for that
configuration?

And if the arm_cmpdi_zero is never expanded, isn't it already
unused before my patch?


Bernd.


Re: [PATCH, testsuite] Add -fno-sched-pressure to a couple sms-*.c tests for powerpc

2016-12-20 Thread Segher Boessenkool
On Tue, Dec 20, 2016 at 11:26:02AM -0600, Pat Haugen wrote:
> gcc.dg/sms-3.c and gcc.dg/sms-6.c fail on powerpc when -fsched-pressure is 
> used. The -fsched-pressure option changes the value returned by 
> rs6000_issue_rate which in turn affects the computed initiation interval in 
> the SMS code and leads to failure to modulo schedule the single loop in 
> sms-3.c and 1 of the 3 loops in sms-6.c. This patch adds that option for 
> powerpc and is a precursor to a patch I'll submit shortly that enables 
> -fsched-pressure by default for the rs6000 port.
> 
> I have verified the updated tests pass both before and after my patch to 
> enable -fsched-pressure is applied. Ok for trunk?

Yes please.  Thanks,


Segher


> testsuite/ChangeLog:
> 2016-12-20  Pat Haugen  
> 
>   * gcc.dg/sms-3.c: Add -fno-sched-pressure for powerpc.
>   * gcc.dg/sms-6.c: Likewise.


Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

2016-12-20 Thread Richard Biener
On December 20, 2016 4:50:25 PM GMT+01:00, Jeff Law  wrote:
>On 12/20/2016 07:16 AM, Richard Biener wrote:
>>
>> it should be already set as we use visited_ssa as "was it ever on the
>worklist",
>> so maybe renaming it would be a good thing as well.
>>
>> + if (TREE_CODE (name) == SSA_NAME)
>> +   {
>> + /* If an SSA has already been seen, this may be a
>loop.
>> +Fail conservatively.  */
>> + if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION
>(name)))
>> +   return false;
>>
>> so to me "conservative" is returning true, not false.
>Agreed.
>
>>
>> + bitmap_set_bit (visited_ssa, SSA_NAME_VERSION
>(name));
>> + worklist.safe_push (name);
>>
>> but for loops we can just continue and ignore this use.  And
>bitmap_set_bit
>> returns whether it set a bit, thus
>>
>> if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION
>(name)))
>>   worklist.safe_push (name);
>>
>> should work?
>What if we're in an irreducible region?

Handling all back edges by deferring to their defs should work.  At least I 
can't see how it would not.

>
>>
>> In principle the thing is sound but I'd like to be able to pass in a
>bitmap of
>> known maybe-undefined/must-defined SSA names to have a cache for
>> multiple invocations from within a pass (like this unswitching case).
>So that means keeping another bitmap for things positively identified
>as 
>defined, then saving it for later invocations.

We eventually need a tristate here for maximum caching.  And as the result 
depends on the dominating BB of the postdom region the savings might be 
questionable.

>  So maybe some of my
>work 
>+ some of his melded together.  (mine computes once for the entire FN 
>and saves the result, so converting it to on-demand saving the result 
>shouldn't be terrible).
>
>
>>
>> Only unswitching on conditions that post-dominate the loop entry
>might be a
>> simpler solution for the PR in question.  OTOH this may disable too
>much
>> unswitching in practice.
>It might be worth experimentation.
>
>Jeff




Re: [PATCH][tree-ssa-address] Use simplify_gen_binary in gen_addr_rtx

2016-12-20 Thread Richard Biener
On December 20, 2016 5:01:19 PM GMT+01:00, Kyrill Tkachov 
 wrote:
>Hi all,
>
>The testcase in this patch generates bogus assembly for arm with -O1
>-mfloat-abi=soft:
>strdr4, [#0, r3]
>
>This is due to non-canonical RTL being generated during expansion:
>(set (mem:DI (plus:SI (const_int 0 [0])
>   (reg/f:SI 153)) [0 MEM[symbol: a, index: _26, offset: 0B]+0 S8 A64])
> (reg:DI 154))
>
>Note the (plus (const_int 0) (reg)). This is being generated in
>gen_addr_rtx in tree-ssa-address.c
>where it creates an explicit PLUS rtx through gen_rtx_PLUS, which
>doesn't try to canonicalise its arguments
>or simplify. The correct thing to do is to use simplify_gen_binary that
>will handle all this properly.

But it has to match up the validity check which passes down exactly the same 
RTL(?)  Or does this stem from propagation simplifying a MEM after IVOPTs?

>I didn't change the other gen_rtx_PLUS calls in this function as their
>results is later used in XEXP operations
>that seem to rely on a PLUS expression being explicitly produced, but
>this particular call doesn't, so it's okay
>to change it. With this patch the sane assembly is generated:
> strdr4, [r3]
>
>Bootstrapped and tested on arm-none-linux-gnueabihf, x86_64,
>aarch64-none-linux-gnu.
>
>Ok for trunk?
>
>Thanks,
>Kyrill
>
>2016-12-20  Kyrylo Tkachov  
>
>* tree-ssa-address.c (gen_addr_rtx): Use simplify_gen_binary to add
> *addr to act_elem.
>
>2016-12-20  Kyrylo Tkachov  
>
> * gcc.dg/20161219.c: New test.




[PATCH,rs6000] Fix PR11488 for rs6000 target

2016-12-20 Thread Pat Haugen
This patch attempts to fix problems with the first scheduling pass creating too 
much register pressure. It does this by enabling the target hook to compute the 
pressure classes for rs6000 target since the first thing I observed while 
investigating the testcase in the subject PR is that IRA was picking 
NON_SPECIAL_REGS as a pressure class which led to the sched-pressure code 
computing too high of a value for number of regs available for pseudos 
preferring GENERAL_REGS. It also enables -fsched-pressure by default, using the 
'model' algorithm.

I ran various runs of cpu20006 to determine the set of pressure classes and 
which sched-pressure algorithm to use. Net result is that with these patches I 
see 6 benchmarks improve in the 2.4-6% range but there are also a couple 2% 
degradations which will need follow up in GCC 8. There was also one benchmark 
that showed a much bigger improvement with the 'weighted' sched-pressure 
algorithm that also needs follow up ('weighted' was not chosen as default since 
it showed more degradations).

Bootstrap/regtest on powerpc64/powerpc64le. There were 2 testcases that failed 
(sms-3.c/sms-6.c) but I have submitted a separate patch to fix those. Ok for 
trunk?

-Pat


2016-12-20  Pat Haugen  

PR rtl-optimization/11488
* common/config/rs6000/rs6000-common.c
(rs6000_option_optimization_table): Enable -fsched-pressure.
* config/rs6000/rs6000.c (TARGET_COMPUTE_PRESSURE_CLASSES): Define
target hook.
(rs6000_option_override_internal): Set default -fsched-pressure 
algorithm.
(rs6000_compute_pressure_classes): Implement target hook.
Index: common/config/rs6000/rs6000-common.c
===
--- common/config/rs6000/rs6000-common.c	(revision 243651)
+++ common/config/rs6000/rs6000-common.c	(working copy)
@@ -32,6 +32,8 @@
 static const struct default_options rs6000_option_optimization_table[] =
   {
 { OPT_LEVELS_1_PLUS, OPT_fomit_frame_pointer, NULL, 1 },
+/* Enable -fsched-pressure for first pass instruction scheduling.  */
+{ OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
 
Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 243651)
+++ config/rs6000/rs6000.c	(working copy)
@@ -1807,6 +1807,9 @@ static const struct attribute_spec rs600
 #undef TARGET_LRA_P
 #define TARGET_LRA_P rs6000_lra_p
 
+#undef TARGET_COMPUTE_PRESSURE_CLASSES
+#define TARGET_COMPUTE_PRESSURE_CLASSES rs6000_compute_pressure_classes
+
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE rs6000_can_eliminate
 
@@ -5100,6 +5103,12 @@ rs6000_option_override_internal (bool gl
 			 global_options.x_param_values,
 			 global_options_set.x_param_values);
 
+  /* Use the 'model' -fsched-pressure algorithm by default.  */
+  maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM,
+			 SCHED_PRESSURE_MODEL,
+			 global_options.x_param_values,
+			 global_options_set.x_param_values);
+
   /* If using typedef char *va_list, signal that
 	 __builtin_va_start (, 0) can be optimized to
 	 ap = __builtin_next_arg (0).  */
@@ -37450,6 +37459,32 @@ rs6000_lra_p (void)
   return TARGET_LRA;
 }
 
+/* Compute register pressure classes.  We implement the target hook to avoid
+   IRA picking something like NON_SPECIAL_REGS as a pressure class, which can
+   lead to incorrect estimates of number of available registers and therefor
+   increased register pressure/spill.   */
+static int
+rs6000_compute_pressure_classes (enum reg_class *pressure_classes)
+{
+  int n;
+
+  n = 0;
+  pressure_classes[n++] = GENERAL_REGS;
+  if (TARGET_VSX)
+pressure_classes[n++] = VSX_REGS;
+  else
+{
+  if (TARGET_ALTIVEC)
+	pressure_classes[n++] = ALTIVEC_REGS;
+  if (TARGET_HARD_FLOAT && TARGET_FPRS)
+	pressure_classes[n++] = FLOAT_REGS;
+}
+  pressure_classes[n++] = CR_REGS;
+  pressure_classes[n++] = SPECIAL_REGS;
+
+  return n;
+}
+
 /* Given FROM and TO register numbers, say whether this elimination is allowed.
Frame pointer elimination is automatically handled.
 


[PATCH, testsuite] Add -fno-sched-pressure to a couple sms-*.c tests for powerpc

2016-12-20 Thread Pat Haugen
gcc.dg/sms-3.c and gcc.dg/sms-6.c fail on powerpc when -fsched-pressure is 
used. The -fsched-pressure option changes the value returned by 
rs6000_issue_rate which in turn affects the computed initiation interval in the 
SMS code and leads to failure to modulo schedule the single loop in sms-3.c and 
1 of the 3 loops in sms-6.c. This patch adds that option for powerpc and is a 
precursor to a patch I'll submit shortly that enables -fsched-pressure by 
default for the rs6000 port.

I have verified the updated tests pass both before and after my patch to enable 
-fsched-pressure is applied. Ok for trunk?

-Pat


testsuite/ChangeLog:
2016-12-20  Pat Haugen  

* gcc.dg/sms-3.c: Add -fno-sched-pressure for powerpc.
* gcc.dg/sms-6.c: Likewise.
Index: gcc.dg/sms-3.c
===
--- gcc.dg/sms-3.c	(revision 243651)
+++ gcc.dg/sms-3.c	(working copy)
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -fmodulo-sched -funroll-loops -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves" } */
+/* { dg-options "-O2 -fmodulo-sched -funroll-loops -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves -fno-sched-pressure" { target powerpc*-*-* } } */
 
 extern void abort (void);
 
Index: gcc.dg/sms-6.c
===
--- gcc.dg/sms-6.c	(revision 243651)
+++ gcc.dg/sms-6.c	(working copy)
@@ -1,7 +1,7 @@
 /* { dg-do run } */
 /* { dg-require-effective-target size32plus } */
 /* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms --param sms-min-sc=1" } */
-/* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves" { target powerpc*-*-* } } */
+/* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves -fno-sched-pressure" { target powerpc*-*-* } } */
 
 extern void abort (void);
 


Re: [PATCH][gimplefe] Improve error recovery

2016-12-20 Thread Joseph Myers
On Tue, 20 Dec 2016, Richard Biener wrote:

> Just noticed a few issues when feeding the GIMPLE FE random -gimple
> dumps.  On errors not skipping to expected tokens leads to a load
> of strange followup parsing errors and worse, to endless parsing
> attempts in one case.
> 
> Fixed with the following.
> 
> Bootstrap / regtest running together with the pass manager change
> posted in the other thread.
> 
> I consider gimple-parser.c changes like this "middle-end", Joseph,
> are you fine with that?

I'm fine with that.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PR c++/78572] handle array self references in intializers

2016-12-20 Thread Nathan Sidwell

On 12/20/2016 11:25 AM, Aldy Hernandez wrote:

The problem in this PR is that we're trying to initialize an array with
members of itself:



Jakub has even gone further to show that for the following:

... = { array[3]=5, array[7]=3, array[7]=8, array[7] = 9 };

things get even worse, because we generate code to write twice into [3]:

{[3]=5, [0]=5, [7]=9, [1]=3, [2]=8, [3]=9}


I looked at this a couple of weeks ago, and got confused.  It wasn't 
very clear to me how the side-effect assignments interact with any 
implict zero assignments.  (Especially if one then tries to support 
C-style designated initializers).


IIUC the side effects of an initializer are evaluated before the storage 
of the initializer itself (separated by a sequence point).  further it 
seems that default zero initialization of non-explicitly initialized 
elements happens after any side-effected store to that element (and 
hence zaps the sideeffect).  The rules for non-aggregates appear to be 
as-if one builds a temporary object and then copy-constructs it into the 
object -- clearly zapping any and all side-effects.  However for 
aggregate initializations it looked like elements were initialized in 
order -- so the side effect on a later element could overwrite the 
initialization of an earlier element.


It wasn't entirely clear.

nathan
--
Nathan Sidwell


[ARM][committed] Fix for PR78255-2.c testism for targets that do not optimize for tailcall

2016-12-20 Thread Andre Vieira (lists)
On 12/12/16 09:04, Christophe Lyon wrote:
>>
> 
> The new test (gcc.target/arm/pr78255-2.c scan-assembler b\\s+bar)
> added at r243494 fails on old arm architectures, such as:
> * arm-none-linux-gnueabi, forcing -march=armv5t in runtestflags
> * arm-none-eabi with default cpu/fpu/mode
> 
> Christophe
Hi,

Thank you for reporting this Christophe and sorry for the delay. The
scan check obviously will not work for targets that do not optimize
tailcalls. So I applied this patch as obvious in revision r243826, such
that the test also accepts direct non-tailcalls, i.e. 'bl?' rather than 'b'.

The test is really to make sure a direct call is not turned into an
indirect call.

gcc/testsuite/ChangeLog:
2016-12-20 Andre Vieira 

* gcc.target/arm/pr78255-2.c: Fix to work for targets
that do not optimize for tailcall.

diff --git a/gcc/testsuite/gcc.target/arm/pr78255-2.c 
b/gcc/testsuite/gcc.target/arm/pr78255-2.c
index 
efa01e750b3962497cccb05ab9862fd3935397a3..cc1c1801c37ee103da90df940a673ceeac2772ed
 100644
--- a/gcc/testsuite/gcc.target/arm/pr78255-2.c
+++ b/gcc/testsuite/gcc.target/arm/pr78255-2.c
@@ -9,4 +9,4 @@ foo (void)
   return bar ((void*)bar);
 }
 
-/* { dg-final { scan-assembler "b\\s+bar" } } */
+/* { dg-final { scan-assembler "bl?\\s+bar" } } */


[PR c++/78572] handle array self references in intializers

2016-12-20 Thread Aldy Hernandez
The problem in this PR is that we're trying to initialize an array with 
members of itself:


int array[10] = { array[3]=5, array[7]=3 };

The C++ front-end, in store_init_value() via cxx_constant_value() and 
friends will transform:


{array[3] = 5, array[7] = 3}

into:

{[3]=5, [0]=5, [7]=3, [1]=3}

which looks invalid to output_constructor*(), presumably because it 
isn't sorted by increasing index.


Jakub has even gone further to show that for the following:

... = { array[3]=5, array[7]=3, array[7]=8, array[7] = 9 };

things get even worse, because we generate code to write twice into [3]:

{[3]=5, [0]=5, [7]=9, [1]=3, [2]=8, [3]=9}

I took the easy way in cxx_eval_store_expression() and marked the 
expression as non_constant_p if we're trying to set an array from 
members of itself.  This causes us to to generate the initialization of 
the self-referencing array[sub] elements as a CLEANUP_POINT_EXPR:


<;
<;

Ultimately this yields correct code:

array:
.zero   40
...
...

_GLOBAL__sub_I_array:
.LFB1:
.cfi_startproc
movl$5, array+12(%rip)
movl$5, array(%rip)
movl$3, array+28(%rip)
movl$3, array+4(%rip)
ret

Is this approach acceptable, or should we be marking this as 
non-constant somewhere else in the chain?  Or perhaps another approach 
would be to handle such constructor holes in the in the varasm code?


Aldy
commit 295d93f60bcbec5b9959a7b3656f10aa0df71c9f
Author: Aldy Hernandez 
Date:   Tue Dec 20 06:13:26 2016 -0500

PR c++/78572
* constexpr.c (cxx_eval_store_expression): Avoid array self
references in initialization.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index aedd004..ac279ad 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3313,6 +3313,15 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
}
 }
 
+  /* Initializing an array from a variant of itself is a non-constant.
+ This avoids attemps at generating incorrect self references in
+ something like: int foo[10] = { stuff[3]=8 }.  */
+  if (TREE_CODE (target) == ARRAY_REF && object == ctx->object)
+{
+  *non_constant_p = true;
+  return t;
+}
+
   /* And then find/build up our initializer for the path to the subobject
  we're initializing.  */
   tree *valp;
diff --git a/gcc/testsuite/g++.dg/pr78572.C b/gcc/testsuite/g++.dg/pr78572.C
new file mode 100644
index 000..82ab4e3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr78572.C
@@ -0,0 +1,9 @@
+// { dg-do compile } */
+
+static int array[10] = { array[3]=5, array[7]=3 };
+
+int
+main ()
+{
+  return 0;
+}


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 05:04:54PM +0100, Andre Vehreschild wrote:
> Well, then how about:
> 
> #define gfc_size_t_zero_node build_int_cst (size_type_node, 0)
> 
> We can't get any faster and for new and old gfortran-hackers one identifier's
> meaning is faster to grasp than two's.

Such macros can cause various maintenance issues, so I'm not in favor of
that.  But if you as fortran maintainers want it, I won't object strongly.

Jakub


Re: [PATCH, Fortran, alloc_poly, v2] Fix allocation of memory for polymorphic assignment

2016-12-20 Thread Andre Vehreschild
Hi Janus,

> 1) After adding that code block in gfc_trans_assignment_1, it seems
> like the comment above is outdated, right?

Thanks for noting.

> 2) Wouldn't it be better to move this block, which does the correct
> allocation for CLASS variables, into
> "alloc_scalar_allocatable_for_assignment", where the allocation for
> all other cases is done?

I tried to, but that would have meant to extend the interface of
alloc_scalar_allocatable_for_assignment significantly, while at the location
where I finally added the code, I could use the data available. Secondly
putting the malloc at the correct location is not possible at alloc_scalar_...
because the pre-blocks have already been joined to the body. That way the
malloc was always placed either before even the vptr was set, or after the data
was copied. Both options were quite hazardous. 

I now went to add the allocation into trans_class_assignment (). This allows
even more reuse of already present and needed data, e.g., the vptr.

Bootstrapped and regtested ok on x86_64-linux/f23. Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/testsuite/ChangeLog:

2016-12-20  Andre Vehreschild  

* gfortran.dg/class_assign_1.f08: New test.


gcc/fortran/ChangeLog:

2016-12-20  Andre Vehreschild  

* trans-expr.c (trans_class_assignment): Allocate memory of _vptr->size
before assigning an allocatable class object.
(gfc_trans_assignment_1): Flag that (re-)alloc of the class object
shall be done.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index cbff9ae..ce7927c 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -9635,17 +9635,38 @@ is_runtime_conformable (gfc_expr *expr1, gfc_expr *expr2)
 
 static tree
 trans_class_assignment (stmtblock_t *block, gfc_expr *lhs, gfc_expr *rhs,
-			gfc_se *lse, gfc_se *rse, bool use_vptr_copy)
+			gfc_se *lse, gfc_se *rse, bool use_vptr_copy,
+			bool class_realloc)
 {
-  tree tmp;
-  tree fcn;
-  tree stdcopy, to_len, from_len;
+  tree tmp, fcn, stdcopy, to_len, from_len, vptr;
   vec *args = NULL;
 
-  tmp = trans_class_vptr_len_assignment (block, lhs, rhs, rse, _len,
+  vptr = trans_class_vptr_len_assignment (block, lhs, rhs, rse, _len,
 	 _len);
 
-  fcn = gfc_vptr_copy_get (tmp);
+  /* Generate allocation of the lhs.  */
+  if (class_realloc)
+{
+  stmtblock_t alloc;
+  tree class_han;
+
+  tmp = gfc_vptr_size_get (vptr);
+  class_han = GFC_CLASS_TYPE_P (TREE_TYPE (lse->expr))
+	  ? gfc_class_data_get (lse->expr) : lse->expr;
+  gfc_init_block ();
+  gfc_allocate_using_malloc (, class_han, tmp, NULL_TREE);
+  tmp = fold_build2_loc (input_location, EQ_EXPR,
+			 boolean_type_node, class_han,
+			 build_int_cst (prvoid_type_node, 0));
+  tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node,
+			 gfc_unlikely (tmp,
+	   PRED_FORTRAN_FAIL_ALLOC),
+			 gfc_finish_block (),
+			 build_empty_stmt (input_location));
+  gfc_add_expr_to_block (>pre, tmp);
+}
+
+  fcn = gfc_vptr_copy_get (vptr);
 
   tmp = GFC_CLASS_TYPE_P (TREE_TYPE (rse->expr))
   ? gfc_class_data_get (rse->expr) : rse->expr;
@@ -9971,15 +9992,10 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
 }
 
   if (is_poly_assign)
-{
-  tmp = trans_class_assignment (, expr1, expr2, , ,
-use_vptr_copy || (lhs_attr.allocatable
-		  && !lhs_attr.dimension));
-  /* Modify the expr1 after the assignment, to allow the realloc below.
-	 Therefore only needed, when realloc_lhs is enabled.  */
-  if (flag_realloc_lhs && !lhs_attr.pointer)
-	gfc_add_data_component (expr1);
-}
+tmp = trans_class_assignment (, expr1, expr2, , ,
+  use_vptr_copy || (lhs_attr.allocatable
+		&& !lhs_attr.dimension),
+  flag_realloc_lhs && !lhs_attr.pointer);
   else if (flag_coarray == GFC_FCOARRAY_LIB
 	   && lhs_caf_attr.codimension && rhs_caf_attr.codimension
 	   && ((lhs_caf_attr.allocatable && lhs_refs_comp)
@@ -10021,7 +10037,8 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag,
   if (lss == gfc_ss_terminator)
 {
   /* F2003: Add the code for reallocation on assignment.  */
-  if (flag_realloc_lhs && is_scalar_reallocatable_lhs (expr1))
+  if (flag_realloc_lhs && is_scalar_reallocatable_lhs (expr1)
+	  && !is_poly_assign)
 	alloc_scalar_allocatable_for_assignment (, string_length,
 		 expr1, expr2);
 
diff --git a/gcc/testsuite/gfortran.dg/class_assign_1.f08 b/gcc/testsuite/gfortran.dg/class_assign_1.f08
new file mode 100644
index 000..fb1f655
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/class_assign_1.f08
@@ -0,0 +1,71 @@
+! { dg-do run }
+!
+! Check that reallocation of the lhs is done with the correct memory size.
+
+
+module base_mod
+
+  type, abstract :: base
+  contains
+procedure(base_add), 

Re: [Patch] Turn -fexcess-precision=fast on when in -ffast-math

2016-12-20 Thread Sandra Loosemore

On 12/20/2016 05:14 AM, James Greenhalgh wrote:


On Tue, Dec 20, 2016 at 11:48:26AM +0100, Richard Biener wrote:

On Mon, Dec 19, 2016 at 6:58 PM, James Greenhalgh
 wrote:



On Thu, Dec 8, 2016 at 10:44 PM, Uros Bizjak  wrote:

Hello!

Attached patch fixes fall-out from excess-precision improvements
patch. As shown in the PR, the code throughout the compiler assumes
FLAG_PRECISION_FAST when flag_unsafe_math_optimizations flag is in
effect. The patch puts back two lines, removed by excess-precision
improvements patch.

2016-12-08  Uros Bizjak  

 PR middle-end/78738
 * toplev.c (init_excess_precision): Initialize flag_excess_precision
 to EXCESS_PRECISION_FAST for flag_unsafe_math_optimizations.

testsuite/ChangeLog:

2016-12-08  Uros Bizjak  

 PR middle-end/78738
 * gcc.target/i386/pr78738.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?


Hmm, I think it belongs to set_unsafe_math_optimization_flags instead
(and be consistent if -fexcess-precision was manually specified).


I think it would be better if this were implied by -ffast-math/-Ofast
than by -funsafe-math-optimizations . That's what I've implemented here,
and tagged the option as SetByCombined to allow us to honour what the
user requests.

This should give us the behaviour you were looking for Uros.

I've bootstrapped and tested the behaviour on x86_64, and I've hacked up
the AArch64 backend to validate that we're setting the flag in the right
circumstances (but that meant changing the AArch64 behaviour, so isn't
something we'd want on trunk, and therefore I can't write a testcase for
this patch).

OK?


Looks good to me, but please also adjust invoke.texi to list -fexcess-precision
as affected by -ffast-math.

Ok with that change.


Thanks, I've modified the affected portions of the documentation.

As I've made a few mistakes in this area recently, I'll hold off on
committing the patch until these documentation changes have been looked
at by Sandra.

OK?


I only have one tiny nit, in this snippet:


 semantics apply without excess precision, and in the latter, rounding
 is unpredictable.

+
 @item -ffast-math
 @opindex ffast-math
 Sets the options @option{-fno-math-errno}, 
@option{-funsafe-math-optimizations},


Please don't introduce unnecessary whitespace changes, or mix them with 
changes to technical content.  The doc parts are OK with that repaired.


-Sandra



Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Andre Vehreschild


On Tue, 20 Dec 2016 16:40:13 +0100
Jakub Jelinek  wrote:

> On Tue, Dec 20, 2016 at 04:29:07PM +0100, Andre Vehreschild wrote:
> > > The first one is GCC internal type for representing sizes, the latter is
> > > the C size_t (usually they have the same precision, they always have the
> > > same signedness (unsigned)).
> > > In the past sizetype actually has been a signed type with very special
> > > behavior.  
> > 
> > I am still wondering if it does not make sense to have something like
> > gfc_size_t_zero_node to prevent us from repeating build_zero_cst
> > (size_type_node) all the time. I had to use it 16 times, i.e., 16 times the
> > code for building a zero size type node is generated instead of a reference
> > to a "constant". And I don't want to know how often size_zero_node is used
> > in the wrong location.  
> 
> built_int_cst (size_type_node, 0) is actually faster than build_zero_cst,
> one fewer level of indirection.
> The 0 constant is cached in the type itself, so it actually in the end
> is basically just:
> return TREE_VEC_ELT (TYPE_CACHED_VALUES (type), 1);
> 
> Adding a variable to hold gfc_size_t_zero_node would mean you'd need to add
> a GC root to hold it.
> 
> Note, sizetype should be still what is usually used, only if you in the ABI
> have something declared as C size_t, then size_type_node should be used.

Well, then how about:

#define gfc_size_t_zero_node build_int_cst (size_type_node, 0)

We can't get any faster and for new and old gfortran-hackers one identifier's
meaning is faster to grasp than two's.

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


[PATCH][tree-ssa-address] Use simplify_gen_binary in gen_addr_rtx

2016-12-20 Thread Kyrill Tkachov

Hi all,

The testcase in this patch generates bogus assembly for arm with -O1 
-mfloat-abi=soft:
   strdr4, [#0, r3]

This is due to non-canonical RTL being generated during expansion:
(set (mem:DI (plus:SI (const_int 0 [0])
(reg/f:SI 153)) [0 MEM[symbol: a, index: _26, offset: 0B]+0 S8 
A64])
(reg:DI 154))

Note the (plus (const_int 0) (reg)). This is being generated in gen_addr_rtx in 
tree-ssa-address.c
where it creates an explicit PLUS rtx through gen_rtx_PLUS, which doesn't try 
to canonicalise its arguments
or simplify. The correct thing to do is to use simplify_gen_binary that will 
handle all this properly.

I didn't change the other gen_rtx_PLUS calls in this function as their results 
is later used in XEXP operations
that seem to rely on a PLUS expression being explicitly produced, but this 
particular call doesn't, so it's okay
to change it. With this patch the sane assembly is generated:
strdr4, [r3]

Bootstrapped and tested on arm-none-linux-gnueabihf, x86_64, 
aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-12-20  Kyrylo Tkachov  

* tree-ssa-address.c (gen_addr_rtx): Use simplify_gen_binary to add
*addr to act_elem.

2016-12-20  Kyrylo Tkachov  

* gcc.dg/20161219.c: New test.
diff --git a/gcc/testsuite/gcc.dg/20161219.c b/gcc/testsuite/gcc.dg/20161219.c
new file mode 100644
index ..93ea8d2364d9ab54704a84e6c0bff0427df82db8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/20161219.c
@@ -0,0 +1,30 @@
+/* { dg-do assemble } */
+/* { dg-options "-O1 -w" } */
+
+static long long a[9];
+int b, c, d, e, g;
+
+static int
+fn1 (int *p1)
+{
+  b = 1;
+  for (; b >= 0; b--)
+{
+  d = 0;
+  for (;; d++)
+	{
+	  e && (a[d] = 0);
+	  if (*p1)
+	break;
+	  c = (int) a;
+	}
+}
+  return 0;
+}
+
+int
+main ()
+{
+  int f = fn1 ((int *) f);
+  return f;
+}
diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c
index a53ade0600d01c9d48f6c789b78fd42d7a06a95b..0c7c5902cebb0196c8e27f4eba45ce176cc3f0a5 100644
--- a/gcc/tree-ssa-address.c
+++ b/gcc/tree-ssa-address.c
@@ -154,7 +154,7 @@ gen_addr_rtx (machine_mode address_mode,
 	}
 
   if (*addr)
-	*addr = gen_rtx_PLUS (address_mode, *addr, act_elem);
+	*addr = simplify_gen_binary (PLUS, address_mode, *addr, act_elem);
   else
 	*addr = act_elem;
 }


Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

2016-12-20 Thread Jeff Law

On 12/20/2016 07:16 AM, Richard Biener wrote:


it should be already set as we use visited_ssa as "was it ever on the worklist",
so maybe renaming it would be a good thing as well.

+ if (TREE_CODE (name) == SSA_NAME)
+   {
+ /* If an SSA has already been seen, this may be a loop.
+Fail conservatively.  */
+ if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
+   return false;

so to me "conservative" is returning true, not false.

Agreed.



+ bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
+ worklist.safe_push (name);

but for loops we can just continue and ignore this use.  And bitmap_set_bit
returns whether it set a bit, thus

if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name)))
  worklist.safe_push (name);

should work?

What if we're in an irreducible region?




In principle the thing is sound but I'd like to be able to pass in a bitmap of
known maybe-undefined/must-defined SSA names to have a cache for
multiple invocations from within a pass (like this unswitching case).
So that means keeping another bitmap for things positively identified as 
defined, then saving it for later invocations.  So maybe some of my work 
+ some of his melded together.  (mine computes once for the entire FN 
and saves the result, so converting it to on-demand saving the result 
shouldn't be terrible).





Only unswitching on conditions that post-dominate the loop entry might be a
simpler solution for the PR in question.  OTOH this may disable too much
unswitching in practice.

It might be worth experimentation.

Jeff


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 04:29:07PM +0100, Andre Vehreschild wrote:
> > The first one is GCC internal type for representing sizes, the latter is
> > the C size_t (usually they have the same precision, they always have the
> > same signedness (unsigned)).
> > In the past sizetype actually has been a signed type with very special
> > behavior.
> 
> I am still wondering if it does not make sense to have something like
> gfc_size_t_zero_node to prevent us from repeating build_zero_cst
> (size_type_node) all the time. I had to use it 16 times, i.e., 16 times the
> code for building a zero size type node is generated instead of a reference to
> a "constant". And I don't want to know how often size_zero_node is used in the
> wrong location.

built_int_cst (size_type_node, 0) is actually faster than build_zero_cst,
one fewer level of indirection.
The 0 constant is cached in the type itself, so it actually in the end
is basically just:
return TREE_VEC_ELT (TYPE_CACHED_VALUES (type), 1);

Adding a variable to hold gfc_size_t_zero_node would mean you'd need to add
a GC root to hold it.

Note, sizetype should be still what is usually used, only if you in the ABI
have something declared as C size_t, then size_type_node should be used.

Jakub


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Andre Vehreschild


On Tue, 20 Dec 2016 16:00:19 +0100
Jakub Jelinek  wrote:

> On Tue, Dec 20, 2016 at 04:55:29PM +0200, Janne Blomqvist wrote:
> > On Tue, Dec 20, 2016 at 3:42 PM, Andre Vehreschild  wrote:  
> > > Hi all,
> > >  
> > >> I think you should use build_zero_cst(size_type_node) instead of
> > >> size_zero_node as size_zero_node is of type sizetype which is not the
> > >> same as size_type_node. Otherwise looks good.  
> > >
> > > In the software design classes I took this was called a design error: Not
> > > choosing sufficiently different names for different artifacts. It was
> > > considered a beginner's error.  
> > 
> > Yeah, sizetype vs. size_type_node is confusing, to say the least..  
> 
> The first one is GCC internal type for representing sizes, the latter is
> the C size_t (usually they have the same precision, they always have the
> same signedness (unsigned)).
> In the past sizetype actually has been a signed type with very special
> behavior.

I am still wondering if it does not make sense to have something like
gfc_size_t_zero_node to prevent us from repeating build_zero_cst
(size_type_node) all the time. I had to use it 16 times, i.e., 16 times the
code for building a zero size type node is generated instead of a reference to
a "constant". And I don't want to know how often size_zero_node is used in the
wrong location.

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH, ARM] Further improve stack usage in sha512, part 2 (PR 77308)

2016-12-20 Thread Wilco Dijkstra
Bernd Edlinger wrote:
> this splits the *arm_negdi2, *arm_cmpdi_insn and *arm_cmpdi_unsigned
> also at split1 except for TARGET_NEON and TARGET_IWMMXT.
>
> In the new test case the stack is reduced to about 270 bytes, except
> for neon and iwmmxt, where this does not change anything.

This looks odd:

-operands[2] = gen_lowpart (SImode, operands[2]);
+if (can_create_pseudo_p ())
+  operands[2] = gen_reg_rtx (SImode);
+else
+  operands[2] = gen_lowpart (SImode, operands[2]);

Given this is an SI mode scratch, do we need the else part at all? It seems 
wrong
to ask for the low part of an SI mode operand...

Other than that it looks good to me, but I can't approve.

As a result of your patches a few patterns are unused now. All the Thumb-2 
iordi_notdi*
patterns cannot be used anymore. Also I think arm_cmpdi_zero never gets used - 
a DI
mode compare with zero is always split into ORR during expand.

Wilco




Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 04:55:29PM +0200, Janne Blomqvist wrote:
> On Tue, Dec 20, 2016 at 3:42 PM, Andre Vehreschild  wrote:
> > Hi all,
> >
> >> I think you should use build_zero_cst(size_type_node) instead of
> >> size_zero_node as size_zero_node is of type sizetype which is not the
> >> same as size_type_node. Otherwise looks good.
> >
> > In the software design classes I took this was called a design error: Not
> > choosing sufficiently different names for different artifacts. It was
> > considered a beginner's error.
> 
> Yeah, sizetype vs. size_type_node is confusing, to say the least..

The first one is GCC internal type for representing sizes, the latter is
the C size_t (usually they have the same precision, they always have the
same signedness (unsigned)).
In the past sizetype actually has been a signed type with very special
behavior.

Jakub


Re: [PATCH] Use the middle-end boolean_type_node

2016-12-20 Thread Steve Kargl
Thought I gave an 'ok', but apparently never sent email.
Sorry about that.  Yes, ok to commit.

-- 
steve

On Tue, Dec 20, 2016 at 04:56:51PM +0200, Janne Blomqvist wrote:
> PING!
> 
> On Tue, Dec 13, 2016 at 10:59 PM, Janne Blomqvist
>  wrote:
> > Use the boolean_type_node setup by the middle-end instead of
> > redefining it. boolean_type_node is not used in GFortran for any
> > ABI-visible stuff, only internally as the type of boolean
> > expressions. There appears to be one exception to this, namely the
> > caf_get* and caf_send* calls which have boolean_type_node
> > arguments. However, on the library side they seem to use C _Bool, so I
> > suspect this might be a case of a argument mismatch that hasn't
> > affected anything so far.
> >
> > The practical effect of this is that the size of such variables will
> > be the same as a C _Bool or C++ bool, that is, on most targets a
> > single byte. Previously we redefined boolean_type_node to be a Fortran
> > default logical kind sized variable, that is 4 or 8 bytes depending on
> > compile options. This might enable slightly more compact code, in case
> > the optimizer determines that the result of such a generated
> > comparison expression needs to be stored in some temporary location
> > rather than being used immediately.
> >
> > Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
> >
> > 2016-12-13  Janne Blomqvist  
> >
> > * trans-types.c (gfc_init_types): Don't redefine boolean type node.
> > ---
> >  gcc/fortran/trans-types.c | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> > index 354308f..e8dafa0 100644
> > --- a/gcc/fortran/trans-types.c
> > +++ b/gcc/fortran/trans-types.c
> > @@ -961,10 +961,6 @@ gfc_init_types (void)
> > wi::mask (n, UNSIGNED,
> >   TYPE_PRECISION (size_type_node)));
> >
> > -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> > -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> > -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> > -
> >/* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
> >gfc_charlen_int_kind = 4;
> >gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> > --
> > 2.7.4
> >
> 
> 
> 
> -- 
> Janne Blomqvist

-- 
Steve


Re: [PATCH] Use the middle-end boolean_type_node

2016-12-20 Thread Janne Blomqvist
PING!

On Tue, Dec 13, 2016 at 10:59 PM, Janne Blomqvist
 wrote:
> Use the boolean_type_node setup by the middle-end instead of
> redefining it. boolean_type_node is not used in GFortran for any
> ABI-visible stuff, only internally as the type of boolean
> expressions. There appears to be one exception to this, namely the
> caf_get* and caf_send* calls which have boolean_type_node
> arguments. However, on the library side they seem to use C _Bool, so I
> suspect this might be a case of a argument mismatch that hasn't
> affected anything so far.
>
> The practical effect of this is that the size of such variables will
> be the same as a C _Bool or C++ bool, that is, on most targets a
> single byte. Previously we redefined boolean_type_node to be a Fortran
> default logical kind sized variable, that is 4 or 8 bytes depending on
> compile options. This might enable slightly more compact code, in case
> the optimizer determines that the result of such a generated
> comparison expression needs to be stored in some temporary location
> rather than being used immediately.
>
> Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
>
> 2016-12-13  Janne Blomqvist  
>
> * trans-types.c (gfc_init_types): Don't redefine boolean type node.
> ---
>  gcc/fortran/trans-types.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> index 354308f..e8dafa0 100644
> --- a/gcc/fortran/trans-types.c
> +++ b/gcc/fortran/trans-types.c
> @@ -961,10 +961,6 @@ gfc_init_types (void)
> wi::mask (n, UNSIGNED,
>   TYPE_PRECISION (size_type_node)));
>
> -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> -
>/* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
>gfc_charlen_int_kind = 4;
>gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> --
> 2.7.4
>



-- 
Janne Blomqvist


Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Janne Blomqvist
On Tue, Dec 20, 2016 at 3:42 PM, Andre Vehreschild  wrote:
> Hi all,
>
>> I think you should use build_zero_cst(size_type_node) instead of
>> size_zero_node as size_zero_node is of type sizetype which is not the
>> same as size_type_node. Otherwise looks good.
>
> In the software design classes I took this was called a design error: Not
> choosing sufficiently different names for different artifacts. It was
> considered a beginner's error.

Yeah, sizetype vs. size_type_node is confusing, to say the least..

> So now I have to repeat myself 16 times only to work around this b***. Nothing
> that will improve gfortran's maintainability.
>
> Second version of the changes needed for caf attached. Bootstrapped and
> regtested fine besides prior known
>
> FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03 -O1  (test for excess
> errors)
>
> on x86_64-linux/f23.

Ok, looks good.



-- 
Janne Blomqvist


Re: [PATCH] Fix IPA CP where it forgot to add a reference in cgraph

2016-12-20 Thread Martin Liška
On 12/20/2016 11:06 AM, Martin Jambor wrote:
> ...this test should be for ADDR_EXPR here.  Or you could switch the
> IPA_REF_* constants the other way round which I bet is going to have
> the same effect in practice, but personally, I'd test for ADDR_EXPR.

Thanks for the note, fixed (and tested in second version of the patch).

Martin

> 
> Thanks,
> 
> Martin

>From 2e29080e44cce899f9d5181185aba0a8a8791a9a Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 19 Dec 2016 11:03:34 +0100
Subject: [PATCH] Fix IPA CP where it forgot to add a reference in cgraph

gcc/ChangeLog:

2016-12-19  Martin Liska  

	* cgraphclones.c (cgraph_node::create_virtual_clone):
	Create either IPA_REF_LOAD of IPA_REF_READ depending on
	whether new_tree is a VAR_DECL or an ADDR_EXPR.
	* ipa-cp.c (create_specialized_node): Add reference just for
	ADDR_EXPRs.
	* symtab.c (symtab_node::maybe_create_reference): Remove guard
	as it's guarded in callers.
---
 gcc/cgraphclones.c | 6 +-
 gcc/ipa-cp.c   | 3 ++-
 gcc/symtab.c   | 2 --
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index 349892dab67..6c8fe156f23 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -624,7 +624,11 @@ cgraph_node::create_virtual_clone (vec redirect_callers,
   || in_lto_p)
 new_node->unique_name = true;
   FOR_EACH_VEC_SAFE_ELT (tree_map, i, map)
-new_node->maybe_create_reference (map->new_tree, IPA_REF_ADDR, NULL);
+{
+  ipa_ref_use use_type
+	= TREE_CODE (map->new_tree) == ADDR_EXPR ? IPA_REF_ADDR : IPA_REF_LOAD;
+  new_node->maybe_create_reference (map->new_tree, use_type, NULL);
+}
 
   if (ipa_transforms_to_apply.exists ())
 new_node->ipa_transforms_to_apply
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index d3b50524457..fd312b56fde 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -3787,7 +3787,8 @@ create_specialized_node (struct cgraph_node *node,
 	 args_to_skip, "constprop");
   ipa_set_node_agg_value_chain (new_node, aggvals);
   for (av = aggvals; av; av = av->next)
-new_node->maybe_create_reference (av->value, IPA_REF_ADDR, NULL);
+if (TREE_CODE (av->value) == ADDR_EXPR)
+  new_node->maybe_create_reference (av->value, IPA_REF_ADDR, NULL);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
diff --git a/gcc/symtab.c b/gcc/symtab.c
index 73168a8db09..562a4a2f6a6 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -598,8 +598,6 @@ symtab_node::maybe_create_reference (tree val, enum ipa_ref_use use_type,
  gimple *stmt)
 {
   STRIP_NOPS (val);
-  if (TREE_CODE (val) != ADDR_EXPR)
-return NULL;
   val = get_base_var (val);
   if (val && VAR_OR_FUNCTION_DECL_P (val))
 {
-- 
2.11.0



Re: Pointer Bounds Checker and trailing arrays (PR68270)

2016-12-20 Thread Alexander Ivchenko
2016-11-26 0:28 GMT+03:00 Ilya Enkovich :
> 2016-11-25 15:47 GMT+03:00 Alexander Ivchenko :
>> Hi,
>>
>> The patch below addresses PR68270. could you please take a look?
>>
>> 2016-11-25  Alexander Ivchenko  
>>
>>* c-family/c.opt (flag_chkp_flexible_struct_trailing_arrays):
>>Add new option.
>>* tree-chkp.c (chkp_parse_array_and_component_ref): Forbid
>>narrowing when chkp_parse_array_and_component_ref is used and
>>the ARRAY_REF points to an array in the end of the struct.
>>
>>
>>
>> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
>> index 7d8a726..e45d6a2 100644
>> --- a/gcc/c-family/c.opt
>> +++ b/gcc/c-family/c.opt
>> @@ -1166,6 +1166,11 @@ C ObjC C++ ObjC++ LTO RejectNegative Report
>> Var(flag_chkp_narrow_to_innermost_ar
>>  Forces Pointer Bounds Checker to use bounds of the innermost arrays in case 
>> of
>>  nested static arryas access.  By default outermost array is used.
>>
>> +fchkp-flexible-struct-trailing-arrays
>> +C ObjC C++ ObjC++ LTO RejectNegative Report
>> Var(flag_chkp_flexible_struct_trailing_arrays)
>> +Allow Pointer Bounds Checker to treat all trailing arrays in structures as
>> +possibly flexible.
>
> Words 'allow' and 'possibly' are confusing here. This option is about to force
> checker to do something, not to give him a choice.

Fixed

> New option has to be documented in invoke.texi. It would also be nice to 
> reflect
> changes on GCC MPX wiki page.

Done
> We have an attribute to change compiler behavior when this option is not set.
> But we have no way to make exceptions when this option is used. Should we
> add one?
Something like "bnd_fixed_size" ? Could work. Although the customer
request did not mention the need for that.
Can I add it in a separate patch?


>> +
>>  fchkp-optimize
>>  C ObjC C++ ObjC++ LTO Report Var(flag_chkp_optimize) Init(-1)
>>  Allow Pointer Bounds Checker optimizations.  By default allowed
>> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
>> index 2769682..40f99c3 100644
>> --- a/gcc/tree-chkp.c
>> +++ b/gcc/tree-chkp.c
>> @@ -3425,7 +3425,9 @@ chkp_parse_array_and_component_ref (tree node, tree 
>> *ptr,
>>if (flag_chkp_narrow_bounds
>>&& !flag_chkp_narrow_to_innermost_arrray
>>&& (!last_comp
>> -  || chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1
>> +  || (chkp_may_narrow_to_field (TREE_OPERAND (last_comp, 1))
>> +  && !(flag_chkp_flexible_struct_trailing_arrays
>> +   && array_at_struct_end_p (var)
>
> This is incorrect place for fix. Consider code
>
> struct S {
>   int a;
>   int b[10];
> };
>
> struct S s;
> int *p = s.b;
>
> Here you need to compute bounds for p and you want your option to take effect
> but in this case you won't event reach your new check because there is no
> ARRAY_REF. And even if we change it to
>
> int *p = s.b[5];
>
> then it still would be narrowed because s.b would still be written
> into 'comp_to_narrow'
> variable. Correct place for fix is in chkp_may_narrow_to_field.

Done

> Also you should consider fchkp-narrow-to-innermost-arrray option. Should it
> be more powerfull or not? I think fchkp-narrow-to-innermost-arrray shouldn't
> narrow to variable sized fields. BTW looks like right now bnd_variable_size
> attribute is ignored by fchkp-narrow-to-innermost-arrray. This is another
> problem and may be fixed in another patch though.
The way code works in chkp_parse_array_and_component_ref seems to be
exactly like you say:  fchkp-narrow-to-innermost-arrray won't narrow
to variable sized fields. I will create a separate bug for
bnd_variable_size+ fchkp-narrow-to-innermost-arrray.

> Also patch lacks tests for various situations (with option and without, with
> ARRAY_REF and without). In case of new attribute and fix for
> fchkp-narrow-to-innermost-arrray behavior additional tests are required.
I added three tests for flag_chkp_flexible_struct_trailing_arrays



The patch is below:

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 2d47d54..21ad6aa 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1188,7 +1188,13 @@ narrowing is on, field bounds are used.
Otherwise full object bounds are used.
 fchkp-narrow-to-innermost-array
 C ObjC C++ ObjC++ LTO RejectNegative Report
Var(flag_chkp_narrow_to_innermost_arrray)
 Forces Pointer Bounds Checker to use bounds of the innermost arrays in case of
-nested static arryas access.  By default outermost array is used.
+nested static arrays access.  By default outermost array is used.
+
+fchkp-flexible-struct-trailing-arrays
+C ObjC C++ ObjC++ LTO RejectNegative Report
Var(flag_chkp_flexible_struct_trailing_arrays)
+Forces Pointer Bounds Checker to treat all trailing arrays in structures as
+possibly flexible.  By default only arrays fields with zero length or that are
+marked with attribute bnd_variable_size are treated as flexible.

 fchkp-optimize
 C ObjC C++ ObjC++ LTO Report Var(flag_chkp_optimize) Init(-1)
diff 

Re: [PATCH] [PR rtl-optimization/65618] Fix MIPS ADA bootstrap failure

2016-12-20 Thread James Cowgill
Hi,

On 19/12/16 21:43, Jeff Law wrote:
> On 12/19/2016 08:44 AM, James Cowgill wrote:
>> 2016-12-16  James Cowgill  
>>
>> PR rtl-optimization/65618
>> * emit-rtl.c (try_split): Update "after" when moving a
>> NOTE_INSN_CALL_ARG_LOCATION.
>>
>> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
>> index 7de17454037..6be124ac038 100644
>> --- a/gcc/emit-rtl.c
>> +++ b/gcc/emit-rtl.c
>> @@ -3742,6 +3742,11 @@ try_split (rtx pat, rtx_insn *trial, int last)
>> next = NEXT_INSN (next))
>>  if (NOTE_KIND (next) == NOTE_INSN_CALL_ARG_LOCATION)
>>{
>> +/* Advance after to the next instruction if it is about to
>> +   be removed.  */
>> +if (after == next)
>> +  after = NEXT_INSN (after);
>> +
>>  remove_insn (next);
>>  add_insn_after (next, insn, NULL);
>>  break;
>>
> So the thing I don't like when looking at this code is we set AFTER
> immediately upon entry to try_split.  But we don't use it until near the
> very end of try_split.  That's just asking for trouble.
> 
> Can we reasonably initialize AFTER just before it's used?

I wasn't sure but looking closer I think that would be fine. This patch
also works and does what Richard Sandiford suggested in the PR.

2016-12-20  James Cowgill  

PR rtl-optimization/65618
* emit-rtl.c (try_split): Move initialization of "before" and
"after" to just before the call to emit_insn_after_setloc.

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 7de17454037..bdc984c65cf 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3643,8 +3643,7 @@ mark_label_nuses (rtx x)
 rtx_insn *
 try_split (rtx pat, rtx_insn *trial, int last)
 {
-  rtx_insn *before = PREV_INSN (trial);
-  rtx_insn *after = NEXT_INSN (trial);
+  rtx_insn *before, *after;
   rtx note;
   rtx_insn *seq, *tem;
   int probability;
@@ -3818,6 +3817,9 @@ try_split (rtx pat, rtx_insn *trial, int last)
}
 }
 
+  before = PREV_INSN (trial);
+  after = NEXT_INSN (trial);
+
   tem = emit_insn_after_setloc (seq, trial, INSN_LOCATION (trial));
 
   delete_insn (trial);

Thanks,
James


Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

2016-12-20 Thread Richard Biener
On Fri, Dec 16, 2016 at 3:41 PM, Aldy Hernandez  wrote:
> Hi folks.
>
> This is a follow-up on Jeff and Richi's interaction on the aforementioned PR
> here:
>
> https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01397.html
>
> I decided to explore the idea of analyzing may-undefness on-demand, which
> actually looks rather cheap.
>
> BTW, I don't understand why we don't have auto_bitmap's, as we already have
> auto_sbitmap's.  I've implemented the former based on auto_sbitmap's code we
> already have.
>
> The attached patch fixes the bug without introducing any regressions.
>
> I also tested the patch by compiling 242 .ii files with -O3.  These were
> gathered from a stage1 build with -save-temps.  There is a slight time
> degradation of 4 seconds within 27 minutes of user time:
>
> tainted:26:52
> orig:   26:48
>
> This was the average aggregate time of two runs compiling all 242 .ii files.
> IMO, this looks reasonable.  It is after all, -O3.Is it acceptable?

+  while (!worklist.is_empty ())
+{
+  name = worklist.pop ();
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  if (ssa_undefined_value_p (name, true))
+   return true;
+
+  bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));

it should be already set as we use visited_ssa as "was it ever on the worklist",
so maybe renaming it would be a good thing as well.

+ if (TREE_CODE (name) == SSA_NAME)
+   {
+ /* If an SSA has already been seen, this may be a loop.
+Fail conservatively.  */
+ if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
+   return false;

so to me "conservative" is returning true, not false.

+ bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
+ worklist.safe_push (name);

but for loops we can just continue and ignore this use.  And bitmap_set_bit
returns whether it set a bit, thus

if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name)))
  worklist.safe_push (name);

should work?

+  /* Check that any SSA names used to define NAME is also fully
+defined.  */
+  use_operand_p use_p;
+  ssa_op_iter iter;
+  FOR_EACH_SSA_USE_OPERAND (use_p, def, iter, SSA_OP_USE)
+   {
+ name = USE_FROM_PTR (use_p);
+ if (TREE_CODE (name) == SSA_NAME)

always true.

+   {
+ /* If an SSA has already been seen, this may be a loop.
+Fail conservatively.  */
+ if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
+   return false;
+ bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
+ worklist.safe_push (name);

See above.

In principle the thing is sound but I'd like to be able to pass in a bitmap of
known maybe-undefined/must-defined SSA names to have a cache for
multiple invocations from within a pass (like this unswitching case).

Also once you hit defs that are in a post-dominated region of the loop entry
you can treat them as not undefined (as their use invokes undefined
behavior).  This is also how you treat function parameters (well,
ssa_undefined_value_p does), where the call site invokes undefined behavior
when passing in undefined values.  So we need an extra parameter specifying
the post-dominance region.

You do not handle memory or calls conservatively which means the existing
testcase only needs some obfuscation to become a problem again.  To fix
that before /* Check that any SSA names used to define NAME is also fully
defined.  */ bail out conservatively, like

   if (! is_gimple_assign (def)
  || gimple_assign_single_p (def))
return true;

Only unswitching on conditions that post-dominate the loop entry might be a
simpler solution for the PR in question.  OTOH this may disable too much
unswitching in practice.

Richard.

> Aldy


Re: [PATCH v3] Run tests only if the machine supports the instruction set.

2016-12-20 Thread Dominik Vogt
On Tue, Dec 20, 2016 at 11:32:58AM +0100, Jakub Jelinek wrote:
> On Tue, Dec 20, 2016 at 11:22:47AM +0100, Dominik Vogt wrote:
> > On Mon, Dec 19, 2016 at 06:00:21PM +0100, Jakub Jelinek wrote:
> > > On Mon, Dec 19, 2016 at 05:50:40PM +0100, Dominik Vogt wrote:
> > > > * config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Define
> > > > __S390_ARCH_LEVEL__.
> > > > gcc/testsuite/ChangeLog-setmem
> > > > 
> > > > * gcc.target/s390/md/setmem_long-1.c: Use "runnable".
> > > > * gcc.target/s390/md/rXsbg_mode_sXl.c: Likewise.
> > > > * gcc.target/s390/md/andc-splitter-1.c: Likewise.
> > > > * gcc.target/s390/md/andc-splitter-2.c: Likewise.
> > > > * lib/gcc-dg.exp (gcc-dg-runtest): Export torture_current_flags.
> > > > * gcc.target/s390/s390.exp: Import torture_current_flags.
> > > > (check_effective_target_runnable): New.
> > > 
> > > Unless you want to add support for all targets in the runnable
> > > effective target, I think it would be better to call it less generically,
> > > s390_runnable or similar.

Done.

> > What do you think about the change in gcc-dg.exp?
> > 
> > We couldn't
> > decide whether it's a valid way of retrieving the flags needed for
> > compiling s390_check_runnable or not.  It would be nice to get all
> > options that are relevant for the test case, including the ones
> > from "dg-options" (etc.), but that probably requires larger
> > changes to lib/*.exp.  (The target specific check functions could
> > be removed then.)
> 
> I'm not a testsuite maintainer nor very good in tcl, so I think you want a
> testsuite maintainer to ack it in any case.
> But, I'd say you want something that will not be terribly expensive.
> If you have an effective target that happens to get flags from the
> current test, then that is necessarily non-cacheable, which would mean
> in addition to compiling every test you also compile another proglet for it.
> I think your current patch does that too, there is no caching, so it would
> be desirable to cache the results; you should invalidate those caches when
> torture_current_flags change.  See e.g. et_cache uses in
> target-supports.exp.  You want to remember torture_current_flags for which
> you've computed the s390_runnable et and if it changes, reset the cache.
> The other *_runnable flags can be probably just normally cached (and you do,
> by using check_runtime rather than check_runtime_nocache).

Right, it gets called even more often than one would think, and
even with empty torture_current_options.  The attached new patch
(v3) removes -Ox options and superflous whitespace and caches that
between calls if it's not empty.  There's another, permanent cache
for calls without any flags.  With proper ordering of the torture
options, the test program is built only a couple of times.

v3:

  * Cache test results.
  * Reorder torture tests for better caching.
  * Add ".machinemode zarch" to assembly file because the $flags
are overridden by the board options.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog-archlevel

* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Define
__S390_ARCH_LEVEL__.
gcc/testsuite/ChangeLog-setmem

* gcc.target/s390/md/setmem_long-1.c: Use "s390_runnable".
* gcc.target/s390/md/rXsbg_mode_sXl.c: Likewise.
* gcc.target/s390/md/andc-splitter-1.c: Likewise.
* gcc.target/s390/md/andc-splitter-2.c: Likewise.
* lib/gcc-dg.exp (gcc-dg-runtest): Export torture_current_flags.
* gcc.target/s390/s390.exp: Import torture_current_flags.
(check_effective_target_s390_runnable): New.
(check_effective_target_z900_runnable): New.
(check_effective_target_z990_runnable): New.
(check_effective_target_z9_ec_runnable): New.
(check_effective_target_z10_runnable): New.
(check_effective_target_z196_runnable): New.
(check_effective_target_zEC12_runnable): New.
(check_effective_target_z13_runnable): New.
(check_effective_target_z10_instructions): Removed.
(MD_TEST_OPTS): Add optimization level without -march=.
>From 4bd55b91b0487590bc9c9bde60664e5d94d2dc08 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Tue, 13 Dec 2016 10:21:08 +0100
Subject: [PATCH] S/390: Run md tests only if the machine supports the
 instruction set.

---
 gcc/config/s390/s390-c.c   |  17 +++
 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c |  19 +--
 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c |  19 +--
 gcc/testsuite/gcc.target/s390/md/rXsbg_mode_sXl.c  |   4 +-
 gcc/testsuite/gcc.target/s390/md/setmem_long-1.c   |   7 +-
 gcc/testsuite/gcc.target/s390/s390.exp | 170 ++---
 gcc/testsuite/lib/gcc-dg.exp   |   2 +
 7 files changed, 198 insertions(+), 40 deletions(-)

diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 

Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread Andre Vehreschild
Hi all,

> I think you should use build_zero_cst(size_type_node) instead of
> size_zero_node as size_zero_node is of type sizetype which is not the
> same as size_type_node. Otherwise looks good.

In the software design classes I took this was called a design error: Not
choosing sufficiently different names for different artifacts. It was
considered a beginner's error.

So now I have to repeat myself 16 times only to work around this b***. Nothing
that will improve gfortran's maintainability.

Second version of the changes needed for caf attached. Bootstrapped and
regtested fine besides prior known

FAIL: gfortran.dg/allocate_deferred_char_scalar_1.f03 -O1  (test for excess
errors)

on x86_64-linux/f23.

- Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr78534_caf_v2.clog
Description: Binary data
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 5b05a3d..1604bc8 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -4211,7 +4211,7 @@ size or one for a scalar.
 
 @item @emph{Syntax}:
 @code{void caf_register (size_t size, caf_register_t type, caf_token_t *token,
-gfc_descriptor_t *desc, int *stat, char *errmsg, int errmsg_len)}
+gfc_descriptor_t *desc, int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4263,7 +4263,7 @@ library is only expected to free memory it allocated itself during a call to
 
 @item @emph{Syntax}:
 @code{void caf_deregister (caf_token_t *token, caf_deregister_t type,
-int *stat, char *errmsg, int errmsg_len)}
+int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4322,7 +4322,8 @@ to a remote image identified by the image_index.
 @item @emph{Syntax}:
 @code{void _gfortran_caf_send (caf_token_t token, size_t offset,
 int image_index, gfc_descriptor_t *dest, caf_vector_t *dst_vector,
-gfc_descriptor_t *src, int dst_kind, int src_kind, bool may_require_tmp)}
+gfc_descriptor_t *src, int dst_kind, int src_kind, bool may_require_tmp,
+int *stat)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4345,6 +4346,9 @@ time that the @var{dest} and @var{src} either cannot overlap or overlap (fully
 or partially) such that walking @var{src} and @var{dest} in element wise
 element order (honoring the stride value) will not lead to wrong results.
 Otherwise, the value is true.
+@item @var{stat} @tab intent(out) when non-NULL give the result of the
+operation, i.e., zero on success and non-zero on error.  When NULL and error
+occurs, then an error message is printed and the program is terminated.
 @end multitable
 
 @item @emph{NOTES}
@@ -4375,7 +4379,8 @@ image identified by the image_index.
 @item @emph{Syntax}:
 @code{void _gfortran_caf_get (caf_token_t token, size_t offset,
 int image_index, gfc_descriptor_t *src, caf_vector_t *src_vector,
-gfc_descriptor_t *dest, int src_kind, int dst_kind, bool may_require_tmp)}
+gfc_descriptor_t *dest, int src_kind, int dst_kind, bool may_require_tmp,
+int *stat)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4398,6 +4403,9 @@ time that the @var{dest} and @var{src} either cannot overlap or overlap (fully
 or partially) such that walking @var{src} and @var{dest} in element wise
 element order (honoring the stride value) will not lead to wrong results.
 Otherwise, the value is true.
+@item @var{stat} @tab intent(out) when non-NULL give the result of the
+operation, i.e., zero on success and non-zero on error.  When NULL and error
+occurs, then an error message is printed and the program is terminated.
 @end multitable
 
 @item @emph{NOTES}
@@ -4430,7 +4438,7 @@ dst_image_index.
 int dst_image_index, gfc_descriptor_t *dest, caf_vector_t *dst_vector,
 caf_token_t src_token, size_t src_offset, int src_image_index,
 gfc_descriptor_t *src, caf_vector_t *src_vector, int dst_kind, int src_kind,
-bool may_require_tmp)}
+bool may_require_tmp, int *stat)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4461,6 +4469,9 @@ time that the @var{dest} and @var{src} either cannot overlap or overlap (fully
 or partially) such that walking @var{src} and @var{dest} in element wise
 element order (honoring the stride value) will not lead to wrong results.
 Otherwise, the value is true.
+@item @var{stat} @tab intent(out) when non-NULL give the result of the
+operation, i.e., zero on success and non-zero on error.  When NULL and error
+occurs, then an error message is printed and the program is terminated.
 @end multitable
 
 @item @emph{NOTES}
@@ -4673,7 +4684,7 @@ been locked by the same image is an error.
 
 @item @emph{Syntax}:
 @code{void _gfortran_caf_lock (caf_token_t token, size_t index, int image_index,
-int *aquired_lock, int *stat, char *errmsg, int errmsg_len)}
+int *aquired_lock, int *stat, char *errmsg, size_t errmsg_len)}
 
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
@@ -4708,7 +4719,7 @@ which is unlocked or has 

Re: [PATCH v3] add -fprolog-pad=N,M option

2016-12-20 Thread Maxim Kuvyrkov
> On Dec 19, 2016, at 6:04 PM, Bernd Schmidt  wrote:
> 
> I'll consider myself agnostic as to whether this is a feature we want or need,

Hi Bernd, thanks for reviewing this!

Regarding the usefulness of this feature, it has been discussed here (2 years 
ago): 
http://gcc.gcc.gnu.narkive.com/JfWUDn8Y/rfc-kernel-livepatching-support-in-gcc .

Kernel live-patching is of interest to several major distros, and it already 
supported by GCC via architecture-specific implementations (x86, s390, sparc).  
The -fprolog-pad= option is an architecture-neutral equivalent that can be used 
for porting kernel live-patching to new architectures.

Existing support for kernel live-patching in x86, s390 and sparc backends can 
be redirected to use -fprolog-pad= functionality and, thus, simplified.

--
Maxim Kuvyrkov
www.linaro.org


> so I'll just comment on some style questions. There's a fair amount of coding 
> style violations, I'll point some of them out but please read the documents 
> we have linked on this page:
>  https://gcc.gnu.org/contribute.html
> 
> On 12/16/2016 03:14 PM, Torsten Duwe wrote:
>> Signed-off-by: Torsten Duwe 
> 
> This is meaningless for the GCC project. We require a copyright assignment; I 
> assume SuSE has a blanket one that covers you. You should also write a 
> ChangeLog entry.
> 
>> diff --git a/gcc/attribs.c b/gcc/attribs.c
>> index e66349a..6ff81a8 100644
>> --- a/gcc/attribs.c
>> +++ b/gcc/attribs.c
>> @@ -365,6 +365,28 @@ decl_attributes (tree *node, tree attributes, int flags)
>>   if (!attributes_initialized)
>> init_attributes ();
>> 
>> +  /* If we're building NOP pads because of a command line arg, note the size
>> + for LTO builds, unless the attribute has already been overridden. */
> 
> Two spaces at the end of a sentence, including at the end of a comment.
> 
>> +  if (TREE_CODE (*node) == FUNCTION_DECL &&
>> +  prolog_nop_pad_size > 0)
> 
> Operators go on the next line when wrapping.
> 
>> + ),
> 
> Don't put closing parens on their own line.
> 
>> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
>> index db293fe..7f3e558 100644
>> --- a/gcc/c/c-decl.c
>> +++ b/gcc/c/c-decl.c
>> @@ -2322,6 +2322,34 @@ merge_decls (tree newdecl, tree olddecl, tree 
>> newtype, tree oldtype)
>>   TREE_ASM_WRITTEN (olddecl) = 0;
>> }
>> 
>> +  /* Prolog pad size may be set wrongly by a forward declaration;
>> + fix it up by pulling the final value in front.
>> +  */
> 
> The "*/" should go on the previous line.
> 
>> +  for (it = _ATTRIBUTES (newdecl); *it; it = _CHAIN(*it))
> 
> Space before opening parentheses (many occurrences).
>> 
>> +void
>> +default_print_prolog_pad (FILE *file, unsigned HOST_WIDE_INT pad_size,
>> +  bool record_p)
> 
> Every function needs a comment describing its purpose and that of its 
> arguments. Look around for examples. There are cases in this patch of hook 
> implementations which ignore all their args, there I believe we can relax 
> this rule a little (but a comment saying which hook is implemented would 
> still be good).
> 
>> 
>> +extern void default_print_prolog_pad (FILE *, unsigned HOST_WIDE_INT , 
>> bool);
> 
> Careful with stray whitespace.
>> +  if (tree_fits_uhwi_p (prolog_pad_value1) )
> 
> Here too.
>> +{
>> +  pad_size = tree_to_uhwi (prolog_pad_value1);
>> +}
> 
> Lose { } braces around single statements. Several cases.
> 
> Am I missing it or is there no actual implementation of the target hook 
> anywhere?
> 
> 
> Bernd



Re: [Patch] Turn -fexcess-precision=fast on when in -ffast-math

2016-12-20 Thread James Greenhalgh

On Tue, Dec 20, 2016 at 11:48:26AM +0100, Richard Biener wrote:
> On Mon, Dec 19, 2016 at 6:58 PM, James Greenhalgh
>  wrote:
> >
> >> On Thu, Dec 8, 2016 at 10:44 PM, Uros Bizjak  wrote:
> >> > Hello!
> >> >
> >> > Attached patch fixes fall-out from excess-precision improvements
> >> > patch. As shown in the PR, the code throughout the compiler assumes
> >> > FLAG_PRECISION_FAST when flag_unsafe_math_optimizations flag is in
> >> > effect. The patch puts back two lines, removed by excess-precision
> >> > improvements patch.
> >> >
> >> > 2016-12-08  Uros Bizjak  
> >> >
> >> > PR middle-end/78738
> >> > * toplev.c (init_excess_precision): Initialize flag_excess_precision
> >> > to EXCESS_PRECISION_FAST for flag_unsafe_math_optimizations.
> >> >
> >> > testsuite/ChangeLog:
> >> >
> >> > 2016-12-08  Uros Bizjak  
> >> >
> >> > PR middle-end/78738
> >> > * gcc.target/i386/pr78738.c: New test.
> >> >
> >> > Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> >> >
> >> > OK for mainline?
> >>
> >> Hmm, I think it belongs to set_unsafe_math_optimization_flags instead
> >> (and be consistent if -fexcess-precision was manually specified).
> >
> > I think it would be better if this were implied by -ffast-math/-Ofast
> > than by -funsafe-math-optimizations . That's what I've implemented here,
> > and tagged the option as SetByCombined to allow us to honour what the
> > user requests.
> >
> > This should give us the behaviour you were looking for Uros.
> >
> > I've bootstrapped and tested the behaviour on x86_64, and I've hacked up
> > the AArch64 backend to validate that we're setting the flag in the right
> > circumstances (but that meant changing the AArch64 behaviour, so isn't
> > something we'd want on trunk, and therefore I can't write a testcase for
> > this patch).
> >
> > OK?
>
> Looks good to me, but please also adjust invoke.texi to list 
> -fexcess-precision
> as affected by -ffast-math.
>
> Ok with that change.

Thanks, I've modified the affected portions of the documentation.

As I've made a few mistakes in this area recently, I'll hold off on
committing the patch until these documentation changes have been looked
at by Sandra.

OK?

Thanks,
James

---
2016-12-20  James Greenhalgh  

* common.opt (excess_precision): Tag as SetByCombined.
* opts.c (set_fast_math_flags): Also set
flag_excess_precision_cmdline.
(fast_math_flags_set_p): Also check flag_excess_precision_cmdline.
* doc/invoke.texi (-fexcess-precision): Drop text saying the
option has no effect under -ffast-math, make it clear that
-ffast-math will cause -fexcess-precision=fast by default even for
standards compliant modes.
(-ffast-math): Document that this sets -fexcess-precision=fast.

diff --git a/gcc/common.opt b/gcc/common.opt
index de06844..6ebaf9c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1317,7 +1317,7 @@ Common Report Var(flag_expensive_optimizations) Optimization
 Perform a number of minor, expensive optimizations.
 
 fexcess-precision=
-Common Joined RejectNegative Enum(excess_precision) Var(flag_excess_precision_cmdline) Init(EXCESS_PRECISION_DEFAULT)
+Common Joined RejectNegative Enum(excess_precision) Var(flag_excess_precision_cmdline) Init(EXCESS_PRECISION_DEFAULT) SetByCombined
 -fexcess-precision=[fast|standard]	Specify handling of excess floating-point precision.
 
 Enum
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b729964..8c5308e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9109,21 +9109,23 @@ both casts and assignments cause values to be rounded to their
 semantic types (whereas @option{-ffloat-store} only affects
 assignments).  This option is enabled by default for C if a strict
 conformance option such as @option{-std=c99} is used.
+@option{-ffast-math} enables @option{-fexcess-precision=fast} by default
+regardless of whether a strict conformance option is used.
 
 @opindex mfpmath
 @option{-fexcess-precision=standard} is not implemented for languages
-other than C, and has no effect if
-@option{-funsafe-math-optimizations} or @option{-ffast-math} is
-specified.  On the x86, it also has no effect if @option{-mfpmath=sse}
+other than C.  On the x86, it has no effect if @option{-mfpmath=sse}
 or @option{-mfpmath=sse+387} is specified; in the former case, IEEE
 semantics apply without excess precision, and in the latter, rounding
 is unpredictable.
 
+
 @item -ffast-math
 @opindex ffast-math
 Sets the options @option{-fno-math-errno}, @option{-funsafe-math-optimizations},
 @option{-ffinite-math-only}, @option{-fno-rounding-math},
-@option{-fno-signaling-nans} and @option{-fcx-limited-range}.
+@option{-fno-signaling-nans}, @option{-fcx-limited-range} and
+@option{-fexcess-precision=fast}.
 
 This option causes the preprocessor macro @code{__FAST_MATH__} 

Re: [Ada] Fix PR ada/78845

2016-12-20 Thread Arnaud Charlet
> The function Ada.Numerics.Generic_Real_Arrays.Inverse is required
> (ARM G.3.1(72)) to return a matrix with the bounds of the dimension indices
> swapped, i.e. result'Range(1) == input'Range(2) and vice versa. The
> present code gets result'Range(1) correct, but result'Range(2) always
> starts at 1.
> 
> Of course, many users would always use ranges starting at 1 and wouldn't see a
> problem.
> 
> 2016-12-17  Simon Wright  
> 
>   PR ada/78845
>   * a-ngrear.adb (Inverse): call Unit_Matrix with First_1 set to
>   A'First(2)
>   and vice versa.

Patch is OK, thanks.


[PATCH] Speed-up use-after-scope (re-writing to SSA) (version 2)

2016-12-20 Thread Martin Liška
On 11/16/2016 05:28 PM, Jakub Jelinek wrote:
> Otherwise LGTM, but please post the asan patch to llvm-commits
> or through their web review interface.
> 
>   Jakub

Ok, llvm folks are unwilling to accept the new API function, thus I've decided 
to come up
with approach suggested by Jakub. Briefly, when expanding ASAN_POISON internal 
function,
we create a new variable (with the same name as the original one). The variable 
is poisoned
at the location of the ASAN_POISON and all usages just call ASAN_CHECK that 
would trigger
use-after-scope run-time error. Situation where ASAN_POISON has a LHS is very 
rare and
is very likely to be a bug. Thus suggested not super-optimized approach should 
not be
problematic.

I'm not sure about the introduction of 'create_var' function, maybe we would 
need some
refactoring. Thoughts?

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin
>From 03c21ac90503aea7af4d74ea8c34f94efde782b6 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 19 Dec 2016 15:36:11 +0100
Subject: [PATCH] Speed up use-after-scope (v2): rewrite into SSA

gcc/ChangeLog:

2016-12-19  Martin Liska  

	* asan.c (asan_expand_poison_ifn): New function.
	* asan.h (asan_expand_poison_ifn):  Likewise.
	* gimple-expr.c (create_var): Likewise.
	* gimple-expr.h (create_var): Likewise.
	* internal-fn.c (expand_ASAN_POISON): Likewise.
	* internal-fn.def (ASAN_POISON): New builtin.
	* sanopt.c (pass_sanopt::execute): Expand
	asan_expand_poison_ifn.
	* tree-ssa.c (is_asan_mark_p): New function.
	(execute_update_addresses_taken): Rewrite local variables
	(identified just by use-after-scope as addressable) into SSA.

gcc/testsuite/ChangeLog:

2016-12-19  Martin Liska  

	* gcc.dg/asan/use-after-scope-3.c: Add additional flags.
	* gcc.dg/asan/use-after-scope-9.c: Likewise and grep for
	sanopt optimization for ASAN_POISON.
---
 gcc/asan.c| 84 ++-
 gcc/asan.h|  1 +
 gcc/gimple-expr.c | 27 +
 gcc/gimple-expr.h |  1 +
 gcc/internal-fn.c |  7 +++
 gcc/internal-fn.def   |  1 +
 gcc/sanopt.c  |  9 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-3.c |  1 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-9.c |  2 +
 gcc/tree-ssa.c| 69 ++
 10 files changed, 191 insertions(+), 11 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 53acff0a2fb..de8ce12f818 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -32,8 +32,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "memmodel.h"
 #include "tm_p.h"
+#include "ssa.h"
 #include "stringpool.h"
-#include "tree-vrp.h"
 #include "tree-ssanames.h"
 #include "optabs.h"
 #include "emit-rtl.h"
@@ -3055,6 +3055,88 @@ asan_expand_check_ifn (gimple_stmt_iterator *iter, bool use_calls)
   return true;
 }
 
+
+/* Expand the ASAN_POISON builtins.  */
+
+bool
+asan_expand_poison_ifn (gimple_stmt_iterator *iter,
+			bool *need_commit_edge_insert)
+{
+  gimple *g = gsi_stmt (*iter);
+  tree poisoned_var = gimple_call_lhs (g);
+  if (!poisoned_var)
+{
+  gsi_remove (iter, true);
+  return true;
+}
+
+  tree var_decl = SSA_NAME_VAR (poisoned_var);
+
+  bool recover_p;
+  if (flag_sanitize & SANITIZE_USER_ADDRESS)
+recover_p = (flag_sanitize_recover & SANITIZE_USER_ADDRESS) != 0;
+  else
+recover_p = (flag_sanitize_recover & SANITIZE_KERNEL_ADDRESS) != 0;
+
+  tree shadow_var = create_var (TREE_TYPE (poisoned_var),
+IDENTIFIER_POINTER (DECL_NAME (var_decl)));
+
+  tree size = DECL_SIZE_UNIT (shadow_var);
+  gimple *poison_call
+= gimple_build_call_internal (IFN_ASAN_MARK, 3,
+  build_int_cst (integer_type_node,
+		 ASAN_MARK_POISON),
+  build_fold_addr_expr (shadow_var), size);
+
+  use_operand_p use_p;
+  imm_use_iterator imm_iter;
+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, poisoned_var)
+{
+  gimple *use = USE_STMT (use_p);
+  if (is_gimple_debug (use))
+	continue;
+
+  int nargs;
+  tree fun = report_error_func (false, recover_p, tree_to_uhwi (size),
+);
+
+  gcall *call = gimple_build_call (fun, 1,
+   build_fold_addr_expr (shadow_var));
+  gimple_set_location (call, gimple_location (use));
+  gimple *call_to_insert = call;
+
+  /* The USE can be a gimple PHI node.  If so, insert the call on
+	 all edges leading to the PHI node.  */
+  if (is_a  (use))
+	{
+	  gphi *phi = dyn_cast (use);
+	  for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+	if (gimple_phi_arg_def (phi, i) == poisoned_var)
+	  {
+		edge e = gimple_phi_arg_edge (phi, i);
+
+		if (call_to_insert == NULL)
+		  call_to_insert = gimple_copy (call);
+
+		gsi_insert_seq_on_edge (e, call_to_insert);
+		*need_commit_edge_insert 

Re: [PATCH] Externalize definition of create_tmp_reg_or_ssa_name

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 11:45:18AM +0100, Richard Biener wrote:
> > I've bootstrapped and make check with this patch applied in conjunction with
> > other patches.
> >
> > OK for trunk?
> 
> Ok.
> 
> Richard.
> 
> >
> > [gcc]
> >
> > 2016-12-19  Will Schmidt  
> >
> > *  gimple-fold.c (create_tmp_reg_or_ssa_name): remove static 
> > declaration.
> > *  gimple-fold.h: add prototype.

Just ChangeLog nit.  Only one space after *, not two, and capital letter
after : .
Also, the second line should be
* gimple-fold.h (create_tmp_reg_or_ssa_name): New prototype.
or so.

Jakub


Re: [PATCH] Fix assertions along default switch labels (PR tree-optimization/78819)

2016-12-20 Thread Richard Biener
On Fri, Dec 16, 2016 at 3:16 PM, Marek Polacek  wrote:
> On Fri, Dec 16, 2016 at 02:58:59PM +0100, Richard Biener wrote:
>> On Fri, Dec 16, 2016 at 2:03 PM, Bernd Schmidt  wrote:
>> > On 12/16/2016 12:49 PM, Marek Polacek wrote:
>> >
>> >> But as this testcase shows, this breaks when the default label shares a
>> >> label
>> >> with another case.  On this testcase, when we reach the switch, we know
>> >> that
>> >> argc is either 1, 2, or 3.  So by the time we reach vrp2, the IR will have
>> >> been optimized to
>> >>
>> >>   switch (argc) {
>> >> default:
>> >> case 3:
>> >>   // argc will be considered 1 despite the case 3
>> >>   break;
>> >> case 2:
>> >>   ...
>> >>}
>> >
>> >
>> > Shouldn't we just remove the "case 3:" from the switch in this case? Would
>> > that fix things?
>>
>> We probably should indeed.  But can we rely on this?
>
> I think we should do both -- apply my fix + investigated why we kept case 3
> around.  I'm willing to look into this.

Agreed.  (my original approval still holds)

Richard.

> Marek


Re: [Patch] Turn -fexcess-precision=fast on when in -ffast-math

2016-12-20 Thread Richard Biener
On Mon, Dec 19, 2016 at 6:58 PM, James Greenhalgh
 wrote:
>
>> On Thu, Dec 8, 2016 at 10:44 PM, Uros Bizjak  wrote:
>> > Hello!
>> >
>> > Attached patch fixes fall-out from excess-precision improvements
>> > patch. As shown in the PR, the code throughout the compiler assumes
>> > FLAG_PRECISION_FAST when flag_unsafe_math_optimizations flag is in
>> > effect. The patch puts back two lines, removed by excess-precision
>> > improvements patch.
>> >
>> > 2016-12-08  Uros Bizjak  
>> >
>> > PR middle-end/78738
>> > * toplev.c (init_excess_precision): Initialize flag_excess_precision
>> > to EXCESS_PRECISION_FAST for flag_unsafe_math_optimizations.
>> >
>> > testsuite/ChangeLog:
>> >
>> > 2016-12-08  Uros Bizjak  
>> >
>> > PR middle-end/78738
>> > * gcc.target/i386/pr78738.c: New test.
>> >
>> > Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>> >
>> > OK for mainline?
>>
>> Hmm, I think it belongs to set_unsafe_math_optimization_flags instead
>> (and be consistent if -fexcess-precision was manually specified).
>
> I think it would be better if this were implied by -ffast-math/-Ofast
> than by -funsafe-math-optimizations . That's what I've implemented here,
> and tagged the option as SetByCombined to allow us to honour what the
> user requests.
>
> This should give us the behaviour you were looking for Uros.
>
> I've bootstrapped and tested the behaviour on x86_64, and I've hacked up
> the AArch64 backend to validate that we're setting the flag in the right
> circumstances (but that meant changing the AArch64 behaviour, so isn't
> something we'd want on trunk, and therefore I can't write a testcase for
> this patch).
>
> OK?

Looks good to me, but please also adjust invoke.texi to list -fexcess-precision
as affected by -ffast-math.

Ok with that change.
Richard.

> Thanks,
> James
>
> ---
> 2016-12-19  James Greenhalgh  
>
> * common.opt (excess_precision): Tag as SetByCombined.
> * opts.c (set_fast_math_flags): Also set
> flag_excess_precision_cmdline.
> (fast_math_flags_set_p): Also check flag_excess_precision_cmdline.
>


Re: [PATCH] Externalize definition of create_tmp_reg_or_ssa_name

2016-12-20 Thread Richard Biener
On Mon, Dec 19, 2016 at 6:27 PM, Will Schmidt  wrote:
> Hi,
>
> For some future rs6000 vector folding patches, I will be needing
> access to the create_tmp_reg_or_ssa_name() function in rs6000.c.
> Thus...
> Externalize the definition of create_tmp_reg_or_ssa_name
> for use in rs6000.c.  The actual usage will show up in later patches.
>
> I'll note that I do not have any 'usage' patches quite ready to go, so
> I may just sit on committing this until one of those usage patches is ready.
>
> I've bootstrapped and make check with this patch applied in conjunction with
> other patches.
>
> OK for trunk?

Ok.

Richard.

>
> [gcc]
>
> 2016-12-19  Will Schmidt  
>
> *  gimple-fold.c (create_tmp_reg_or_ssa_name): remove static 
> declaration.
> *  gimple-fold.h: add prototype.
>
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index d00625b..ac4498f 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -161,8 +161,8 @@ can_refer_decl_in_current_unit_p (tree decl, tree 
> from_decl)
> is in SSA form, a SSA name is created.  Otherwise a temporary register
> is made.  */
>
> -static tree
> -create_tmp_reg_or_ssa_name (tree type, gimple *stmt = NULL)
> +tree
> +create_tmp_reg_or_ssa_name (tree type, gimple *stmt)
>  {
>if (gimple_in_ssa_p (cfun))
>  return make_ssa_name (type, stmt);
> diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
> index 5add30c..e7f8fe2 100644
> --- a/gcc/gimple-fold.h
> +++ b/gcc/gimple-fold.h
> @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>  #ifndef GCC_GIMPLE_FOLD_H
>  #define GCC_GIMPLE_FOLD_H
>
> +extern tree create_tmp_reg_or_ssa_name (tree, gimple *stmt = NULL);
>  extern tree canonicalize_constructor_val (tree, tree);
>  extern tree get_symbol_constant_value (tree);
>  extern void get_range_strlen (tree, tree[2]);
>
>


Re: [PATCH] varasm: Propagate litpool decl alignment to generated RTX.

2016-12-20 Thread Richard Biener
On Fri, Dec 16, 2016 at 9:29 PM, Andreas Krebbel
 wrote:
> When pushing a value into the literal pool the resulting decl might
> get a higher alignment than the original expression depending on how a
> target defines CONSTANT_ALIGNMENT.  Generating an RTX for the constant
> pool access we currently use the alignment from the original
> expression.  Changed with the attached patch.

And it might be even smaller alignment...  or do we not allow that?

> This fixes a GCC 6 regression for S/390.  For arrays of string
> constants as in the attached testcase encode_section_info is not able
> to figure out that the constant pool slot is already properly aligned
> since the mem_align field in the rtx is not set properly.
>
> Another question is why (at the end of build_constant_desc) we invoke
> the encode_section_info hook with the original expression instead of
> the newly generated var_decl?  I think the hook is supposed to be
> invoked with a decl?!

No idea...

>   /* Set flags or add text to the name to record information, such as
>  that it is a local symbol.  If the name is changed, the macro
>  ASM_OUTPUT_LABELREF will have to know how to strip this
>  information.  This call might invalidate our local variable
>  SYMBOL; we can't use it afterward.  */
>   targetm.encode_section_info (exp, rtl, true);
>
>   desc->rtl = rtl;
>
>   return desc;
> }
>
> Bootstrapped and regtested on x86_64 and s390x.
>
> No regressions.
>
> Ok?

Ok.

Richard.

> -Andreas-
>
> gcc/ChangeLog:
>
> 2016-12-16  Andreas Krebbel  
>
> * varasm.c (build_constant_desc): Use the alignment of the var
> decl instead of the original expression.
>
> gcc/testsuite/ChangeLog:
>
> 2016-12-16  Andreas Krebbel  
>
> * gcc.target/s390/litpool-str-1.c: New test.
> ---
>  gcc/testsuite/gcc.target/s390/litpool-str-1.c | 22 ++
>  gcc/varasm.c  |  4 
>  2 files changed, 26 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/s390/litpool-str-1.c
>
> diff --git a/gcc/testsuite/gcc.target/s390/litpool-str-1.c 
> b/gcc/testsuite/gcc.target/s390/litpool-str-1.c
> new file mode 100644
> index 000..cd921d2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/litpool-str-1.c
> @@ -0,0 +1,22 @@
> +/* Make sure strings are recognized as being accessible through larl.
> +   This requires the symbol ref alignment properly propagated to
> +   encode_section_info.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-march=z900 -O2 -fpic" } */
> +
> +
> +extern void foo(const char*, const char*, const char*);
> +
> +void bar(int i)
> +{
> +  const char t1[10] = "test";
> +  const char t2[10] = "test2";
> +  const char t3[2][10] = {
> +   "foofoofoo",
> +   "barbarbar",
> +};
> +  foo(t1, t2, t3[i]);
> +}
> +
> +/* { dg-final { scan-assembler-not "GOTOFF" } } */
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index 5b15847..68a7ba2 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -3296,6 +3296,10 @@ build_constant_desc (tree exp)
>set_mem_attributes (rtl, exp, 1);
>set_mem_alias_set (rtl, 0);
>
> +  /* Putting EXP into the literal pool might have imposed a larger
> + alignment which should be visible in the RTX as well.  */
> +  set_mem_align (rtl, DECL_ALIGN (decl));
> +
>/* We cannot share RTX'es in pool entries.
>   Mark this piece of RTL as required for unsharing.  */
>RTX_FLAG (rtl, used) = 1;
> --
> 2.9.1
>


Re: [PATCH v2] Run tests only if the machine supports the instruction set.

2016-12-20 Thread Jakub Jelinek
On Tue, Dec 20, 2016 at 11:22:47AM +0100, Dominik Vogt wrote:
> On Mon, Dec 19, 2016 at 06:00:21PM +0100, Jakub Jelinek wrote:
> > On Mon, Dec 19, 2016 at 05:50:40PM +0100, Dominik Vogt wrote:
> > >   * config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Define
> > >   __S390_ARCH_LEVEL__.
> > > gcc/testsuite/ChangeLog-setmem
> > > 
> > >   * gcc.target/s390/md/setmem_long-1.c: Use "runnable".
> > >   * gcc.target/s390/md/rXsbg_mode_sXl.c: Likewise.
> > >   * gcc.target/s390/md/andc-splitter-1.c: Likewise.
> > >   * gcc.target/s390/md/andc-splitter-2.c: Likewise.
> > >   * lib/gcc-dg.exp (gcc-dg-runtest): Export torture_current_flags.
> > >   * gcc.target/s390/s390.exp: Import torture_current_flags.
> > >   (check_effective_target_runnable): New.
> > 
> > Unless you want to add support for all targets in the runnable
> > effective target, I think it would be better to call it less generically,
> > s390_runnable or similar.
> 
> Fair enough.
> 
> What do you think about the change in gcc-dg.exp?
> 
> We couldn't
> decide whether it's a valid way of retrieving the flags needed for
> compiling s390_check_runnable or not.  It would be nice to get all
> options that are relevant for the test case, including the ones
> from "dg-options" (etc.), but that probably requires larger
> changes to lib/*.exp.  (The target specific check functions could
> be removed then.)

I'm not a testsuite maintainer nor very good in tcl, so I think you want a
testsuite maintainer to ack it in any case.
But, I'd say you want something that will not be terribly expensive.
If you have an effective target that happens to get flags from the
current test, then that is necessarily non-cacheable, which would mean
in addition to compiling every test you also compile another proglet for it.
I think your current patch does that too, there is no caching, so it would
be desirable to cache the results; you should invalidate those caches when
torture_current_flags change.  See e.g. et_cache uses in
target-supports.exp.  You want to remember torture_current_flags for which
you've computed the s390_runnable et and if it changes, reset the cache.
The other *_runnable flags can be probably just normally cached (and you do,
by using check_runtime rather than check_runtime_nocache).

Jakub


Re: [PATCH v2] Run tests only if the machine supports the instruction set.

2016-12-20 Thread Dominik Vogt
On Mon, Dec 19, 2016 at 06:00:21PM +0100, Jakub Jelinek wrote:
> On Mon, Dec 19, 2016 at 05:50:40PM +0100, Dominik Vogt wrote:
> > * config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Define
> > __S390_ARCH_LEVEL__.
> > gcc/testsuite/ChangeLog-setmem
> > 
> > * gcc.target/s390/md/setmem_long-1.c: Use "runnable".
> > * gcc.target/s390/md/rXsbg_mode_sXl.c: Likewise.
> > * gcc.target/s390/md/andc-splitter-1.c: Likewise.
> > * gcc.target/s390/md/andc-splitter-2.c: Likewise.
> > * lib/gcc-dg.exp (gcc-dg-runtest): Export torture_current_flags.
> > * gcc.target/s390/s390.exp: Import torture_current_flags.
> > (check_effective_target_runnable): New.
> 
> Unless you want to add support for all targets in the runnable
> effective target, I think it would be better to call it less generically,
> s390_runnable or similar.

Fair enough.

What do you think about the change in gcc-dg.exp?

We couldn't
decide whether it's a valid way of retrieving the flags needed for
compiling s390_check_runnable or not.  It would be nice to get all
options that are relevant for the test case, including the ones
from "dg-options" (etc.), but that probably requires larger
changes to lib/*.exp.  (The target specific check functions could
be removed then.)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] Fix IPA CP where it forgot to add a reference in cgraph

2016-12-20 Thread Martin Jambor
Hi,

On Mon, Dec 19, 2016 at 11:09:52AM +0100, Martin Liska wrote:
> Hello.
> 
> Building mariadb with -flto exposes a bug which I also used to see
> in Firefox. It's caused by IPA CP starting from r236418, where the
> pass started to propagate const VAR_DECLs. Problem is that the pass
> does not update call graph by adding IPA_REF_READ of the propagated
> variable.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?

Honza needs to have a look at this but since I have suggested this
approach, I am of course fine with it, except that...


> Martin

> From 477e81fde08d0520ce552ec8baa0349590dc683c Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Mon, 19 Dec 2016 11:03:34 +0100
> Subject: [PATCH] Fix IPA CP where it forgot to add a reference in cgraph
> 
> gcc/ChangeLog:
> 
> 2016-12-19  Martin Liska  
> 
>   * cgraphclones.c (cgraph_node::create_virtual_clone):
>   Create either IPA_REF_LOAD of IPA_REF_READ depending on
>   whether new_tree is a VAR_DECL or an ADDR_EXPR.
>   * ipa-cp.c (create_specialized_node): Add reference just for
>   ADDR_EXPRs.
>   * symtab.c (symtab_node::maybe_create_reference): Remove guard
>   as it's guarded in callers.
> ---
>  gcc/cgraphclones.c | 6 +-
>  gcc/ipa-cp.c   | 3 ++-
>  gcc/symtab.c   | 2 --
>  3 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
> index 349892dab67..93c86e6a1cc 100644
> --- a/gcc/cgraphclones.c
> +++ b/gcc/cgraphclones.c
> @@ -624,7 +624,11 @@ cgraph_node::create_virtual_clone (vec 
> redirect_callers,
>|| in_lto_p)
>  new_node->unique_name = true;
>FOR_EACH_VEC_SAFE_ELT (tree_map, i, map)
> -new_node->maybe_create_reference (map->new_tree, IPA_REF_ADDR, NULL);
> +{
> +  ipa_ref_use use_type
> + = TREE_CODE (map->new_tree) == VAR_DECL ? IPA_REF_ADDR : IPA_REF_LOAD;

...this test should be for ADDR_EXPR here.  Or you could switch the
IPA_REF_* constants the other way round which I bet is going to have
the same effect in practice, but personally, I'd test for ADDR_EXPR.

Thanks,

Martin


Re: [PATCH][ARM] PR target/78694: Avoid invalid RTL sharing in minipool code

2016-12-20 Thread Ramana Radhakrishnan

On 09/12/16 14:03, Kyrill Tkachov wrote:

Hi all,

In this ICE GCC reports invalid RTL sharing in the pattern:
(insn 955 954 956 (unspec_volatile [
(const:SI (unspec:SI [
(symbol_ref:SI ("a") [flags 0xe8] )
(const_int 4 [0x4])
] UNSPEC_TLS))
] VUNSPEC_POOL_4) -1
 (nil))
seb.c:116:1: error: shared rtx
(const:SI (unspec:SI [
(symbol_ref:SI ("a") [flags 0xe8] )
(const_int 4 [0x4])
] UNSPEC_TLS))

This is the consttable_4 pattern emitted by dump_minipool and the shared
rtx that's causing
the problem is the operand to that pattern that comes from the 'value'
field of Mnode.

I'm not very familiar with the exact restrictions on sharing RTL but I
believe the right way
to fix these is to do a copy_rtx of the value we want to use. So this
patch does that when the
pool entries are emitted.

The ICE is quite fragile. See the PR for more details, but I've been
able to reliably reproduce it
on an arm-none-linux-gnueabihf target with -mcpu=arm7tdmi
-mfloat-abi=soft -mfpu=vfp -mthumb and this
patch fixes that there.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?


Sorry about the time it's taken to review this - I'm way behind on my 
gcc-patches mailbox.


Ok.

Thanks,
Ramana


Thanks,
Kyrill

2016-12-09  Kyrylo Tkachov  

PR target/78694
* config/arm/arm.c (dump_minipool): Copy mp->value before emitting it
in the minipool to avoid invalid RTL sharing.

2016-12-09  Kyrylo Tkachov  

PR target/78694
* gcc.c-torture/compile/pr78694.c: New test.




Re: [PATCH] Remove unused libgfortran functions

2016-12-20 Thread FX
> I don't understand. Why would it imply a 1:1 mapping of release series
> with major ABI versions?

OK, I thought you meant to map libgfortran version numbers (libgfortran.so.7 
with GCC 7). If it’s the gfortran.map node names, I’m happy with that indeed. 

Attached patch regtested on x86_64-apple-darwin16.3.0.
OK to commit?


> Currently we have _gfortran_, that is with a single underscore in the
> beginning, so it's not in the "C/POSIX reserved for the implementation
> namespace". But yes, I agree that at least those functions documented
> under the non-Fortran main program section in the manual should be
> kept as is.

Then, if we keep some functions under _gfortran_, I say let’s keep them all 
there. It’s not hurting, and the few users who care have come to expect it.


Cheers,
FX



map.ChangeLog
Description: Binary data


map.diff
Description: Binary data


Re: [PATCH][ARM] PR target/78694: Avoid invalid RTL sharing in minipool code

2016-12-20 Thread Sebastian Huber

Hello,

it would be quite nice if someone could have a look at this since this 
breaks the GCC build with libgomp enabled for all Thumb-1 targets.


On 16/12/16 16:20, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00849.html

Thanks,
Kyrill


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [PATCH] PR 78534 Change character length from int to size_t

2016-12-20 Thread FX
Dear Bob,

First, regarding the ABI vs. API question: there is no consistent API for how 
to pass between Fortran and C strings, unless one uses Fortran 2003’s 
ISO_C_BINDING. It’s an ABI detail, in the sense that every compiler will choose 
to do things their own way: most compilers who pass a hidden length parameter, 
although its size (32-bit or 64-bit or size_t) and position (either after the 
char pointer, or at the end of the argument list) are variable between 
compilers. So, any code that does this is already compiler-specific.

Second, there are good reasons we might want to change this. One is possible 
use cases (although there are few, by definition, because we simply don’t 
support those right now). The second one is compatibility with C 
string-handling functions, who operate on size_t arguments, which means we can 
now use those functions without casting types around all the time.

Finally, if we’re making this change, we welcome any feedback on how to make it 
as easy as possible to handle in user code. Documentation, preprocessor macros, 
etc.

In particular, one of the things we will need to address is on helping widely 
used code to adapt to the change, so that. One example I am thinking of, that 
uses old-style C/Fortran interfaces, is MPI libraries (openmpi & mpich). We 
definitely need to test those to make sure nothing breaks if we are going to 
proceed — or they need to be fixed upstream well before we release, and with 
due note of the incompatibility in our release notes.


Cheers,
FX

Re: [gimplefe] reject invalid pass name in startwith

2016-12-20 Thread Richard Biener
On Sun, 18 Dec 2016, Prathamesh Kulkarni wrote:

> Hi Richard,
> The attached patch attempts to reject invalid pass-name in startwith
> and verified gimplefe tests pass with the patch (not sure if bootstrap
> is required?)
> Does it look OK ?

No - get_pass_by_name works on dump file names while the startwith
machinery counts the actual invocations of the pass (and supports
"ccp" as alias to "ccp1" for example).  So there is no 1:1 correspondence
between startwith names and names get_pass_by_name expects.

I'd say we instead want to diagnose when the pass pipeline execution
fails to find "startwith".  Thus ontop of my earlier patch which
ends up with

  /* For skipping passes until startwith pass */
  if (cfun
  && cfun->pass_startwith
  /* But we can't skip the lowering phase yet -- ideally we'd
 drive that phase fully via properties.  */
  && (cfun->curr_properties & PROP_ssa))
{
  size_t namelen = strlen (pass->name);
  /* We have to at least start when we leave SSA.  */
  if (pass->properties_destroyed & PROP_ssa)
cfun->pass_startwith = NULL;
  else if (! strncmp (pass->name, cfun->pass_startwith, namelen))
{

in the & PROP_ssa case, if pass->name doesn't match, diagnose a
"failed to start pass execution at %s, starting with RTL expansion".

Richard.


[PATCH][gimplefe] Improve error recovery

2016-12-20 Thread Richard Biener

Just noticed a few issues when feeding the GIMPLE FE random -gimple
dumps.  On errors not skipping to expected tokens leads to a load
of strange followup parsing errors and worse, to endless parsing
attempts in one case.

Fixed with the following.

Bootstrap / regtest running together with the pass manager change
posted in the other thread.

I consider gimple-parser.c changes like this "middle-end", Joseph,
are you fine with that?

Richard.

2016-12-20  Richard Biener  

c/
* gimple-parser.c (c_parser_gimple_compound_statement): Improve
error recovery.
(c_parser_gimple_statement): Only build assigns for non-error
stmts.
(c_parser_gimple_postfix_expression_after): Improve error recovery.

Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 243738)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -215,7 +215,7 @@ c_parser_gimple_compound_statement (c_pa
 expr_stmt:
  c_parser_gimple_statement (parser, seq);
  if (! c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
-   return return_p;
+   c_parser_skip_until_found (parser, CPP_SEMICOLON, NULL);
}
 }
   c_parser_consume_token (parser);
@@ -327,9 +327,12 @@ c_parser_gimple_statement (c_parser *par
 case CPP_NOT:
 case CPP_MULT: /* pointer deref */
   rhs = c_parser_gimple_unary_expression (parser);
-  assign = gimple_build_assign (lhs.value, rhs.value);
-  gimple_set_location (assign, loc);
-  gimple_seq_add_stmt (seq, assign);
+  if (rhs.value != error_mark_node)
+   {
+ assign = gimple_build_assign (lhs.value, rhs.value);
+ gimple_set_location (assign, loc);
+ gimple_seq_add_stmt (seq, assign);
+   }
   return;
 
 default:;
@@ -385,10 +388,13 @@ c_parser_gimple_statement (c_parser *par
   && lookup_name (c_parser_peek_token (parser)->value))
 {
   rhs = c_parser_gimple_unary_expression (parser);
-  gimple *call = gimple_build_call_from_tree (rhs.value);
-  gimple_call_set_lhs (call, lhs.value);
-  gimple_seq_add_stmt (seq, call);
-  gimple_set_location (call, loc);
+  if (rhs.value != error_mark_node)
+   {
+ gimple *call = gimple_build_call_from_tree (rhs.value);
+ gimple_call_set_lhs (call, lhs.value);
+ gimple_seq_add_stmt (seq, call);
+ gimple_set_location (call, loc);
+   }
   return;
 }
 
@@ -802,7 +808,10 @@ c_parser_gimple_postfix_expression_after
tree idx = c_parser_gimple_unary_expression (parser).value;
 
if (! c_parser_require (parser, CPP_CLOSE_SQUARE, "expected %<]%>"))
- break;
+ {
+   c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE, NULL);
+   break;
+ }
 
start = expr.get_start ();
finish = c_parser_tokens_buf (parser, 0)->location;


Fix PR testsuite/71237

2016-12-20 Thread Eric Botcazou
This adds -fno-vect-cost-model to the 6 relevant testcases.

Tested on x86_64-suse-linux, applied on the mainline.


2016-12-20  Eric Botcazou  

PR testsuite/71237
* gnat.dg/vect1.adb: Add -fno-vect-cost-model to dg-options.
* gnat.dg/vect2.adb: Likewise.
* gnat.dg/vect3.adb: Likewise.
* gnat.dg/vect4.adb: Likewise.
* gnat.dg/vect5.adb: Likewise.
* gnat.dg/vect6.adb: Likewise.

-- 
Eric BotcazouIndex: gnat.dg/vect1.adb
===
--- gnat.dg/vect1.adb	(revision 243782)
+++ gnat.dg/vect1.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile { target i?86-*-* x86_64-*-* } }
--- { dg-options "-O3 -msse2 -fdump-tree-vect-details" }
+-- { dg-options "-O3 -msse2 -fno-vect-cost-model -fdump-tree-vect-details" }
 
 package body Vect1 is
 
Index: gnat.dg/vect2.adb
===
--- gnat.dg/vect2.adb	(revision 243782)
+++ gnat.dg/vect2.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile { target i?86-*-* x86_64-*-* } }
--- { dg-options "-O3 -msse2 -fdump-tree-vect-details" }
+-- { dg-options "-O3 -msse2 -fno-vect-cost-model -fdump-tree-vect-details" }
 
 package body Vect2 is
 
Index: gnat.dg/vect3.adb
===
--- gnat.dg/vect3.adb	(revision 243782)
+++ gnat.dg/vect3.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile { target i?86-*-* x86_64-*-* } }
--- { dg-options "-O3 -msse2 -fdump-tree-vect-details" }
+-- { dg-options "-O3 -msse2 -fno-vect-cost-model -fdump-tree-vect-details" }
 
 package body Vect3 is
 
Index: gnat.dg/vect4.adb
===
--- gnat.dg/vect4.adb	(revision 243782)
+++ gnat.dg/vect4.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile { target i?86-*-* x86_64-*-* } }
--- { dg-options "-O3 -msse2 -fdump-tree-vect-details" }
+-- { dg-options "-O3 -msse2 -fno-vect-cost-model -fdump-tree-vect-details" }
 
 package body Vect4 is
 
Index: gnat.dg/vect5.adb
===
--- gnat.dg/vect5.adb	(revision 243782)
+++ gnat.dg/vect5.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile { target i?86-*-* x86_64-*-* } }
--- { dg-options "-O3 -msse2 -fdump-tree-vect-details" }
+-- { dg-options "-O3 -msse2 -fno-vect-cost-model -fdump-tree-vect-details" }
 
 package body Vect5 is
 
Index: gnat.dg/vect6.adb
===
--- gnat.dg/vect6.adb	(revision 243782)
+++ gnat.dg/vect6.adb	(working copy)
@@ -1,5 +1,5 @@
 -- { dg-do compile { target i?86-*-* x86_64-*-* } }
--- { dg-options "-O3 -msse2 -fdump-tree-vect-details" }
+-- { dg-options "-O3 -msse2 -fno-vect-cost-model -fdump-tree-vect-details" }
 
 package body Vect6 is