date:20161126

[Patches] Add variant constexpr support for visit, comparisons and get

2016-11-26 Thread Tim Shen

This 4-patch series contains the following in order:

a.diff: Remove uses-allocator ctors. They are going away, and removing
it reduces the maintenance burden from now on.

b.diff: Add constexpr support for get<> and comparisons. This patch
also involves small refactoring of _Variant_storage.

c.diff: Fix some libc++ test failures.

d.diff: Add constexpr support for visit. This patch also removes
__storage, __get_alternative, and __reserved_type_map, since we don't
need to support reference/void types for now.

The underlying design doesn't change - we still use the vtable
approach to achieve O(1) runtime cost even under -O0.

Bootstrapped and tested for each of them.

Thanks!


-- 
Regards,
Tim Shen
commit 638ecd4cf354d853bb12b089a356df99531f9afa
Author: Tim Shen 
Date:   Thu Nov 24 00:56:08 2016 -0800

2016-11-26  Tim Shen  

* include/std/variant (__erased_use_alloc_ctor,
_Variant_base::_Variant_base, variant::variant): Remove uses-allocator
related functions.
* testsuite/20_util/variant/compile.cc: Remove related tests.
* testsuite/20_util/variant/run.cc: Remove related tests.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 34ad3fd..2d9303a 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -202,14 +202,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __erased_ctor(void* __lhs, void* __rhs)
 { ::new (__lhs) decay_t<_Lhs>(__get_alternative<_Rhs>(__rhs)); }
 
-  template
-constexpr void
-__erased_use_alloc_ctor(const _Alloc& __a, void* __lhs, void* __rhs)
-{
-  __uses_allocator_construct(__a, static_cast*>(__lhs),
- __get_alternative<_Rhs>(__rhs));
-}
-
   // TODO: Find a potential chance to reuse this accross the project.
   template
 constexpr void
@@ -353,47 +345,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	: _Storage(__i, std::forward<_Args>(__args)...), _M_index(_Np)
 	{ }
 
-  template
-	_Variant_base(const _Alloc& __a, const _Variant_base& __rhs)
-	: _Storage(), _M_index(__rhs._M_index)
-	{
-	  if (__rhs._M_valid())
-	{
-	  static constexpr void
-	  (*_S_vtable[])(const _Alloc&, void*, void*) =
-		{ &__erased_use_alloc_ctor<_Alloc, __storage<_Types>&,
-	   const __storage<_Types>&>... };
-	  _S_vtable[__rhs._M_index](__a, _M_storage(), __rhs._M_storage());
-	}
-	}
-
-  template
-	_Variant_base(const _Alloc& __a, _Variant_base&& __rhs)
-	: _Storage(), _M_index(__rhs._M_index)
-	{
-	  if (__rhs._M_valid())
-	{
-	  static constexpr void
-	  (*_S_vtable[])(const _Alloc&, void*, void*) =
-		{ &__erased_use_alloc_ctor<_Alloc, __storage<_Types>&,
-	   __storage<_Types>&&>... };
-	  _S_vtable[__rhs._M_index](__a, _M_storage(), __rhs._M_storage());
-	}
-	}
-
-  template
-	constexpr explicit
-	_Variant_base(const _Alloc& __a, in_place_index_t<_Np>,
-		  _Args&&... __args)
-	: _Storage(), _M_index(_Np)
-	{
-	  using _Storage =
-	__storage>>;
-	  __uses_allocator_construct(__a, static_cast<_Storage*>(_M_storage()),
- std::forward<_Args>(__args)...);
-	  __glibcxx_assert(_M_index == _Np);
-	}
-
   _Variant_base&
   operator=(const _Variant_base& __rhs)
   {
@@ -1026,84 +977,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	_Default_ctor_enabler(_Enable_default_constructor_tag{})
 	{ __glibcxx_assert(index() == _Np); }
 
-  template, _Alloc>>>
-	variant(allocator_arg_t, const _Alloc& __a)
-	: variant(allocator_arg, __a, in_place_index<0>)
-	{ }
-
-  template>>...>::value>>
-	variant(allocator_arg_t, const _Alloc& __a, const variant& __rhs)
-	: _Base(__a, __rhs),
-	_Default_ctor_enabler(_Enable_default_constructor_tag{})
-	{ }
-
-  template>...>::value>>
-	variant(allocator_arg_t, const _Alloc& __a, variant&& __rhs)
-	: _Base(__a, std::move(__rhs)),
-	_Default_ctor_enabler(_Enable_default_constructor_tag{})
-	{ }
-
-  template>
-		 && __is_uses_allocator_constructible_v<
-		   __accepted_type<_Tp&&>, _Alloc, _Tp&&>
-		 && !is_same_v, variant>, variant&>>
-	variant(allocator_arg_t, const _Alloc& __a, _Tp&& __t)
-	: variant(allocator_arg, __a, in_place_index<__accepted_index<_Tp&&>>,
-		  std::forward<_Tp>(__t))
-	{ __glibcxx_assert(holds_alternative<__accepted_type<_Tp&&>>(*this)); }
-
-  template
-		 && __is_uses_allocator_constructible_v<
-		   _Tp, _Alloc, _Args&&...>>>
-	variant(allocator_arg_t, const _Alloc& __a, in_place_type_t<_Tp>,
-		_Args&&... __args)
-	: variant(allocator_arg, __a, in_place_index<__index_of<_Tp>>,
-		  std::forward<_Args>(__args)...)
-	{ __glibcxx_assert(holds_alternative<_Tp>(*this)); }
-
-  template
-		 && __is_uses_allocator_constructible_v<
-		   _Tp, _Alloc, initializer_list<_Up>&, _Args&&...>>>
-	variant(allocator_arg_t, const _Alloc& __a, in_place_type_t<_Tp>,
-		initializer_list<_Up> __il, _Args&&...

Re: [PATCH] avoid calling alloca(0)

2016-11-26 Thread Martin Sebor


On 11/25/2016 12:51 PM, Jeff Law wrote:

On 11/23/2016 06:15 PM, Martin Sebor wrote:


gcc_assert works only in some instances (e.g., in c-ada-spec.c:191)
but not in others because some actually do make the alloca(0) call
at runtime: at a minimum, lto.c:3285, reg-stack.c:2008, and
tree-ssa-threadedge.c:344 assert during bootstrap.

You might have the wrong line number of reg-stack.c and lto.  You've
pointed to the start of subst_asm_stack_regs and lto_main respectively.
It'd probably be better if you posted the line with a bit of context.


I must have copied the wrong line numbers or had stale sources
in my tree.  Sorry about that.  In lto.c, there are two calls
to XALLOCAVEC.  I believe the first one is the one where the
alloca(0) call takes place:

  1580  
  1581tree *map = XALLOCAVEC (tree, 2 * len);
  1582for (tree_scc *pscc = *slot; pscc; pscc = pscc->next)
--
  1610  {
  1611tree *map2 = XALLOCAVEC (tree, 2 * len);
  1612for (unsigned i = 0; i < len; ++i)

In reg-stack.c it's these three:

  2052  
  2053note_reg = XALLOCAVEC (rtx, i);
  2054note_loc = XALLOCAVEC (rtx *, i);
  2055note_kind = XALLOCAVEC (enum reg_note, i);
  2056  

To find all such calls I modified GCC to emit an inform call for
every XALLOCAVEC invocation with a zero argument, configured the
patched GCC on x86_64 with all languages (including lto),
bootstrapped it, ran the full test suite, and extracted the set
of unique notes from the logs.  Attached in the .log file is
the output along with counts of each.  Curiously, neither of
the two above shows up, even though adding asserts for them
broke bootstrap.  I haven't investigated why.

Martin

PS The patch I used to get the output is in the attached .diff
file.
gcc/ada/gcc-interface/utils.c:5623: void def_fn_type(builtin_type, builtin_type, bool, int, ...): alloca called with a zero argument
gcc/calls.c:3260: rtx_def* expand_call(tree, rtx, int): alloca called with a zero argument
gcc/c-family/c-common.c:3914: void def_fn_type(builtin_type, builtin_type, bool, int, ...): alloca called with a zero argument
gcc/cp/call.c:3141: z_candidate* add_template_candidate_real(z_candidate**, tree, tree, tree, tree, const vec*, tree, tree, tree, int, tree, unification_kind_t, tsubst_flags_t): alloca called with a zero argument
gcc/cp/pt.c:11362: tree_node* tsubst_template_args(tree, tree, tsubst_flags_t, tree): alloca called with a zero argument
gcc/cp/semantics.c:1444: tree_node* finish_asm_stmt(int, tree, tree, tree, tree, tree): alloca called with a zero argument
gcc/final.c:2632: rtx_insn* final_scan_insn(rtx_insn*, FILE*, int, int, int*): alloca called with a zero argument
gcc/gimple-fold.c:4346: bool fold_stmt_1(gimple_stmt_iterator*, bool, tree_node* (*)(tree)): alloca called with a zero argument
gcc/gimple-fold.c:5852: tree_node* gimple_fold_stmt_to_constant_1(gimple*, tree_node* (*)(tree), tree_node* (*)(tree)): alloca called with a zero argument
gcc/gimple-walk.c:844: bool walk_stmt_load_store_addr_ops(gimple*, void*, walk_stmt_load_store_addr_fn, walk_stmt_load_store_addr_fn, walk_stmt_load_store_addr_fn): alloca called with a zero argument
gcc/tree.c:11260: tree_node* build_call_expr_loc(location_t, tree, int, ...): alloca called with a zero argument
gcc/tree.c:11277: tree_node* build_call_expr(tree, int, ...): alloca called with a zero argument
gcc/tree.c:11312: tree_node* build_call_expr_internal_loc(location_t, internal_fn, tree, int, ...): alloca called with a zero argument
gcc/tree-ssa-structalias.c:4930: void find_func_aliases(function*, gimple*): alloca called with a zero argument
gcc/tree-ssa-threadedge.c:343: gimple* record_temporary_equivalences_from_stmts_at_dest(edge, const_and_copies*, avail_exprs_stack*, tree_node* (*)(gimple*, gimple*, avail_exprs_stack*)): alloca called with a zero argument


   CALLS   LOCATION
  --   
  226872   /src/gcc/78284/gcc/c-family/c-common.c:3914
  117141   /src/gcc/78284/gcc/calls.c:3260
  183040   /src/gcc/78284/gcc/cp/call.c:3141
   80400   /src/gcc/78284/gcc/ada/gcc-interface/utils.c:5623
   17671   /src/gcc/78284/gcc/gimple-fold.c:4346
   65348   /src/gcc/78284/gcc/gimple-fold.c:5852
   63468   /src/gcc/78284/gcc/tree-ssa-threadedge.c:343
   32886   /src/gcc/78284/gcc/gimple-walk.c:844
6578   /src/gcc/78284/gcc/tree-ssa-structalias.c:4930
4046   /src/gcc/78284/gcc/cp/pt.c:11362
4866   /src/gcc/78284/gcc/cp/semantics.c:1444
1484   /src/gcc/78284/gcc/final.c:2632
  80   /src/gcc/78284/gcc/tree.c:11312
  26   /src/gcc/78284/gcc/tree.c:11260
   4   /src/gcc/78284/gcc/tree.c:11277
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 3e3f31e..24c8c32 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -22,6 +22,12 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "symtab.h"
 
+extern void inform (location_t, const char *, ...);
+
+#undef WARN_ALLOCA_ZERO
+#define WARN_ALLOCA_ZERO() \
+

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-26 Thread Segher Boessenkool

On Sat, Nov 26, 2016 at 11:22:44PM +0100, Paolo Bonzini wrote:
> The combine.c hunk instead is needed to simplify cases that do not use the
> ternary operator (the "h" and "i" functions in the testcases) like this:
> 
>   return ((x >> 9) & 1) << 7;
> 
> Normally this is simplified just fine to a single shift and an AND.
> Here, however, the bit to preserve after (x >> 9 << 7) is the QImode
> sign bit, and if_then_else_cond produces a complicated concoction
> involving (ne:SI (subreg:QI ...)).  simplify_if_then_else cannot then
> reduce it back to the original.  In fact, simplify_if_then_else does
> have a similar pattern, but it cannot deal with the subreg.  This is
> easily done by ZERO_EXTENDing the result from QImode back to the
> comparison's mode.  The shift/shift/and or shift/and/shift combination
> can then be reduced to shift+and just like for any other bit position.

The combine part is fine, thanks for the patch.


Segher


> 2016-11-26  Paolo Bonzini  
> 
>   * combine.c (simplify_if_then_else): Simplify IF_THEN_ELSE
>   that isolates a single bit, even if the condition involves
>   subregs.
>   * match.pd: Simplify X ? C : 0 where C is a power of 2 and
>   X tests a single bit.
> 
> 2016-11-26  Paolo Bonzini  
> 
>   * gcc.dg/fold-and-lshift.c, gcc.dg/fold-and-rshift-1.c,
>   gcc.dg/fold-and-rshift-2.c: New testcases.

[PATCH] Partial solution to LWG 523

2016-11-26 Thread Tim Shen

Also see discussions from libstdc++/71500.

Bootstrapped and tested on x86_64-linux-gnu.

Thanks!


-- 
Regards,
Tim Shen
commit 6c862a2b84578a651d458b09572551c8391082e4
Author: Tim Shen 
Date:   Sat Nov 26 12:36:20 2016 -0800

2016-11-26  Tim Shen  

PR libstdc++/71500
* include/bits/regex.h (basic_regex::basic_regex): Use ECMAScript
when the syntax is not specified.
* include/bits/regex_compiler.h (_RegexTranslator,
_RegexTranslatorBase): Partially support icase in ranges.
* include/bits/regex_compiler.tcc (_BracketMatcher::_M_apply):
Refactor _M_apply to make the control flow easier to follow, and
call _M_translator._M_match_range as added previously.
* include/bits/shared_ptr_base.h: Fix a typo that causes many
debug check failures.
* testsuite/28_regex/traits/char/icase.cc: Add new tests.
* testsuite/28_regex/traits/char/user_defined.cc: Add new tests.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index aadf312..224d3db 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -762,7 +762,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   template
basic_regex(_FwdIter __first, _FwdIter __last, locale_type __loc,
flag_type __f)
-   : _M_flags(__f), _M_loc(std::move(__loc)),
+   : _M_flags((__f & (ECMAScript | basic | extended | awk | grep | egrep))
+  ? __f : (__f | ECMAScript)),
+   _M_loc(std::move(__loc)),
_M_automaton(__detail::__compile_nfa<_FwdIter, _Rx_traits>(
  std::move(__first), std::move(__last), _M_loc, _M_flags))
{ }
diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index 410d61b..964fb28 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -30,6 +30,15 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+_GLIBCXX_BEGIN_NAMESPACE_CXX11
+
+  template
+class regex_traits;
+
+_GLIBCXX_END_NAMESPACE_CXX11
+_GLIBCXX_END_NAMESPACE_VERSION
+
 namespace __detail
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -207,17 +216,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // [28.13.14]
   template
-class _RegexTranslator
+class _RegexTranslatorBase
 {
 public:
   typedef typename _TraitsT::char_type   _CharT;
   typedef typename _TraitsT::string_type _StringT;
-  typedef typename std::conditional<__collate,
-   _StringT,
-   _CharT>::type _StrTransT;
+  typedef _StringT _StrTransT;
 
   explicit
-  _RegexTranslator(const _TraitsT& __traits)
+  _RegexTranslatorBase(const _TraitsT& __traits)
   : _M_traits(__traits)
   { }
 
@@ -235,23 +242,86 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _StrTransT
   _M_transform(_CharT __ch) const
   {
-   return _M_transform_impl(__ch, typename integral_constant::type());
+   _StrTransT __str = _StrTransT(1, __ch);
+   return _M_traits.transform(__str.begin(), __str.end());
   }
 
-private:
+  // See LWG 523. It's not efficiently implementable when _TraitsT is not
+  // std::regex_traits<>, and __collate is true. See specializations for
+  // implementations of other cases.
+  bool
+  _M_match_range(const _StrTransT& __first, const _StrTransT& __last,
+const _StrTransT& __s) const
+  { return __first <= __s && __s <= __last; }
+
+protected:
+  bool _M_in_range_icase(_CharT __first, _CharT __last, _CharT __ch) const
+  {
+   typedef std::ctype<_CharT> __ctype_type;
+   const auto& __fctyp = use_facet<__ctype_type>(this->_M_traits.getloc());
+   auto __lower = __fctyp.tolower(__ch);
+   auto __upper = __fctyp.toupper(__ch);
+   return (__first <= __lower && __lower <= __last)
+ || (__first <= __upper && __upper <= __last);
+  }
+
+  const _TraitsT& _M_traits;
+};
+
+  template
+class _RegexTranslator
+: public _RegexTranslatorBase<_TraitsT, __icase, __collate>
+{
+public:
+  typedef _RegexTranslatorBase<_TraitsT, __icase, __collate> _Base;
+  using _Base::_Base;
+};
+
+  template
+class _RegexTranslator<_TraitsT, __icase, false>
+: public _RegexTranslatorBase<_TraitsT, __icase, false>
+{
+public:
+  typedef _RegexTranslatorBase<_TraitsT, __icase, false> _Base;
+  typedef typename _Base::_CharT _CharT;
+  typedef _CharT _StrTransT;
+
+  using _Base::_Base;
+
   _StrTransT
-  _M_transform_impl(_CharT __ch, false_type) const
+  _M_transform(_CharT __ch) const
   { return __ch; }
 
-  _StrTransT
-  _M_transform_impl(_CharT

Re: [PATCH] improve folding of expressions that move a single bit around

2016-11-26 Thread Marc Glisse


On Sat, 26 Nov 2016, Paolo Bonzini wrote:


--- match.pd(revision 242742)
+++ match.pd(working copy)
@@ -2554,6 +2554,19 @@
  (cmp (bit_and@2 @0 integer_pow2p@1) @1)
  (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))

+/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
+   convert this into a shift of (A & C).  */
+(simplify
+ (cond
+  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
+  integer_pow2p@3 integer_zerop)
+ (with {
+int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
+  }
+  (if (shift > 0)
+   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
+   (convert (rshift @2 { build_int_cst (integer_type_node, -shift); })


What happens if @1 is the sign bit, in a signed type? Do we get an 
arithmetic shift right?


--
Marc Glisse

[PATCH] simplify-rtx: Handle truncate of extract

2016-11-26 Thread Segher Boessenkool

simplify_truncation changes the truncation of many operations into
the operation on the truncation.  This patch makes this code also
handle extracts.

Tested on powerpc64-linux.  With this patch the rlwimi testcases work.

Is this okay for trunk?


Segher


---
 gcc/simplify-rtx.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index fde2443..e8c142c 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -742,6 +742,36 @@ simplify_truncation (machine_mode mode, rtx op,
}
 }
 
+  /* Turn (truncate:M1 (*_extract:M2 (reg:M2) (len) (pos))) into
+ (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without
+ changing len.  */
+  if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT)
+  && REG_P (XEXP (op, 0))
+  && CONST_INT_P (XEXP (op, 1))
+  && CONST_INT_P (XEXP (op, 2)))
+{
+  rtx op0 = XEXP (op, 0);
+  unsigned HOST_WIDE_INT len = UINTVAL (XEXP (op, 1));
+  unsigned HOST_WIDE_INT pos = UINTVAL (XEXP (op, 2));
+  if (BITS_BIG_ENDIAN && pos >= op_precision - precision)
+   {
+ op0 = simplify_gen_unary (TRUNCATE, mode, op0, GET_MODE (op0));
+ if (op0)
+   {
+ pos -= op_precision - precision;
+ return simplify_gen_ternary (GET_CODE (op), mode, mode, op0,
+  XEXP (op, 1), GEN_INT (pos));
+   }
+   }
+  else if (!BITS_BIG_ENDIAN && precision >= len + pos)
+   {
+ op0 = simplify_gen_unary (TRUNCATE, mode, op0, GET_MODE (op0));
+ if (op0)
+   return simplify_gen_ternary (GET_CODE (op), mode, mode, op0,
+XEXP (op, 1), XEXP (op, 2));
+   }
+}
+
   /* Recognize a word extraction from a multi-word subreg.  */
   if ((GET_CODE (op) == LSHIFTRT
|| GET_CODE (op) == ASHIFTRT)
-- 
1.9.3

Re: [Patch][i386] PR 70118: Fix ubsan warning on SSE2 loadl_epi64 and storel_epi64

2016-11-26 Thread Marc Glisse


On Sat, 26 Nov 2016, Allan Sandfeld Jensen wrote:


Use the recently introduced unaligned variant of __m128i and add a similar
__m64 and use those to make it clear these two intrinsics require neither 128-
bit nor 64-bit alignment.


Thanks for doing this. You'll want Uros or Kirill to review your patch.
There are probably several more places that could do with an unaligned 
fix, but we don't have to find them all at once.
First I found it strange to use __m64, but then it actually seems like a 
good call to use a type that is not just aligned(1) but also may_alias.


+  *(__m64_u *)__P = __m64(((__v2di)__B)[0]);

gcc complains about this syntax for me, it wants parentheses around 
__m64... Did it pass the testsuite for you?

On the other hand, this seems less complicated:

  *(__m64_u *)__P = *(__m64*)&__B;

I am now wondering if we are not using the __v2di-like types too much, in 
places where the lack of may_alias might be an issue... Or maybe I am 
afraid for no reason and even here the may_alias is unnecessary. Looking 
at dumps also makes me wonder if we could simplify 
view_convert_expr(bit_field_expr) to just bit_field_expr when it is the 
only use.


--
Marc Glisse

[PATCH] combine: Tweak change_zero_ext

2016-11-26 Thread Segher Boessenkool

change_zero_ext handles (zero_extend:M1 (subreg:M2 (reg:M1) ...))
already; this patch extends it to also deal with any
(zero_extend:M1 (subreg:M2 (reg:M3) ...)) where the subreg is not
paradoxical.

Tested on powerpc64-linux.  This is needed for some of the rlwimi
testcases to be optimised properly.


Segher


2016-11-26  Segher Boessenkool  

* combine.c (change_zero_ext): Also handle extends from a subreg
to a mode bigger than that of the operand of the subreg.

---
 gcc/combine.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 4b3496b..2c3bcf1 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11275,11 +11275,13 @@ change_zero_ext (rtx pat)
   else if (GET_CODE (x) == ZERO_EXTEND
   && SCALAR_INT_MODE_P (mode)
   && GET_CODE (XEXP (x, 0)) == SUBREG
-  && GET_MODE (SUBREG_REG (XEXP (x, 0))) == mode
+  && !paradoxical_subreg_p (XEXP (x, 0))
   && subreg_lowpart_p (XEXP (x, 0)))
{
  size = GET_MODE_PRECISION (GET_MODE (XEXP (x, 0)));
  x = SUBREG_REG (XEXP (x, 0));
+ if (GET_MODE (x) != mode)
+   x = gen_lowpart_SUBREG (mode, x);
}
   else if (GET_CODE (x) == ZERO_EXTEND
   && SCALAR_INT_MODE_P (mode)
-- 
1.9.3

[PATCH] improve folding of expressions that move a single bit around

2016-11-26 Thread Paolo Bonzini

In code like the following from KVM:

/* it is a read fault? */
error_code = (exit_qualification << 2) & PFERR_FETCH_MASK;

it would be nicer to write

/* it is a read fault? */
error_code = (exit_qualification & VMX_EPT_READ_FAULT_MASK) ? 
PFERR_FETCH_MASK : 0;

instead of having to know the difference between the positions of the
source and destination bits.  LLVM catches the latter just fine (which
is why I am sending this in stage 3...), but GCC does not, so this
patch adds two patterns to catch it.

The combine.c hunk instead is needed to simplify cases that do not use the
ternary operator (the "h" and "i" functions in the testcases) like this:

return ((x >> 9) & 1) << 7;

Normally this is simplified just fine to a single shift and an AND.
Here, however, the bit to preserve after (x >> 9 << 7) is the QImode
sign bit, and if_then_else_cond produces a complicated concoction
involving (ne:SI (subreg:QI ...)).  simplify_if_then_else cannot then
reduce it back to the original.  In fact, simplify_if_then_else does
have a similar pattern, but it cannot deal with the subreg.  This is
easily done by ZERO_EXTENDing the result from QImode back to the
comparison's mode.  The shift/shift/and or shift/and/shift combination
can then be reduced to shift+and just like for any other bit position.

These forms are not included in the fold-and-rshift-2.c testcase,
because in this case a shift+shift (without the following AND) is a
valid alternative too; and at least on x86 it has the same cost as
shift+and.  Compare:

movl%edi, %eax
sarl$24, %eax
andl$128, %eax
ret

and

movl%edi, %eax
shrl$31, %eax
sall$7, %eax

Bootstrapped/regtested x86_64-pc-linux-gnu, ok?

Paolo

2016-11-26  Paolo Bonzini  

* combine.c (simplify_if_then_else): Simplify IF_THEN_ELSE
that isolates a single bit, even if the condition involves
subregs.
* match.pd: Simplify X ? C : 0 where C is a power of 2 and
X tests a single bit.

2016-11-26  Paolo Bonzini  

* gcc.dg/fold-and-lshift.c, gcc.dg/fold-and-rshift-1.c,
gcc.dg/fold-and-rshift-2.c: New testcases.

Index: combine.c
===
--- combine.c   (revision 242742)
+++ combine.c   (working copy)
@@ -6522,14 +6522,22 @@
   simplify_shift_const (NULL_RTX, ASHIFT, mode,
gen_lowpart (mode, XEXP (cond, 0)), i);
 
-  /* (IF_THEN_ELSE (NE REG 0) (0) (8)) is REG for nonzero_bits (REG) == 8.  */
+  /* (IF_THEN_ELSE (NE A 0) C1 0) is A or a zero-extend of A if the only
+ non-zero bit in A is C1.  */
   if (true_code == NE && XEXP (cond, 1) == const0_rtx
   && false_rtx == const0_rtx && CONST_INT_P (true_rtx)
-  && GET_MODE (XEXP (cond, 0)) == mode
+  && INTEGRAL_MODE_P (GET_MODE (XEXP (cond, 0)))
   && (UINTVAL (true_rtx) & GET_MODE_MASK (mode))
- == nonzero_bits (XEXP (cond, 0), mode)
+ == nonzero_bits (XEXP (cond, 0), GET_MODE (XEXP (cond, 0)))
   && (i = exact_log2 (UINTVAL (true_rtx) & GET_MODE_MASK (mode))) >= 0)
-return XEXP (cond, 0);
+{
+  rtx val = XEXP (cond, 0);
+  enum machine_mode val_mode = GET_MODE (val);
+  if (val_mode == mode)
+return val;
+  else if (GET_MODE_PRECISION (val_mode) < GET_MODE_PRECISION (mode))
+return simplify_gen_unary (ZERO_EXTEND, mode, val, val_mode);
+}
 
   return x;
 }
Index: match.pd
===
--- match.pd(revision 242742)
+++ match.pd(working copy)
@@ -2554,6 +2554,19 @@
   (cmp (bit_and@2 @0 integer_pow2p@1) @1)
   (icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))
  
+/* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
+   convert this into a shift of (A & C).  */
+(simplify
+ (cond
+  (ne (bit_and@2 @0 integer_pow2p@1) integer_zerop)
+  integer_pow2p@3 integer_zerop)
+ (with {
+int shift = wi::exact_log2 (@3) - wi::exact_log2 (@1);
+  }
+  (if (shift > 0)
+   (lshift (convert @2) { build_int_cst (integer_type_node, shift); })
+   (convert (rshift @2 { build_int_cst (integer_type_node, -shift); })
+
 /* If we have (A & C) != 0 where C is the sign bit of A, convert
this into A < 0.  Similarly for (A & C) == 0 into A >= 0.  */
 (for cmp (eq ne)
@@ -2568,6 +2581,19 @@
(with { tree stype = signed_type_for (TREE_TYPE (@0)); }
 (ncmp (convert:stype @0) { build_zero_cst (stype); })
 
+/* If we have A < 0 ? C : 0 where C and D are powers of 2,
+   convert this into a right shift and AND.  */
+(simplify
+ (cond
+  (lt @0 integer_zerop)
+  integer_pow2p@1 integer_zerop)
+ (with {
+int shift = element_precision (@0) - wi::exact_log2 (@1) - 1;
+  }
+  (bit_and
+   (convert (rshift @0 { build_int_cst (integer_type_node, shift); }))
+   @1)))
+
 /* When the addresses are not directly of decls

[Patch][i386] PR 70118: Fix ubsan warning on SSE2 loadl_epi64 and storel_epi64

2016-11-26 Thread Allan Sandfeld Jensen

Use the recently introduced unaligned variant of __m128i and add a similar 
__m64 and use those to make it clear these two intrinsics require neither 128-
bit nor 64-bit alignment.

`Allan
Index: gcc/config/i386/emmintrin.h
===
--- gcc/config/i386/emmintrin.h	(revision 242753)
+++ gcc/config/i386/emmintrin.h	(working copy)
@@ -703,9 +703,9 @@
 }
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_loadl_epi64 (__m128i const *__P)
+_mm_loadl_epi64 (__m128i_u const *__P)
 {
-  return _mm_set_epi64 ((__m64)0LL, *(__m64 *)__P);
+  return _mm_set_epi64 ((__m64)0LL, *(__m64_u *)__P);
 }
 
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
@@ -721,9 +721,9 @@
 }
 
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_mm_storel_epi64 (__m128i *__P, __m128i __B)
+_mm_storel_epi64 (__m128i_u *__P, __m128i __B)
 {
-  *(long long *)__P = ((__v2di)__B)[0];
+  *(__m64_u *)__P = __m64(((__v2di)__B)[0]);
 }
 
 extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
Index: gcc/config/i386/mmintrin.h
===
--- gcc/config/i386/mmintrin.h	(revision 242753)
+++ gcc/config/i386/mmintrin.h	(working copy)
@@ -37,6 +37,9 @@
vector types, and their scalar components.  */
 typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__));
 
+/* Unaligned version of the same type  */
+typedef int __m64_u __attribute__ ((__vector_size__ (8), __may_alias__, __aligned__ (1)));
+
 /* Internal data types for implementing the intrinsics.  */
 typedef int __v2si __attribute__ ((__vector_size__ (8)));
 typedef short __v4hi __attribute__ ((__vector_size__ (8)));

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

2016-11-26 Thread Segher Boessenkool

On Sat, Nov 26, 2016 at 09:15:48AM -0700, Jeff Law wrote:
> On 11/26/2016 04:11 AM, Eric Botcazou wrote:
> >> From my investigations on the m68k, the effects on the IL are minimal
> >>with a slight bias towards better code (by suppressing if-conversions of
> >>some now more costly blocks).  *But* the size of the resulting code was
> >>all over the place -- sometimes it was better, others worse.  From
> >>looking at the assembly we seemingly are copying blocks that aren't
> >>strictly necessary.
> >
> >I'm seeing essentially the same thing on SPARC, probably because of the 
> >ifcvt
> >change; the rtlanal change seems to be neutral for the architecture.
> Just to be clear, I was only testing the rtlanal change, not the ifcvt 
> change.
> 
> I repeated my test on the GCC runtime libraries for m68k-elf.  Bernd's 
> rtlanal change +.03%, the goof in STC, +9.4%.  So the STC goof still 
> dwarfs the impact to Bernd's change, but not as badly as I saw in the 
> newlib codebase.

orig, i386+rtlanal, i386+rtlanal+ifcvt:

worse:
   alpha   5439003   5455979   5455979
 c6x   2107939   2108931   2108931
cris   2189380   2193836   2193836
m32r   3427409   3427541   3427453
m68k   3228408   3230978   3230978
mips   4286748   4286964   4286692
  mips64   5564819   5565643   5565291
  parisc   8278881   8289977   8289573
parisc64   7234619   7249187   7249139
 powerpc   8438949   8440005   8440005
   powerpc64  14499969  14508689  14508689
s390  12778748  12779228  12779220
 shnommu   1369868   1371020   1371020
 sparc64   5921556   5922172   5922172
  tilegx  12297581  12307461  12307461
 tilepro  11215603  11227339  11227339
  xtensa   1776196   1779152   1779152

better:
blackfin   1973931   1973867   1973867
 frv   3638192   3637792   3637792
   h8300   1060172   1059976   1059976
i386   9742984   9742463   9742463
ia64  15402035  15396171  15396171
 mn10300   2360025   2358201   2358201
   nios2   3185625   3176693   3176693
  x86_64  10360418  10359588  10359588

did not build:
 arc 0 0 0
 arm 0 0 0
   arm64 0 0 0
  microblaze 0 0 0
  sh 0 0 0
   sparc 0 0 0


tl;dr: The ifcvt change doesn't do much, but the cost change does.


Segher

[PATCHv3] [AARCH64] Add variant support to -m="native"and add thunderxt88p1.

2016-11-26 Thread Andrew Pinski

On Tue, Nov 1, 2016 at 11:08 AM, Andrew Pinski  wrote:
> On Tue, Nov 17, 2015 at 2:10 PM, Andrew Pinski  wrote:
>> Since ThunderX T88 pass 1 (variant 0) is a ARMv8 part while pass 2 (variant 
>> 1)
>> is an ARMv8.1 part, I needed to add detecting of the variant also for this
>> difference. Also I simplify a little bit and combined the single core and
>> arch detecting cases so it would be easier to add variant.
>
> Actually it is a bit more complex than what I said here, see below for
> the full table of options and what are enabled/disabled now.
>
>> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>> Tested -mcpu=native on both T88 pass 1 and T88 pass 2 to make sure it is
>> deecting the two seperately.
>
>
> Here is the final patch in this series updated; I changed the cpu name
> slightly and made sure I updated invoke.texi too.
>
> The names are going to match the names in LLVM (worked with our LLVM
> engineer here at Cavium about the names).
> Here are the names recorded and
> -mpcu=thunderx:
> *Matches part num 0xA0 (reserved for ThunderX 8x series)
> *T88 Pass 2 scheduling
> *Hardware prefetching (software prefetching disabled)
> *LSE enabled
> *no v8.1
>
> -mcpu=thunderxt88:
> *Matches part num 0xA1
> *T88 Pass 2 scheduling
> *software prefetching enabled
> *LSE enabled
> *no v8.1
>
> -mcpu=thunderxt88p1 (only for GCC):
> *Matches part num 0xA1, variant 0
> *T88 Pass 1 scheduling
> *software prefetching enabled
> *no LSE enabled
> *no v8.1
>
> -mcpu=thunderxt81 and -mcpu=thunderxt83:
> *Matches part num 0xA2/0xA3
> *T88 Pass 2 scheduling
> *Hardware prefetching (software prefetching disabled)
> *LSE enabled
> *v8.1
>
>
> I have not hooked up software vs hardware prefetching and the
> scheduler parts (the next patch will do part of that); both ARMv8.1-a
> and LSE parts are hooked up as those parts are only in
> aarch64-cores.def.
>
> OK?  Bootstrapped and tested on ThunderX T88 and ThunderX T81
> (aarch64-linux-gnu).

Here is the latest version of the patch.  Updated for the latest
additions of "falkor".   Also added a comment about the order of
"thunderxt88p1" and "thunderxt88".

OK?  Bootstrapped and tested on arrch64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:

* config/aarch64/aarch64-cores.def: Add -1 as the variant to all of the cores.
(thunderx): Update to include LSE by default.
(thunderxt88p1): New core.
(thunderxt88): New core.
(thunderxt81): New core.
(thunderxt83): New core.
* config/aarch64/driver-aarch64.c (struct aarch64_core_data): Add variant field.
(ALL_VARIANTS): New define.
(AARCH64_CORE): Support VARIANT operand.
(cpu_data): Likewise.
(host_detect_local_cpu): Parse variant field of /proc/cpuinfo.  Combine the arch
and single core case and support variant searching.
* common/config/aarch64/aarch64-common.c (AARCH64_CORE): Add VARIANT operand.
* config/aarch64/aarch64-opts.h (AARCH64_CORE): Likewise.
* config/aarch64/aarch64.c (AARCH64_CORE): Likewise.
* config/aarch64/aarch64.h (AARCH64_CORE): Likewise.
* config/aarch64/aarch64-tune.md: Regenerate.

* doc/invoke.texi (AARCH64/mtune): Document thunderxt88,
thunderxt88p1, thunderxt81, thunderxt83 as available options.
Index: common/config/aarch64/aarch64-common.c
===
--- common/config/aarch64/aarch64-common.c  (revision 242888)
+++ common/config/aarch64/aarch64-common.c  (working copy)
@@ -145,7 +145,7 @@ struct arch_to_arch_name
the default set of architectural feature flags they support.  */
 static const struct processor_name_to_arch all_cores[] =
 {
-#define AARCH64_CORE(NAME, X, IDENT, ARCH_IDENT, FLAGS, COSTS, IMP, PART) \
+#define AARCH64_CORE(NAME, X, IDENT, ARCH_IDENT, FLAGS, COSTS, IMP, PART, 
VARIANT) \
   {NAME, AARCH64_ARCH_##ARCH_IDENT, FLAGS},
 #include "config/aarch64/aarch64-cores.def"
   {"generic", AARCH64_ARCH_8A, AARCH64_FL_FOR_ARCH8},
Index: config/aarch64/aarch64-cores.def
===
--- config/aarch64/aarch64-cores.def(revision 242888)
+++ config/aarch64/aarch64-cores.def(working copy)
@@ -21,7 +21,7 @@
 
Before using #include to read this file, define a macro:
 
-  AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH_IDENT, FLAGS, 
COSTS, IMP, PART)
+  AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH_IDENT, FLAGS, 
COSTS, IMP, PART, VARIANT)
 
The CORE_NAME is the name of the core, represented as a string constant.
The CORE_IDENT is the name of the core, represented as an identifier.
@@ -39,40 +39,47 @@
PART is the part number of the CPU.  On a GNU/Linux system it can be
found in /proc/cpuinfo.  For big.LITTLE systems this should use the
macro AARCH64_BIG_LITTLE where the big part

Re: Documentation of LTIME

2016-11-26 Thread Janus Weil

>> If not, we definitely need to fix the documentation of LTIME, since
>> the current version simply does not work with TIME8(), unless one uses
>> -fdefault-integer-8 (which is not mentioned in the docu).
>
> What about the attached patch

Yes, looks good to me. Ok to commit!

One minor nit (optional):

@@ -9635,10 +9650,15 @@ To stat an open file: @ref{FSTAT}, to st

 @table @asis
 @item @emph{Description}:
-Given a system time value @var{TIME} (as provided by the @code{TIME8}
+Given a system time value @var{TIME} (as provided by the @code{TIME}
 intrinsic), fills @var{VALUES} with values extracted from it appropriate
 to the local time zone using @code{localtime(3)}.

I would use @ref{TIME} here (there are two places where this occurs).
Also I would add @ref{DATE_AND_TIME} to the "see also" section.

Thanks,
Janus

Re: Documentation of LTIME

2016-11-26 Thread Dominique d'Humières



> Still, since we internally already have both implementations for
> kind=4 and kind=8, we could as well make use of them, I guess.

I let this for people understanding what to do.

> If not, we definitely need to fix the documentation of LTIME, since
> the current version simply does not work with TIME8(), unless one uses
> -fdefault-integer-8 (which is not mentioned in the docu).

What about the attached patch



patch-78545
Description: Binary data


Dominique

> 
> Cheers,
> Janus

Re: [PATCH] Fix a couple of issues in gimple-ssa-sprintf.c

2016-11-26 Thread Martin Sebor


On 11/25/2016 10:20 AM, Jakub Jelinek wrote:

Hi!

Here is an attempt to fix a couple of bugs in gimple-ssa-sprintf.c.
First of all, it assumes size_t is always the same as uintmax_t, which
is not necessarily the case.
Second, it uses static tree {,u}intmax_type_node; variables for caching
those types, but doesn't register them with GC; but their computation
is quite cheap, so I think it isn't worth wasting a GC root for those,
especially if we compute it only in the very rare case when somebody
uses the j modifier.
Third, the code assumes that ptrdiff_t is the signed type for size_t.
E.g. vms is one port where that isn't true, ptrdiff_t can be 64-bit,
while size_t is 32-bit.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


It looks fine to me.  Thanks for fixing these corner cases!

Martin

PS As the comment above the build_intmax_type_node function
mentions, its body was copied from lto/lto-lang.c.  It would be
useful not to have to duplicate the code in the middle-end and
instead provide a shared definition of each of the nodes so that
they could be used everywhere.  Ditto for ptrdiff_type_node.
It would make it less likely for patches to break when someone
forgets to bootstrap and test Ada, for instance, or where the
use of the type isn't exercised by bootstrap or the test suite
due to gaps in coverage.



2016-11-25  Jakub Jelinek  

* gimple-ssa-sprintf.c (build_intmax_type_nodes): Look at
UINTMAX_TYPE rather than SIZE_TYPE.  Add gcc_unreachable if
intmax_t couldn't be determined.
(format_integer): Make {,u}intmax_type_node no longer static,
initialize them only when needed.  For z and t use
signed_or_unsigned_type_for instead of assuming size_t and
ptrdiff_t have the same precision.

--- gcc/gimple-ssa-sprintf.c.jj 2016-11-25 09:49:47.0 +0100
+++ gcc/gimple-ssa-sprintf.c2016-11-25 10:26:58.763114194 +0100
@@ -733,23 +733,23 @@ format_percent (const conversion_spec &,
 }


-/* Ugh.  Compute intmax_type_node and uintmax_type_node the same way
-   lto/lto-lang.c does it.  This should be available in tree.h.  */
+/* Compute intmax_type_node and uintmax_type_node similarly to how
+   tree.c builds size_type_node.  */

 static void
 build_intmax_type_nodes (tree *pintmax, tree *puintmax)
 {
-  if (strcmp (SIZE_TYPE, "unsigned int") == 0)
+  if (strcmp (UINTMAX_TYPE, "unsigned int") == 0)
 {
   *pintmax = integer_type_node;
   *puintmax = unsigned_type_node;
 }
-  else if (strcmp (SIZE_TYPE, "long unsigned int") == 0)
+  else if (strcmp (UINTMAX_TYPE, "long unsigned int") == 0)
 {
   *pintmax = long_integer_type_node;
   *puintmax = long_unsigned_type_node;
 }
-  else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0)
+  else if (strcmp (UINTMAX_TYPE, "long long unsigned int") == 0)
 {
   *pintmax = long_long_integer_type_node;
   *puintmax = long_long_unsigned_type_node;
@@ -762,12 +762,14 @@ build_intmax_type_nodes (tree *pintmax,
char name[50];
sprintf (name, "__int%d unsigned", int_n_data[i].bitsize);

-   if (strcmp (name, SIZE_TYPE) == 0)
+   if (strcmp (name, UINTMAX_TYPE) == 0)
  {
*pintmax = int_n_trees[i].signed_type;
*puintmax = int_n_trees[i].unsigned_type;
+   return;
  }
  }
+  gcc_unreachable ();
 }
 }

@@ -851,15 +853,8 @@ format_pointer (const conversion_spec 
 static fmtresult
 format_integer (const conversion_spec , tree arg)
 {
-  /* These are available as macros in the C and C++ front ends but,
- sadly, not here.  */
-  static tree intmax_type_node;
-  static tree uintmax_type_node;
-
-  /* Initialize the intmax nodes above the first time through here.  */
-  if (!intmax_type_node)
-build_intmax_type_nodes (_type_node, _type_node);
-
+  tree intmax_type_node;
+  tree uintmax_type_node;
   /* Set WIDTH and PRECISION to either the values in the format
  specification or to zero.  */
   int width = spec.have_width ? spec.width : 0;
@@ -909,19 +904,20 @@ format_integer (const conversion_spec 
   break;

 case FMT_LEN_z:
-  dirtype = sign ? ptrdiff_type_node : size_type_node;
+  dirtype = signed_or_unsigned_type_for (!sign, size_type_node);
   break;

 case FMT_LEN_t:
-  dirtype = sign ? ptrdiff_type_node : size_type_node;
+  dirtype = signed_or_unsigned_type_for (!sign, ptrdiff_type_node);
   break;

 case FMT_LEN_j:
+  build_intmax_type_nodes (_type_node, _type_node);
   dirtype = sign ? intmax_type_node : uintmax_type_node;
   break;

 default:
-   return fmtresult ();
+  return fmtresult ();
 }

   /* The type of the argument to the directive, either deduced from

Jakub

Re: [Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

2016-11-26 Thread Janus Weil

2016-11-26 17:37 GMT+01:00 Dominique d'Humières :
>
>> Le 26 nov. 2016 à 10:45, Janus Weil  a écrit :
>>
>> ping!
>>
> The patch is working has expected. Note the removed block has been introduced 
> by Daniel Franke at r126826.

Right, thanks for the reference. I think that commit is plain wrong,
at least the part that says "Specification functions are constant".

One can easily construct a specification function that is not a
compile-time constant. For example, just take the module function
"get_i" in the test case and have it depend on a variable declared in
the module header.

module mytypes
   implicit none
   integer, save :: i = 13
 contains
   pure integer function get_i ()
 get_i = i
   end function
  subroutine set_i (j)
integer, intent(in) :: j
i = j
  end subroutine
end module

Cheers,
Janus

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

2016-11-26 Thread Segher Boessenkool

On Sat, Nov 26, 2016 at 03:44:22AM -0700, Jeff Law wrote:
> On 11/24/2016 03:32 PM, Segher Boessenkool wrote:
> >On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:
> >>On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> >>>On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> 
> That we compare different kinds of costs (which really has no meaning at
> all, it's a heuristic at best) in various places is a known problem, not
> a regression.
> >>>But the problems with the costing system exhibit themselves as a code
> >>>quality regression.  In the end that's what the end-users see -- a
> >>>regression in the quality of the code GCC generates.
> >>
> >>Yes, exactly -- and I fear this all-encompassing change will cause just
> >>such a regression for many users.  Tests are running, will know more
> >>later today (or tomorrow).
> >>
> >>The PR is about a very specific problem; the patch is not.  The patch
> >>is not a bug fix.  If we allow anything that "makes things better" in
> >>stage 3, what make it different from stage 1 then?
> >
> >Here are results of testing with trunk right before the three patches,
> >compared with with the three patches.  This lists the sizes of the vmlinux
> >of a Linux kernel build for that arch.
> Thanks.  While I question how much emphasis we should put on code sizes 
> as a way to measure this change, it can still point out interesting 
> effects, positive and negative.

Code size I can test "easily" for many archs (it still takes almost a
full day), and it does correlate well with local optimisations on most
archs.  I have looked at the actual differences on some archs (which
takes a lot more time still), and the differences are all over the place.
Which suggests changing the costs is a big change for most of those
archs; and they all have been tuned for the *old* situation, so this
makes things worse in the short run, whether the new costs are better
or not.

Not a change for stage 3, and not something *I* should need to analyse
anyway; this analysis needs to be done *before* the patch goes in.

> From my investigations on the m68k, the effects on the IL are minimal 
> with a slight bias towards better code (by suppressing if-conversions of 
> some now more costly blocks).  *But* the size of the resulting code was 
> all over the place -- sometimes it was better, others worse.  From 
> looking at the assembly we seemingly are copying blocks that aren't 
> strictly necessary.
> 
> Enter bb-reorder and the STC algorithm.  It is copying blocks *very* 
> aggressively, like absurdly aggressively on the m68k.  Of course it 
> doesn't help that the m68k doesn't define a length attribute and as a 
> result STC thinks every insn has size 0 and thus block copying is zero cost.
> 
> I want to verify the #s, so take this with a slight grain of salt.  The 
> net changes to newlib's .o's for Bernd's work -- +30 bytes.  The effect 
> of the STC issue above -- +1115586 bytes.  Or to put it another way, 
> Bernd's changes, +.0003% change.  STC, +13.8%.

STC wasn't changed in the patch.  Maybe interactions with STC is what
causes all the problems, but that is an argument *against* doing this
after stage 1.


Segher

Re: Documentation of LTIME

2016-11-26 Thread Janus Weil

 Well LTIME cannot accept both kind(4) and kind(8) arguments. The reference 
 to TIME8 looks like a mistake, isn’t it?
>>>
>>> Huh, in libgfortran I see two versions with different kinds (ltime_i4
>>> and ltime_i8), but in my tests I never get LTIME to work with kind=8
>>> arguments. I guess that is the real bug here ...
>>
>> I think the origin of the bug is:
>>
>> void
>> gfc_resolve_ltime (gfc_code *c)
>> {
>>   c->resolved_sym
>> = gfc_get_intrinsic_sub_symbol (gfc_get_string (PREFIX ("ltime_i%d"),
>> gfc_default_integer_kind));
>> }
>>
>>
>> This always uses the ITIME version corresponding to
>> gfc_default_integer_kind, disregarding the actual kind of the
>> arguments.
>
> LTIME, ITIME, TIME, TIME8, IDATE are g77 intrinsics, from back before
> newfangled things like kinds. So there are versions for integer kind=4
> and kind=8 due to -fdefault-integer-8.

Ok, I see. Thanks for the comment.

Still, since we internally already have both implementations for
kind=4 and kind=8, we could as well make use of them, I guess.

If not, we definitely need to fix the documentation of LTIME, since
the current version simply does not work with TIME8(), unless one uses
-fdefault-integer-8 (which is not mentioned in the docu).

Cheers,
Janus

Re: Documentation of LTIME

2016-11-26 Thread Janne Blomqvist

On Sat, Nov 26, 2016 at 7:03 PM, Janus Weil  wrote:
> 2016-11-26 17:58 GMT+01:00 Janus Weil :
 * possibly add some more cross-links to intrinsic.texi
>>>
>>> Could you please elaborate?
>>
>> I just mean it might be useful to add some more links from LTIME to
>> ITIME, IDATE and DATE_AND_DATE (and back?). They are are all very
>> similar.
>
> Sorry, I meant DATE_AND_TIME here.
>
>
>>> Well LTIME cannot accept both kind(4) and kind(8) arguments. The reference 
>>> to TIME8 looks like a mistake, isn’t it?
>>
>> Huh, in libgfortran I see two versions with different kinds (ltime_i4
>> and ltime_i8), but in my tests I never get LTIME to work with kind=8
>> arguments. I guess that is the real bug here ...
>
> I think the origin of the bug is:
>
> void
> gfc_resolve_ltime (gfc_code *c)
> {
>   c->resolved_sym
> = gfc_get_intrinsic_sub_symbol (gfc_get_string (PREFIX ("ltime_i%d"),
> gfc_default_integer_kind));
> }
>
>
> This always uses the ITIME version corresponding to
> gfc_default_integer_kind, disregarding the actual kind of the
> arguments.
>
> Cheers,
> Janus

LTIME, ITIME, TIME, TIME8, IDATE are g77 intrinsics, from back before
newfangled things like kinds. So there are versions for integer kind=4
and kind=8 due to -fdefault-integer-8.

DATE_AND_TIME is different since it's in the current standard and IIRC
is specified to work with any integer kind.

-- 
Janne Blomqvist

Re: Documentation of LTIME

2016-11-26 Thread Janus Weil

2016-11-26 17:58 GMT+01:00 Janus Weil :
>>> * possibly add some more cross-links to intrinsic.texi
>>
>> Could you please elaborate?
>
> I just mean it might be useful to add some more links from LTIME to
> ITIME, IDATE and DATE_AND_DATE (and back?). They are are all very
> similar.

Sorry, I meant DATE_AND_TIME here.


>> Well LTIME cannot accept both kind(4) and kind(8) arguments. The reference 
>> to TIME8 looks like a mistake, isn’t it?
>
> Huh, in libgfortran I see two versions with different kinds (ltime_i4
> and ltime_i8), but in my tests I never get LTIME to work with kind=8
> arguments. I guess that is the real bug here ...

I think the origin of the bug is:

void
gfc_resolve_ltime (gfc_code *c)
{
  c->resolved_sym
= gfc_get_intrinsic_sub_symbol (gfc_get_string (PREFIX ("ltime_i%d"),
gfc_default_integer_kind));
}


This always uses the ITIME version corresponding to
gfc_default_integer_kind, disregarding the actual kind of the
arguments.

Cheers,
Janus

Re: Documentation of LTIME

2016-11-26 Thread Janus Weil

>> * possibly add some more cross-links to intrinsic.texi
>
> Could you please elaborate?

I just mean it might be useful to add some more links from LTIME to
ITIME, IDATE and DATE_AND_DATE (and back?). They are are all very
similar.



>> And one last point: The description of LTIME mentions TIME8, but if I try 
>> this:
>>
>>  call ltime(time8(), values)
>>
>> I get:
>>
>> Error: ‘time’ argument of ‘ltime’ intrinsic at (1) must be of kind 4
>
> Well LTIME cannot accept both kind(4) and kind(8) arguments. The reference to 
> TIME8 looks like a mistake, isn’t it?

Huh, in libgfortran I see two versions with different kinds (ltime_i4
and ltime_i8), but in my tests I never get LTIME to work with kind=8
arguments. I guess that is the real bug here ...

Maybe one can at least linkify the TIME8 in the documentation? (Or
mention both intrinsics, TIME and TIME8).

Cheers,
Janus

Re: [Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

2016-11-26 Thread Dominique d'Humières


> Le 26 nov. 2016 à 10:45, Janus Weil  a écrit :
> 
> ping!
> 
The patch is working has expected. Note the removed block has been introduced 
by Daniel Franke at r126826.

Dominique.

Re: Documentation of LTIME

2016-11-26 Thread Dominique d'Humières


> Le 26 nov. 2016 à 17:00, Janus Weil  a écrit :
> 
> And one last point: The description of LTIME mentions TIME8, but if I try 
> this:
> 
>  call ltime(time8(), values)
> 
> I get:
> 
> Error: ‘time’ argument of ‘ltime’ intrinsic at (1) must be of kind 4

Well LTIME cannot accept both kind(4) and kind(8) arguments. The reference to 
TIME8 looks like a mistake, isn’t it?

Dominique
 
> 
> So maybe the reference to TIME8 should be replaced by TIME?
> 
> Cheers,
> Janus
>

Re: Documentation of LTIME

2016-11-26 Thread Dominique d'Humières


> Le 26 nov. 2016 à 16:54, Janus Weil  a écrit :
> 
> 2016-11-26 16:49 GMT+01:00 Dominique d'Humières :
>> If there is no objection, I’ll commit the following patch
> 
> Looks good to me!
> 
> If you want, you could also fix the remaining items I mentioned in
> comment 1 in the PR:
> * apply the same fix to GMTIME

Done

> * fix it in the libgfortran sources as well

Done

> * possibly add some more cross-links to intrinsic.texi

Could you please elaborate?

Dominique

> 
> Thanks,
> Janus

Re: Documentation of LTIME

2016-11-26 Thread Francisco Pena

Dominique,

your patch sounds good to me.

2016-11-26 16:49 GMT+01:00 Dominique d'Humières :
> If there is no objection, I’ll commit the following patch
>
> --- ../_clean/gcc/fortran/intrinsic.texi2016-11-25 22:03:20.0 
> +0100
> +++ gcc/fortran/intrinsic.texi  2016-11-26 16:47:12.0 +0100
> @@ -9663,11 +9663,11 @@ The elements of @var{VALUES} are assigne
>  seconds
>  @item Minutes after the hour, range 0--59
>  @item Hours past midnight, range 0--23
> -@item Day of month, range 0--31
> -@item Number of months since January, range 0--12
> +@item Day of month, range 1--31
> +@item Number of months since January, range 0--11
>  @item Years since 1900
>  @item Number of days since Sunday, range 0--6
> -@item Days since January 1
> +@item Days since January 1, range 0--365
>  @item Daylight savings indicator: positive if daylight savings is in
>  effect, zero if not, and negative if the information is not available.
>  @end enumerate
>
> Dominique

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

2016-11-26 Thread Jeff Law


On 11/26/2016 04:11 AM, Eric Botcazou wrote:

 From my investigations on the m68k, the effects on the IL are minimal
with a slight bias towards better code (by suppressing if-conversions of
some now more costly blocks).  *But* the size of the resulting code was
all over the place -- sometimes it was better, others worse.  From
looking at the assembly we seemingly are copying blocks that aren't
strictly necessary.


I'm seeing essentially the same thing on SPARC, probably because of the ifcvt
change; the rtlanal change seems to be neutral for the architecture.
Just to be clear, I was only testing the rtlanal change, not the ifcvt 
change.


I repeated my test on the GCC runtime libraries for m68k-elf.  Bernd's 
rtlanal change +.03%, the goof in STC, +9.4%.  So the STC goof still 
dwarfs the impact to Bernd's change, but not as badly as I saw in the 
newlib codebase.



Jeff

Re: Documentation of LTIME

2016-11-26 Thread Janus Weil

And one last point: The description of LTIME mentions TIME8, but if I try this:

  call ltime(time8(), values)

I get:

Error: ‘time’ argument of ‘ltime’ intrinsic at (1) must be of kind 4

So maybe the reference to TIME8 should be replaced by TIME?

Cheers,
Janus



2016-11-26 16:54 GMT+01:00 Janus Weil :
> 2016-11-26 16:49 GMT+01:00 Dominique d'Humières :
>> If there is no objection, I’ll commit the following patch
>
> Looks good to me!
>
> If you want, you could also fix the remaining items I mentioned in
> comment 1 in the PR:
> * apply the same fix to GMTIME
> * fix it in the libgfortran sources as well
> * possibly add some more cross-links to intrinsic.texi
>
> Thanks,
> Janus
>
>
>
>> --- ../_clean/gcc/fortran/intrinsic.texi2016-11-25 
>> 22:03:20.0 +0100
>> +++ gcc/fortran/intrinsic.texi  2016-11-26 16:47:12.0 +0100
>> @@ -9663,11 +9663,11 @@ The elements of @var{VALUES} are assigne
>>  seconds
>>  @item Minutes after the hour, range 0--59
>>  @item Hours past midnight, range 0--23
>> -@item Day of month, range 0--31
>> -@item Number of months since January, range 0--12
>> +@item Day of month, range 1--31
>> +@item Number of months since January, range 0--11
>>  @item Years since 1900
>>  @item Number of days since Sunday, range 0--6
>> -@item Days since January 1
>> +@item Days since January 1, range 0--365
>>  @item Daylight savings indicator: positive if daylight savings is in
>>  effect, zero if not, and negative if the information is not available.
>>  @end enumerate
>>
>> Dominique
>>
>>> Le 26 nov. 2016 à 16:02, Francisco Pena  a écrit :
>>>
>>> Thank you guys,
>>>
>>> I have reported the documentation bug,
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78545
>>

Re: Documentation of LTIME

2016-11-26 Thread Janus Weil

2016-11-26 16:49 GMT+01:00 Dominique d'Humières :
> If there is no objection, I’ll commit the following patch

Looks good to me!

If you want, you could also fix the remaining items I mentioned in
comment 1 in the PR:
* apply the same fix to GMTIME
* fix it in the libgfortran sources as well
* possibly add some more cross-links to intrinsic.texi

Thanks,
Janus



> --- ../_clean/gcc/fortran/intrinsic.texi2016-11-25 22:03:20.0 
> +0100
> +++ gcc/fortran/intrinsic.texi  2016-11-26 16:47:12.0 +0100
> @@ -9663,11 +9663,11 @@ The elements of @var{VALUES} are assigne
>  seconds
>  @item Minutes after the hour, range 0--59
>  @item Hours past midnight, range 0--23
> -@item Day of month, range 0--31
> -@item Number of months since January, range 0--12
> +@item Day of month, range 1--31
> +@item Number of months since January, range 0--11
>  @item Years since 1900
>  @item Number of days since Sunday, range 0--6
> -@item Days since January 1
> +@item Days since January 1, range 0--365
>  @item Daylight savings indicator: positive if daylight savings is in
>  effect, zero if not, and negative if the information is not available.
>  @end enumerate
>
> Dominique
>
>> Le 26 nov. 2016 à 16:02, Francisco Pena  a écrit :
>>
>> Thank you guys,
>>
>> I have reported the documentation bug,
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78545
>

Re: Documentation of LTIME

2016-11-26 Thread Dominique d'Humières

If there is no objection, I’ll commit the following patch

--- ../_clean/gcc/fortran/intrinsic.texi2016-11-25 22:03:20.0 
+0100
+++ gcc/fortran/intrinsic.texi  2016-11-26 16:47:12.0 +0100
@@ -9663,11 +9663,11 @@ The elements of @var{VALUES} are assigne
 seconds
 @item Minutes after the hour, range 0--59
 @item Hours past midnight, range 0--23
-@item Day of month, range 0--31
-@item Number of months since January, range 0--12
+@item Day of month, range 1--31
+@item Number of months since January, range 0--11
 @item Years since 1900
 @item Number of days since Sunday, range 0--6
-@item Days since January 1
+@item Days since January 1, range 0--365
 @item Daylight savings indicator: positive if daylight savings is in
 effect, zero if not, and negative if the information is not available.
 @end enumerate

Dominique

> Le 26 nov. 2016 à 16:02, Francisco Pena  a écrit :
> 
> Thank you guys,
> 
> I have reported the documentation bug,
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78545

Re: [v3 PATCH] LWG 2766, LWG 2749

2016-11-26 Thread Ville Voutilainen

On 22 November 2016 at 17:06, Jonathan Wakely  wrote:
>> so I can certainly change all these swaps to use operator! rather than
>> __not_. Is the
>> patch otherwise ok for trunk? What about the tuple part?
>
>
> Yes, OK changing the top-level __not_s to operator!
> I haven't reviewed the tuple part fully yet.

Updated patches attached, and tested with the full testsuite on Linux-PPC64.
diff --git a/libstdc++-v3/include/bits/stl_pair.h 
b/libstdc++-v3/include/bits/stl_pair.h
index ef52538..981dbeb 100644
--- a/libstdc++-v3/include/bits/stl_pair.h
+++ b/libstdc++-v3/include/bits/stl_pair.h
@@ -478,6 +478,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 swap(pair<_T1, _T2>& __x, pair<_T1, _T2>& __y)
 noexcept(noexcept(__x.swap(__y)))
 { __x.swap(__y); }
+
+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if,
+  __is_swappable<_T2>>::value>::type
+swap(pair<_T1, _T2>&, pair<_T1, _T2>&) = delete;
+#endif
 #endif // __cplusplus >= 201103L
 
   /**
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index f9ec60f..21b0bac 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -649,6 +649,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 swap(unique_ptr<_Tp, _Dp>& __x,
 unique_ptr<_Tp, _Dp>& __y) noexcept
 { __x.swap(__y); }
+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if::value>::type
+swap(unique_ptr<_Tp, _Dp>&,
+unique_ptr<_Tp, _Dp>&) = delete;
+#endif
 
   template
diff --git a/libstdc++-v3/include/std/array b/libstdc++-v3/include/std/array
index 3ab0355..86100b5 100644
--- a/libstdc++-v3/include/std/array
+++ b/libstdc++-v3/include/std/array
@@ -287,6 +287,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 swap(array<_Tp, _Nm>& __one, array<_Tp, _Nm>& __two)
 noexcept(noexcept(__one.swap(__two)))
 { __one.swap(__two); }
+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if<
+  !_GLIBCXX_STD_C::__array_traits<_Tp, _Nm>::_Is_swappable::value>::type
+swap(array<_Tp, _Nm>&, array<_Tp, _Nm>&) = delete;
+#endif
 
   template
 constexpr _Tp&
diff --git a/libstdc++-v3/include/std/optional 
b/libstdc++-v3/include/std/optional
index ea673cc..191d64b 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -930,6 +930,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { __lhs.swap(__rhs); }
 
   template
+inline enable_if_t && is_swappable_v<_Tp>)>
+swap(optional<_Tp>&, optional<_Tp>&) = delete;
+
+  template
 constexpr optional>
 make_optional(_Tp&& __t)
 { return optional> { std::forward<_Tp>(__t) }; }
diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 7d93575..c87d83d 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -877,10 +877,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return false; }
 
   template
-inline auto swap(variant<_Types...>& __lhs, variant<_Types...>& __rhs)
-noexcept(noexcept(__lhs.swap(__rhs))) -> decltype(__lhs.swap(__rhs))
+inline enable_if_t<__and_...,
+ is_swappable<_Types>...>::value>
+swap(variant<_Types...>& __lhs, variant<_Types...>& __rhs)
+noexcept(noexcept(__lhs.swap(__rhs)))
 { __lhs.swap(__rhs); }
 
+  template
+inline enable_if_t...,
+  is_swappable<_Types>...>::value>
+swap(variant<_Types...>&, variant<_Types...>&) = delete;
+
   class bad_variant_access : public exception
   {
   public:
diff --git a/libstdc++-v3/testsuite/20_util/optional/swap/2.cc 
b/libstdc++-v3/testsuite/20_util/optional/swap/2.cc
index 5793488..cb9291a 100644
--- a/libstdc++-v3/testsuite/20_util/optional/swap/2.cc
+++ b/libstdc++-v3/testsuite/20_util/optional/swap/2.cc
@@ -33,11 +33,11 @@ void swap(B&, B&) noexcept(false);
 static_assert( std::is_swappable_v );
 static_assert( !std::is_nothrow_swappable_v );
 
-// Not swappable, but optional is swappable via the generic std::swap.
+// Not swappable, and optional not swappable via the generic std::swap.
 struct C { };
 void swap(C&, C&) = delete;
 
-static_assert( std::is_swappable_v );
+static_assert( !std::is_swappable_v );
 
 // Not swappable, and optional not swappable via the generic std::swap.
 struct D { D(D&&) = delete; };
diff --git a/libstdc++-v3/testsuite/20_util/pair/swap_cxx17.cc 
b/libstdc++-v3/testsuite/20_util/pair/swap_cxx17.cc
new file mode 100644
index 000..6b09f42
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/pair/swap_cxx17.cc
@@ -0,0 +1,35 @@
+// { dg-options "-std=gnu++17" }
+// { dg-do compile }
+
+// Copyright (C) 2016 Free

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

2016-11-26 Thread Eric Botcazou

>  From my investigations on the m68k, the effects on the IL are minimal
> with a slight bias towards better code (by suppressing if-conversions of
> some now more costly blocks).  *But* the size of the resulting code was
> all over the place -- sometimes it was better, others worse.  From
> looking at the assembly we seemingly are copying blocks that aren't
> strictly necessary.

I'm seeing essentially the same thing on SPARC, probably because of the ifcvt 
change; the rtlanal change seems to be neutral for the architecture.

-- 
Eric Botcazou

Re: change initialization of ptrdiff_type_node

2016-11-26 Thread Prathamesh Kulkarni

On 25 November 2016 at 14:48, Richard Biener  wrote:
> On Fri, 25 Nov 2016, Prathamesh Kulkarni wrote:
>
>> On 25 November 2016 at 13:43, Richard Biener  wrote:
>> > On Fri, 25 Nov 2016, Jakub Jelinek wrote:
>> >
>> >> On Fri, Nov 25, 2016 at 01:28:06PM +0530, Prathamesh Kulkarni wrote:
>> >> > --- a/gcc/lto/lto-lang.c
>> >> > +++ b/gcc/lto/lto-lang.c
>> >> > @@ -1271,8 +1271,30 @@ lto_init (void)
>> >> >gcc_assert (TYPE_MAIN_VARIANT (const_tm_ptr_type_node)
>> >> >   == const_ptr_type_node);
>> >> >
>> >> > -  ptrdiff_type_node = integer_type_node;
>> >> > +  if (strcmp (PTRDIFF_TYPE, "int") == 0)
>> >> > +ptrdiff_type_node = integer_type_node;
>> >> > +  else if (strcmp (PTRDIFF_TYPE, "long int") == 0)
>> >> > +ptrdiff_type_node = long_integer_type_node;
>> >> > +  else if (strcmp (PTRDIFF_TYPE, "long long int") == 0)
>> >> > +ptrdiff_type_node = long_long_integer_type_node;
>> >> > +  else if (strcmp (PTRDIFF_TYPE, "short int") == 0)
>> >> > +ptrdiff_type_node = short_integer_type_node;
>> >> > +  else
>> >> > +{
>> >> > +  ptrdiff_type_node = NULL_TREE;
>> >> > +  for (int i = 0; i < NUM_INT_N_ENTS; i++)
>> >> > +   if (int_n_enabled_p[i])
>> >> > + {
>> >> > +   char name[50];
>> >> > +   sprintf (name, "__int%d", int_n_data[i].bitsize);
>> >> > +   if (strcmp (name, PTRDIFF_TYPE) == 0)
>> >> > + ptrdiff_type_node = int_n_trees[i].signed_type;
>> >> > + }
>> >> > +  if (ptrdiff_type_node == NULL_TREE)
>> >> > +   gcc_unreachable ();
>> >> > +}
>> >>
>> >> This looks ok to me.
>> >
>> > But I'd like to see this in build_common_tree_nodes alongside
>> > the initialization of size_type_node (and thus removed from
>> > c_common_nodes_and_builtins).  This way you can simply remove
>> > the lto-lang.c code as well.
>> >
>> > Please then also remove the ptrdiff_type_node re-set from
>> > free_lang_data ().
>> Hi Richard,
>> Does this version look OK ?
>> Validation in progress.
>
> Yes, patch is ok if testing succeeds.
Thanks, the patch passes bootstrap+test on x86_64-unknown-linux-gnu
with --enable-languages=all,ada
and cross-tested on arm*-*-*, aarch64*-*-* with
--enable-languages=c,c++,fortran.

However LTO bootstrap fails with miscompares (attached)
configured with: --disable-werror --enable-stage1-checking=release
--with-build-config=bootstrap-lto
I verified that the same miscompares happen without the patch too, and
have committed it as r242888.

Thanks,
Prathamesh
>
> Thanks,
> Richard.
gcc/tree-ssa-phiopt.o differs
gcc/sanopt.o differs
gcc/tree-ssa-loop-ivcanon.o differs
gcc/gcc.o differs
gcc/lra.o differs
gcc/tree-ssa-loop-manip.o differs
gcc/tree.o differs
gcc/tree-ssa-dce.o differs
gcc/gcse.o differs
gcc/gimple-ssa-strength-reduction.o differs
gcc/ipa-split.o differs
gcc/ipa.o differs
gcc/cfgexpand.o differs
gcc/recog.o differs
gcc/tree-ssa-loop-niter.o differs
gcc/loop-doloop.o differs
gcc/combine.o differs
gcc/predict.o differs
gcc/dce.o differs
gcc/graphds.o differs
gcc/asan.o differs
gcc/tree-ssa.o differs
gcc/tree-ssa-loop-im.o differs
gcc/ipa-devirt.o differs
gcc/dbxout.o differs
gcc/combine-stack-adj.o differs
gcc/tree-ssa-live.o differs
gcc/sched-rgn.o differs
gcc/trans-mem.o differs
gcc/tree-ssa-loop-unswitch.o differs
gcc/haifa-sched.o differs
gcc/tree-diagnostic.o differs
gcc/tree-vect-stmts.o differs
gcc/collect2.o differs
gcc/tree-vect-data-refs.o differs
gcc/tree-ssa-operands.o differs
gcc/ipa-icf.o differs
gcc/tree-ssa-sccvn.o differs
gcc/tree-ssa-forwprop.o differs
gcc/tsan.o differs
gcc/gimple-ssa-store-merging.o differs
gcc/tree-parloops.o differs
gcc/tree-complex.o differs
gcc/tracer.o differs
gcc/tree-vect-slp.o differs
gcc/diagnostic-show-locus.o differs
gcc/hsa-gen.o differs
gcc/hsa.o differs
gcc/df-scan.o differs
gcc/gcse-common.o differs
gcc/tree-object-size.o differs
gcc/build/genextract.o differs
gcc/build/genpreds.o differs
gcc/build/read-rtl.o differs
gcc/build/gengtype-state.o differs
gcc/build/genattr.o differs
gcc/build/genopinit.o differs
gcc/build/genmatch.o differs
gcc/build/genrecog.o differs
gcc/build/gensupport.o differs
gcc/build/genautomata.o differs
gcc/tree-loop-distribution.o differs
gcc/gimplify.o differs
gcc/symtab.o differs
gcc/lto-wrapper.o differs
gcc/rtlanal.o differs
gcc/dse.o differs
gcc/cfgrtl.o differs
gcc/dwarf2out.o differs
gcc/ifcvt.o differs
gcc/ipa-inline-analysis.o differs
gcc/ira.o differs
gcc/hsa-brig.o differs
gcc/dwarf2cfi.o differs
gcc/gimple.o differs
gcc/sel-sched-dump.o differs
gcc/tree-ssa-uncprop.o differs
gcc/tree-eh.o differs
gcc/reg-stack.o differs
gcc/ggc-none.o differs
gcc/tree-ssa-tail-merge.o differs
gcc/loop-init.o differs
gcc/tree-vrp.o differs
gcc/emit-rtl.o differs
gcc/tree-vect-patterns.o differs
gcc/ipa-pure-const.o differs
gcc/ipa-prop.o differs
gcc/function.o differs
gcc/tree-vect-loop-manip.o differs
gcc/vtable-verify.o differs
gcc/tree-outof-ssa.o differs
gcc/edit-context.o differs
gcc/dominance.o

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

2016-11-26 Thread Jeff Law


On 11/24/2016 03:32 PM, Segher Boessenkool wrote:

On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:

On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:

On 11/24/2016 07:53 AM, Segher Boessenkool wrote:


That we compare different kinds of costs (which really has no meaning at
all, it's a heuristic at best) in various places is a known problem, not
a regression.

But the problems with the costing system exhibit themselves as a code
quality regression.  In the end that's what the end-users see -- a
regression in the quality of the code GCC generates.


Yes, exactly -- and I fear this all-encompassing change will cause just
such a regression for many users.  Tests are running, will know more
later today (or tomorrow).

The PR is about a very specific problem; the patch is not.  The patch
is not a bug fix.  If we allow anything that "makes things better" in
stage 3, what make it different from stage 1 then?


Here are results of testing with trunk right before the three patches,
compared with with the three patches.  This lists the sizes of the vmlinux
of a Linux kernel build for that arch.
Thanks.  While I question how much emphasis we should put on code sizes 
as a way to measure this change, it can still point out interesting 
effects, positive and negative.


From my investigations on the m68k, the effects on the IL are minimal 
with a slight bias towards better code (by suppressing if-conversions of 
some now more costly blocks).  *But* the size of the resulting code was 
all over the place -- sometimes it was better, others worse.  From 
looking at the assembly we seemingly are copying blocks that aren't 
strictly necessary.


Enter bb-reorder and the STC algorithm.  It is copying blocks *very* 
aggressively, like absurdly aggressively on the m68k.  Of course it 
doesn't help that the m68k doesn't define a length attribute and as a 
result STC thinks every insn has size 0 and thus block copying is zero cost.


I want to verify the #s, so take this with a slight grain of salt.  The 
net changes to newlib's .o's for Bernd's work -- +30 bytes.  The effect 
of the STC issue above -- +1115586 bytes.  Or to put it another way, 
Bernd's changes, +.0003% change.  STC, +13.8%.



jeff

Re: [Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

2016-11-26 Thread Janus Weil

ping!


2016-11-19 10:12 GMT+01:00 Janus Weil :
> Hi all,
>
>> I previously assumed that the test case for this PR would be legal,
>> but by now I think that's wrong. The test case should be rejected, and
>> we already have checking mechanisms for this (see
>> resolve_fl_variable), but apparently they are not working.
>>
>> My current suspicion is that 'gfc_is_constant_expr' has a bug, because
>> it claims the call to the function 'get_i' to be a constant
>> expression. This is not true, because get_i() can not be reduced to a
>> compile-time constant.
>
> some more reading in the standard confirms this suspicion: In
> gfc_is_constant_expr there is a piece of code which claims that
> specification functions are constant. That is certainly not true, and
> so what I'm doing in the attached fix is to remove that code and add
> some references to the standard to make things clearer.
>
> The code that I'm removing has last been touched in this commit by
> Jerry six years ago:
>
> https://gcc.gnu.org/viewcvs/gcc?view=revision=166520
>
> However, this did not introduce the bug in the first place (not sure
> when that happened).
>
> In any case the new patch in the attachment regtests cleanly and
> correctly rejects the original test case as well as one of the cases
> mentioned by Dominique. Ok for trunk?
>
> Cheers,
> Janus
>
>
>
> 2016-11-19  Janus Weil  
>
> PR fortran/78392
> * expr.c (gfc_is_constant_expr): Specification functions are not
> compile-time constants. Update documentation (add reference to F08
> standard), add a FIXME.
> (external_spec_function): Add reference to F08 standard.
> * resolve.c (resolve_fl_variable): Ditto.
>
> 2016-11-19  Janus Weil  
>
> PR fortran/78392
> * gfortran.dg/constant_shape.f90: New test case.

[PATCH v2] libgcc/mkmap-symver: support skip_underscore (PR74748)

2016-11-26 Thread Waldemar Brodkorb

Hi,

Some platforms, such as Blackfin, have a special prefix for assembly
symbols as opposed to C symbols. For this reason, a function named
"foo()" in C will in fact be visible as a symbol called "_foo" in the
ELF binary.

The current linker version script logic in libgcc doesn't take into
account this situation properly. The Blackfin specific
libgcc/config/bfin/libgcc-glibc.ver has an additional "_" in front of
every symbol so that it matches the output of "nm" (which gets parsed to
produce the final linker version script). But due to this additional
"_", ld no longer matches with the symbols since "ld" does the matching
with the original symbol name, not the one prefixed with "_".

Due to this, none of the symbols in libgcc/config/bfin/libgcc-glibc.ver
are actually matched with symbols in libgcc. This causes all libgcc
symbols to be left as "LOCAL", which causes lots of "undefined
reference" whenever some C or C++ code that calls a function of libgcc
is compiled.

To address this, this commit introduces a "skip_underscore" variable to
the mkmap-symver script. It tells mkmap-symver to ignore the leading
underscore from the "nm" output.

Note that this new argument is different from the existing
"leading_underscore" argument, which *adds* an additional underscore to
the generated linker version script.

Having this functionality paves the way to using the generic linker
version information for Blackfin, instead of using a custom one.

Signed-off-by: Thomas Petazzoni 
Tested-by: Waldemar Brodkorb 

2016-11-26  Thomas Petazzoni 

PR gcc/74748
* libgcc/mkmap-symver.awk: add support for skip_underscore


diff --git a/libgcc/mkmap-symver.awk b/libgcc/mkmap-symver.awk
index 266832a..0a57d31 100644
--- a/libgcc/mkmap-symver.awk
+++ b/libgcc/mkmap-symver.awk
@@ -47,7 +47,11 @@ state == "nm" && ($1 == "U" || $2 == "U") {
 
 state == "nm" && NF == 3 {
   split ($3, s, "@")
-  def[s[1]] = 1;
+  if (skip_underscore && substr(s[1], 1, 1) == "_")
+  symname = substr(s[1], 2);
+  else
+  symname = s[1];
+  def[symname] = 1;
   sawsymbol = 1;
   next;
 }

Thanks in advance,
 Waldemar

[Patches] Add variant constexpr support for visit, comparisons and get

Re: [PATCH] avoid calling alloca(0)

Re: [PATCH] improve folding of expressions that move a single bit around

[PATCH] Partial solution to LWG 523

Re: [PATCH] improve folding of expressions that move a single bit around

[PATCH] simplify-rtx: Handle truncate of extract

Re: [Patch][i386] PR 70118: Fix ubsan warning on SSE2 loadl_epi64 and storel_epi64

[PATCH] combine: Tweak change_zero_ext

[PATCH] improve folding of expressions that move a single bit around

[Patch][i386] PR 70118: Fix ubsan warning on SSE2 loadl_epi64 and storel_epi64

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

[PATCHv3] [AARCH64] Add variant support to -m="native"and add thunderxt88p1.

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: [PATCH] Fix a couple of issues in gimple-ssa-sprintf.c

Re: [Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: [Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: Documentation of LTIME

Re: [v3 PATCH] LWG 2766, LWG 2749

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

Re: change initialization of ptrdiff_type_node

Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.

Re: [Patch, Fortran] PR 78392: ICE in gfc_trans_auto_array_allocation, at fortran/trans-array.c:5979

[PATCH v2] libgcc/mkmap-symver: support skip_underscore (PR74748)

35 matches

Site Navigation

Mail list logo

Footer information