date:20161130

Re: [C++/78252] libiberty demangler crash with lambda (auto)

2016-11-30 Thread Markus Trippelsdorf

On 2016.11.30 at 14:06 -0500, Nathan Sidwell wrote:
> This patch fixes a problem in libiberty's symbol demangler.  With a
> templated forwarding function such as std::forward, we can end up emitting
> mangled function names that encode lambda information.  Lambdas with auto
> argument types have a synthesized templated operator(), and g++ uses that
> when mangling the lambda.
> 
> Unfortunately g++ doesn't notice the template parameters there mean 'auto'
> and emits regular template parameter references. (This is a bug, see below.)
> 
> But, as the forwarding function itself is a template, and the lambda is part
> of a template parameter substitution, we can end up with the demangler
> recursing unboundedly.  In other cases we can fail to demangle (returning
> null), or demangle to an unexpected type (substituting the current template
> parameter type into the place of the 'auto').
> 
> This patch fixes the demangler by noting when it's printing the argument
> types of a lambda.  In that case whenever we encounter a template parameter
> reference we emit 'auto', and also inhibit some &/&& smushing that needs
> checking.  AFAICT, once inside a lambda argument list we cannot encounter
> template parameter references that actually refer to an enclosing template
> argument list. That means we don't have the problem of disabling this
> additional check within the argument list printing.  I don't think we can
> meet a nested lambda type either, but the ++...-- idiom seemed safer to me.
> 
> We cannot do this substitution when parsing the mangled name, because g++
> applies the usual squangling back references as-if there really was a
> template parameter reference.  Later squangling references to the type
> containing the lambda argument may or may not require the reference to be to
> an enclosing template argument, or be auto, depending on the context of the
> squangle reference.
> 
> I've also included a c++ testcase to check the mangling of the lambdas that
> cause this.  While this is a g++ bug, it's an ABI-affecting one, and we
> shouldn't change the behaviour unintentionally.  I've not investigated why
> the mangler's failing to check is_auto, and will look at that later.  I
> imagine a fix will be -fabi-version dependent. I have filed 78621 to track
> it.

Thanks. This patch also fixes:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70909

-- 
Markus

Re: [PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Christophe Lyon

On 1 December 2016 at 01:18, Bernd Edlinger  wrote:
> On 12/01/16 00:10, Jonathan Wakely wrote:
>> On 30/11/16 23:06 +0100, Christophe Lyon wrote:
>>> On 30 November 2016 at 22:51, Jonathan Wakely  wrote:
 On 30/11/16 22:32 +0100, Christophe Lyon wrote:
>
> On 30 November 2016 at 20:00, Bernd Edlinger
> 
> wrote:
>>
>> Hi,
>>
>> I noticed that a cross-compiler produces an unusable libstdc++.so
>> that contains an unresolved reference to aligned_alloc instead of
>> posix_memalign, or whatever is actually available.
>>
>> Therefore it is impossible to link any C++ programs against the
>> libstdc++.so that comes with the cross-compiler.
>>
>> That happens for instance in the following configuration:
>> --target=arm-linux-gnueabihf.
>>
>
> How could this be unnoticed so far?


 I did wonder that.

 The newlib config is hardcoded, which probably covers a lot of the
 cross builds in regular use.

>>> The config mentioned by Bernd (arm-linux-gnueabihf) does not use newlib.
>>> I checked my libstdc++.log files, there's no -static option in use, I
>>> don't
>>> use --disable-shared, so I'm not sure why there's a problem?
>>
>> Then you probably have a newer glibc that defines aligned_alloc, and
>> Bernd is using an older one that doesn't define it. Bernd?
>>
>
> Yes.
>
> It is from 2011, glibc-2.15 as it looks like.
>
> I never had any issues with that, because it is supposed to be upward
> compatible with newer glibc, I did update the glibc on the target
> system recently to glibc-2.23, though, and had not noticed any issues,
> before.
>

OK, it makes sense: I'm using glibc-2.20.

>
> Thanks
> Bernd.

Re: [PATCH] handle integer overflow/wrapping in printf directives (PR 78622)

2016-11-30 Thread Jakub Jelinek

On Wed, Nov 30, 2016 at 08:26:04PM -0700, Martin Sebor wrote:
> @@ -795,6 +795,43 @@ get_width_and_precision (const conversion_spec ,
>*pprec = prec;
>  }
>  
> +/* With the range [*ARGMIN, *ARGMAX] of an integer directive's actual
> +   argument, due to the conversion from either *ARGMIN or *ARGMAX to
> +   the type of the directive's formal argument it's possible for both
> +   to result in the same number of bytes or a range of bytes that's
> +   less than the number of bytes that would result from formatting
> +   some other value in the range [*ARGMIN, *ARGMAX].  This can be
> +   determined by checking for the actual argument being in the range
> +   of the type of the directive.  If it isn't it must be assumed to
> +   take on the full range of the directive's type.
> +   Return true when the range has been adjusted, false otherwise.  */
> +
> +static bool
> +adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax)
> +{
> +  tree dirmin = TYPE_MIN_VALUE (dirtype);
> +  tree dirmax = TYPE_MAX_VALUE (dirtype);
> +
> +  if (tree_int_cst_lt (*argmin, dirmin)
> +  || tree_int_cst_lt (dirmax, *argmin)
> +  || tree_int_cst_lt (*argmax, dirmin)
> +  || tree_int_cst_lt (dirmax, *argmax))
> +{
> +  if (TYPE_UNSIGNED (dirtype))
> + {
> +   *argmin = dirmin;
> +   *argmax = dirmax;
> + }
> +  else
> + {
> +   *argmin = integer_zero_node;
> +   *argmax = dirmin;
> + }
> +  return true;
> +}

Isn't this too simplistic?  I mean, if you have say dirtype of signed char
and argmin say 4096 + 32 and argmax say 4096 + 64, (signed char) arg
has range 32, 64, while I think your routine will yield -128, 127 (well,
0 as min and -128 as max as that is what you return for signed type).

Can't you subtract argmax - argmin (best just in wide_int, no need to build
trees), and use what you have just for the case where that number doesn't
fit into the narrower precision, otherwise if argmin - (dirtype) argmin
== argmax - (dirtype) argmax, just use (dirtype) argmin and (dirtype) argmax
as the range, and in case that it crosses a boundary figure if you can do
anything than the above?  Guess all cases of signed/unsigned dirtype and/or
argtype need to be considered.

Also, is argmin and argmax in this case the actual range (what should go
into res.arg{min,max}), or the values with shortest/longest representation?
Wouldn't it be better to always compute the range of values that can be
printed and only later on (after all VR_RANGE and VR_VARYING handling)
transform that into the number with shortest/longest representation in that
range?  Perhaps even using different variable names for the latter would
make things clearer (argshortest, arglongest or whatever).

Jakub

Re: [v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Ville Voutilainen

On 1 December 2016 at 09:14, Ville Voutilainen
 wrote:
>> Yes it does. Thank you.
> Committed as obvious-enough.

Also, if this change causes one more problem I'm reverting all of it,
we _are_ in stage 3 and this sort of churn
is not exactly desirable. I'm currently hiding in a basement so as to
avoid the wrath of release managers anyway. :)

Re: [v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Ville Voutilainen

On 1 December 2016 at 08:45, Markus Trippelsdorf  wrote:
> On 2016.12.01 at 08:11 +0200, Ville Voutilainen wrote:
>> On 1 December 2016 at 07:38, Markus Trippelsdorf  
>> wrote:
>> > It breaks building Firefox:
>>
>> Sigh, when writing a trait, write a proper trait. Does this patch fix
>> the problem?
>
> Yes it does. Thank you.


Committed as obvious-enough.

Re: [v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Markus Trippelsdorf

On 2016.12.01 at 08:11 +0200, Ville Voutilainen wrote:
> On 1 December 2016 at 07:38, Markus Trippelsdorf  
> wrote:
> > It breaks building Firefox:
> 
> Sigh, when writing a trait, write a proper trait. Does this patch fix
> the problem?

Yes it does. Thank you.

-- 
Markus

Re: [PATCH] Reenable RTL sharing verification

2016-11-30 Thread Steven Bosscher

On Wed, Nov 30, 2016 at 1:08 PM, Jakub Jelinek wrote:
> Hi!
>
> The http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01055.html
> change broke all RTL sharing verification, even with --enable-checking=rtl
> we don't verify anything for the last 3.5 years.

Eh, I guess "oops!" doesn't quite cover that error. Sorry!

Ciao!
Steven

Re: Go patch committed: Merge to gccgo branch

2016-11-30 Thread Ian Lance Taylor

Now I've merged GCC trunk revision 243094 to the gccgo branch.

Ian

[RS6000] fix rtl checking internal compiler error

2016-11-30 Thread Alan Modra

I'm committing this one as obvious once my powerpc64le-linux bootstrap
and regression check completes.  It fixes hundreds of rtl checking
testsuite errors like the following:

gcc.c-torture/compile/pr39943.c:6:1: internal compiler error: RTL check: 
expected elt 0 type 'e' or 'u', have 'E' (rtx unspec) in insn_is_swappable_p, 
at config/rs6000/rs6000.c:40678

* gcc/config/rs6000/rs6000.c (insn_is_swappable_p): Properly
look inside UNSPEC_VSX_XXSPLTW vec.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9fe98b7..7f307b1 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -40675,7 +40675,7 @@ insn_is_swappable_p (swap_web_entry *insn_entry, rtx 
insn,
if (GET_CODE (use_body) != SET
|| GET_CODE (SET_SRC (use_body)) != UNSPEC
|| XINT (SET_SRC (use_body), 1) != UNSPEC_VSX_XXSPLTW
-   || XEXP (XEXP (SET_SRC (use_body), 0), 1) != const0_rtx)
+   || XVECEXP (SET_SRC (use_body), 0, 1) != const0_rtx)
  return 0;
  }
}

-- 
Alan Modra
Australia Development Lab, IBM

Re: [v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Ville Voutilainen

On 1 December 2016 at 07:38, Markus Trippelsdorf  wrote:
> It breaks building Firefox:


Sigh, when writing a trait, write a proper trait. Does this patch fix
the problem?

2016-12-01  Ville Voutilainen  

The convertible_to traits need to use a variadic catch-all for the
false-cases.
* include/std/istream (__is_convertible_to_basic_istream):
Change the parameter of the false-case of __check to a variadic.
* include/std/ostream (__is_convertible_to_basic_ostream):
Likewise.
diff --git a/libstdc++-v3/include/std/istream b/libstdc++-v3/include/std/istream
index 319e226..1d77d30 100644
--- a/libstdc++-v3/include/std/istream
+++ b/libstdc++-v3/include/std/istream
@@ -915,7 +915,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
   static basic_istream<_Ch, _Up>& __check(basic_istream<_Ch, _Up>*);
 
-  static void __check(void*);
+  static void __check(...);
 public:
   using istream_type =
decltype(__check(declval::type*>()));
diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream
index 70fd10b..9dea778 100644
--- a/libstdc++-v3/include/std/ostream
+++ b/libstdc++-v3/include/std/ostream
@@ -619,7 +619,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template
 static basic_ostream<_Ch, _Up>& __check(basic_ostream<_Ch, _Up>*);
 
-static void __check(void*);
+static void __check(...);
   public:
 using ostream_type =
   decltype(__check(declval::type*>()));

libgo patch committed: Set initarchive in initsig

2016-11-30 Thread Ian Lance Taylor

The library initialization code in go-libmain.c sets the C variable
runtime_isarchive but failed to set the Go variable runtime.isarchive.
We don't currently have a way to let C code access an unexported Go
variable, but fortunately the only time the Go function initsig is
called with an argument of true is exactly where we want to set
isarchive.  So let initsig do it.  Bootstrapped and ran Go testsuite
on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 243084)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-9be198d960e4bc46e21e4da1e3d4a1619266b8ab
+97b949f249515a61d3c09e9e06f08c8af189e967
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/signal1_unix.go
===
--- libgo/go/runtime/signal1_unix.go(revision 243084)
+++ libgo/go/runtime/signal1_unix.go(working copy)
@@ -65,6 +65,11 @@ var signalsOK bool
 //go:nosplit
 //go:nowritebarrierrec
 func initsig(preinit bool) {
+   if preinit {
+   // preinit is only passed as true if isarchive should be true.
+   isarchive = true
+   }
+
if !preinit {
// It's now OK for signal handlers to run.
signalsOK = true

Re: [v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Markus Trippelsdorf

On 2016.11.30 at 16:25 +, Jonathan Wakely wrote:
> On 30/11/16 17:58 +0200, Ville Voutilainen wrote:
> >Fix testsuite failures caused by the patch implementing LWG 2534.
> >* include/std/istream (__is_convertible_to_basic_istream):
> >Change the return types of __check, introduce stream_type.
> >(operator>>(_Istream&&, _Tp&&)):
> >Use __is_convertible_to_basic_istream::stream_type as the return type.
> >* include/std/ostream (__is_convertible_to_basic_ostream):
> >Change the return types of __check, introduce stream_type.
> >(operator>>(_Ostream&&, _Tp&&)):
> >Use __is_convertible_to_basic_ostream::stream_type as the return type.
> 
> As discussed on IRC, please change "stream_type" to istream_type and
> ostream_type, as appropriate, because those names are already used by
> stream iterators, so users can't define them as macros.
> 
> And you could make the remove_reference happen inside the
> __is_convertible_to_basic_[io]stream trait, since it's only ever used
> with references, but that seems stylistic.
> 
> OK with the stream_type renaming.

It breaks building Firefox:

In file included from ../../dist/system_wrappers/ostream:4:0,
 from ../../dist/stl_wrappers/ostream:55,
 from 
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/iterator:64,
 from ../../dist/system_wrappers/iterator:4,
 from ../../dist/stl_wrappers/iterator:55,
 from 
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/backward/hashtable.h:63,
 from 
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ext/hash_map:64,
 from 
/home/trippels/gecko-dev/ipc/chromium/src/base/hash_tables.h:43,
 from 
/home/trippels/gecko-dev/ipc/chromium/src/base/file_path.h:72,
 from 
/home/trippels/gecko-dev/ipc/chromium/src/chrome/common/ipc_message_utils.h:12,
 from ../../dist/include/ipc/IPCMessageUtils.h:11,
 from ../../dist/include/mozilla/ipc/Transport_posix.h:11,
 from ../../dist/include/mozilla/ipc/Transport.h:15,
 from /home/trippels/gecko-dev/ipc/glue/BackgroundChild.h:10,
 from /home/trippels/gecko-dev/ipc/glue/BackgroundImpl.cpp:5,
 from 
/home/trippels/moz-build-dir/ipc/glue/Unified_cpp_ipc_glue0.cpp:2:
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream: In instantiation 
of ‘struct std::__is_convertible_to_basic_ostream’:
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:656:5:   required 
by substitution of ‘template typename 
std::enable_if >, 
std::__is_convertible_to_basic_ostream<_Ostream>, 
std::__is_insertable<_Ostream&, const _Tp&, void> >::value, typename 
std::__is_convertible_to_basic_ostream<_Tp>::ostream_type>::type 
std::operator<<(_Ostream&&, const _Tp&) [with _Ostream = const 
mozilla::unused_t&; _Tp = {anonymous}::ParentImpl*]’
../../dist/include/nsCOMPtr.h:185:13:   required from ‘void operator<<(const 
mozilla::unused_t&, const already_AddRefed<{anonymous}::ParentImpl>&)’
/home/trippels/gecko-dev/ipc/glue/BackgroundImpl.cpp:1981:32:   required from 
here
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:625:23: error: no 
matching function for call to ‘std::__is_convertible_to_basic_ostream::__check(const mozilla::unused_t*)’
   decltype(__check(declval::type*>()));
~~~^~
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:620:37: note: 
candidate: template static std::basic_ostream<_Ch, _Up>& 
std::__is_convertible_to_basic_ostream<_Tp>::__check(std::basic_ostream<_Ch, 
_Up>*) [with _Ch = _Ch; _Up = _Up; _Tp = const mozilla::unused_t&]
 static basic_ostream<_Ch, _Up>& __check(basic_ostream<_Ch, _Up>*);
 ^~~
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:620:37: note:   
template argument deduction/substitution failed:
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:625:23: note:   
types ‘std::basic_ostream<_CharT, _Traits>’ and ‘const mozilla::unused_t’ have 
incompatible cv-qualifiers
   decltype(__check(declval::type*>()));
~~~^~
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:622:17: note: 
candidate: static void 
std::__is_convertible_to_basic_ostream<_Tp>::__check(void*) [with _Tp = const 
mozilla::unused_t&] 
 static void __check(void*);
 ^~~
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:622:17: note:   
conversion of argument 1 would be ill-formed:
/home/trippels/gcc_test/usr/local/include/c++/7.0.0/ostream:625:23: error: 
invalid conversion from ‘const void*’ to ‘void*’ [-fpermissive]
   decltype(__check(declval::type*>()));

Re: [PATCH] avoid calling alloca(0)

2016-11-30 Thread Martin Sebor


What I think this tells us is that we're not at a place where we're
clean.  But we can incrementally get there.  The warning is only
catching a fairly small subset of the cases AFAICT.  That's not unusual
and analyzing why it didn't trigger on those cases might be useful as well.


The warning has no smarts.  It relies on constant propagation and
won't find a call unless it sees it's being made with a constant
zero.  Looking at the top two on the list the calls are in extern
functions not called from the same source file, so it probably just
doesn't see that the functions are being called from another file
with a zero.  Building GCC with LTO might perhaps help.


So where does this leave us for gcc-7?  I'm wondering if we drop the
warning in, but not enable it by default anywhere.  We fix the cases we
can (such as reg-stack,c tree-ssa-threadedge.c, maybe others) before
stage3 closes, and shoot for the rest in gcc-8, including improvign the
warning (if there's something we can clearly improve), and enabling the
warning in -Wall or -Wextra.


I'm fine with deferring the GCC fixes and working on the cleanup
over time but I don't think that needs to gate enabling the option
with -Wextra.  The warnings can be suppressed or prevented from
causing errors during a GCC build either via a command line option
or by pragma in the code.  AFAICT, from the other warnings I see
go by, this is what has been done for -Wno-implicit-fallthrough
while those warnings are being cleaned up.  Why not take the same
approach here?

As much as I would like to improve the warning itself I'm also not
sure I see much of an opportunity for it.  It's not prone to high
rates of false positives (hardly any in fact) and the cases it
misses are those where it simply doesn't see the argument value
because it's not been made available by constant propagation.

That said, I consider the -Walloc-size-larger-than warning to be
the more important part of the patch by far.  I'd hate a lack of
consensus on how to deal with GCC's handful of instances of
alloca(0) to stall the rest of the patch.

Thanks
Martin

Re: [Patches] Add variant constexpr support for visit, comparisons and get

2016-11-30 Thread Tim Shen

On Wed, Nov 30, 2016 at 8:27 AM, Jonathan Wakely wrote:
> On 26/11/16 21:38 -0800, Tim Shen wrote:
>> +  template>
>> struct _Uninitialized;
>
>
> I'm still unsure that is_literal_type is the right trait here. If it's
> definitely right then we should probably *not* deprecate it in C++17!

No it's not right. We need this only because [basic.types]p10.5.3 (in n4606):

  if it (a literal type) is a union, at least one of its non-static
data members is of non-volatile literal type, ...

is not implemented. In the current GCC implementation, however, all
non-static data members need to be literal types, in order to create a
literal union.

With the current GCC implementation, to keep our goal, which is to
make _Variadic_union literal type, we need to ensure that
_Uninitialized is literal type, by specializing on T:
1) If is_literal_type_v, store a T;
2) otherwise, store a raw buffer of T.

In the future, when [basic.types]p10.5.3 is implemented, we don't need
is_literal_type_v.

I'll add a comment here.

I didn't check for other compilers.

-- 
Regards,
Tim Shen

Re: [PATCH] Fix rtl sharing bug in rs6000_frame_related (PR target/78614)

2016-11-30 Thread Alan Modra

On Thu, Dec 01, 2016 at 12:36:49PM +1030, Alan Modra wrote:
> On Wed, Nov 30, 2016 at 11:27:40PM +0100, Jakub Jelinek wrote:
> > Markus said he has bootstrapped this patch with rtl checking on powerpc64.
> 
> I repeated the exercise and found a stage1 bootstrap failure due to
> invalid rtl sharing on powerpc64le-linux, using

Apologies for the noise.  I wasn't testing your patch..  Checking
again with it applied properly, and I'm past stage1 where the failures
happened.

-- 
Alan Modra
Australia Development Lab, IBM

[PATCH] handle integer overflow/wrapping in printf directives (PR 78622)

2016-11-30 Thread Martin Sebor


In the gimple-ssa-sprintf pass I made the incorrect assumption that
a wider integer argument in some range [X, Y] to a narrower directive
(such as int to %hhi) results in the number of bytes corresponding to
the range bounded by the number of bytes resulting from formatting X
and Y converted to the directive's type (how's that for a clear
description?)

Basically, given an int A in [X, Y], and sprintf("%hhi", A) the logic
was to transform the range to [X', Y'] where X' = (signed char)X and
Y' = (signed char)Y, compute a range of bytes [B, C] that X' and Y'
would format to, and use that as the range for A.  That's wrong when
X or Y are outside the range of the directive's type because of
overflow or wrapping.  It's possible to find B in [X, Y] such that
(signed char)B is outside the range of [X', Y'].

Bug 78622 - [7 Regression] -Wformat-length/-fprintf-return-value
incorrect with overflow/wrapping, derived from a comment on bug
78586 has a test case.

The attached patch fixes this problem.  A happy consequence of
the fix is that it also resolves bug 77721 - -Wformat-length not
uses arg range for converted vars (or at least makes the test
case in the bug pass; there are outstanding limitations due
to poor range info in the pass).

While there, I also fixed a couple of minor cosmetic issues with
the phrasing and formatting of diagnostics (unrelated to the main
problem).

Tested on x86-64.

Thanks
Martin
PR middle-end/78622 - [7 Regression] -Wformat-length/-fprintf-return-value incorrect with overflow/wrapping

gcc/ChangeLog:

	PR middle-end/78622
	* gimple-ssa-sprintf.c (min_bytes_remaining): Use res.knownrange
	rather than res.bounded.
	(adjust_range_for_overflow): New function.
	(format_integer): Always set res.bounded to true unless either
	precision or width is specified and unknown.
	Call adjust_range_for_overflow.
	(format_directive): Remove vestigial quoting.
	(add_bytes): Correct the computation of boundrange used to
	decide whether a warning is of a "maybe" or "defnitely" kind.

gcc/testsuite/ChangeLog:

	PR middle-end/78622
	* gcc.c-torture/execute/pr78622.c: New test.
	* gcc.dg/tree-ssa/builtin-sprintf-2.c: Remove "benign" undefined
	behavior inadvertently introduced in a previous commit.  Tighten
	up final checking.
	* gcc.dg/tree-ssa/builtin-sprintf-6.c: Add test cases.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-6.c: Remove xfails and
	add a final optimization check.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 99a635a..eceed3e 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -637,7 +637,7 @@ min_bytes_remaining (unsigned HOST_WIDE_INT navail, const format_result )
   if (HOST_WIDE_INT_MAX <= navail)
 return navail;
 
-  if (1 < warn_format_length || res.bounded)
+  if (1 < warn_format_length || res.knownrange)
 {
   /* At level 2, or when all directives output an exact number
 	 of bytes or when their arguments were bounded by known
@@ -795,6 +795,43 @@ get_width_and_precision (const conversion_spec ,
   *pprec = prec;
 }
 
+/* With the range [*ARGMIN, *ARGMAX] of an integer directive's actual
+   argument, due to the conversion from either *ARGMIN or *ARGMAX to
+   the type of the directive's formal argument it's possible for both
+   to result in the same number of bytes or a range of bytes that's
+   less than the number of bytes that would result from formatting
+   some other value in the range [*ARGMIN, *ARGMAX].  This can be
+   determined by checking for the actual argument being in the range
+   of the type of the directive.  If it isn't it must be assumed to
+   take on the full range of the directive's type.
+   Return true when the range has been adjusted, false otherwise.  */
+
+static bool
+adjust_range_for_overflow (tree dirtype, tree *argmin, tree *argmax)
+{
+  tree dirmin = TYPE_MIN_VALUE (dirtype);
+  tree dirmax = TYPE_MAX_VALUE (dirtype);
+
+  if (tree_int_cst_lt (*argmin, dirmin)
+  || tree_int_cst_lt (dirmax, *argmin)
+  || tree_int_cst_lt (*argmax, dirmin)
+  || tree_int_cst_lt (dirmax, *argmax))
+{
+  if (TYPE_UNSIGNED (dirtype))
+	{
+	  *argmin = dirmin;
+	  *argmax = dirmax;
+	}
+  else
+	{
+	  *argmin = integer_zero_node;
+	  *argmax = dirmin;
+	}
+  return true;
+}
+  return false;
+}
+
 /* Return a range representing the minimum and maximum number of bytes
that the conversion specification SPEC will write on output for the
integer argument ARG when non-null.  ARG may be null (for vararg
@@ -989,6 +1026,10 @@ format_integer (const conversion_spec , tree arg)
 
   fmtresult res;
 
+  /* The result is bounded unless width or precision has been specified
+ whose value is unknown.  */
+  res.bounded = width != HOST_WIDE_INT_MIN && prec != HOST_WIDE_INT_MIN;
+
   /* Using either the range the non-constant argument is in, or its
  type (either "formal" or actual), create a range of values that
  constrain the

Re: [PATCH] Fix rtl sharing bug in rs6000_frame_related (PR target/78614)

2016-11-30 Thread Alan Modra

On Wed, Nov 30, 2016 at 11:27:40PM +0100, Jakub Jelinek wrote:
> Markus said he has bootstrapped this patch with rtl checking on powerpc64.

I repeated the exercise and found a stage1 bootstrap failure due to
invalid rtl sharing on powerpc64le-linux, using

AS="/home/amodra/gnu/bin/as" LD="/home/amodra/gnu/bin/ld" \
~/src/gcc-pr78614/configure \
--build=powerpc64le-linux --prefix=/home/amodra/gnu \
--enable-targets=powerpc64-linux,powerpc-linux,powerpcle-linux 
--disable-multilib \
--enable-valgrind-annotations \
--disable-nls --with-cpu=power8 --enable-languages=all,go --enable-lto \
--enable-checking=yes,rtl

> 2016-11-30  Jakub Jelinek  
> 
>   PR target/78614
>   * config/rs6000/rs6000.c (rs6000_frame_related): Call
>   set_used_flags (pat) before any simplifications.  Clear used flag on
>   PARALLEL copy.  Don't guard add_reg_note call.  Call
>   copy_rtx_if_shared on pat before storing it into
>   REG_FRAME_RELATED_EXPR.

-- 
Alan Modra
Australia Development Lab, IBM

[PATCH] combine: Emit a barrier after unconditional trap (PR78607)

2016-11-30 Thread Segher Boessenkool

After an unconditional trap there should be a barrier.  In most cases
one is automatically inserted, but not if the trap is the final insn in
the instruction stream.  We need to emit one explicitly.

Tested on powerpc64-linux {-m64,-m32}, committing to trunk.


Segher


2016-12-01  Segher Boessenkool  

PR rtl-optimization/78607
* combine.c (try_combine): Emit a barrier after a unconditional trap.

gcc/testsuite/
PR rtl-optimization/78607
* gcc.c-torture/compile/pr78607.c: New testcase.

---
 gcc/combine.c |  2 ++
 gcc/testsuite/gcc.c-torture/compile/pr78607.c | 12 
 2 files changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr78607.c

diff --git a/gcc/combine.c b/gcc/combine.c
index fd33a4d..e48b6c9 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -4655,6 +4655,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   basic_block bb = BLOCK_FOR_INSN (i3);
   gcc_assert (bb);
   remove_edge (split_block (bb, i3));
+  emit_barrier_after_bb (bb);
   *new_direct_jump_p = 1;
 }
 
@@ -4665,6 +4666,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   basic_block bb = BLOCK_FOR_INSN (undobuf.other_insn);
   gcc_assert (bb);
   remove_edge (split_block (bb, undobuf.other_insn));
+  emit_barrier_after_bb (bb);
   *new_direct_jump_p = 1;
 }
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr78607.c 
b/gcc/testsuite/gcc.c-torture/compile/pr78607.c
new file mode 100644
index 000..2c5420d
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr78607.c
@@ -0,0 +1,12 @@
+/* PR rtl-optimization/78607 */
+
+void
+rc (int cx)
+{
+  int mq;
+
+  if (mq == 0 && (cx / 0) != 0)
+for (;;)
+  {
+  }
+}
-- 
1.9.3

Re: [PATCH] Fix minor nits in gimple-ssa-sprintf.c (PR tree-optimization/78586)

2016-11-30 Thread Martin Sebor


On 11/30/2016 12:01 PM, Jakub Jelinek wrote:

Hi!

This patch fixes some minor nits I've raised in the PR, more severe issues
left unresolved there.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


Thank you.  One comment below.


@@ -1059,7 +1048,12 @@ format_integer (const conversion_spec 
}

  if (code == NOP_EXPR)
-   argtype = TREE_TYPE (gimple_assign_rhs1 (def));
+   {
+ tree type = TREE_TYPE (gimple_assign_rhs1 (def));
+ if (TREE_CODE (type) == INTEGER_TYPE
+ || TREE_CODE (type) == POINTER_TYPE)
+   argtype = type;


As I replied in my comment #6 on the bug, I'm not sure I see what
is wrong with the original code, and I haven't been able to come
up with a test case that demonstrates a problem because of it with
any of the types you mentioned (bool, enum, or floating).

I trust you when you say the change is necessary but I would like
to see a test case.  I can add it myself if you can sketch it out.

Martin

Re: [PATCH] Fix rtl sharing bug in rs6000_frame_related (PR target/78614)

2016-11-30 Thread Segher Boessenkool

On Wed, Nov 30, 2016 at 11:27:40PM +0100, Jakub Jelinek wrote:
> As mentioned in the PR, the rs6000_frame_related rewrite broke rtl sharing
> whose --enable-checking=rtl verification has been broken for the last 3.5
> years until today.

Great eh!  I am very scared.

> Markus said he has bootstrapped this patch with rtl checking on powerpc64.
> 
> Ok for trunk?

Yes please.  Thanks,


Segher


> 2016-11-30  Jakub Jelinek  
> 
>   PR target/78614
>   * config/rs6000/rs6000.c (rs6000_frame_related): Call
>   set_used_flags (pat) before any simplifications.  Clear used flag on
>   PARALLEL copy.  Don't guard add_reg_note call.  Call
>   copy_rtx_if_shared on pat before storing it into
>   REG_FRAME_RELATED_EXPR.

Re: [PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Bernd Edlinger

On 12/01/16 00:10, Jonathan Wakely wrote:
> On 30/11/16 23:06 +0100, Christophe Lyon wrote:
>> On 30 November 2016 at 22:51, Jonathan Wakely  wrote:
>>> On 30/11/16 22:32 +0100, Christophe Lyon wrote:

 On 30 November 2016 at 20:00, Bernd Edlinger
 
 wrote:
>
> Hi,
>
> I noticed that a cross-compiler produces an unusable libstdc++.so
> that contains an unresolved reference to aligned_alloc instead of
> posix_memalign, or whatever is actually available.
>
> Therefore it is impossible to link any C++ programs against the
> libstdc++.so that comes with the cross-compiler.
>
> That happens for instance in the following configuration:
> --target=arm-linux-gnueabihf.
>

 How could this be unnoticed so far?
>>>
>>>
>>> I did wonder that.
>>>
>>> The newlib config is hardcoded, which probably covers a lot of the
>>> cross builds in regular use.
>>>
>> The config mentioned by Bernd (arm-linux-gnueabihf) does not use newlib.
>> I checked my libstdc++.log files, there's no -static option in use, I
>> don't
>> use --disable-shared, so I'm not sure why there's a problem?
>
> Then you probably have a newer glibc that defines aligned_alloc, and
> Bernd is using an older one that doesn't define it. Bernd?
>

Yes.

It is from 2011, glibc-2.15 as it looks like.

I never had any issues with that, because it is supposed to be upward
compatible with newer glibc, I did update the glibc on the target
system recently to glibc-2.23, though, and had not noticed any issues,
before.


Thanks
Bernd.

Re: [PATCH] Fix rtl sharing bug in rs6000_frame_related (PR target/78614)

2016-11-30 Thread Alan Modra

On Wed, Nov 30, 2016 at 11:27:40PM +0100, Jakub Jelinek wrote:
> The last hunk just removes unnecessary condition, if the condition is not
> true, we return from the function already earlier and don't do any
> replacements.

Yeah, that was a leftover from an earlier revision of the patch that
removed the "if (!repl && !reg2)" block.

-- 
Alan Modra
Australia Development Lab, IBM

Re: [RFA] Handle target with no length attributes sanely in bb-reorder.c

2016-11-30 Thread Jeff Law


On 11/30/2016 01:38 AM, Richard Biener wrote:

On Tue, Nov 29, 2016 at 5:07 PM, Jeff Law  wrote:

On 11/29/2016 03:23 AM, Richard Biener wrote:


On Mon, Nov 28, 2016 at 10:23 PM, Jeff Law  wrote:




I was digging into  issues around the patches for 78120 when I stumbled
upon
undesirable bb copying in bb-reorder.c on the m68k.

The core issue is that the m68k does not define a length attribute and
therefore generic code assumes that the length of all insns is 0 bytes.



What other targets behave like this?


ft32, nvptx, mmix, mn10300, m68k, c6x, rl78, vax, ia64, m32c


Ok.


cris has a hack to define a length, even though no attempt is made to make
it accurate.  The hack specifically calls out that it's to make bb-reorder
happy.




That in turn makes bb-reorder think it is infinitely cheap to copy basic
blocks.  In the two codebases I looked at (GCC's runtime libraries and
newlib) this leads to a 10% and 15% undesirable increase in code size.

I've taken a slight variant of this patch and bootstrapped/regression
tested
it on x86_64-linux-gnu to verify sanity as well as built the m68k target
libraries noted above.

OK for the trunk?



I wonder if it isn't better to default to a length of 1 instead of zero
when
there is no length attribute.  There are more users of the length
attribute
in bb-reorder.c (and elsewhere as well I suppose).


I pondered that as well, but felt it was riskier given we've had a default
length of 0 for ports that don't define lengths since the early 90s.  It's
certainly easy enough to change that default if you'd prefer.  I don't have
a strong preference either way.


Thinking about this again maybe targets w/o insn-length should simply
always use the 'simple' algorithm instead of the STV one?  At least that
might be what your change effectively does in some way?
From reading the comments I don't think STC will collapse down into the 
simple algorithm if block copying is disabled.  But Segher would know 
for sure.


WRT the choice of simple vs STC, I doubt it matters much for the 
processors in question.


JEff

Re: [PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Jonathan Wakely


On 30/11/16 23:06 +0100, Christophe Lyon wrote:

On 30 November 2016 at 22:51, Jonathan Wakely  wrote:

On 30/11/16 22:32 +0100, Christophe Lyon wrote:


On 30 November 2016 at 20:00, Bernd Edlinger 
wrote:


Hi,

I noticed that a cross-compiler produces an unusable libstdc++.so
that contains an unresolved reference to aligned_alloc instead of
posix_memalign, or whatever is actually available.

Therefore it is impossible to link any C++ programs against the
libstdc++.so that comes with the cross-compiler.

That happens for instance in the following configuration:
--target=arm-linux-gnueabihf.



How could this be unnoticed so far?



I did wonder that.

The newlib config is hardcoded, which probably covers a lot of the
cross builds in regular use.


The config mentioned by Bernd (arm-linux-gnueabihf) does not use newlib.
I checked my libstdc++.log files, there's no -static option in use, I don't
use --disable-shared, so I'm not sure why there's a problem?


Then you probably have a newer glibc that defines aligned_alloc, and
Bernd is using an older one that doesn't define it. Bernd?

[PATCH][Aarch64] Add support for overflow add and sub operations

2016-11-30 Thread Michael Collison

Hi,

This patch improves code generations for builtin arithmetic overflow operations 
for the aarch64 backend. As an example for a simple test case such as:

int
f (int x, int y, int *ovf)
{
  int res;
  *ovf = __builtin_sadd_overflow (x, y, );
  return res;
}

Current trunk at -O2 generates

f:
mov w3, w0
mov w4, 0
add w0, w0, w1
tbnzw1, #31, .L4
cmp w0, w3
blt .L3
.L2:
str w4, [x2]
ret
.p2align 3
.L4:
cmp w0, w3
ble .L2
.L3:
mov w4, 1
b   .L2


With the patch this now generates:

f:
addsw0, w0, w1
csetw1, vs
str w1, [x2]
ret

Tested on aarch64-linux-gnu with no regressions. Okay for trunk?


2016-11-30  Michael Collison  
Richard Henderson 

* config/aarch64/aarch64-modes.def (CC_V): New.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
for signed overflow using CC_Vmode.
(aarch64_get_condition_code_1): Handle CC_Vmode.
* config/aarch64/aarch64.md (addv4, uaddv4): New.
(addti3): Create simpler code if low part is already known to be 0.
(addvti4, uaddvti4): New.
(*add3_compareC_cconly_imm): New.
(*add3_compareC_cconly): New.
(*add3_compareC_imm): New.
(*add3_compareC): Rename from add3_compare1; do not
handle constants within this pattern.
(*add3_compareV_cconly_imm): New.
(*add3_compareV_cconly): New.
(*add3_compareV_imm): New.
(add3_compareV): New.
(add3_carryinC, add3_carryinV): New.
(*add3_carryinC_zero, *add3_carryinV_zero): New.
(*add3_carryinC, *add3_carryinV): New.
(subv4, usubv4): New.
(subti): Handle op1 zero.
(subvti4, usub4ti4): New.
(*sub3_compare1_imm): New.
(sub3_carryinCV): New.
(*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
(*sub3_carryinCV_z2, *sub3_carryinCV): New


rth_overflow_ipreview1.patch
Description: rth_overflow_ipreview1.patch

Re: [PATCH] Dump probability for edges a frequency for BBs

2016-11-30 Thread Martin Sebor


On 11/24/2016 05:59 AM, Martin Liška wrote:

On 11/24/2016 09:29 AM, Richard Biener wrote:

Please guard with ! TDF_GIMPLE, otherwise the output will not be parseable
with the GIMPLE FE.

RIchard.


Done and verified that and it provides equal dumps for -fdump*-gimple.
Installed as r242837.


Hi Martin,

I'm trying to understand how to interpret the probabilities (to
make sure one of my tests, builtin-sprintf-2.c, is testing what
it's supposed to be testing).

With this example:

  char d2[2];

  void f (void)
  {
if (2 != __builtin_sprintf (d2, "%i", 12))
  __builtin_abort ();
  }

the probability of the branch to abort is 0%:

  f1 ()
  {
int _1;

 [100.0%]:
_1 = __builtin_sprintf (, "%i", 12);
if (_1 != 2)
  goto ; [0.0%]
else
  goto ; [100.0%]

 [0.0%]:
__builtin_abort ();

 [100.0%]:
return;
  }

Yet the call to abort is in the assembly so I would expect its
probability to be more than zero.  So my question is: it it safe
to be testing for calls to abort in the optimized dump as a way
of verifying that the call has not been eliminated from the program
regardless of their probabilities?

For reference, the directive the test uses since this change was
committed looks like this:

{ dg-final { scan-tree-dump-times "> \\\[\[0-9.\]+%\\\]:\n 
*__builtin_abort" 114 "optimized" }


If I'm reading the heavily escaped regex right it matches any
percentage, even 0.0% (and the test passes).

Thanks
Martin

[PATCH] Fix rtl sharing bug in rs6000_frame_related (PR target/78614)

2016-11-30 Thread Jakub Jelinek

Hi!

As mentioned in the PR, the rs6000_frame_related rewrite broke rtl sharing
whose --enable-checking=rtl verification has been broken for the last 3.5
years until today.
The problem is that simplify_replace_rtx doesn't unshare everything, only
the minimum needed to replace what is needed without changing original
expression.  But as we replace something from PATTERN of the insn and
want to stick that into REG_FRAME_RELATED_EXPR note in the insn stream
next to the PATTERN, there can't be any sharing except for shareable
rtxes.

The following patch is the more memory friendly version, it marks as used
everything in the original PATTERN, then does all those
simplify_replace_rtxs that usually (with the patch I've just posted more
often) when unsharing/creating new rtxes keep the used bit unset, and
finally does copy_rtx_if_shared which should unshare only whatever
simplify_replace_rtx has not unshared.

The last hunk just removes unnecessary condition, if the condition is not
true, we return from the function already earlier and don't do any
replacements.

Markus said he has bootstrapped this patch with rtl checking on powerpc64.

Ok for trunk?

2016-11-30  Jakub Jelinek  

PR target/78614
* config/rs6000/rs6000.c (rs6000_frame_related): Call
set_used_flags (pat) before any simplifications.  Clear used flag on
PARALLEL copy.  Don't guard add_reg_note call.  Call
copy_rtx_if_shared on pat before storing it into
REG_FRAME_RELATED_EXPR.

--- gcc/config/rs6000/rs6000.c.jj   2016-11-29 07:31:02.0 +0100
+++ gcc/config/rs6000/rs6000.c  2016-11-30 17:10:40.805842306 +0100
@@ -27170,6 +27170,7 @@ rs6000_frame_related (rtx_insn *insn, rt
  Call simplify_replace_rtx on the SETs rather than the whole insn
  so as to leave the other stuff alone (for example USE of r12).  */
 
+  set_used_flags (pat);
   if (GET_CODE (pat) == SET)
 {
   if (repl)
@@ -27181,6 +27182,7 @@ rs6000_frame_related (rtx_insn *insn, rt
 {
   pat = shallow_copy_rtx (pat);
   XVEC (pat, 0) = shallow_copy_rtvec (XVEC (pat, 0));
+  RTX_FLAG (pat, used) = 0;
 
   for (int i = 0; i < XVECLEN (pat, 0); i++)
if (GET_CODE (XVECEXP (pat, 0, i)) == SET)
@@ -27203,8 +27205,7 @@ rs6000_frame_related (rtx_insn *insn, rt
 gcc_unreachable ();
 
   RTX_FRAME_RELATED_P (insn) = 1;
-  if (repl || reg2)
-add_reg_note (insn, REG_FRAME_RELATED_EXPR, pat);
+  add_reg_note (insn, REG_FRAME_RELATED_EXPR, copy_rtx_if_shared (pat));
 
   return insn;
 }

Jakub

[PATCH] Unset used bit in simplify_replace_* on newly copied rtxs (PR target/78614)

2016-11-30 Thread Jakub Jelinek

Hi!

Instead of a simple approach to fix PR78614 (a rs6000 backend bug) by adding:
pat = copy_rtx (pat);
before
  XVECEXP (pat, ...) = simplify_replace_rtx (XVECEXP (pat, ...), x, y);
because simplify_replace_rtx doesn't unshare all rtxes, just those required
not to modify the original expression (i.e. whenever changing x to copy_rtx
(y) somewhere also all expressions containing those), I've tried to avoid
too much GC creation by doing:
set_used_flags (pat);
  pat = shallow_copy_rtx (pat);
  XVEC (pat, 0) = shallow_copy_rtvec (XVEC (pat, 0));
  RTX_FLAG (pat, used) = 0;
...
  XVECEXP (pat, ...) = simplify_replace_rtx (XVECEXP (pat, ...), x, y);
...
  pat = copy_rtx_if_shared (pat);
I run into the problem that while simplify_replace_rtx if it actually does
copy_rtx (for y) or if it simplifies something through
simplify_gen_{unary,binary,relational,ternary}, the used bits on the newly
created rtxes are cleared, when we fall through into the fallback
simplify_replace_fn_rtx handling, it calls shallow_copy_rtx which copies the
set used bit and thus copy_rtx_if_shared copies it again.

The following patch improves it by reseting the used bit, similarly how
copy_rtx resets it.  In addition, I've noticed that copy_rtx_if_shared
documents that if some rtx with Es in format is unshared, it is required
that all the rtvecs it contains are unshared as well, partial unsharing
is not valid.  Most of rtxes contain at most a single E or are just used
in gen* programs, ASM_OPERANDS is an exception that contains 3.  So the
patch also ensures that if we copy ASM_OPERANDS, we unshare all the rtvecs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Now that I think about it, the second hunk should also probably contain
the for (int k = 0; fmt[k]; k++) loop, but as the only rtx with more than
one E is ssiEEEi this isn't needed with the current rtx.def.

2016-11-30  Jakub Jelinek  

PR target/78614
* simplify-rtx.c (simplify_replace_fn_rtx): When copying at least
one 'E' format vector, copy all of them.  Clear used flag after
shallow_copy_rtx.

--- gcc/simplify-rtx.c.jj   2016-11-30 13:57:12.0 +0100
+++ gcc/simplify-rtx.c  2016-11-30 17:42:08.606817145 +0100
@@ -547,13 +547,20 @@ simplify_replace_fn_rtx (rtx x, const_rt
  old_rtx, fn, data);
if (op != RTVEC_ELT (vec, j))
  {
-   if (newvec == vec)
+   if (x == newx)
  {
-   newvec = shallow_copy_rtvec (vec);
-   if (x == newx)
- newx = shallow_copy_rtx (x);
-   XVEC (newx, i) = newvec;
+   newx = shallow_copy_rtx (x);
+   RTX_FLAG (newx, used) = 0;
+   /* If we copy X, we need to copy also all
+  vectors in it, rather than copy only
+  a subset of them and share the rest.  */
+   for (int k = 0; fmt[k]; k++)
+ if (fmt[k] == 'E')
+   XVEC (newx, k) = shallow_copy_rtvec (XVEC (x, k));
+   newvec = XVEC (newx, i);
  }
+   else
+ gcc_checking_assert (vec != newvec);
RTVEC_ELT (newvec, j) = op;
  }
  }
@@ -566,7 +573,10 @@ simplify_replace_fn_rtx (rtx x, const_rt
if (op != XEXP (x, i))
  {
if (x == newx)
- newx = shallow_copy_rtx (x);
+ {
+   newx = shallow_copy_rtx (x);
+   RTX_FLAG (newx, used) = 0;
+ }
XEXP (newx, i) = op;
  }
  }

Jakub

Re: [PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Christophe Lyon

On 30 November 2016 at 22:51, Jonathan Wakely  wrote:
> On 30/11/16 22:32 +0100, Christophe Lyon wrote:
>>
>> On 30 November 2016 at 20:00, Bernd Edlinger 
>> wrote:
>>>
>>> Hi,
>>>
>>> I noticed that a cross-compiler produces an unusable libstdc++.so
>>> that contains an unresolved reference to aligned_alloc instead of
>>> posix_memalign, or whatever is actually available.
>>>
>>> Therefore it is impossible to link any C++ programs against the
>>> libstdc++.so that comes with the cross-compiler.
>>>
>>> That happens for instance in the following configuration:
>>> --target=arm-linux-gnueabihf.
>>>
>>
>> How could this be unnoticed so far?
>
>
> I did wonder that.
>
> The newlib config is hardcoded, which probably covers a lot of the
> cross builds in regular use.
>
The config mentioned by Bernd (arm-linux-gnueabihf) does not use newlib.
I checked my libstdc++.log files, there's no -static option in use, I don't
use --disable-shared, so I'm not sure why there's a problem?


>
>>> The attached patch adds a link test for the memalign function
>>> and fixes the cross-build for me.
>>>
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.

New Spanish PO file for 'gcc' (version 6.2.0)

2016-11-30 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/es.po

(This file, 'gcc-6.2.0.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Jonathan Wakely


On 30/11/16 22:32 +0100, Christophe Lyon wrote:

On 30 November 2016 at 20:00, Bernd Edlinger  wrote:

Hi,

I noticed that a cross-compiler produces an unusable libstdc++.so
that contains an unresolved reference to aligned_alloc instead of
posix_memalign, or whatever is actually available.

Therefore it is impossible to link any C++ programs against the
libstdc++.so that comes with the cross-compiler.

That happens for instance in the following configuration:
--target=arm-linux-gnueabihf.



How could this be unnoticed so far?


I did wonder that.

The newlib config is hardcoded, which probably covers a lot of the
cross builds in regular use.


The attached patch adds a link test for the memalign function
and fixes the cross-build for me.

Is it OK for trunk?


Thanks
Bernd.

Re: [1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-11-30 Thread Cary Coutant

How about if instead of special DW_OP codes, you instead define a new
virtual register that contains the mangled return address? If the rule
for that virtual register is anything other than DW_CFA_undefined,
you'd expect to find the mangled return address using that rule;
otherwise, you would use the rule for LR instead and expect an
unmangled return address. The earlier example would become (picking an
arbitrary value of 120 for the new virtual register number):

.cfi_startproc
   0x0  paciasp (this instruction sign return address register LR/X30)
.cfi_val 120, DW_OP_reg30
   0x4  stp x29, x30, [sp, -32]!
.cfi_offset 120, -16
.cfi_offset 29, -32
.cfi_def_cfa_offset 32
   0x8  add x29, sp, 0

Just a suggestion...

-cary


On Wed, Nov 16, 2016 at 6:02 AM, Jakub Jelinek  wrote:
> On Wed, Nov 16, 2016 at 02:54:56PM +0100, Mark Wielaard wrote:
>> On Wed, 2016-11-16 at 10:00 +, Jiong Wang wrote:
>> >   The two operations DW_OP_AARCH64_paciasp and DW_OP_AARCH64_paciasp_deref 
>> > were
>> > designed as shortcut operations when LR is signed with A key and using
>> > function's CFA as salt.  This is the default behaviour of return address
>> > signing so is expected to be used for most of the time.  
>> > DW_OP_AARCH64_pauth
>> > is designed as a generic operation that allow describing pointer signing on
>> > any value using any salt and key in case we can't use the shortcut 
>> > operations
>> > we can use this.
>>
>> I admit to not fully understand the salting/keying involved. But given
>> that the DW_OP space is really tiny, so we would like to not eat up too
>> many of them for new opcodes. And given that introducing any new DW_OPs
>> using for CFI unwinding will break any unwinder anyway causing us to
>> update them all for this new feature. Have you thought about using a new
>> CIE augmentation string character for describing that the return
>> address/link register used by a function/frame is salted/keyed?
>>
>> This seems a good description of CIE records and augmentation
>> characters: http://www.airs.com/blog/archives/460
>>
>> It obviously also involves updating all unwinders to understand the new
>> augmentation character (and possible arguments). But it might be more
>> generic and saves us from using up too many DW_OPs.
>
> From what I understood, the return address is not always scrambled, so
> it doesn't apply to the whole function, just to most of it (except for
> an insn in the prologue and some in the epilogue).  So I think one op is
> needed.  But can't it be just a toggable flag whether the return address
> is scrambled + some arguments to it?
> Thus DW_OP_AARCH64_scramble .uleb128 0 would mean that the default
> way of scrambling starts here (if not already active) or any kind of
> scrambling ends here (if already active), and
> DW_OP_AARCH64_scramble .uleb128 non-zero would be whatever encoding you need
> to represent details of the less common variants with details what to do.
> Then you'd just hook through some MD_* macro in the unwinder the
> descrambling operation if the scrambling is active at the insns you unwind
> on.
>
> Jakub

Re: Go patch committed: Merge to gccgo branch

2016-11-30 Thread Ian Lance Taylor

Now I've merged GCC trunk revision 243083 to the gccgo branch.

Ian

Re: [PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Christophe Lyon

On 30 November 2016 at 20:00, Bernd Edlinger  wrote:
> Hi,
>
> I noticed that a cross-compiler produces an unusable libstdc++.so
> that contains an unresolved reference to aligned_alloc instead of
> posix_memalign, or whatever is actually available.
>
> Therefore it is impossible to link any C++ programs against the
> libstdc++.so that comes with the cross-compiler.
>
> That happens for instance in the following configuration:
> --target=arm-linux-gnueabihf.
>

How could this be unnoticed so far?

> The attached patch adds a link test for the memalign function
> and fixes the cross-build for me.
>
> Is it OK for trunk?
>
>
> Thanks
> Bernd.

Re: [Fortran, Patch, PR{43366, 57117, 61337, 61376}, v1] Assign to polymorphic objects.

2016-11-30 Thread Dominique d'Humières

If I compile the test with an instrumented  gfortran , I get 

../../work/gcc/fortran/interface.c:2948:33: runtime error: load of value 
1818451807, which is not a valid value for type ‘expr_t'

Dominique

> Le 30 nov. 2016 à 21:06, David Edelsohn  a écrit :
> 
> Hi, Andre
> 
> I have noticed that the alloc_comp_class_5.f03 testcase fails on AIX.
> Annotating the testcase a little, shows that the failure is at
> 
>  if (any(x /= ["foo", "bar", "baz"])) call abort()
> 
> write (*,*) any
> 
> at the point of failure produces
> 
> "foobarba"
> 
> - David

libgo patch committed: Print C functions in traceback

2016-11-30 Thread Ian Lance Taylor

This patch to libgo prints C functions when doing a stack traceback.

Since gccgo can trace back through C code as easily as Go code, we
should print C functions in the traceback.

This worked before https://golang.org/cl/31230 for a dumb reason.  The
default value for runtime.traceback_cache was, and is, 2 << 2, meaning
to print all functions.  The old C code for runtime_parsedebugvars
would return immediately and do nothing if the environment variable
GODEBUG was not set (if GODEBUG was set it would later call
setTraceback.  The new Go code for runtime.parsedebugvars does not
return immediately if GODEBUG is not set, and always calls
setTraceback.  Either way, if GOTRACEBACK is not set, setTraceback
would set traceback_cache to 1 << 2, meaning to only print non-runtime
functions and having the effect of not printing plain C functions.

This patch keeps the current handling of GODEBUG/GOTRACEBACK, which
matches the gc library, but add an extra check to print C functions by
default.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 242992)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1d3e0ceee45012a1c3b4ff7f5119a72f90bfcf6a
+9be198d960e4bc46e21e4da1e3d4a1619266b8ab
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/traceback_gccgo.go
===
--- libgo/go/runtime/traceback_gccgo.go (revision 242724)
+++ libgo/go/runtime/traceback_gccgo.go (working copy)
@@ -89,6 +89,15 @@ func showframe(name string, gp *g) bool
if g.m.throwing > 0 && gp != nil && (gp == g.m.curg || gp == 
g.m.caughtsig.ptr()) {
return true
}
+
+   // Gccgo can trace back through C functions called via cgo.
+   // We want to print those in the traceback.
+   // But unless GOTRACEBACK > 1 (checked below), still skip
+   // internal C functions and cgo-generated functions.
+   if !contains(name, ".") && !hasprefix(name, "__go_") && 
!hasprefix(name, "_cgo_") {
+   return true
+   }
+
level, _, _ := gotraceback()
 
// Special case: always show runtime.gopanic frame, so that we can

Re: [PATCH] [PR78112] Remove the g++.dg/pr78112.C testcase

2016-11-30 Thread Mike Stump

On Nov 30, 2016, at 5:04 AM, Pierre-Marie de Rodat  wrote:
> I recently added a testcase for PR78112's resolution. Unfortunately,
> what it tests changes from one platform to another and I even get
> different results for a single platform (see
> ). As multiple
> developpers reported these errors and as the testcase relies on a
> compiler behavior that still looks bogous to me, I suggest to remove the
> testcase for now.
> 
> Ok to commit?

So, I noticed this and didn't see who you wanted to review it so, since it was 
C++, I thought I'd take a look at it.  Ick.  Complex issue.

One way to test this would be to have a internal check in the compiler for the 
thing you don't want to happen as an assert, and then have the unpatched 
compiler abort when given the test case before the work that cause fixed the 
original PR.  The test case then shows the failure, and should anyone break it, 
the internal check will catch the problem and abort, thus causing the test case 
to then fail (again).  Then, you only need to compile the test case and expect 
a non-zero output from the compilation.  On darwin, the excess message from 
dsymutil should be findable by dejagnu and should also be able to fail the test 
case on darwin.

If you like that design (and a dwarf maintainer likes that design), then you 
can fix this test case by removing:

> -/* { dg-final { scan-assembler-times DW_AT_inline 6 { xfail *-*-aix* } } } */
> -/* { dg-final { scan-assembler-times DW_AT_object_pointer 37 { xfail 
> *-*-aix* } } } */

alone and otherwise keep the full test case.  If that isn't done, this can be 
fixed by making the test a darwin only test case, as on darwin I'm lead to 
believe that the dsymutil output will cause the test case to fail.  For that, 
just add a target darwin* to the test case and then you can either remove the 
two above lines or update them to be what is seen on darwin.  And lastly, I'm 
fine with you just punting the issue and removing the entire test case if you 
feel it is best.

When looking at the test case, I wonder just how reduced it really was.  The 
last possible option would be to reduce the test case further and see if the 
problem can be eliminated that way.  Again, I'll pre-approve the test suite 
part of any of those solutions you like best.

Re: [Fortran, Patch, PR{43366, 57117, 61337, 61376}, v1] Assign to polymorphic objects.

2016-11-30 Thread David Edelsohn

Hi, Andre

I have noticed that the alloc_comp_class_5.f03 testcase fails on AIX.
Annotating the testcase a little, shows that the failure is at

  if (any(x /= ["foo", "bar", "baz"])) call abort()

write (*,*) any

at the point of failure produces

"foobarba"

- David

Re: [PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Jonathan Wakely


On 30/11/16 19:00 +, Bernd Edlinger wrote:

Hi,

I noticed that a cross-compiler produces an unusable libstdc++.so
that contains an unresolved reference to aligned_alloc instead of
posix_memalign, or whatever is actually available.

Therefore it is impossible to link any C++ programs against the
libstdc++.so that comes with the cross-compiler.

That happens for instance in the following configuration:
--target=arm-linux-gnueabihf.

The attached patch adds a link test for the memalign function
and fixes the cross-build for me.

Is it OK for trunk?


OK.

Presumably we should have this for other cross targets too.

[PATCH] Minimal reimplementation of errors.c within read-md.c

2016-11-30 Thread David Malcolm

On Wed, 2016-11-30 at 17:18 +0100, Bernd Schmidt wrote:
> On 11/29/2016 10:13 PM, Bernd Schmidt wrote:
> > On 11/29/2016 07:53 PM, David Malcolm wrote:
> > 
> > > Would you prefer that I went with approach (B), or is approach
> > > (A)
> > > acceptable?
> > 
> > Well, I was hoping there'd be an approach (C) where the read-rtl
> > code
> > uses whatever diagnostics framework that is available. Maybe it'll
> > turn
> > out that's too hard. Somehow the current patch looked strange to
> > me, but
> > if there's no easy alternative maybe we'll have to go with it.
> 
> So, I've tried to build patches 1-6 + 8, without #7. It looks like
> the
> differences are as follows:
> 
> - A lack of seen_error in errors.[ch], could be easily added, and
>basically a spelling mismatch between have_error and errorcount.
> - A lack of fatal in diagnostics.c. Could maybe be added to just call
>fatal_error?
> 
> All this seems simpler and cleaner to fix than linking two different
> error handling frameworks into one binary. Do you see any other
> difficulties?
> 
> 
> Bernd

Thanks.  Here's an implementation of that idea.

Given that fatal_error doesn't expose a way to accept va_args, it
seemed simplest to just copy the implementation from errors.c,
and conditionalize it with #ifndef GENERATOR_FILE.

Only lightly tested so far, but it builds (stage 1) and passes the
new rtl.exp test suite from patch 9 (which includes an error-handling
test).

Is this OK, assuming it passes regular testing?

gcc/ChangeLog:
* read-md.c (have_error): New global, copied from errors.c.
(fatal): New function, copied from errors.c.
* selftest-rtl.c: Include "diagnostic-core.h".
---
 gcc/read-md.c  | 25 +
 gcc/selftest-rtl.c |  1 +
 2 files changed, 26 insertions(+)

diff --git a/gcc/read-md.c b/gcc/read-md.c
index 25bc3c4..ce99473 100644
--- a/gcc/read-md.c
+++ b/gcc/read-md.c
@@ -31,6 +31,31 @@ along with GCC; see the file COPYING3.  If not see
 #include "vec.h"
 #include "read-md.h"
 
+#ifndef GENERATOR_FILE
+
+/* Minimal reimplementation of errors.c for use by RTL frontend
+   within cc1.  */
+
+int have_error = 0;
+
+/* Fatal error - terminate execution immediately.  Does not return.  */
+
+void
+fatal (const char *format, ...)
+{
+  va_list ap;
+
+  va_start (ap, format);
+  fprintf (stderr, "%s: ", progname);
+  vfprintf (stderr, format, ap);
+  va_end (ap);
+  fputc ('\n', stderr);
+  exit (FATAL_EXIT_CODE);
+}
+
+#endif /* #ifndef GENERATOR_FILE */
+
+
 /* Associates PTR (which can be a string, etc.) with the file location
specified by FILENAME and LINENO.  */
 struct ptr_loc {
diff --git a/gcc/selftest-rtl.c b/gcc/selftest-rtl.c
index 8f3c976..c5ab216 100644
--- a/gcc/selftest-rtl.c
+++ b/gcc/selftest-rtl.c
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "memmodel.h"
 #include "emit-rtl.h"
 #include "selftest-rtl.h"
+#include "diagnostic-core.h"
 
 #if CHECKING_P
 
-- 
1.8.5.3

Re: Ping: Re: [patch, avr] Add flash size to device info and make wrap around default

2016-11-30 Thread Denis Chertykov

2016-11-30 18:09 GMT+03:00 Georg-Johann Lay :
> On 30.11.2016 07:27, Pitchumani Sivanupandi wrote:
>>
>> On Tuesday 29 November 2016 10:06 PM, Denis Chertykov wrote:
>>>
>>> 2016-11-28 10:17 GMT+03:00 Pitchumani Sivanupandi
>>> :

 On Saturday 26 November 2016 12:11 AM, Denis Chertykov wrote:
>
> I'm sorry for delay.
>
> I have a problem with the patch:
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file avr-arch.h
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file avr-devices.c
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file avr-mcus.def
> Hunk #1 FAILED at 62.
> 1 out of 1 hunk FAILED -- saving rejects to file avr-mcus.def.rej
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file gen-avr-mmcu-specs.c
> Hunk #1 succeeded at 215 (offset 5 lines).
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file specs.h
> Hunk #1 succeeded at 58 (offset 1 line).
> Hunk #2 succeeded at 66 (offset 1 line).


 There are changes in avr-mcus.def after this patch is submitted.
 Now, I have incorporated the changes and attached the resolved patch.

 Regards,
 Pitchumani

 gcc/ChangeLog

 2016-11-09  Pitchumani Sivanupandi 

  * config/avr/avr-arch.h (avr_mcu_t): Add flash_size member.
  * config/avr/avr-devices.c(avr_mcu_types): Add flash size info.
  * config/avr/avr-mcu.def: Likewise.
  * config/avr/gen-avr-mmcu-specs.c (print_mcu): Remove hard-coded
 prefix
  check to find wrap-around value, instead use MCU flash size. For 8k
 flash
  devices, update link_pmem_wrap spec string to add
 --pmem-wrap-around=8k.
  * config/avr/specs.h: Remove link_pmem_wrap from LINK_RELAX_SPEC
 and
  add to linker specs (LINK_SPEC) directly.
>>>
>>> Committed.
>>
>> It looks like only avr-mcus.def and ChangeLog are committed.
>> Without the other changes trunk build is broken.
>>
>> Regards,
>> Pitchumani
>
>
> Hi, I allowed me to commit the missing files.
>
> http://gcc.gnu.org/r243033
>

Thank you.

[C++/78252] libiberty demangler crash with lambda (auto)

2016-11-30 Thread Nathan Sidwell

This patch fixes a problem in libiberty's symbol demangler.  With a 
templated forwarding function such as std::forward, we can end up 
emitting mangled function names that encode lambda information.  Lambdas 
with auto argument types have a synthesized templated operator(), and 
g++ uses that when mangling the lambda.


Unfortunately g++ doesn't notice the template parameters there mean 
'auto' and emits regular template parameter references. (This is a bug, 
see below.)


But, as the forwarding function itself is a template, and the lambda is 
part of a template parameter substitution, we can end up with the 
demangler recursing unboundedly.  In other cases we can fail to demangle 
(returning null), or demangle to an unexpected type (substituting the 
current template parameter type into the place of the 'auto').


This patch fixes the demangler by noting when it's printing the argument 
types of a lambda.  In that case whenever we encounter a template 
parameter reference we emit 'auto', and also inhibit some &/&& smushing 
that needs checking.  AFAICT, once inside a lambda argument list we 
cannot encounter template parameter references that actually refer to an 
enclosing template argument list. That means we don't have the problem 
of disabling this additional check within the argument list printing.  I 
don't think we can meet a nested lambda type either, but the ++...-- 
idiom seemed safer to me.


We cannot do this substitution when parsing the mangled name, because 
g++ applies the usual squangling back references as-if there really was 
a template parameter reference.  Later squangling references to the type 
containing the lambda argument may or may not require the reference to 
be to an enclosing template argument, or be auto, depending on the 
context of the squangle reference.


I've also included a c++ testcase to check the mangling of the lambdas 
that cause this.  While this is a g++ bug, it's an ABI-affecting one, 
and we shouldn't change the behaviour unintentionally.  I've not 
investigated why the mangler's failing to check is_auto, and will look 
at that later.  I imagine a fix will be -fabi-version dependent. I have 
filed 78621 to track it.


ok?

Nick, we originally found this when GDB exploded. If you're ok with it, 
I'll commit to binutils/gdb when approved for gcc.


nathan
--
Nathan Sidwell
2016-11-30  Nathan Sidwell  

	gcc/testsuite/
	* g++.dg/cpp1y/lambda-mangle-1.C: New.

	libiberty/
	* cp-demangle.c (struct p_print_info): Add is_lambda_arg field.
	(d_print_init): Initialize it.
	(d_print_comp_inner) : Check
	is_lambda_arg for auto.
	: Skip smashing check when
	is_lambda_arg.
	: Increment is_lambda_arg around arg
	printing.
	* testsuite/demangle-expected: Add lambda auto mangling cases.

Index: gcc/testsuite/g++.dg/cpp1y/lambda-mangle-1.C
===
--- gcc/testsuite/g++.dg/cpp1y/lambda-mangle-1.C	(nonexistent)
+++ gcc/testsuite/g++.dg/cpp1y/lambda-mangle-1.C	(working copy)
@@ -0,0 +1,47 @@
+// { dg-do compile { target c++14 } }
+
+// PR 78252
+
+// We erroneously mangle lambda auto parms asif template parameters (T_),
+// rather than auto (Da).  While that's unfortunate, it'd be best if
+// we didn't accidentally change that.
+
+template class X;
+
+template
+T & (T )
+{
+  return static_cast (v);
+}
+
+template
+void eat (T )
+{
+}
+
+void Foo ()
+{
+  auto lam = [](auto &) { };
+  auto lam_1 = [](int &, auto &) { };
+  auto lam_2 = [](auto &, X &) { };
+  auto lam_3 = [](auto (*)[5]) { };
+
+  forward (lam);
+  forward (lam_1);
+  forward (lam_2);
+  forward (lam_3);
+
+  eat (lam);
+  eat (lam_1);
+  eat (lam_2);
+  eat (lam_3);
+}
+
+// { dg-final { scan-assembler "_Z7forwardIZ3FoovEUlRT_E_EOS0_S1_:" } }
+// { dg-final { scan-assembler "_Z7forwardIZ3FoovEUlRiRT_E0_EOS1_S2_:" } }
+// { dg-final { scan-assembler "_Z7forwardIZ3FoovEUlRT_R1XIiEE1_EOS0_S1_:" } }
+// { dg-final { scan-assembler "_Z7forwardIZ3FoovEUlPA5_T_E2_EOS0_RS0_:" } }
+// { dg-final { scan-assembler "_Z3eatIZ3FoovEUlRT_E_EvS1_:" } }
+// { dg-final { scan-assembler "_Z3eatIZ3FoovEUlRiRT_E0_EvS2_:" } }
+// { dg-final { scan-assembler "_Z3eatIZ3FoovEUlRT_R1XIiEE1_EvS1_:" } }
+// { dg-final { scan-assembler "_Z3eatIZ3FoovEUlPA5_T_E2_EvRS0_:" } }
Index: libiberty/cp-demangle.c
===
--- libiberty/cp-demangle.c	(revision 243016)
+++ libiberty/cp-demangle.c	(working copy)
@@ -343,6 +343,12 @@ struct d_print_info
   struct d_print_mod *modifiers;
   /* Set to 1 if we saw a demangling error.  */
   int demangle_failure;
+  /* Non-zero if we're printing a lambda argument. A template
+ parameter reference actually means 'auto'.  This is a bug in name
+ mangling, and will demangle to something confusing.
+ Unfortunately it can also cause infinite recursion, if we don't
+ interpret this as 'auto'.

[PATCH] Fix UB in dwarf2out.c (PR debug/78587)

2016-11-30 Thread Jakub Jelinek

Hi!

This patch fixes 3 spots with UB in dwarf2out.c, furthermore the first spot
results in smaller/better debug info.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-30  Jakub Jelinek  

PR debug/78587
* dwarf2out.c (loc_descr_plus_const): For negative offset use
uint_loc_descriptor instead of int_loc_descriptor and perform negation
in unsigned HOST_WIDE_INT type.
(scompare_loc_descriptor): Shift UINTVAL left instead of INTVAL.

* gcc.dg/debug/pr78587.c: New test.

--- gcc/dwarf2out.c.jj  2016-11-18 22:55:19.0 +0100
+++ gcc/dwarf2out.c 2016-11-30 15:16:39.402673343 +0100
@@ -1514,7 +1514,8 @@ loc_descr_plus_const (dw_loc_descr_ref *
 
   else
 {
-  loc->dw_loc_next = int_loc_descriptor (-offset);
+  loc->dw_loc_next
+   = uint_loc_descriptor (-(unsigned HOST_WIDE_INT) offset);
   add_loc_descr (>dw_loc_next, new_loc_descr (DW_OP_minus, 0, 0));
 }
 }
@@ -13837,7 +13838,7 @@ scompare_loc_descriptor (enum dwarf_loca
   if (CONST_INT_P (XEXP (rtl, 1))
  && GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT
  && (size_of_int_loc_descriptor (shift) + 1
- + size_of_int_loc_descriptor (INTVAL (XEXP (rtl, 1)) << shift)
+ + size_of_int_loc_descriptor (UINTVAL (XEXP (rtl, 1)) << shift)
  >= size_of_int_loc_descriptor (GET_MODE_MASK (op_mode)) + 1
 + size_of_int_loc_descriptor (INTVAL (XEXP (rtl, 1))
   & GET_MODE_MASK (op_mode
@@ -13852,7 +13853,7 @@ scompare_loc_descriptor (enum dwarf_loca
   add_loc_descr (, int_loc_descriptor (shift));
   add_loc_descr (, new_loc_descr (DW_OP_shl, 0, 0));
   if (CONST_INT_P (XEXP (rtl, 1)))
-op1 = int_loc_descriptor (INTVAL (XEXP (rtl, 1)) << shift);
+op1 = int_loc_descriptor (UINTVAL (XEXP (rtl, 1)) << shift);
   else
 {
   add_loc_descr (, int_loc_descriptor (shift));
--- gcc/testsuite/gcc.dg/debug/pr78587.c.jj 2016-11-30 15:01:08.855153232 
+0100
+++ gcc/testsuite/gcc.dg/debug/pr78587.c2016-11-30 15:20:22.0 
+0100
@@ -0,0 +1,23 @@
+/* PR debug/78587 */
+/* { dg-do compile } */
+/* { dg-additional-options "-w" } */
+
+extern void bar (void);
+
+void
+foo (long long x)
+{
+  x ^= 9223372036854775808ULL;
+  bar ();
+}
+
+struct S { int w[4]; } a[1], b;
+
+void
+baz ()
+{
+  int e = (int) baz;
+  if (e <= -80)
+e = 0;
+  b = a[e];
+}

Jakub

[PATCH] Fix minor nits in gimple-ssa-sprintf.c (PR tree-optimization/78586)

2016-11-30 Thread Jakub Jelinek

Hi!

This patch fixes some minor nits I've raised in the PR, more severe issues
left unresolved there.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-30  Jakub Jelinek  

PR tree-optimization/78586
* gimple-ssa-sprintf.c (format_integer): Don't handle NOP_EXPR,
CONVERT_EXPR or COMPONENT_REF here.  Formatting fix.  For
SSA_NAME_DEF_STMT with NOP_EXPR only change argtype if the rhs1's
type is INTEGER_TYPE or POINTER_TYPE.

--- gcc/gimple-ssa-sprintf.c.jj 2016-11-30 09:00:42.0 +0100
+++ gcc/gimple-ssa-sprintf.c2016-11-30 12:57:05.996480633 +0100
@@ -968,24 +968,13 @@ format_integer (const conversion_spec 
 }
   else if (TREE_CODE (TREE_TYPE (arg)) == INTEGER_TYPE
   || TREE_CODE (TREE_TYPE (arg)) == POINTER_TYPE)
-{
-  /* Determine the type of the provided non-constant argument.  */
-  if (TREE_CODE (arg) == NOP_EXPR)
-   arg = TREE_OPERAND (arg, 0);
-  else if (TREE_CODE (arg) == CONVERT_EXPR)
-   arg = TREE_OPERAND (arg, 0);
-  if (TREE_CODE (arg) == COMPONENT_REF)
-   arg = TREE_OPERAND (arg, 1);
-
-  argtype = TREE_TYPE (arg);
-}
+/* Determine the type of the provided non-constant argument.  */
+argtype = TREE_TYPE (arg);
   else
-{
-  /* Don't bother with invalid arguments since they likely would
-have already been diagnosed, and disable any further checking
-of the format string by returning [-1, -1].  */
-  return fmtresult ();
-}
+/* Don't bother with invalid arguments since they likely would
+   have already been diagnosed, and disable any further checking
+   of the format string by returning [-1, -1].  */
+return fmtresult ();
 
   fmtresult res;
 
@@ -1059,7 +1048,12 @@ format_integer (const conversion_spec 
}
 
  if (code == NOP_EXPR)
-   argtype = TREE_TYPE (gimple_assign_rhs1 (def));
+   {
+ tree type = TREE_TYPE (gimple_assign_rhs1 (def));
+ if (TREE_CODE (type) == INTEGER_TYPE
+ || TREE_CODE (type) == POINTER_TYPE)
+   argtype = type;
+   }
}
}
 }

Jakub

[PATCH, libstdc++] Fix detection of posix_memalig for cross-builds

2016-11-30 Thread Bernd Edlinger

Hi,

I noticed that a cross-compiler produces an unusable libstdc++.so
that contains an unresolved reference to aligned_alloc instead of
posix_memalign, or whatever is actually available.

Therefore it is impossible to link any C++ programs against the
libstdc++.so that comes with the cross-compiler.

That happens for instance in the following configuration:
--target=arm-linux-gnueabihf.

The attached patch adds a link test for the memalign function
and fixes the cross-build for me.

Is it OK for trunk?


Thanks
Bernd.
2016-11-30  Bernd Edlinger  

	* crossconfig.m4 (*-linux*): Add link-check for memalign.
	* configure: Regenerated.

Index: libstdc++-v3/configure
===
--- libstdc++-v3/configure	(revision 242960)
+++ libstdc++-v3/configure	(working copy)
@@ -59826,11 +59826,24 @@ _ACEOF
 fi
 done
 
+for ac_func in aligned_alloc posix_memalign memalign _aligned_malloc
+do :
+  as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
+ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
+eval as_val=\$$as_ac_var
+   if test "x$as_val" = x""yes; then :
+  cat >>confdefs.h <<_ACEOF
+#define `$as_echo "HAVE_$ac_func" | $as_tr_cpp` 1
+_ACEOF
 
+fi
+done
 
 
 
 
+
+
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for iconv" >&5
 $as_echo_n "checking for iconv... " >&6; }
 if test "${am_cv_func_iconv+set}" = set; then :
Index: libstdc++-v3/crossconfig.m4
===
--- libstdc++-v3/crossconfig.m4	(revision 242960)
+++ libstdc++-v3/crossconfig.m4	(working copy)
@@ -157,6 +157,7 @@ case "${host}" in
 AC_DEFINE(_GLIBCXX_USE_RANDOM_TR1)
 GCC_CHECK_TLS
 AC_CHECK_FUNCS(__cxa_thread_atexit_impl)
+AC_CHECK_FUNCS(aligned_alloc posix_memalign memalign _aligned_malloc)
 AM_ICONV
 ;;
   *-mingw32*)

Re: [PATCH] Another debug info stv fix (PR rtl-optimization/78547)

2016-11-30 Thread Jakub Jelinek

On Wed, Nov 30, 2016 at 08:01:11AM +0100, Uros Bizjak wrote:
> On Tue, Nov 29, 2016 at 8:44 PM, Jakub Jelinek  wrote:
> > Hi!
> >
> > The following testcase ICEs because DECL_RTL/DECL_INCOMING_RTL are adjusted
> > by the stv pass through the PUT_MODE modifications, which means that for
> > var-tracking.c they contain a bogus mode.
> >
> > Fixed by wrapping those into TImode subreg or adjusting the MEMs to have the
> > correct mode.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > 2016-11-29  Jakub Jelinek  
> >
> > PR rtl-optimization/78547
> > * config/i386/i386.c (convert_scalars_to_vectors): If any
> > insns have been converted, adjust all parameter's DEC_RTL and
> > DECL_INCOMING_RTL back from V1TImode to TImode if the parameters 
> > have
> > TImode.
> 
> LGTM.

This patch actually has been working around IMHO broken rtl sharing of MEM
between DECL_INCOMING_RTL and some REG_EQUAL note.

Here is an updated patch that avoids this sharing (the middle-end part) and
thus can remove the MEM handling and just keep REG handling in
convert_scalars_to_vectors.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-29  Jakub Jelinek  

PR rtl-optimization/78547
* emit-rtl.c (unshare_all_rtl): Make sure DECL_RTL and
DECL_INCOMING_RTL is not shared.
* config/i386/i386.c (convert_scalars_to_vectors): If any
insns have been converted, adjust all parameter's DEC_RTL and
DECL_INCOMING_RTL back from V1TImode to TImode if the parameters have
TImode.

--- gcc/emit-rtl.c.jj   2016-11-30 14:02:28.0 +0100
+++ gcc/emit-rtl.c  2016-11-30 14:27:35.860625382 +0100
@@ -2668,6 +2668,14 @@ unsigned int
 unshare_all_rtl (void)
 {
   unshare_all_rtl_1 (get_insns ());
+
+  for (tree decl = DECL_ARGUMENTS (cfun->decl); decl; decl = DECL_CHAIN (decl))
+{
+  if (DECL_RTL_SET_P (decl))
+   SET_DECL_RTL (decl, copy_rtx_if_shared (DECL_RTL (decl)));
+  DECL_INCOMING_RTL (decl) = copy_rtx_if_shared (DECL_INCOMING_RTL (decl));
+}
+
   return 0;
 }
 
--- gcc/config/i386/i386.c.jj   2016-11-30 14:01:49.835432636 +0100
+++ gcc/config/i386/i386.c  2016-11-30 14:28:37.072841688 +0100
@@ -4073,6 +4073,28 @@ convert_scalars_to_vector ()
crtl->stack_alignment_needed = 128;
   if (crtl->stack_alignment_estimated < 128)
crtl->stack_alignment_estimated = 128;
+  /* Fix up DECL_RTL/DECL_INCOMING_RTL of arguments.  */
+  if (TARGET_64BIT)
+   for (tree parm = DECL_ARGUMENTS (current_function_decl);
+parm; parm = DECL_CHAIN (parm))
+ {
+   if (TYPE_MODE (TREE_TYPE (parm)) != TImode)
+ continue;
+   if (DECL_RTL_SET_P (parm)
+   && GET_MODE (DECL_RTL (parm)) == V1TImode)
+ {
+   rtx r = DECL_RTL (parm);
+   if (REG_P (r))
+ SET_DECL_RTL (parm, gen_rtx_SUBREG (TImode, r, 0));
+ }
+   if (DECL_INCOMING_RTL (parm)
+   && GET_MODE (DECL_INCOMING_RTL (parm)) == V1TImode)
+ {
+   rtx r = DECL_INCOMING_RTL (parm);
+   if (REG_P (r))
+ DECL_INCOMING_RTL (parm) = gen_rtx_SUBREG (TImode, r, 0);
+ }
+ }
 }
 
   return 0;
--- gcc/testsuite/gcc.dg/pr78547.c.jj   2016-11-30 14:28:02.260287390 +0100
+++ gcc/testsuite/gcc.dg/pr78547.c  2016-11-30 14:28:02.260287390 +0100
@@ -0,0 +1,18 @@
+/* PR rtl-optimization/78547 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-Os -g -freorder-blocks-algorithm=simple -Wno-psabi" } */
+/* { dg-additional-options "-mstringop-strategy=libcall" { target i?86-*-* 
x86_64-*-* } } */
+
+typedef unsigned __int128 u128;
+typedef unsigned __int128 V __attribute__ ((vector_size (64)));
+
+V
+foo (u128 a, u128 b, u128 c, V d)
+{
+  V e = (V) {a};
+  V f = e & 1;
+  e = 0 != e;
+  c = c;
+  f = f << ((V) {c} & 7);
+  return f + e;
+}

Jakub

Re: [PATCH] Fix x86_64 fix_debug_reg_uses (PR rtl-optimization/78575)

2016-11-30 Thread Jakub Jelinek

On Tue, Nov 29, 2016 at 03:20:08PM -0700, Jeff Law wrote:
> On 11/29/2016 12:41 PM, Jakub Jelinek wrote:
> >Hi!
> >
> >The x86_64 stv pass uses PUT_MODE to change REGs and MEMs in place to affect
> >all setters and users, but that is undesirable in debug insns which are
> >intentionally ignored during the analysis and we should keep using correct
> >modes (TImode) instead of the new one (V1TImode).
> Note that MEMs are not shared, so twiddling the mode on any given MEM
> changes one and only one object.

Note that this patch isn't trying to workaround any wrong MEM sharing,
while the other patch has been.  So, is the PR78575 ok as is?
I'll post the other updated patch in the corresponding thread.

Jakub

Re: [1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-11-30 Thread Yao Qi

On Wed, Nov 30, 2016 at 11:15:22AM +, Jiong Wang wrote:
> 
> Hi GDB, Binutils maintainer:
> 
>   OK on this proposal and install this patch to binutils-gdb master?
>

This proposal is good to GDB, as long as you add a gdbarch hook and move
the code handling DW_CFA_GNU_window_save in
gdb/dwarf2-frame.c:execute_cfa_program to sparc-tdep.c or/and
sparc64-tdep.c.

-- 
Yao (齐尧)

Re: [Patch Doc] Update documentation for __fp16 type

2016-11-30 Thread Joseph Myers

On Wed, 30 Nov 2016, James Greenhalgh wrote:

> +@code{_Float16} type defined by ISO/IEC TS18661:3-2005

Add a space after "TS", and it's -3:2015 not :3-2005.

I think the -mfp16-format documentation in invoke.texi should also be 
updated to reflect that it affects availability of _Float16.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 3/3] Move data definitions from icv.c back to env.c

2016-11-30 Thread Jakub Jelinek

On Wed, Nov 30, 2016 at 08:36:27PM +0300, Alexander Monakov wrote:
> env.c contains a static constructor that would initialize various global 
> libgomp
> data such as members of gomp_global_icv.  Therefore it's not ok to define them
> in a separate translation unit: under static linking this results in env.o not
> linked in (unless an incremental link on icv.o+env.o is performed when 
> building
> libgomp.a).  Move definitions of global data from icv.c back to env.c, remove
> empty config/nvptx/env.c, and guard environment access on NVPTX using the new
> LIBGOMP_OFFLOADED_ONLY macro.
> 
>   * config/nvptx/env.c: Delete.
>   * icv.c: Move definitions of ICV variables back ...
>   * env.c: ...here.  Do not compile environment-related functionality if
>   LIBGOMP_OFFLOADED_ONLY is set.

Can you please move the ICVs after all the (especially system) headers are 
included,
even when it means 2 separate #ifndef LIBGOMP_OFFLOADED_ONLY instead of just
one?  Ok with that change.

Jakub

patch to fix PR77856

2016-11-30 Thread Vladimir N Makarov


The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77856

The bug was in a new code for invariant inheritance I added this summer.

The patch was successfully bootstrapped and tested on x86-64.

Committed as rev. 243038.


Index: ChangeLog
===
--- ChangeLog	(revision 243034)
+++ ChangeLog	(working copy)
@@ -1,3 +1,10 @@
+2016-11-30  Vladimir Makarov  
+
+	PR tree-optimization/77856
+	* lra-constraints.c (inherit_in_ebb): Check original regno for
+	invalid invariant regs too.  Set only clobbered hard regs for the
+	invalid invariant regs.
+
 2016-11-30  Pitchumani Sivanupandi  
 
 	Commit files forgotten in r242966.
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 242848)
+++ lra-constraints.c	(working copy)
@@ -5886,7 +5886,9 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 	   && dst_regno >= lra_constraint_new_regno_start
 	   && invariant_p (SET_SRC (curr_set))
 	   && (cl = lra_get_allocno_class (dst_regno)) != NO_REGS
-	   && ! bitmap_bit_p (_invariant_regs, dst_regno))
+	   && ! bitmap_bit_p (_invariant_regs, dst_regno)
+	   && ! bitmap_bit_p (_invariant_regs,
+  ORIGINAL_REGNO(regno_reg_rtx[dst_regno])))
 	{
 	  /* 'reload_pseudo <- invariant'.  */
 	  if (ira_class_hard_regs_num[cl] <= max_small_class_regs_num)
@@ -6157,16 +6159,20 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 	  curr_id = lra_get_insn_recog_data (curr_insn);
 	  for (reg = curr_id->regs; reg != NULL; reg = reg->next)
 	if (reg->type != OP_IN)
-	  bitmap_set_bit (_invariant_regs, reg->regno);
+	  {
+		bitmap_set_bit (_invariant_regs, reg->regno);
+		bitmap_set_bit (_invariant_regs,
+ORIGINAL_REGNO (regno_reg_rtx[reg->regno]));
+	  }
 	  curr_static_id = curr_id->insn_static_data;
 	  for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next)
 	if (reg->type != OP_IN)
 	  bitmap_set_bit (_invariant_regs, reg->regno);
 	  if (curr_id->arg_hard_regs != NULL)
 	for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++)
+	  if (regno >= FIRST_PSEUDO_REGISTER)
 		bitmap_set_bit (_invariant_regs,
-regno >= FIRST_PSEUDO_REGISTER
-? regno : regno - FIRST_PSEUDO_REGISTER);
+regno - FIRST_PSEUDO_REGISTER);
 	}
   /* We reached the start of the current basic block.  */
   if (prev_insn == NULL_RTX || prev_insn == PREV_INSN (head)
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 243034)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2016-11-30  Vladimir Makarov  
+
+	PR tree-optimization/77856
+	* gcc.target/i386.c (pr77856.c): New.
+
 2016-11-30  Andre Vehreschild  
 
 	Now really add the file.
Index: testsuite/gcc.target/i386/pr77856.c
===
--- testsuite/gcc.target/i386/pr77856.c	(revision 0)
+++ testsuite/gcc.target/i386/pr77856.c	(working copy)
@@ -0,0 +1,83 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+extern void abort (void);
+
+unsigned a, e;
+
+struct S0
+{
+  int f1;
+  int f8;
+} c = {4, 6};
+
+int b, f, g, h, i, j, l, p;
+short d, o = -7;
+char k, n = 5;
+
+unsigned fn1 (int p1, int p2)
+{
+  return p2 >= 2 || p1 >> p2 ? p1 : p1 << p2;
+}
+
+static short fn2 (struct S0 p1)
+{
+  int *q = 0;
+  int r = 7;
+  if (!a)
+{
+  c.f1 = 1;
+  for (; c.f1; c.f1--)
+	if (fn1 (10 != 0, p1.f8))
+	  {
+	short s = 9 << ~o % (d ^ n);
+	int t = s >> h % ~d;
+	p = r;
+	r = s | p * (d && 9) | t;
+	int u = i & c.f1;
+	unsigned v = ~(~(u & h) | (~(8 >> c.f1) & i));
+	int w = v;
+	if (u < 9)
+	  w = c.f1;
+	if (i > h && u)
+	  {
+		__builtin_printf ("%d\n", c.f1);
+		continue;
+	  }
+	c.f1 = w;
+	if (!p)
+	  continue;
+	return 0;
+	  }
+  for (;;)
+	*q = 0;
+}
+  return 0;
+}
+
+static void fn3 ()
+{
+  fn2 (c);
+  l = c.f1 < b;
+  if (l)
+{
+L1:
+  e = l | j / e % ~f;
+  j = f - 4 % k < c.f1 / e / b - j - 1;
+  if (l)
+	{
+	  __builtin_printf ("%d\n", b);
+	  goto L1;
+	}
+  int m[245];
+  g = m[2];
+}
+}
+
+int main ()
+{
+  fn3 ();
+  if (c.f1 != 1)
+abort ();
+  return 0;
+}

[PATCH][ARM] Merge negdi2 patterns

2016-11-30 Thread Wilco Dijkstra

The negdi2 patterns for ARM and Thumb-2 are duplicated because Thumb-2
doesn't support RSC with an immediate.  We can however emulate RSC with
zero using a shifted SBC.  If we add this to subsi3_carryin the negdi
patterns can be merged, simplifying things a bit (eg. if changing when to split
for PR77308).  This should generate identical code in all cases.

ChangeLog:
2016-11-30  Wilco Dijkstra  

* gcc/config/arm/arm.md (subsi3_carryin): Add Thumb-2 RSC #0.
(arm_negdi2) Rename to negdi2, allow on Thumb-2.
* gcc/config/arm/thumb2.md (thumb2_negdi2): Remove pattern.

--
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
2035fa5861d876771aef9fb391bcb01b877cf148..eb79d1376e1fb3df1eabddde22aa93ab6fec94ea
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1128,19 +1128,20 @@
 )
 
 (define_insn "*subsi3_carryin"
-  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
-(minus:SI (minus:SI (match_operand:SI 1 "reg_or_int_operand" "r,I")
-(match_operand:SI 2 "s_register_operand" "r,r"))
+  [(set (match_operand:SI 0 "s_register_operand" "=r,r,r")
+(minus:SI (minus:SI (match_operand:SI 1 "reg_or_int_operand" "r,I,Pz")
+(match_operand:SI 2 "s_register_operand" "r,r,r"))
   (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0]
   "TARGET_32BIT"
   "@
sbc%?\\t%0, %1, %2
-   rsc%?\\t%0, %2, %1"
+   rsc%?\\t%0, %2, %1
+   sbc%?\\t%0, %2, %2, lsl #1"
   [(set_attr "conds" "use")
-   (set_attr "arch" "*,a")
+   (set_attr "arch" "*,a,t2")
(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")
-   (set_attr "type" "adc_reg,adc_imm")]
+   (set_attr "type" "adc_reg,adc_imm,alu_shift_imm")]
 )
 
 (define_insn "*subsi3_carryin_const"
@@ -4731,12 +4732,13 @@
 
 ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1).
 ;; The first alternative allows the common case of a *full* overlap.
-(define_insn_and_split "*arm_negdi2"
+(define_insn_and_split "*negdi2"
   [(set (match_operand:DI 0 "s_register_operand" "=r,")
(neg:DI (match_operand:DI 1 "s_register_operand"  "0,r")))
(clobber (reg:CC CC_REGNUM))]
-  "TARGET_ARM"
-  "#"   ; "rsbs\\t%Q0, %Q1, #0\;rsc\\t%R0, %R1, #0"
+  "TARGET_32BIT"
+  "#"   ; rsbs %Q0, %Q1, #0; rsc %R0, %R1, #0  (ARM)
+   ; negs %Q0, %Q1; sbc %R0, %R1, %R1, lsl #1 (Thumb-2)
   "&& reload_completed"
   [(parallel [(set (reg:CC CC_REGNUM)
   (compare:CC (const_int 0) (match_dup 1)))
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 
affcd832b72b7d358347e7370265be492866bb90..d9c530a48878923683485933c5640ffe80908401
 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -125,32 +125,6 @@
(set_attr "type" "multiple")]
 )
 
-;; Thumb-2 does not have rsc, so use a clever trick with shifter operands.
-(define_insn_and_split "*thumb2_negdi2"
-  [(set (match_operand:DI 0 "s_register_operand" "=,r")
-   (neg:DI (match_operand:DI 1 "s_register_operand"  "?r,0")))
-   (clobber (reg:CC CC_REGNUM))]
-  "TARGET_THUMB2"
-  "#" ; negs\\t%Q0, %Q1\;sbc\\t%R0, %R1, %R1, lsl #1
-  "&& reload_completed"
-  [(parallel [(set (reg:CC CC_REGNUM)
-  (compare:CC (const_int 0) (match_dup 1)))
- (set (match_dup 0) (minus:SI (const_int 0) (match_dup 1)))])
-   (set (match_dup 2) (minus:SI (minus:SI (match_dup 3)
-  (ashift:SI (match_dup 3)
- (const_int 1)))
-(ltu:SI (reg:CC_C CC_REGNUM) (const_int 0]
-  {
-operands[2] = gen_highpart (SImode, operands[0]);
-operands[0] = gen_lowpart (SImode, operands[0]);
-operands[3] = gen_highpart (SImode, operands[1]);
-operands[1] = gen_lowpart (SImode, operands[1]);
-  }
-  [(set_attr "conds" "clob")
-   (set_attr "length" "8")
-   (set_attr "type" "multiple")]
-)

[PATCH 3/3] Move data definitions from icv.c back to env.c

2016-11-30 Thread Alexander Monakov

env.c contains a static constructor that would initialize various global libgomp
data such as members of gomp_global_icv.  Therefore it's not ok to define them
in a separate translation unit: under static linking this results in env.o not
linked in (unless an incremental link on icv.o+env.o is performed when building
libgomp.a).  Move definitions of global data from icv.c back to env.c, remove
empty config/nvptx/env.c, and guard environment access on NVPTX using the new
LIBGOMP_OFFLOADED_ONLY macro.

* config/nvptx/env.c: Delete.
* icv.c: Move definitions of ICV variables back ...
* env.c: ...here.  Do not compile environment-related functionality if
LIBGOMP_OFFLOADED_ONLY is set.

diff --git a/libgomp/config/nvptx/env.c b/libgomp/config/nvptx/env.c
deleted file mode 100644
index e69de29..000
diff --git a/libgomp/env.c b/libgomp/env.c
index 7ba7663..d601e19 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -23,13 +23,46 @@
see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
.  */
 
-/* This file arranges for OpenMP internal control variables to be initialized
-   from environment variables at startup.  */
+/* This file defines the OpenMP internal control variables and arranges
+   for them to be initialized from environment variables at startup.  */
 
 #include "libgomp.h"
+#include "gomp-constants.h"
+#include 
+
+struct gomp_task_icv gomp_global_icv = {
+  .nthreads_var = 1,
+  .thread_limit_var = UINT_MAX,
+  .run_sched_var = GFS_DYNAMIC,
+  .run_sched_chunk_size = 1,
+  .default_device_var = 0,
+  .dyn_var = false,
+  .nest_var = false,
+  .bind_var = omp_proc_bind_false,
+  .target_data = NULL
+};
+
+unsigned long gomp_max_active_levels_var = INT_MAX;
+bool gomp_cancel_var = false;
+int gomp_max_task_priority_var = 0;
+#ifndef HAVE_SYNC_BUILTINS
+gomp_mutex_t gomp_managed_threads_lock;
+#endif
+unsigned long gomp_available_cpus = 1, gomp_managed_threads = 1;
+unsigned long long gomp_spin_count_var, gomp_throttled_spin_count_var;
+unsigned long *gomp_nthreads_var_list, gomp_nthreads_var_list_len;
+char *gomp_bind_var_list;
+unsigned long gomp_bind_var_list_len;
+void **gomp_places_list;
+unsigned long gomp_places_list_len;
+int gomp_debug_var;
+unsigned int gomp_num_teams_var;
+char *goacc_device_type;
+int goacc_device_num;
+
+#ifndef LIBGOMP_OFFLOADED_ONLY
 #include "libgomp_f.h"
 #include "oacc-int.h"
-#include "gomp-constants.h"
 #include 
 #include 
 #include 
@@ -48,7 +81,6 @@
 #  endif
 # endif
 #endif
-#include 
 #include 
 
 #ifndef HAVE_STRTOULL
@@ -1273,3 +1305,4 @@ initialize_env (void)
 
   goacc_runtime_initialize ();
 }
+#endif /* LIBGOMP_OFFLOADED_ONLY */
diff --git a/libgomp/icv.c b/libgomp/icv.c
index e58b961..cf00e24 100644
--- a/libgomp/icv.c
+++ b/libgomp/icv.c
@@ -23,43 +23,13 @@
see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
.  */
 
-/* This file defines the OpenMP internal control variables and associated
-   OpenMP API entry points.  */
+/* This file defines the OpenMP API entry points that operate on internal
+   control variables.  */
 
 #include "libgomp.h"
 #include "gomp-constants.h"
 #include 
 
-struct gomp_task_icv gomp_global_icv = {
-  .nthreads_var = 1,
-  .thread_limit_var = UINT_MAX,
-  .run_sched_var = GFS_DYNAMIC,
-  .run_sched_chunk_size = 1,
-  .default_device_var = 0,
-  .dyn_var = false,
-  .nest_var = false,
-  .bind_var = omp_proc_bind_false,
-  .target_data = NULL
-};
-
-unsigned long gomp_max_active_levels_var = INT_MAX;
-bool gomp_cancel_var = false;
-int gomp_max_task_priority_var = 0;
-#ifndef HAVE_SYNC_BUILTINS
-gomp_mutex_t gomp_managed_threads_lock;
-#endif
-unsigned long gomp_available_cpus = 1, gomp_managed_threads = 1;
-unsigned long long gomp_spin_count_var, gomp_throttled_spin_count_var;
-unsigned long *gomp_nthreads_var_list, gomp_nthreads_var_list_len;
-char *gomp_bind_var_list;
-unsigned long gomp_bind_var_list_len;
-void **gomp_places_list;
-unsigned long gomp_places_list_len;
-int gomp_debug_var;
-unsigned int gomp_num_teams_var;
-char *goacc_device_type;
-int goacc_device_num;
-
 void
 omp_set_num_threads (int n)
 {

[PATCH][ARM] Improve Thumb allocation order

2016-11-30 Thread Wilco Dijkstra

Thumb uses a special register allocation order to increase the use of low
registers.  Oddly enough, LR appears before R12, which means that LR must
be saved and restored even if R12 is available.  Swapping R12 and LR means
this simple example now uses R12 as a temporary (just like ARM):

int f(long long a, long long b)
{
  if (a < b) return 1;
  return a + b;
}

cmp r0, r2
sbcsip, r1, r3
ite ge
addge   r0, r0, r2
movlt   r0, #1
bx  lr

Bootstrap OK. CSibe benchmarks unchanged.

ChangeLog:
2016-11-30  Wilco Dijkstra  

* gcc/config/arm/arm.c (thumb_core_reg_alloc_order): Swap R12 and R14.

--
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
43c78f6148a5306fb0079ee2eba12f3763652bcc..29dcefd23762ba861b458b8860eb4b4856a9cb02
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -26455,7 +26455,7 @@ arm_mangle_type (const_tree type)
 static const int thumb_core_reg_alloc_order[] =
 {
3,  2,  1,  0,  4,  5,  6,  7,
-  14, 12,  8,  9, 10, 11
+  12, 14,  8,  9, 10, 11
 };
 
 /* Adjust register allocation order when compiling for Thumb.  */

Re: [PATCH 2/3] Introduce LIBGOMP_OFFLOADED_ONLY macro

2016-11-30 Thread Jakub Jelinek

On Wed, Nov 30, 2016 at 08:28:05PM +0300, Alexander Monakov wrote:
> Introduce LIBGOMP_OFFLOADED_ONLY macro to indicate that libgomp is being built
> for an accelerator-only target.
> 
>   * configure.ac [nvptx*-*-*] (libgomp_offloaded_only): Set and use it...
>   (LIBGOMP_OFFLOADED_ONLY): ...here; new define.
>   * configure: Regenerate.
>   * config.h.in: Likewise.

Ok.

Jakub

Re: [PATCH 1/3] libgomp: regenerate with automake-1.11.6

2016-11-30 Thread Jakub Jelinek

On Wed, Nov 30, 2016 at 08:24:40PM +0300, Alexander Monakov wrote:
> A recent libgomp commit (svn rev. 242749) appears to have regenerated some
> auto-files using automake-1.11.1 instead of 1.11.6:
> 
> Author: jamborm 
> Date:   Wed Nov 23 12:27:13 2016 +
> 
> Remove build dependence on HSA run-time
> 
> [...]
> * Makefile.in: Regenerated.
> * aclocal.m4: Likewise.
> * config.h.in: Likewise.
> * configure: Likewise.
> * testsuite/Makefile.in: Likewise.
> 
> This patch regenerates them again using automake-1.11.6 and autoconf-2.64.  
> 
> libgomp/
>   * Makefile.in: Regenerate with automake-1.11.6.
>   * aclocal.m4: Likewise.
>   * configure: Likewise.
>   * testsuite/Makefile.in: Likewise.

Ok, thanks.

Jakub

[PATCH 2/3] Introduce LIBGOMP_OFFLOADED_ONLY macro

2016-11-30 Thread Alexander Monakov

Introduce LIBGOMP_OFFLOADED_ONLY macro to indicate that libgomp is being built
for an accelerator-only target.

* configure.ac [nvptx*-*-*] (libgomp_offloaded_only): Set and use it...
(LIBGOMP_OFFLOADED_ONLY): ...here; new define.
* configure: Regenerate.
* config.h.in: Likewise.

diff --git a/libgomp/config.h.in b/libgomp/config.h.in
index b54dd87..583b9b4 100644
--- a/libgomp/config.h.in
+++ b/libgomp/config.h.in
@@ -115,6 +115,9 @@
 /* Define to 1 if GNU symbol versioning is used for libgomp. */
 #undef LIBGOMP_GNU_SYMBOL_VERSIONING
 
+/* Define to 1 if building libgomp for an accelerator-only target. */
+#undef LIBGOMP_OFFLOADED_ONLY
+
 /* Define to 1 if libgomp should use POSIX threads. */
 #undef LIBGOMP_USE_PTHREADS
 
diff --git a/libgomp/configure b/libgomp/configure
index cfce560..6355ad9 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -15074,6 +15074,8 @@ case "$host" in
   nvptx*-*-*)
 # NVPTX does not support Pthreads, has its own code replacement.
 libgomp_use_pthreads=no
+# NVPTX is an accelerator-only target
+libgomp_offloaded_only=yes
 ;;
   *)
 # Check to see if -pthread or -lpthread is needed.  Prefer the former.
@@ -15125,6 +15127,12 @@ $as_echo "#define LIBGOMP_USE_PTHREADS 1" >>confdefs.h
 
 fi
 
+if test x$libgomp_offloaded_only = xyes; then
+
+$as_echo "#define LIBGOMP_OFFLOADED_ONLY 1" >>confdefs.h
+
+fi
+
 # Plugins for offload execution, configure.ac fragment.  -*- mode: autoconf -*-
 #
 # Copyright (C) 2014-2016 Free Software Foundation, Inc.
diff --git a/libgomp/configure.ac b/libgomp/configure.ac
index 5f1db7e..4086d3f 100644
--- a/libgomp/configure.ac
+++ b/libgomp/configure.ac
@@ -182,6 +182,8 @@ case "$host" in
   nvptx*-*-*)
 # NVPTX does not support Pthreads, has its own code replacement.
 libgomp_use_pthreads=no
+# NVPTX is an accelerator-only target
+libgomp_offloaded_only=yes
 ;;
   *)
 # Check to see if -pthread or -lpthread is needed.  Prefer the former.
@@ -208,6 +210,11 @@ if test x$libgomp_use_pthreads != xno; then
 [Define to 1 if libgomp should use POSIX threads.])
 fi
 
+if test x$libgomp_offloaded_only = xyes; then
+  AC_DEFINE(LIBGOMP_OFFLOADED_ONLY, 1,
+[Define to 1 if building libgomp for an accelerator-only target.])
+fi
+
 m4_include([plugin/configfrag.ac])
 
 # Check for functions needed.

Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Jeff Law


On 11/30/2016 09:53 AM, Matthias Klose wrote:

On 30.11.2016 12:38, Richard Biener wrote:

On Wed, Nov 30, 2016 at 12:30 PM, Matthias Klose  wrote:

There's one more fix needed for the case of only having the pkg-config module
installed when configuring with --enable-objc-gc. We can't use PKG_CHECK_MODULES
directly because the pkg.m4 macros choke on the dash in the module name. Thus
setting the CFLAGS and LIBS directly. Ok to install?


Why not fix pkg.m4?

Richard.


Jakub suggested to avoid using pkg-config at all, so we can get rid off this
code altogether.

I thought we'd OK'd pkg-config (for JIT) which is why I didn't call it out.

Looking now, pkg-config got NAKd there and was removed.

Jeff

[PATCH 1/3] libgomp: regenerate with automake-1.11.6

2016-11-30 Thread Alexander Monakov

A recent libgomp commit (svn rev. 242749) appears to have regenerated some
auto-files using automake-1.11.1 instead of 1.11.6:

Author: jamborm 
Date:   Wed Nov 23 12:27:13 2016 +

Remove build dependence on HSA run-time

[...]
* Makefile.in: Regenerated.
* aclocal.m4: Likewise.
* config.h.in: Likewise.
* configure: Likewise.
* testsuite/Makefile.in: Likewise.

This patch regenerates them again using automake-1.11.6 and autoconf-2.64.  

libgomp/
* Makefile.in: Regenerate with automake-1.11.6.
* aclocal.m4: Likewise.
* configure: Likewise.
* testsuite/Makefile.in: Likewise.diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index e62daec..92c087f 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -1,9 +1,9 @@
-# Makefile.in generated by automake 1.11.1 from Makefile.am.
+# Makefile.in generated by automake 1.11.6 from Makefile.am.
 # @configure_input@
 
 # Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
-# 2003, 2004, 2005, 2006, 2007, 2008, 2009  Free Software Foundation,
-# Inc.
+# 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software
+# Foundation, Inc.
 # This Makefile.in is free software; the Free Software Foundation
 # gives unlimited permission to copy and/or distribute it,
 # with or without modifications, as long as this notice is preserved.
@@ -45,6 +45,23 @@
 
 
 VPATH = @srcdir@
+am__make_dryrun = \
+  { \
+am__dry=no; \
+case $$MAKEFLAGS in \
+  *\\[\ \	]*) \
+echo 'am--echo: ; @echo "AM"  OK' | $(MAKE) -f - 2>/dev/null \
+  | grep '^AM OK$$' >/dev/null || am__dry=yes;; \
+  *) \
+for am__flg in $$MAKEFLAGS; do \
+  case $$am__flg in \
+*=*|--*) ;; \
+*n*) am__dry=yes; break;; \
+  esac; \
+done;; \
+esac; \
+test $$am__dry = yes; \
+  }
 pkgdatadir = $(datadir)/@PACKAGE@
 pkgincludedir = $(includedir)/@PACKAGE@
 pkglibdir = $(libdir)/@PACKAGE@
@@ -120,6 +137,12 @@ am__nobase_list = $(am__nobase_strip_setup); \
 am__base_list = \
   sed '$$!N;$$!N;$$!N;$$!N;$$!N;$$!N;$$!N;s/\n/ /g' | \
   sed '$$!N;$$!N;$$!N;$$!N;s/\n/ /g'
+am__uninstall_files_from_dir = { \
+  test -z "$$files" \
+|| { test ! -d "$$dir" && test ! -f "$$dir" && test ! -r "$$dir"; } \
+|| { echo " ( cd '$$dir' && rm -f" $$files ")"; \
+ $(am__cd) "$$dir" && rm -f $$files; }; \
+  }
 am__installdirs = "$(DESTDIR)$(toolexeclibdir)" "$(DESTDIR)$(infodir)" \
 	"$(DESTDIR)$(fincludedir)" "$(DESTDIR)$(libsubincludedir)" \
 	"$(DESTDIR)$(toolexeclibdir)"
@@ -203,6 +226,11 @@ RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
 	install-pdf-recursive install-ps-recursive install-recursive \
 	installcheck-recursive installdirs-recursive pdf-recursive \
 	ps-recursive uninstall-recursive
+am__can_run_installinfo = \
+  case $$AM_UPDATE_INFO_DIR in \
+n|no|NO) false;; \
+*) (install-info --version) >/dev/null 2>&1;; \
+  esac
 HEADERS = $(nodist_finclude_HEADERS) $(nodist_libsubinclude_HEADERS) \
 	$(nodist_noinst_HEADERS) $(nodist_toolexeclib_HEADERS)
 RECURSIVE_CLEAN_TARGETS = mostlyclean-recursive clean-recursive	\
@@ -465,7 +493,7 @@ all: config.h
 
 .SUFFIXES:
 .SUFFIXES: .c .dvi .f90 .lo .o .obj .ps
-am--refresh:
+am--refresh: Makefile
 	@:
 $(srcdir)/Makefile.in: @MAINTAINER_MODE_TRUE@ $(srcdir)/Makefile.am $(top_srcdir)/plugin/Makefrag.am $(am__configure_deps)
 	@for dep in $?; do \
@@ -490,6 +518,7 @@ Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
 	echo ' cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__depfiles_maybe)'; \
 	cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__depfiles_maybe);; \
 	esac;
+$(top_srcdir)/plugin/Makefrag.am:
 
 $(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
 	$(SHELL) ./config.status --recheck
@@ -501,10 +530,8 @@ $(ACLOCAL_M4): @MAINTAINER_MODE_TRUE@ $(am__aclocal_m4_deps)
 $(am__aclocal_m4_deps):
 
 config.h: stamp-h1
-	@if test ! -f $@; then \
-	  rm -f stamp-h1; \
-	  $(MAKE) $(AM_MAKEFLAGS) stamp-h1; \
-	else :; fi
+	@if test ! -f $@; then rm -f stamp-h1; else :; fi
+	@if test ! -f $@; then $(MAKE) $(AM_MAKEFLAGS) stamp-h1; else :; fi
 
 stamp-h1: $(srcdir)/config.h.in $(top_builddir)/config.status
 	@rm -f stamp-h1
@@ -528,7 +555,6 @@ libgomp.spec: $(top_builddir)/config.status $(srcdir)/libgomp.spec.in
 	cd $(top_builddir) && $(SHELL) ./config.status $@
 install-toolexeclibLTLIBRARIES: $(toolexeclib_LTLIBRARIES)
 	@$(NORMAL_INSTALL)
-	test -z "$(toolexeclibdir)" || $(MKDIR_P) "$(DESTDIR)$(toolexeclibdir)"
 	@list='$(toolexeclib_LTLIBRARIES)'; test -n "$(toolexeclibdir)" || list=; \
 	list2=; for p in $$list; do \
 	  if test -f $$p; then \
@@ -536,6 +562,8 @@ install-toolexeclibLTLIBRARIES: $(toolexeclib_LTLIBRARIES)
 	  else :; fi; \
 	done; \
 	test -z "$$list2" || { \
+	  echo " $(MKDIR_P)

Re: [PATCHv3 5/7, GCC, ARM, V8M] Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute

2016-11-30 Thread Kyrill Tkachov



On 30/11/16 12:05, Andre Vieira (lists) wrote:

Hi,

I got a bug report against the old version of this patch and fixed it
here. This had to do with GCC optimizations sharing types with and
without the 'cmse_nonsecure_call' attribute.  The patch now no longer
sets the main variant, this didn't seem to do what I thought it did.
Instead the patch now creates distinct type copies for every declared
pointer that eventually points to the function type with the attribute,
it will also create a distinct copy for the function type itself.
Another change in this patch was to make 'arm_comp_type_attributes', the
ARM implementation of TARGET_COMP_TYPE_ATTRIBUTES, deny compatibility
between function types with the attribute and without.

I added a test case to test the issue solved with these changes.


Ok.
Thanks,
Kyrill


*** gcc/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * config/arm/arm.c (gimplify.h): New include.
 (arm_handle_cmse_nonsecure_call): New.
 (arm_attribute_table): Added cmse_nonsecure_call.
 (arm_comp_type_attributes): Deny compatibility of function types
with
 without the cmse_nonsecure_call attribute.
 * doc/extend.texi (ARM ARMv8-M Security Extensions): New attribute.

*** gcc/testsuite/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse-3.c: Add tests.
 * gcc.target/arm/cmse/cmse-4.c: Add tests.
 * gcc.target/arm/cmse/cmse-15.c: New.


Cheers,
Andre

Re: [PATCHv2 4/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

2016-11-30 Thread Kyrill Tkachov



On 30/11/16 15:32, Andre Vieira (lists) wrote:

On 23/11/16 11:52, Andre Vieira (lists) wrote:

Hi,

After some extra testing I realized there was an issue with the way we
were clearing registers when returning from a cmse_nonsecure_entry
function for ARMv8-M.Baseline.  This patch fixes that and changes the
testcase to catch the issue.

The problem was I was always using LR to clear the registers, however,
due to the way the Thumb-1 backend works, we can't guarantee LR will
contain the address to which we will be returning at the time of
clearing. Instead we use r0 to clear r1-r3 and IP. If the function does
not use r0 to return a value, we clear r0 with 0 before using it to
clear everything else. As for LR, we move the value of the register used
to return into it prior to returning.

This satisfies the requirements of not leaking secure information since
all registers hold either:
- values to return
- 0
- return address

No changes to ChangeLog.

Cheers,
Andre


Hi,

So I seemed to have forgotten to address two of your comments earlier,
done in this version.

To reiterate:
After some extra testing I realized there was an issue with the way we
were clearing registers when returning from a cmse_nonsecure_entry
function for ARMv8-M Baseline.  This patch fixes that and changes the
testcase to catch the issue.

The problem was I was always using LR to clear the registers, however,
due to the way the Thumb-1 backend works, we can't guarantee LR will
contain the address to which we will be returning at the time of
clearing. Instead we use r0 to clear r1-r3 and IP. If the function does
not use r0 to return a value, we clear r0 with 0 before using it to
clear everything else. As for LR, we move the value of the register used
to return into it prior to returning.

This satisfies the requirements of not leaking secure information since
all registers hold either:
- values to return
- 0
- return address

*** gcc/ChangeLog ***
2016-11-xx  Andre Vieira
  Thomas Preud'homme  

  * config/arm/arm.c (output_return_instruction): Clear
  registers.
  (thumb2_expand_return): Likewise.
  (thumb1_expand_epilogue): Likewise.
  (thumb_exit): Likewise.
  (arm_expand_epilogue): Likewise.
  (cmse_nonsecure_entry_clear_before_return): New.
  (comp_not_to_clear_mask_str_un): New.
  (compute_not_to_clear_mask): New.
  * config/arm/thumb1.md (*epilogue_insns): Change length attribute.
  * config/arm/thumb2.md (*thumb2_cmse_entry_return): Duplicate
  thumb2_return pattern for cmse_nonsecure_entry functions.

*** gcc/testsuite/ChangeLog ***
2016-11-xx  Andre Vieira
  Thomas Preud'homme  

  * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate.
  * gcc.target/arm/cmse/struct-1.c: New.
  * gcc.target/arm/cmse/bitfield-1.c: New.
  * gcc.target/arm/cmse/bitfield-2.c: New.
  * gcc.target/arm/cmse/bitfield-3.c: New.
  * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are
cleared.
  * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
  * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
  * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
  * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
  * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.


Ok, thanks for addressing the issues.
Kyrill


Cheers,
Andre

Re: [libgomp] No references to env.c -> no libgomp construction

2016-11-30 Thread Alexander Monakov

[redirecting from gcc@ to gcc-patches@ for patch submission]

On Wed, 30 Nov 2016, Sebastian Huber wrote:
> > On Tue, 29 Nov 2016, Sebastian Huber wrote:
> > > * env.c: Split out ICV definitions into...
> > > * icv.c: ...here (new file) and...
> > > * icv-device.c: ...here. New file.
> > > 
> > > the env.c contains now only local symbols (at least for target 
> > > *-rtems*-*):
> > > 
> > [...]
> > > 
> > > Thus the libgomp constructor is not linked in into executables.
> > 
> > Thanks for the report.  This issue affects only static libgomp.a (and not on
> > NVPTX where env.c is deliberately empty).
> > 
> > I think the minimal solution here is to #include  from icv.c instead 
> > of
> > compiling it separately (using <> inclusion rather than "" so in case of 
> > NVPTX
> > we pick up the empty config/nvptx/env.c from toplevel icv.c).
> > 
> > A slightly more involved but perhaps a preferable approach is to remove
> > config/nvptx/env.c, introduce LIBGOMP_OFFLOADED_ONLY macro, and use it to
> > guard inclusion of env.c from icv.c (which then can use the #include "env.c"
> > form).
> 
> I guess its sufficient to move
> 
> pthread_attr_t gomp_thread_attr;
> 
> from team.c (NVPTX seems to provide its own team.c) to env.c.  This generates
> a reference from team.c to env.c and the constructor is pulled in.

Well, yes, generally definitions of variables must be in the same translation
unit as the constructor that would initialize them -- so at the moment it's
wrong for both the ICV definitions and gomp_thread_attr to be defined elsewhere.

In reply to this message I'm posting 3 patches that solve the issue by moving
ICV definitions back to env.c and using new LIBGOMP_OFFLOADED_ONLY macro to
avoid environment-related functionality on nvptx.  I think it would be good to
move gomp_thread_attr to env.c too, but that's not a part of this patchset.

Thanks.
Alexander

Re: [PATCH 2/4] [ARC] Cleanup implementation.

2016-11-30 Thread Andrew Burgess

* Claudiu Zissulescu  [2016-11-16 11:17:59 
+0100]:

> gcc/
> 2016-06-30  Claudiu Zissulescu  

There seem to be two sets of changes here:

> 
>   * config/arc/arc-protos.h (insn_is_tls_gd_dispatch): Remove.
>   * config/arc/arc.c (arc_unspec_offset): New function.
>   (arc_finalize_pic): Change.
>   (arc_emit_call_tls_get_addr): Likewise.
>   (arc_legitimize_tls_address): Likewise.
>   (arc_legitimize_pic_address): Likewise.
>   (insn_is_tls_gd_dispatch): Remove.
>   * config/arc/arc.h (INSN_REFERENCES_ARE_DELAYED): Change.

This lot seem pretty straight forward.  As always I cringe a little
when I see "Change" as the description, but given that actual change
is straight forward and small it's not that bad..

>   * config/arc/arc.md (ls_gd_load): Remove.
>   (tls_gd_dispatch): Likewise.

I don't see the connection between these two parts?  Plus it would be
nice to have some more words _somewhere_ for why these are being
removed.  The commit message is probably the right place I'd have
thought.

But assuming your reason for removing the patterns is solid this patch
looks fine.  You should commit with an extended description.

Thanks,
Andrew


> ---
>  gcc/config/arc/arc-protos.h |  1 -
>  gcc/config/arc/arc.c| 41 ++---
>  gcc/config/arc/arc.h|  2 +-
>  gcc/config/arc/arc.md   | 34 --
>  4 files changed, 19 insertions(+), 59 deletions(-)
> 
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index 6008744..bda3d46 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -118,6 +118,5 @@ extern bool arc_eh_uses (int regno);
>  extern int regno_clobbered_p (unsigned int, rtx_insn *, machine_mode, int);
>  extern bool arc_legitimize_reload_address (rtx *, machine_mode, int, int);
>  extern void arc_secondary_reload_conv (rtx, rtx, rtx, bool);
> -extern bool insn_is_tls_gd_dispatch (rtx_insn *);
>  extern void arc_cpu_cpp_builtins (cpp_reader *);
>  extern rtx arc_eh_return_address_location (void);
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 428676f..7eadb3c 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -2893,6 +2893,15 @@ arc_eh_return_address_location (void)
>  
>  /* PIC */
>  
> +/* Helper to generate unspec constant.  */
> +
> +static rtx
> +arc_unspec_offset (rtx loc, int unspec)
> +{
> +  return gen_rtx_CONST (Pmode, gen_rtx_UNSPEC (Pmode, gen_rtvec (1, loc),
> +unspec));
> +}
> +
>  /* Emit special PIC prologues and epilogues.  */
>  /* If the function has any GOTOFF relocations, then the GOTBASE
> register has to be setup in the prologue
> @@ -2918,9 +2927,7 @@ arc_finalize_pic (void)
>gcc_assert (flag_pic != 0);
>  
>pat = gen_rtx_SYMBOL_REF (Pmode, "_DYNAMIC");
> -  pat = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, pat), ARC_UNSPEC_GOT);
> -  pat = gen_rtx_CONST (Pmode, pat);
> -
> +  pat = arc_unspec_offset (pat, ARC_UNSPEC_GOT);
>pat = gen_rtx_SET (baseptr_rtx, pat);
>  
>emit_insn (pat);
> @@ -4989,8 +4996,7 @@ arc_emit_call_tls_get_addr (rtx sym, int reloc, rtx eqv)
>  
>start_sequence ();
>  
> -  rtx x = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, sym), reloc);
> -  x = gen_rtx_CONST (Pmode, x);
> +  rtx x = arc_unspec_offset (sym, reloc);
>emit_move_insn (r0, x);
>use_reg (_fusage, r0);
>  
> @@ -5046,17 +5052,18 @@ arc_legitimize_tls_address (rtx addr, enum tls_model 
> model)
>addr = gen_rtx_CONST (Pmode, addr);
>base = arc_legitimize_tls_address (base, TLS_MODEL_GLOBAL_DYNAMIC);
>return gen_rtx_PLUS (Pmode, force_reg (Pmode, base), addr);
> +
>  case TLS_MODEL_GLOBAL_DYNAMIC:
>return arc_emit_call_tls_get_addr (addr, UNSPEC_TLS_GD, addr);
> +
>  case TLS_MODEL_INITIAL_EXEC:
> -  addr = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), UNSPEC_TLS_IE);
> -  addr = gen_rtx_CONST (Pmode, addr);
> +  addr = arc_unspec_offset (addr, UNSPEC_TLS_IE);
>addr = copy_to_mode_reg (Pmode, gen_const_mem (Pmode, addr));
>return gen_rtx_PLUS (Pmode, arc_get_tp (), addr);
> +
>  case TLS_MODEL_LOCAL_EXEC:
>  local_exec:
> -  addr = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), UNSPEC_TLS_OFF);
> -  addr = gen_rtx_CONST (Pmode, addr);
> +  addr = arc_unspec_offset (addr, UNSPEC_TLS_OFF);
>return gen_rtx_PLUS (Pmode, arc_get_tp (), addr);
>  default:
>gcc_unreachable ();
> @@ -5087,14 +5094,11 @@ arc_legitimize_pic_address (rtx orig, rtx oldx)
>else if (!flag_pic)
>   return orig;
>else if (CONSTANT_POOL_ADDRESS_P (addr) || SYMBOL_REF_LOCAL_P (addr))
> - return gen_rtx_CONST (Pmode,
> -   gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr),
> -   ARC_UNSPEC_GOTOFFPC));
> + return arc_unspec_offset (addr,

Re: [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2016-11-30 Thread Bernd Edlinger

On 11/30/16 13:01, Wilco Dijkstra wrote:
> Bernd Edlinger wrote:
>> On 11/29/16 16:06, Wilco Dijkstra wrote:
>>> Bernd Edlinger wrote:
>>>
>>> -  "TARGET_32BIT && reload_completed
>>> +  "TARGET_32BIT && ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)
>>>  && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"
>>>
>>> This is equivalent to "&& (!TARGET_IWMMXT || reload_completed)" since we're
>>> already excluding NEON.
>>
>> Aehm, no.  This would split the addi_neon insn before it is clear
>> if the reload pass will assign a VFP register.
>
> Hmm that's strange... This instruction shouldn't be used to also split some 
> random
> Neon pattern - for example arm_subdi3 doesn't do the same. To understand and
> reason about any of these complex patterns they should all work in the same 
> way...
>
I was a bit surprised as well, when I saw that happen.
But subdi3 is different:
   "TARGET_32BIT && !TARGET_NEON"
   "#"  ; "subs\\t%Q0, %Q1, %Q2\;sbc\\t%R0, %R1, %R2"
   "&& reload_completed"

so this never splits anything if TARGET_NEON.
but adddi3 can not expand if TARGET_NEON but it's pattern simply
looks exactly like the addi3_neon:

(define_insn_and_split "*arm_adddi3"
   [(set (match_operand:DI  0 "s_register_operand" 
"=")
 (plus:DI (match_operand:DI 1 "s_register_operand" "%0, 0, r, 0, r")
  (match_operand:DI 2 "arm_adddi_operand"  "r,  0, r, 
Dd, Dd")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT && !TARGET_NEON"
   "#"
   "TARGET_32BIT && reload_completed
&& ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0])))"

(define_insn "adddi3_neon"
   [(set (match_operand:DI 0 "s_register_operand" 
"=w,?,?,?w,?,?,?")
 (plus:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,w,r,0,r")
  (match_operand:DI 2 "arm_adddi_operand" 
"w,r,0,w,r,Dd,Dd")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_NEON"
{
   switch (which_alternative)
 {
 case 0: /* fall through */
 case 3: return "vadd.i64\t%P0, %P1, %P2";
 case 1: return "#";
 case 2: return "#";
 case 4: return "#";
 case 5: return "#";
 case 6: return "#";
 default: gcc_unreachable ();
 }

Even the return "#" explicitly invokes the former pattern.
So I think the author knew that, and did it on purpose.


>> But when I make *arm_cmpdi_insn split early, it ICEs:
>
> (insn 4870 4869 1636 87 (set (scratch:SI)
>  (minus:SI (minus:SI (subreg:SI (reg:DI 2261) 4)
>  (subreg:SI (reg:DI 473 [ X$14 ]) 4))
>  (ltu:SI (reg:CC_C 100 cc)
>  (const_int 0 [0] "pr77308-2.c":140 -1
>   (nil))
>
> That's easy, we don't have a sbcs , r1, r2 pattern. A quick 
> workaround is
> to create a temporary for operand[2] (if before reload) so it will match the 
> standard
> sbcs pattern, and then the split works fine.
>
>> So it is certainly possible, but not really simple to improve the
>> stack size even further.  But I would prefer to do that in a
>> separate patch.
>
> Yes separate patches would be fine. However there is a lot of scope to 
> improve this
> further. For example after your patch shifts and logical operations are 
> expanded in
> expand, add/sub are in split1 after combine runs and everything else is split 
> after
> reload. It doesn't make sense to split different operations at different 
> times - it means
> you're still going to get the bad DImode subregs and miss lots of optimization
> opportunities due to the mix of partly split and partly not-yet-split 
> operations.
>

Yes.  I did the add/sub differently because it was more easy this way,
and it was simply sufficient to make the existing test cases happy.

Also, the biggest benefit was IIRC from the very early splitting
of the anddi/iordi/xordi patterns, because they have completely
separate data flow in low and high parts.  And that is not
the case for the arihmetic patterns, but nevertheless they
can still be optimized, preferably, when a new test case
is found, that can demonstrate an improvement.

I am not sure why the cmpdi pattern have an influence at all,
because from the data flow you need all 64 bits of both sides.
Nevertheless it is a fact: With the modified test case I
get 264 bytes frame size, and that was 1920 before.

I attached the completely untested follow-up patch now, but I would
like to post that one again for review, after I applied my current
patch, which is still waiting for final review (please feel pinged!).


This is really exciting...


Thanks
Bernd.
--- gcc/config/arm/arm.md.orig	2016-11-27 09:22:41.794790123 +0100
+++ gcc/config/arm/arm.md	2016-11-30 16:40:30.140532737 +0100
@@ -4738,7 +4738,7 @@
(clobber (reg:CC CC_REGNUM))]
   "TARGET_ARM"
   "#"   ; "rsbs\\t%Q0, %Q1, #0\;rsc\\t%R0, %R1, #0"
-  "&& reload_completed"
+  "&& ((!TARGET_NEON && !TARGET_IWMMXT) || reload_completed)"
   [(parallel [(set (reg:CC CC_REGNUM)
 		   (compare:CC (const_int 0) (match_dup 1)))
 	  (set (match_dup 0) (minus:SI

Re: [patch] boehm-gc removal and libobjc changes to build with an external bdw-gc

2016-11-30 Thread Matthias Klose

On 30.11.2016 12:38, Richard Biener wrote:
> On Wed, Nov 30, 2016 at 12:30 PM, Matthias Klose  wrote:
>> There's one more fix needed for the case of only having the pkg-config module
>> installed when configuring with --enable-objc-gc. We can't use 
>> PKG_CHECK_MODULES
>> directly because the pkg.m4 macros choke on the dash in the module name. Thus
>> setting the CFLAGS and LIBS directly. Ok to install?
> 
> Why not fix pkg.m4?
>
> Richard.

Jakub suggested to avoid using pkg-config at all, so we can get rid off this
code altogether.

To re-enable bootstrap with --enable-objc-gc enabled, I checked in this change
as a stop-gap:

>> * configure.ac: Set BDW_GC_CFLAGS and BDW_GC_LIBS after checking
>> for the existence of the pkg-config modules.
>> * Regenerate.

Matthias

[PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA

2016-11-30 Thread Kyrill Tkachov


Hi all,

In this awkward ICE we have a *load_multiple pattern that is being transformed 
in reload from:
(insn 55 67 151 3 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
(set (reg:SI 158 [ c+4 ])
(mem/u/c:SI (plus:SI (reg/f:SI 147)
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) arm-crash.c:25 393 {*load_multiple}
 (expr_list:REG_UNUSED (reg:SI 0 r0)
(nil)))


into the invalid:
(insn 55 67 70 3 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
(set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
(const_int -4 [0xfffc])) [4 %sfp+-12 S4 
A32])
(mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) arm-crash.c:25 393 {*load_multiple}
 (nil))

The operands of *load_multiple are not validated through constraints like LRA 
is used to, but rather through
a match_parallel predicate which ends up calling ldm_stm_operation_p to 
validate the multiple sets.
But this means that LRA cannot reason about the constraints properly.
This two-regiseter load should not have used *load_multiple anyway, it should 
have used *ldm2_ from ldmstm.md
and indeed it did until the loop2_invariant pass which copied the ldm2_ pattern:
(insn 27 23 28 4 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
(set (reg:SI 1 r1)
(mem/u/c:SI (plus:SI (reg/f:SI 147)
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) "ldm.c":25 385 {*ldm2_}
 (nil))

into:
(insn 55 19 67 3 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
(set (reg:SI 158)
(mem/u/c:SI (plus:SI (reg/f:SI 147)
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) "ldm.c":25 404 {*load_multiple}
 (expr_list:REG_UNUSED (reg:SI 0 r0)
(nil)))

Note that it now got recognised as load_multiple because the second register is 
not a hard register but the pseudo 158.
In any case, the solution suggested in the PR (and I agree with it) is to 
restrict *load_multiple to after reload.
The similar pattern *load_multiple_with_writeback also has a similar condition 
and the comment above *load_multiple says that
it's used to generate epilogues, which is done after reload anyway. For 
pre-reload load-multiples the patterns in ldmstm.md
should do just fine.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2016-11-30  Kyrylo Tkachov  

PR target/71436
* config/arm/arm.md (*load_multiple): Add reload_completed to
matching condition.

2016-11-30  Kyrylo Tkachov  

PR target/71436
* gcc.c-torture/compile/pr71436.c: New test.
commit 996d28e2353badd1b29ef000f94d40c7dab9010f
Author: Kyrylo Tkachov 
Date:   Tue Nov 29 15:07:30 2016 +

[ARM] Restrict *load_multiple pattern till after LRA

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 74c44f3..22d2a84 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11807,12 +11807,15 @@ (define_insn ""
 
 ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern covers
 ;; large lists without explicit writeback generated for APCS_FRAME epilogue.
+;; The operands are validated through the load_multiple_operation
+;; match_parallel predicate rather than through constraints so enable it only
+;; after reload.
 (define_insn "*load_multiple"
   [(match_parallel 0 "load_multiple_operation"
 [(set (match_operand:SI 2 "s_register_operand" "=rk")
   (mem:SI (match_operand:SI 1 "s_register_operand" "rk")))
 ])]
-  "TARGET_32BIT"
+  "TARGET_32BIT && reload_completed"
   "*
   {
 arm_output_multireg_pop (operands, /*return_pc=*/false,
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
new file mode 100644
index 000..ab08d5d
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
@@ -0,0 +1,35 @@
+/* PR target/71436.  */
+
+#pragma pack(1)
+struct S0
+{
+  volatile int f0;
+  short f2;
+};
+
+void foo (struct S0 *);
+int a, d;
+static struct S0 b[5];
+static struct S0 c;
+void fn1 ();
+void
+main ()
+{
+  {
+struct S0 e;
+for (; d; fn1 ())
+  {
+{
+  a = 3;
+  for (; a >= 0; a -= 1)
+{
+  {
+e = c;
+  }
+  b[a] = e;
+}
+}
+  }
+  }
+  foo (b);
+}

Re: [Patches] Add variant constexpr support for visit, comparisons and get

2016-11-30 Thread Jonathan Wakely


On 26/11/16 21:38 -0800, Tim Shen wrote:

This 4-patch series contains the following in order:

a.diff: Remove uses-allocator ctors. They are going away, and removing
it reduces the maintenance burden from now on.


Yay! less code.


b.diff: Add constexpr support for get<> and comparisons. This patch
also involves small refactoring of _Variant_storage.

c.diff: Fix some libc++ test failures.

d.diff: Add constexpr support for visit. This patch also removes
__storage, __get_alternative, and __reserved_type_map, since we don't
need to support reference/void types for now.

The underlying design doesn't change - we still use the vtable
approach to achieve O(1) runtime cost even under -O0.


Great stuff.




   * include/std/variant: Implement constexpr comparison and get<>.
   * testsuite/20_util/variant/compile.cc: Tests.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 2d9303a..a913074 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -154,31 +154,63 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  template
using __storage = _Alternative;

-  template>
+  // _Uninitialized nullify the destructor calls.


This wording confused me slightly. How about:

 "_Uninitialized makes destructors trivial"


+  // This is necessary, since we define _Variadic_union as a recursive union,
+  // and we don't want to inspect the union members one by one in its dtor,
+  // it's slow.


Please change "it's slow" to "that's slow".


+  template>
struct _Uninitialized;


I'm still unsure that is_literal_type is the right trait here. If it's
definitely right then we should probably *not* deprecate it in C++17!


  template
struct _Uninitialized<_Type, false>
{
-  constexpr _Uninitialized() = default;
-
  template
  constexpr _Uninitialized(in_place_index_t<0>, _Args&&... __args)
  { ::new (&_M_storage) _Type(std::forward<_Args>(__args)...); }

+  const _Type& _M_get() const &
+  {
+   return *static_cast(
+   static_cast(&_M_storage));
+  }
+
+  _Type& _M_get() &
+  { return *static_cast<_Type*>(static_cast(&_M_storage)); }
+
+  const _Type&& _M_get() const &&
+  {
+   return std::move(*static_cast(
+   static_cast(&_M_storage)));
+  }
+
+  _Type&& _M_get() &&
+  {
+   return std::move(*static_cast<_Type*>(static_cast(&_M_storage)));
+  }
+
  typename std::aligned_storage::type
  _M_storage;


I think this could use __aligned_membuf, which would reduce the
alignment requirements for some types (e.g. long long on x86-32).

That would also mean you get the _M_ptr() member so don't need all the
casts.


+  ~_Variant_storage()
+  { _M_destroy_impl(std::make_index_sequence{}); }


You can use index_sequence_for<_Types...> here.


@@ -598,9 +645,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_S_apply_all_alts(_Array_type& __vtable, index_sequence<__indices...>)
{ (_S_apply_single_alt<__indices>(__vtable._M_arr[__indices]), ...); }

-  template
+  template


This needs to be _Tp not T


+  return __lhs._M_equal_to(__rhs,
+  std::make_index_sequence{});


Another one that could use index_sequence_for<_Types...>


+  return __lhs._M_less_than(__rhs,
+   std::make_index_sequence{});


Same again.



   * include/bits/enable_special_members.h: Make
   _Enable_default_constructor constexpr.
   * include/std/variant (variant::emplace, variant::swap, std::swap,
   std::hash): Sfinae on emplace and std::swap; handle __poison_hash 
bases
   of duplicated types.

diff --git a/libstdc++-v3/include/bits/enable_special_members.h 
b/libstdc++-v3/include/bits/enable_special_members.h
index 07c6c99..4f4477b 100644
--- a/libstdc++-v3/include/bits/enable_special_members.h
+++ b/libstdc++-v3/include/bits/enable_special_members.h
@@ -118,7 +118,8 @@ template
operator=(_Enable_default_constructor&&) noexcept = default;

// Can be used in other ctors.
-explicit _Enable_default_constructor(_Enable_default_constructor_tag) { }
+constexpr explicit
+_Enable_default_constructor(_Enable_default_constructor_tag) { }
  };

+  void _M_reset()
+  {
+   _M_reset_impl(std::make_index_sequence{});
+   _M_index = variant_npos;
+  }
+
  ~_Variant_storage()
-  { _M_destroy_impl(std::make_index_sequence{}); }
+  { _M_reset(); }


These can also use index_sequence_for<_Types...>


@@ -1253,14 +1285,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

  template
struct hash>
-: private __poison_hash>...
+: private __detail::__variant::_Variant_hash_base<
+   variant<_Types...>, std::make_index_sequence>

Re: [v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Jonathan Wakely


On 30/11/16 17:58 +0200, Ville Voutilainen wrote:

   Fix testsuite failures caused by the patch implementing LWG 2534.
   * include/std/istream (__is_convertible_to_basic_istream):
   Change the return types of __check, introduce stream_type.
   (operator>>(_Istream&&, _Tp&&)):
   Use __is_convertible_to_basic_istream::stream_type as the return type.
   * include/std/ostream (__is_convertible_to_basic_ostream):
   Change the return types of __check, introduce stream_type.
   (operator>>(_Ostream&&, _Tp&&)):
   Use __is_convertible_to_basic_ostream::stream_type as the return type.


As discussed on IRC, please change "stream_type" to istream_type and
ostream_type, as appropriate, because those names are already used by
stream iterators, so users can't define them as macros.

And you could make the remove_reference happen inside the
__is_convertible_to_basic_[io]stream trait, since it's only ever used
with references, but that seems stylistic.

OK with the stream_type renaming.

Re: [PATCH] Fix prs 78602 & 78560 on PowerPC (vec_extract/vec_sec)

2016-11-30 Thread Segher Boessenkool

On Wed, Nov 30, 2016 at 12:55:29AM -0500, Michael Meissner wrote:
> I have done full bootstraps and make check with no regressions on a little
> endian power8 (64-bit only), a big endian power8 (64-bit only), and a big
> endian power7 (both 32-bit and 64-bit).  Cann I install both patches to the
> trunk?

Yes please.  Thanks,


Segher


> 2016-11-29  Michael Meissner  
> 
>   PR target/78602
>   * config/rs6000/rs6000.c (rs6000_expand_vector_extract): If the
>   element is not a constant or in a register, force it to a
>   register.
> 
> 2016-11-29  Michael Meissner  
> 
>   PR target/78560
>   * config/rs6000/rs6000.c (rs6000_expand_vector_set): Force value
>   that will be set to a vector element to be in a register.
>   * config/rs6000/vsx.md (vsx_set__p9): Fix thinko that used
>   the wrong multiplier to convert the element number to a byte
>   offset.

Re: [PATCH,rs6000] Correct mode of operand 2 in vector extract half-word and word instruction patterns

2016-11-30 Thread Segher Boessenkool

On Wed, Nov 30, 2016 at 08:35:08AM -0700, Kelvin Nilsen wrote:
> This patch corrects an error in a patch committed on 2016-10-18 to add
> built-in function support for Power9 string operations.  In that
> original patch, the mode for operand 2 of the newly added vector
> extract half-word and full-word instruction patterns was described as
> V16QI, even though those instruction patterns were conceptually
> operating on V8HI and V4SI operands respectively.
> 
> This patch changes the modes of the operands for these instruction
> patterns to better represent the intended types.  This patch improves
> readability and maintainability of code.  It does not affect
> correctness of generated code, since the existing implementation
> implicitly coerces the operand types to the declared type.
> 
> The patch has been bootstrapped and tested on powerpc64le-unknown-linux
> without regressions.
> 
> Is this ok for the trunk?

Okay.  Thanks,


Segher


> 2016-11-30  Kelvin Nilsen  
> 
>   PR target/78577
>   * config/rs6000/vsx.md (vextuhlx): Revise mode of operand 2.
>   (vextuhrx): Likewise.
>   (vextuwlx): Likewise.
>   (vextuwrx): Likewise.

Re: [PATCH 7/9] Add RTL-error-handling to host

2016-11-30 Thread Bernd Schmidt


On 11/29/2016 10:13 PM, Bernd Schmidt wrote:

On 11/29/2016 07:53 PM, David Malcolm wrote:


Would you prefer that I went with approach (B), or is approach (A)
acceptable?


Well, I was hoping there'd be an approach (C) where the read-rtl code
uses whatever diagnostics framework that is available. Maybe it'll turn
out that's too hard. Somehow the current patch looked strange to me, but
if there's no easy alternative maybe we'll have to go with it.


So, I've tried to build patches 1-6 + 8, without #7. It looks like the 
differences are as follows:


- A lack of seen_error in errors.[ch], could be easily added, and
  basically a spelling mismatch between have_error and errorcount.
- A lack of fatal in diagnostics.c. Could maybe be added to just call
  fatal_error?

All this seems simpler and cleaner to fix than linking two different 
error handling frameworks into one binary. Do you see any other 
difficulties?



Bernd

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Dominik Vogt

On Wed, Nov 30, 2016 at 03:40:32PM +0100, Michael Matz wrote:
> Hi,
> 
> On Wed, 30 Nov 2016, Segher Boessenkool wrote:
> 
> > > I don't think mode-changing _extracts are valid in this context.  From 
> > > the 
> > > docu:
> > > 
> > >   `(sign_extract:M LOC SIZE POS)'
> > >   ...
> > >  The mode M is the same as the mode that would be used for LOC if
> > >  it were a register.
> > > 
> > > Probably it could be made to work just fine, but I'm not sure it'd be 
> > > worth much, as then the targets would need to care for mode-changes 
> > > occuring not just through subregs as usual, but also through extracts.
> > 
> > The patch https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02987.html I
> > submitted yesterday deals with this same issue, FWIW -- some ports
> > apparently already do mode-changing extracts.
> 
> Yeah, saw that a bit later.  So, hmmm.  I'm not sure what to make of it, 
> if the targets choose to use mode-changing extracts I guess that's fine, 
> as they presumably will have written patterns that recognize them.  But I 
> don't think we should willy-nilly generate such patterns as we can't know 
> if the target deals with them or not.

Just working on such a pattern for s390x, I had the impression
that combine uses the automatic mode change when it's there, and
otherwise it does things differently, that is, it tries different
combinations when it has the pattern than without.  There seems to
be at least some kind of detection.

> We could of course always generate 
> both variants: (subreg:M1 (extract:M2 (object:M2)) and
> (extract:M1 (object:M2)) and see if either matches, but that seems a bit 
> too much work.

--

The insns that combine tried are:

  (insn 7 4 8 2 (set (reg:DI 69)
(lshiftrt:DI (reg/v:DI 66 [ v_x ])
(const_int 48 [0x30])))

  (insn 9 8 10 2 (parallel [
(set (reg:SI 68 [ v_or ])
(ior:SI (reg:SI 70 [ v_and1 ])
(subreg:SI (reg:DI 69) 4)))
(clobber (reg:CC 33 %cc))
])

A while ago combine handled the situation well, resulting in the
new "risbg" instruction, but for a while it's not been working.
It's a bit difficult to track that down to a specific commit
because of the broken "combine"-patch that took a while to fix.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild

Fixed -> r243034.

- Andre

On Wed, 30 Nov 2016 15:53:39 +0100
Janus Weil  wrote:

> Hi,
> 
> > on IRC:
> > 15:28:22 dominiq:  vehre: add /* FALLTHROUGH */
> >
> > Done and committed as obvious as r243023.  
> 
> thanks. However, I still see these two:
> 
> 
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> >> > ‘_gfortran_caf_get_by_ref’:
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
> >> > ‘src_size’ may be used uninitialized in this function
> >> > [-Wmaybe-uninitialized]
> >> >if (size == 0 || src_size == 0)
> >> > ~^~~~
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> >> > ‘_gfortran_caf_send_by_ref’:
> >> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
> >> > ‘src_size’ may be used uninitialized in this function
> >> > [-Wmaybe-uninitialized]
> >> >if (size == 0 || src_size == 0)
> >> > ~^~~~  
> 
> Can you please fix them as well?
> 
> Thanks,
> Janus
> 
> 
> 
> 
> >> > 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :  
> >> > > Hi Paul,
> >> > >
> >> > > thanks for the review. Committed with the changes requested and the one
> >> > > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
> >> > > -m32 as r243021.
> >> > >
> >> > > Thanks for the review and tests.
> >> > >
> >> > > Regards,
> >> > > Andre
> >> > >
> >> > > On Wed, 30 Nov 2016 07:49:13 +0100
> >> > > Paul Richard Thomas  wrote:
> >> > >  
> >> > >> Dear Andre,
> >> > >>
> >> > >> This all looks OK to me. The only comment that I have that you might
> >> > >> deal with before committing is that some of the Boolean expressions,
> >> > >> eg:
> >> > >> +  int caf_dereg_mode
> >> > >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> >> > >> +  || c->attr.codimension)
> >> > >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> >> > >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> >> > >> +  : GFC_CAF_COARRAY_DEREGISTER)
> >> > >> +  : GFC_CAF_COARRAY_NOCOARRAY;
> >> > >>
> >> > >> are getting be sufficiently convoluted that a small, appropriately
> >> > >> named, helper function might be clearer. Of course, this is true of
> >> > >> many parts of gfortran but it is not too late to start making the code
> >> > >> a bit clearer.
> >> > >>
> >> > >> You can commit to the present trunk as far as I am concerned. I know
> >> > >> that the caf enthusiasts will test it to bits before release!
> >> > >>
> >> > >> Regards
> >> > >>
> >> > >> Paul
> >> > >>
> >> > >>
> >> > >> On 28 November 2016 at 19:33, Andre Vehreschild 
> >> > >> wrote:  
> >> > >> > PING!
> >> > >> >
> >> > >> > I know it's a lengthy patch, but comments would be nice anyway.
> >> > >> >
> >> > >> > - Andre
> >> > >> >
> >> > >> > On Tue, 22 Nov 2016 20:46:50 +0100
> >> > >> > Andre Vehreschild  wrote:
> >> > >> >  
> >> > >> >> Hi all,
> >> > >> >>
> >> > >> >> attached patch addresses the need of extending the API of the
> >> > >> >> caf-libs to enable allocatable components asynchronous allocation.
> >> > >> >> Allocatable components in derived type coarrays are different from
> >> > >> >> regular coarrays or coarrayed components. The latter have to be
> >> > >> >> allocated on all images or on none. Furthermore is the allocation
> >> > >> >> a point of synchronisation.
> >> > >> >>
> >> > >> >> For allocatable components the F2008 allows to have some allocated
> >> > >> >> on some images and on others not. Furthermore is the registration
> >> > >> >> with the caf-lib, that an allocatable component is present in a
> >> > >> >> derived type coarray no longer a synchronisation point. To
> >> > >> >> implement these features two new types of coarray registration
> >> > >> >> have been introduced. The first one just registering the component
> >> > >> >> with the caf-lib and the latter doing the allocate. Furthermore
> >> > >> >> has the caf-API been extended to provide a query function to learn
> >> > >> >> about the allocation status of a component on a remote image.
> >> > >> >>
> >> > >> >> Sorry, that the patch is rather lengthy. Most of this is due to the
> >> > >> >> structure_alloc_comps' signature change. The routine and its
> >> > >> >> wrappers are used rather often which needed the appropriate
> >> > >> >> changes.
> >> > >> >>
> >> > >> >> I know I left two or three TODOs in the patch to remind me of
> >> > >> >> things I have to investigate further. For the current state these
> >> > >> >> TODOs are no reason to hold back the patch. The third party
> >> > >> >> library opencoarrays implements the mpi-part of the caf-model and
> >> > >> >> will change in sync. It would of course be advantageous to just
> >> > >> >> have to say: With gcc-7 gfortran implements allocatable components
> >> > >> >> in derived coarrays

[v3 PATCH] Fix testsuite failures caused by the patch implementing LWG 2534.

2016-11-30 Thread Ville Voutilainen

2016-11-30  Ville Voutilainen  

Fix testsuite failures caused by the patch implementing LWG 2534.
* include/std/istream (__is_convertible_to_basic_istream):
Change the return types of __check, introduce stream_type.
(operator>>(_Istream&&, _Tp&&)):
Use __is_convertible_to_basic_istream::stream_type as the return type.
* include/std/ostream (__is_convertible_to_basic_ostream):
Change the return types of __check, introduce stream_type.
(operator>>(_Ostream&&, _Tp&&)):
Use __is_convertible_to_basic_ostream::stream_type as the return type.
diff --git a/libstdc++-v3/include/std/istream b/libstdc++-v3/include/std/istream
index 4f0e940..81df402 100644
--- a/libstdc++-v3/include/std/istream
+++ b/libstdc++-v3/include/std/istream
@@ -913,11 +913,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_convertible_to_basic_istream
 {
   template
-  static true_type __check(basic_istream<_Ch, _Up>*);
+  static basic_istream<_Ch, _Up>& __check(basic_istream<_Ch, _Up>*);
 
-  static false_type __check(void*);
+  static void __check(void*);
 public:
-  using type = decltype(__check(declval<_Tp*>()));
+  using stream_type = decltype(__check(declval<_Tp*>()));
+  using type = __not_>;
   constexpr static bool value = type::value;
   };
 
@@ -949,7 +950,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __is_convertible_to_basic_istream<
typename remove_reference<_Istream>::type>,
  __is_extractable<_Istream&, _Tp&&>>::value,
-  _Istream&>::type
+  typename __is_convertible_to_basic_istream<
+typename
+remove_reference<_Istream>::type>::stream_type>::type
 operator>>(_Istream&& __is, _Tp&& __x)
 {
   __is >> std::forward<_Tp>(__x);
diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream
index a1fe892..64db7c7 100644
--- a/libstdc++-v3/include/std/ostream
+++ b/libstdc++-v3/include/std/ostream
@@ -617,11 +617,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_convertible_to_basic_ostream
   {
 template
-static true_type __check(basic_ostream<_Ch, _Up>*);
+static basic_ostream<_Ch, _Up>& __check(basic_ostream<_Ch, _Up>*);
 
-static false_type __check(void*);
+static void __check(void*);
   public:
-using type = decltype(__check(declval<_Tp*>()));
+using stream_type = decltype(__check(declval<_Tp*>()));
+using type = __not_>;
 constexpr static bool value = type::value;
   };
 
@@ -650,8 +651,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __is_convertible_to_basic_ostream<
typename remove_reference<_Ostream>::type>,
  __is_insertable<_Ostream&, const _Tp&>>::value,
-  _Ostream&>::type
- //basic_ostream<_CharT, _Traits>&
+  typename __is_convertible_to_basic_ostream<
+typename
+remove_reference<_Ostream>::type>::stream_type>::type
 operator<<(_Ostream&& __os, const _Tp& __x)
 {
   __os << __x;

[PATCH,rs6000] Correct mode of operand 2 in vector extract half-word and word instruction patterns

2016-11-30 Thread Kelvin Nilsen


This patch corrects an error in a patch committed on 2016-10-18 to add
built-in function support for Power9 string operations.  In that
original patch, the mode for operand 2 of the newly added vector
extract half-word and full-word instruction patterns was described as
V16QI, even though those instruction patterns were conceptually
operating on V8HI and V4SI operands respectively.

This patch changes the modes of the operands for these instruction
patterns to better represent the intended types.  This patch improves
readability and maintainability of code.  It does not affect
correctness of generated code, since the existing implementation
implicitly coerces the operand types to the declared type.

The patch has been bootstrapped and tested on powerpc64le-unknown-linux
without regressions.

Is this ok for the trunk?

gcc/ChangeLog:

2016-11-30  Kelvin Nilsen  

PR target/78577
* config/rs6000/vsx.md (vextuhlx): Revise mode of operand 2.
(vextuhrx): Likewise.
(vextuwlx): Likewise.
(vextuwrx): Likewise.

Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 242948)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -3648,7 +3648,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V8HI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUHLX))]
   "TARGET_P9_VECTOR"
   "vextuhlx %0,%1,%2"
@@ -3659,7 +3659,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V8HI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUHRX))]
   "TARGET_P9_VECTOR"
   "vextuhrx %0,%1,%2"
@@ -3670,7 +3670,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V4SI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUWLX))]
   "TARGET_P9_VECTOR"
   "vextuwlx %0,%1,%2"
@@ -3681,7 +3681,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI
 [(match_operand:SI 1 "register_operand" "r")
- (match_operand:V16QI 2 "altivec_register_operand" "v")]
+ (match_operand:V4SI 2 "altivec_register_operand" "v")]
 UNSPEC_VEXTUWRX))]
   "TARGET_P9_VECTOR"
   "vextuwrx %0,%1,%2"


-- 
Kelvin Nilsen, Ph.D.  kdnil...@linux.vnet.ibm.com
home office: 801-756-4821, cell: 520-991-6727
IBM Linux Technology Center - PPC Toolchain

Re: [PATCHv2 4/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

2016-11-30 Thread Andre Vieira (lists)

On 23/11/16 11:52, Andre Vieira (lists) wrote:
> Hi,
> 
> After some extra testing I realized there was an issue with the way we
> were clearing registers when returning from a cmse_nonsecure_entry
> function for ARMv8-M.Baseline.  This patch fixes that and changes the
> testcase to catch the issue.
> 
> The problem was I was always using LR to clear the registers, however,
> due to the way the Thumb-1 backend works, we can't guarantee LR will
> contain the address to which we will be returning at the time of
> clearing. Instead we use r0 to clear r1-r3 and IP. If the function does
> not use r0 to return a value, we clear r0 with 0 before using it to
> clear everything else. As for LR, we move the value of the register used
> to return into it prior to returning.
> 
> This satisfies the requirements of not leaking secure information since
> all registers hold either:
> - values to return
> - 0
> - return address
> 
> No changes to ChangeLog.
> 
> Cheers,
> Andre
> 
Hi,

So I seemed to have forgotten to address two of your comments earlier,
done in this version.

To reiterate:
After some extra testing I realized there was an issue with the way we
were clearing registers when returning from a cmse_nonsecure_entry
function for ARMv8-M Baseline.  This patch fixes that and changes the
testcase to catch the issue.

The problem was I was always using LR to clear the registers, however,
due to the way the Thumb-1 backend works, we can't guarantee LR will
contain the address to which we will be returning at the time of
clearing. Instead we use r0 to clear r1-r3 and IP. If the function does
not use r0 to return a value, we clear r0 with 0 before using it to
clear everything else. As for LR, we move the value of the register used
to return into it prior to returning.

This satisfies the requirements of not leaking secure information since
all registers hold either:
- values to return
- 0
- return address

*** gcc/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * config/arm/arm.c (output_return_instruction): Clear
 registers.
 (thumb2_expand_return): Likewise.
 (thumb1_expand_epilogue): Likewise.
 (thumb_exit): Likewise.
 (arm_expand_epilogue): Likewise.
 (cmse_nonsecure_entry_clear_before_return): New.
 (comp_not_to_clear_mask_str_un): New.
 (compute_not_to_clear_mask): New.
 * config/arm/thumb1.md (*epilogue_insns): Change length attribute.
 * config/arm/thumb2.md (*thumb2_cmse_entry_return): Duplicate
 thumb2_return pattern for cmse_nonsecure_entry functions.

*** gcc/testsuite/ChangeLog ***
2016-11-xx  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate.
 * gcc.target/arm/cmse/struct-1.c: New.
 * gcc.target/arm/cmse/bitfield-1.c: New.
 * gcc.target/arm/cmse/bitfield-2.c: New.
 * gcc.target/arm/cmse/bitfield-3.c: New.
 * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are
cleared.
 * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.

Cheers,
Andre
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
db7e0c842fff1b0aee5059e3ea4813059caa8d03..6a9db85aa879e1c5547908dcc9f036ee37de489e
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16297,6 +16297,279 @@ note_invalid_constants (rtx_insn *insn, HOST_WIDE_INT 
address, int do_pushes)
   return;
 }

+/* This function computes the clear mask and PADDING_BITS_TO_CLEAR for structs
+   and unions in the context of ARMv8-M Security Extensions.  It is used as a
+   helper function for both 'cmse_nonsecure_call' and 'cmse_nonsecure_entry'
+   functions.  The PADDING_BITS_TO_CLEAR pointer can be the base to either one
+   or four masks, depending on whether it is being computed for a
+   'cmse_nonsecure_entry' return value or a 'cmse_nonsecure_call' argument
+   respectively.  The tree for the type of the argument or a field within an
+   argument is passed in ARG_TYPE, the current register this argument or field
+   starts in is kept in the pointer REGNO and updated accordingly, the bit this
+   argument or field starts at is passed in STARTING_BIT and the last used bit
+   is kept in LAST_USED_BIT which is also updated accordingly.  */
+
+static unsigned HOST_WIDE_INT
+comp_not_to_clear_mask_str_un (tree arg_type, int * regno,
+  uint32_t * padding_bits_to_clear,
+  unsigned starting_bit, int * last_used_bit)
+
+{
+  unsigned HOST_WIDE_INT not_to_clear_reg_mask = 0;
+
+

Re: [PATCH GCC]Simplify (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2)

2016-11-30 Thread Richard Biener

On Fri, Nov 18, 2016 at 5:53 PM, Bin Cheng  wrote:
> Hi,
> This is a rework of https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02007.html.
> Though review comments suggested it could be merged with last kind 
> simplification
> of fold_cond_expr_with_comparison, it's not really applicable.  As a matter 
> of fact,
> the suggestion stands for patch 
> @https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02005.html.
> I had previous patch 
> (https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01898.html)
> moving fold_cond_expr_with_comparison to match.pd pattern and incorporated
> that patch into it.  For this one, the rework is trivial, just renames 
> several variable
> tags as suggested.  Bootstrap and test on x86_64 and AArch64, is it OK?

+ A) Operand x is a unsigned to signed type conversion and c1 is
+   integer zero.  In this case,
+ (signed type)x  < 0  <=>  x  > MAX_VAL(signed type)
+ (signed type)x >= 0  <=>  x <= MAX_VAL(signed type)

for (singed type)x < 0 -> x > signed-type-max we probably do a reverse
"canonicalization" transform?  Yeah,

/* Non-equality compare simplifications from fold_binary  */
(for cmp (lt gt le ge)
...
 (if (wi::eq_p (@1, signed_max)
  && TYPE_UNSIGNED (arg1_type)
  /* We will flip the signedness of the comparison operator
 associated with the mode of @1, so the sign bit is
 specified by this mode.  Check that @1 is the signed
 max associated with this sign bit.  */
  && prec == GET_MODE_PRECISION (TYPE_MODE (arg1_type))
  /* signed_type does not work on pointer types.  */
  && INTEGRAL_TYPE_P (arg1_type))
  /* The following case also applies to X < signed_max+1
 and X >= signed_max+1 because previous transformations.  */
  (if (cmp == LE_EXPR || cmp == GT_EXPR)
   (with { tree st = signed_type_for (arg1_type); }
(if (cmp == LE_EXPR)
 (ge (convert:st @0) { build_zero_cst (st); })
 (lt (convert:st @0) { build_zero_cst (st); }))

+   if (cmp_code == GE_EXPR)
+ cmp_code = LE_EXPR;
+   c1 = wide_int_to_tree (op_type, wi::max_value (to_type));
+ }
...
+   if (op == PLUS_EXPR)
+ real_c1 = wide_int_to_tree (op_type,
+ wi::sub (c3, c2, sgn, ));
+   else
+ real_c1 = wide_int_to_tree (op_type,
+ wi::add (c3, c2, sgn, ));

can you avoid the tree building here and just continue using wide-ints please?
Simply do the wide_int_to_tree in the result patterns.

Otherwise looks ok to me.

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-11-17  Bin Cheng  
>
> * match.pd: Add new pattern:
> (cond (cmp (convert? x) c1) (op x c2) c3) -> (op (minmax x c1) c2).
>
> gcc/testsuite/ChangeLog
> 2016-11-17  Bin Cheng  
>
> * gcc.dg/fold-bopcond-1.c: New test.
> * gcc.dg/fold-bopcond-2.c: New test.

Re: Ping: Re: [patch, avr] Add flash size to device info and make wrap around default

2016-11-30 Thread Georg-Johann Lay


On 30.11.2016 07:27, Pitchumani Sivanupandi wrote:

On Tuesday 29 November 2016 10:06 PM, Denis Chertykov wrote:

2016-11-28 10:17 GMT+03:00 Pitchumani Sivanupandi
:

On Saturday 26 November 2016 12:11 AM, Denis Chertykov wrote:

I'm sorry for delay.

I have a problem with the patch:
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-arch.h
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-devices.c
(Stripping trailing CRs from patch; use --binary to disable.)
patching file avr-mcus.def
Hunk #1 FAILED at 62.
1 out of 1 hunk FAILED -- saving rejects to file avr-mcus.def.rej
(Stripping trailing CRs from patch; use --binary to disable.)
patching file gen-avr-mmcu-specs.c
Hunk #1 succeeded at 215 (offset 5 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file specs.h
Hunk #1 succeeded at 58 (offset 1 line).
Hunk #2 succeeded at 66 (offset 1 line).


There are changes in avr-mcus.def after this patch is submitted.
Now, I have incorporated the changes and attached the resolved patch.

Regards,
Pitchumani

gcc/ChangeLog

2016-11-09  Pitchumani Sivanupandi 

 * config/avr/avr-arch.h (avr_mcu_t): Add flash_size member.
 * config/avr/avr-devices.c(avr_mcu_types): Add flash size info.
 * config/avr/avr-mcu.def: Likewise.
 * config/avr/gen-avr-mmcu-specs.c (print_mcu): Remove hard-coded
prefix
 check to find wrap-around value, instead use MCU flash size. For 8k
flash
 devices, update link_pmem_wrap spec string to add
--pmem-wrap-around=8k.
 * config/avr/specs.h: Remove link_pmem_wrap from LINK_RELAX_SPEC
and
 add to linker specs (LINK_SPEC) directly.

Committed.

It looks like only avr-mcus.def and ChangeLog are committed.
Without the other changes trunk build is broken.

Regards,
Pitchumani


Hi, I allowed me to commit the missing files.

http://gcc.gnu.org/r243033

Johann

Re: [v3 PATCH] LWG 2766, LWG 2749

2016-11-30 Thread Jonathan Wakely


On 26/11/16 14:47 +0200, Ville Voutilainen wrote:

Updated patches attached, and tested with the full testsuite on Linux-PPC64.


Both patches are OK for trunk with the minor tweaks noted below.
Thanks.



diff --git a/libstdc++-v3/include/bits/stl_pair.h 
b/libstdc++-v3/include/bits/stl_pair.h
index ef52538..981dbeb 100644
--- a/libstdc++-v3/include/bits/stl_pair.h
+++ b/libstdc++-v3/include/bits/stl_pair.h
@@ -478,6 +478,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
swap(pair<_T1, _T2>& __x, pair<_T1, _T2>& __y)
noexcept(noexcept(__x.swap(__y)))
{ __x.swap(__y); }
+
+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if,
+  __is_swappable<_T2>>::value>::type
+swap(pair<_T1, _T2>&, pair<_T1, _T2>&) = delete;
+#endif
#endif // __cplusplus >= 201103L

  /**
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index f9ec60f..21b0bac 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -649,6 +649,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
swap(unique_ptr<_Tp, _Dp>& __x,
 unique_ptr<_Tp, _Dp>& __y) noexcept
{ __x.swap(__y); }


Insert a blank line here please.


+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if::value>::type
+swap(unique_ptr<_Tp, _Dp>&,
+unique_ptr<_Tp, _Dp>&) = delete;
+#endif

  template
diff --git a/libstdc++-v3/include/std/array b/libstdc++-v3/include/std/array
index 3ab0355..86100b5 100644
--- a/libstdc++-v3/include/std/array
+++ b/libstdc++-v3/include/std/array
@@ -287,6 +287,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
swap(array<_Tp, _Nm>& __one, array<_Tp, _Nm>& __two)
noexcept(noexcept(__one.swap(__two)))
{ __one.swap(__two); }


Insert a blank line here please.


+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if<
+  !_GLIBCXX_STD_C::__array_traits<_Tp, _Nm>::_Is_swappable::value>::type
+swap(array<_Tp, _Nm>&, array<_Tp, _Nm>&) = delete;
+#endif

  template
constexpr _Tp&




diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index 63cacd4..8952750 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -1442,17 +1442,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
forward_as_tuple(_Elements&&... __args) noexcept
{ return tuple<_Elements&&...>(std::forward<_Elements>(__args)...); }

-  template
-struct __is_tuple_like_impl> : true_type
-{ };
-
-  // Internal type trait that allows us to sfinae-protect tuple_cat.
-  template
-struct __is_tuple_like
-: public __is_tuple_like_impl::type>::type>::type
-{ };
-
  template
struct __make_tuple_impl;

@@ -1596,6 +1585,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
swap(tuple<_Elements...>& __x, tuple<_Elements...>& __y)
noexcept(noexcept(__x.swap(__y)))
{ __x.swap(__y); }


Insert a blank line here please.


+#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
+  template
+inline
+typename enable_if...>::value>::type
+swap(tuple<_Elements...>&, tuple<_Elements...>&) = delete;
+#endif

  // A class (and instance) which can be used in 'tie' when an element
  // of a tuple is not required
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index e5f2bba..49f76cc 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2593,9 +2593,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  template 
struct __is_nothrow_swappable;

+  template
+class tuple;
+
+  template
+struct __is_tuple_like_impl : false_type
+{ };
+
+  template
+struct __is_tuple_like_impl> : true_type
+{ };
+
+  // Internal type trait that allows us to sfinae-protect tuple_cat.
+  template
+struct __is_tuple_like
+: public __is_tuple_like_impl::type>::type>::type


We can lose the std:: quals here.


+{ };
+
  template
inline
-typename enable_if<__and_,
+typename enable_if<__and_<__not_<__is_tuple_like<_Tp>>,
+ is_move_constructible<_Tp>,
  is_move_assignable<_Tp>>::value>::type
swap(_Tp&, _Tp&)
noexcept(__and_,

Re: [v3 PATCH] Implement LWG 2534, Constrain rvalue stream operators.

2016-11-30 Thread David Edelsohn

I believe that this broke g++.old-deja/g++.law/ctors10.C

invalid initialization of reference of type 'Klasse::Err&' from
expression of type 'std::basic_ostream::__ostream_type {aka
std::basic_ostream}

- David

Re: [v3 PATCH] Implement LWG 2534, Constrain rvalue stream operators.

2016-11-30 Thread Ville Voutilainen

On 30 November 2016 at 16:59, David Edelsohn  wrote:
> I believe that this broke g++.old-deja/g++.law/ctors10.C
>
> invalid initialization of reference of type 'Klasse::Err&' from
> expression of type 'std::basic_ostream::__ostream_type {aka
> std::basic_ostream}


I'll take a look.

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Janus Weil

Hi,

> on IRC:
> 15:28:22 dominiq:  vehre: add /* FALLTHROUGH */
>
> Done and committed as obvious as r243023.

thanks. However, I still see these two:


>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
>> > ‘_gfortran_caf_get_by_ref’:
>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
>> > ‘src_size’ may be used uninitialized in this function
>> > [-Wmaybe-uninitialized]
>> >if (size == 0 || src_size == 0)
>> > ~^~~~
>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
>> > ‘_gfortran_caf_send_by_ref’:
>> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
>> > ‘src_size’ may be used uninitialized in this function
>> > [-Wmaybe-uninitialized]
>> >if (size == 0 || src_size == 0)
>> > ~^~~~

Can you please fix them as well?

Thanks,
Janus




>> > 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :
>> > > Hi Paul,
>> > >
>> > > thanks for the review. Committed with the changes requested and the one
>> > > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
>> > > -m32 as r243021.
>> > >
>> > > Thanks for the review and tests.
>> > >
>> > > Regards,
>> > > Andre
>> > >
>> > > On Wed, 30 Nov 2016 07:49:13 +0100
>> > > Paul Richard Thomas  wrote:
>> > >
>> > >> Dear Andre,
>> > >>
>> > >> This all looks OK to me. The only comment that I have that you might
>> > >> deal with before committing is that some of the Boolean expressions,
>> > >> eg:
>> > >> +  int caf_dereg_mode
>> > >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
>> > >> +  || c->attr.codimension)
>> > >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
>> > >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
>> > >> +  : GFC_CAF_COARRAY_DEREGISTER)
>> > >> +  : GFC_CAF_COARRAY_NOCOARRAY;
>> > >>
>> > >> are getting be sufficiently convoluted that a small, appropriately
>> > >> named, helper function might be clearer. Of course, this is true of
>> > >> many parts of gfortran but it is not too late to start making the code
>> > >> a bit clearer.
>> > >>
>> > >> You can commit to the present trunk as far as I am concerned. I know
>> > >> that the caf enthusiasts will test it to bits before release!
>> > >>
>> > >> Regards
>> > >>
>> > >> Paul
>> > >>
>> > >>
>> > >> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
>> > >> > PING!
>> > >> >
>> > >> > I know it's a lengthy patch, but comments would be nice anyway.
>> > >> >
>> > >> > - Andre
>> > >> >
>> > >> > On Tue, 22 Nov 2016 20:46:50 +0100
>> > >> > Andre Vehreschild  wrote:
>> > >> >
>> > >> >> Hi all,
>> > >> >>
>> > >> >> attached patch addresses the need of extending the API of the 
>> > >> >> caf-libs
>> > >> >> to enable allocatable components asynchronous allocation. Allocatable
>> > >> >> components in derived type coarrays are different from regular
>> > >> >> coarrays or coarrayed components. The latter have to be allocated on
>> > >> >> all images or on none. Furthermore is the allocation a point of
>> > >> >> synchronisation.
>> > >> >>
>> > >> >> For allocatable components the F2008 allows to have some allocated on
>> > >> >> some images and on others not. Furthermore is the registration with
>> > >> >> the caf-lib, that an allocatable component is present in a derived
>> > >> >> type coarray no longer a synchronisation point. To implement these
>> > >> >> features two new types of coarray registration have been introduced.
>> > >> >> The first one just registering the component with the caf-lib and the
>> > >> >> latter doing the allocate. Furthermore has the caf-API been extended
>> > >> >> to provide a query function to learn about the allocation status of a
>> > >> >> component on a remote image.
>> > >> >>
>> > >> >> Sorry, that the patch is rather lengthy. Most of this is due to the
>> > >> >> structure_alloc_comps' signature change. The routine and its wrappers
>> > >> >> are used rather often which needed the appropriate changes.
>> > >> >>
>> > >> >> I know I left two or three TODOs in the patch to remind me of things 
>> > >> >> I
>> > >> >> have to investigate further. For the current state these TODOs are no
>> > >> >> reason to hold back the patch. The third party library opencoarrays
>> > >> >> implements the mpi-part of the caf-model and will change in sync. It
>> > >> >> would of course be advantageous to just have to say: With gcc-7
>> > >> >> gfortran implements allocatable components in derived coarrays nearly
>> > >> >> completely.
>> > >> >>
>> > >> >> I know we are in stage 3. But the patch bootstraps and regtests ok on
>> > >> >> x86_64-linux/F23. So, is it ok for trunk or shall it go to 7.2?
>> > >> >>
>> > >> >> Regards,
>> > >> >>   Andre
>> > >> >
>> > >> >
>> > >> > --
>> > >> > Andre Vehreschild * Email: vehre ad gmx dot de
>> > >>
>> > >>

[Patch Doc] Update documentation for __fp16 type

2016-11-30 Thread James Greenhalgh


Hi,

Documentation for __fp16 seems to have drifted out of line with
compiler behaviour over time.

This patch tries to fix that up, documenting AArch64 support and
removing comments on restrictions on using the __fp16 type for arguments
and return types.

OK?

Thanks,
James

---
2016-11-30  James Greenhalgh  

* doc/extend.texi (Half-Precision): Update to document current
compiler behaviour.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7d3d17a..cf16ec3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1012,11 +1012,12 @@ that handle conversions if/when long double is changed to be IEEE
 @cindex half-precision floating point
 @cindex @code{__fp16} data type
 
-On ARM targets, GCC supports half-precision (16-bit) floating point via
-the @code{__fp16} type.  You must enable this type explicitly
-with the @option{-mfp16-format} command-line option in order to use it.
+On ARM and AArch64 targets, GCC supports half-precision (16-bit) floating
+point via the @code{__fp16} type defined in the ARM C Language Extensions.
+On ARM systems, you must enable this type explicitly with the
+@option{-mfp16-format} command-line option in order to use it.
 
-ARM supports two incompatible representations for half-precision
+ARM targets support two incompatible representations for half-precision
 floating-point values.  You must choose one of the representations and
 use it consistently in your program.
 
@@ -1031,22 +1032,20 @@ format, but does not support infinities or NaNs.  Instead, the range
 of exponents is extended, so that this format can represent normalized
 values in the range of @math{2^{-14}} to 131008.
 
-The @code{__fp16} type is a storage format only.  For purposes
-of arithmetic and other operations, @code{__fp16} values in C or C++
-expressions are automatically promoted to @code{float}.  In addition,
-you cannot declare a function with a return value or parameters
-of type @code{__fp16}.
+The GCC port for AArch64 only supports the IEEE 754-2008 format, and does
+not require use of the @option{-mfp16-format} command-line option.
 
-Note that conversions from @code{double} to @code{__fp16}
-involve an intermediate conversion to @code{float}.  Because
-of rounding, this can sometimes produce a different result than a
-direct conversion.
+The @code{__fp16} type may only be used as an argument to intrinsics defined
+in @code{}, or as a storage format.  For purposes of
+arithmetic and other operations, @code{__fp16} values in C or C++
+expressions are automatically promoted to @code{float}.
 
-ARM provides hardware support for conversions between
+The ARM target provides hardware support for conversions between
 @code{__fp16} and @code{float} values
-as an extension to VFP and NEON (Advanced SIMD).  GCC generates
-code using these hardware instructions if you compile with
-options to select an FPU that provides them;
+as an extension to VFP and NEON (Advanced SIMD), and from ARMv8 provides
+hardware support for conversions between @code{__fp16} and @code{double}
+values.  GCC generates code using these hardware instructions if you
+compile with options to select an FPU that provides them;
 for example, @option{-mfpu=neon-fp16 -mfloat-abi=softfp},
 in addition to the @option{-mfp16-format} option to select
 a half-precision format.
@@ -1054,8 +1053,12 @@ a half-precision format.
 Language-level support for the @code{__fp16} data type is
 independent of whether GCC generates code using hardware floating-point
 instructions.  In cases where hardware support is not specified, GCC
-implements conversions between @code{__fp16} and @code{float} values
-as library calls.
+implements conversions between @code{__fp16} and other types as library
+calls.
+
+It is recommended that code which is intended to be portable use the
+@code{_Float16} type defined by ISO/IEC TS18661:3-2005
+(@xref{Floating Types}).
 
 @node Decimal Float
 @section Decimal Floating Types

Re: [PATCH] Partial solution to LWG 523

2016-11-30 Thread Jonathan Wakely


On 30/11/16 13:03 +, Jonathan Wakely wrote:

On 26/11/16 16:27 -0800, Tim Shen wrote:

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 953aa87..2fb70b7 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1000,7 +1000,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 element_type&
 operator*() const noexcept
 {
-   __glibcxx_assert(_M_ptr != nullptr);
+   __glibcxx_assert(_M_get() != nullptr);
return *_M_get();
 }


Oops, thanks, but let's fix this separately (I'll do it now) so the
rest of the patch only touches regex stuff.


I've fixed that with this patch, committed to trunk.

commit 18ee83fae4fef2fc720ef6aef0754377e6fe29e8
Author: Jonathan Wakely 
Date:   Wed Nov 30 13:11:50 2016 +

Fix condition in shared_ptr assertion

2016-11-30  Tim Shen  

	* include/bits/shared_ptr_base.h
	(__shared_ptr_access::operator*()): Fix assertion.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h
index 953aa87..2fb70b7 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1000,7 +1000,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   element_type&
   operator*() const noexcept
   {
-	__glibcxx_assert(_M_ptr != nullptr);
+	__glibcxx_assert(_M_get() != nullptr);
 	return *_M_get();
   }

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Michael Matz

Hi,

On Wed, 30 Nov 2016, Segher Boessenkool wrote:

> > I don't think mode-changing _extracts are valid in this context.  From the 
> > docu:
> > 
> >   `(sign_extract:M LOC SIZE POS)'
> >   ...
> >  The mode M is the same as the mode that would be used for LOC if
> >  it were a register.
> > 
> > Probably it could be made to work just fine, but I'm not sure it'd be 
> > worth much, as then the targets would need to care for mode-changes 
> > occuring not just through subregs as usual, but also through extracts.
> 
> The patch https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02987.html I
> submitted yesterday deals with this same issue, FWIW -- some ports
> apparently already do mode-changing extracts.

Yeah, saw that a bit later.  So, hmmm.  I'm not sure what to make of it, 
if the targets choose to use mode-changing extracts I guess that's fine, 
as they presumably will have written patterns that recognize them.  But I 
don't think we should willy-nilly generate such patterns as we can't know 
if the target deals with them or not.  We could of course always generate 
both variants: (subreg:M1 (extract:M2 (object:M2)) and
(extract:M1 (object:M2)) and see if either matches, but that seems a bit 
too much work.

Ciao,
Michael.

[PATCH,libstdc++] xfail operator new on AIX

2016-11-30 Thread David Edelsohn

AIX shared libraries do not allow overriding and interposition of
symbols by default, which is used to override operators, such as
operator new (and operator delete) in C++.  Four libstdc++ testcases
rely on this behavior and fail on AIX.  Jonathan and I have decided to
XFAIL the testcases to reduce the noise in the testsuite output.

With the recent libstdc++ testsuite changes, there now are only four
(4) failures in the libstdc++ testsuite on AIX -- all due to one real
bug in the C++ front-end for targets with stabs debugging.

Thanks, David

* testsuite/18_support/50594.cc: XFAIL on AIX.
* testsuite/ext/mt_allocator/check_new.cc: Same.
* testsuite/ext/pool_allocator/check_new.cc: Same.
* testsuite/27_io/ios_base/storage/11584.cc: Same.

Index: 18_support/50594.cc
===
--- 18_support/50594.cc (revision 243019)
+++ 18_support/50594.cc (working copy)
@@ -1,5 +1,6 @@
 // { dg-options "-fwhole-program" }
 // { dg-additional-options "-static-libstdc++" { target *-*-mingw* } }
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }

 // Copyright (C) 2011-2016 Free Software Foundation, Inc.
 //
Index: ext/mt_allocator/check_new.cc
===
--- ext/mt_allocator/check_new.cc   (revision 243019)
+++ ext/mt_allocator/check_new.cc   (working copy)
@@ -1,3 +1,5 @@
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+
 // 2001-11-25  Phil Edwards  
 //
 // Copyright (C) 2001-2016 Free Software Foundation, Inc.
Index: ext/pool_allocator/check_new.cc
===
--- ext/pool_allocator/check_new.cc (revision 243019)
+++ ext/pool_allocator/check_new.cc (working copy)
@@ -1,3 +1,5 @@
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+
 // 2001-11-25  Phil Edwards  
 //
 // Copyright (C) 2001-2016 Free Software Foundation, Inc.
Index: 27_io/ios_base/storage/11584.cc
===
--- 27_io/ios_base/storage/11584.cc (revision 243019)
+++ 27_io/ios_base/storage/11584.cc (working copy)
@@ -1,3 +1,5 @@
+// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+
 // 2004-01-25 jlqu...@gcc.gnu.org

 // Copyright (C) 2004-2016 Free Software Foundation, Inc.

[Patch doc] Document _Float16 availability on ARM/AArch64

2016-11-30 Thread James Greenhalgh


Hi,

As subject - update extend.texi to mention availability of _Float16 types
on ARM and AArch64.

OK?

Thanks,
James

---
2016-11-30  James Greenhalgh  

* doc/extend.texi (Floating Types): Document availability of
_Float16 on ARM/AArch64.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d873403..7d3d17a 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -997,8 +997,10 @@ IEEE binary128 format.  The @code{_Float64x} type is supported on all
 systems where @code{__float128} is supported.  The @code{_Float32}
 type is supported on all systems supporting IEEE binary32; the
 @code{_Float64} and @code{Float32x} types are supported on all systems
-supporting IEEE binary64.  GCC does not currently support
-@code{_Float16} or @code{_Float128x} on any systems.
+supporting IEEE binary64.  The @code{_Float16} type is supported on AArch64
+systems by default, and on ARM systems when the IEEE format for 16-bit
+floating point types is selected with @option{-mfp16-format=ieee}.
+GCC does not currently support @code{_Float128x} on any systems.
 
 On the PowerPC, @code{__ibm128} provides access to the IBM extended
 double format, and it is intended to be used by the library functions

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild

Hi all,

on IRC:
15:28:22 dominiq:  vehre: add /* FALLTHROUGH */

Done and committed as obvious as r243023.

- Andre

On Wed, 30 Nov 2016 15:22:46 +0100
Andre Vehreschild  wrote:

> Janus,
> 
> those fallthroughs are fully intentional and each and everyone is documented.
> When you can tell me a way to remove those false positive warnings I am happy
> to do so, when it comes at no extra costs at runtime.
> 
> - Andre
> 
> On Wed, 30 Nov 2016 14:48:38 +0100
> Janus Weil  wrote:
> 
> > Hi Andre,
> > 
> > after your commit I see several warnings when compiling libgfortran
> > (see below). Could you please fix those (if possible)?
> > 
> > Thanks,
> > Janus
> > 
> > 
> > 
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> > ‘_gfortran_caf_is_present’:
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> > this statement may fall through [-Wimplicit-fallthrough=]
> >  if (riter->next == NULL)
> > ^
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
> >case CAF_ARR_REF_VECTOR:
> >^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> > ‘_gfortran_caf_get_by_ref’:
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
> > ‘src_size’ may be used uninitialized in this function
> > [-Wmaybe-uninitialized]
> >if (size == 0 || src_size == 0)
> > ~^~~~
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> > ‘_gfortran_caf_send_by_ref’:
> > /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
> > ‘src_size’ may be used uninitialized in this function
> > [-Wmaybe-uninitialized]
> >if (size == 0 || src_size == 0)
> > ~^~~~
> > 
> > 
> > 
> > 
> > 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :  
> > > Hi Paul,
> > >
> > > thanks for the review. Committed with the changes requested and the one
> > > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
> > > -m32 as r243021.
> > >
> > > Thanks for the review and tests.
> > >
> > > Regards,
> > > Andre
> > >
> > > On Wed, 30 Nov 2016 07:49:13 +0100
> > > Paul Richard Thomas  wrote:
> > >
> > >> Dear Andre,
> > >>
> > >> This all looks OK to me. The only comment that I have that you might
> > >> deal with before committing is that some of the Boolean expressions,
> > >> eg:
> > >> +  int caf_dereg_mode
> > >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> > >> +  || c->attr.codimension)
> > >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> > >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> > >> +  : GFC_CAF_COARRAY_DEREGISTER)
> > >> +  : GFC_CAF_COARRAY_NOCOARRAY;
> > >>
> > >> are getting be sufficiently convoluted that a small, appropriately
> > >> named, helper function might be clearer. Of course, this is true of
> > >> many parts of gfortran but it is not too late to start making the code
> > >> a bit clearer.
> > >>
> > >> You can commit to the present trunk as far as I am concerned. I know
> > >> that the caf enthusiasts will test it to bits before release!
> > >>
> > >> Regards
> > >>
> > >> Paul
> > >>
> > >>
> > >> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
> > >> > PING!
> > >> >
> > >> > I know it's a lengthy patch, but comments would be nice anyway.
> > >> >
> > >> > - Andre
> > >> >
> > >> > On Tue, 22 Nov 2016 20:46:50 +0100
> > >> > Andre Vehreschild  wrote:
> > >> >
> > >> >> Hi all,
> > >> >>
> > >> >> attached patch addresses the need of extending the API of the caf-libs
> > >> >> to enable allocatable components asynchronous allocation. Allocatable
> > >> >> components in derived type coarrays are different from regular
> > >> >> coarrays or coarrayed components. The latter have to be allocated on
> > >> >> all images or on none. Furthermore is the allocation a point of
> > >> >> synchronisation.
> > >> >>
> > >> >>

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild

Janus,

those fallthroughs are fully intentional and each and everyone is documented.
When you can tell me a way to remove those false positive warnings I am happy to
do so, when it comes at no extra costs at runtime.

- Andre

On Wed, 30 Nov 2016 14:48:38 +0100
Janus Weil  wrote:

> Hi Andre,
> 
> after your commit I see several warnings when compiling libgfortran
> (see below). Could you please fix those (if possible)?
> 
> Thanks,
> Janus
> 
> 
> 
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> ‘_gfortran_caf_is_present’:
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
> this statement may fall through [-Wimplicit-fallthrough=]
>  if (riter->next == NULL)
> ^
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
>case CAF_ARR_REF_VECTOR:
>^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> ‘_gfortran_caf_get_by_ref’:
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
> ‘src_size’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
>if (size == 0 || src_size == 0)
> ~^~~~
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
> ‘_gfortran_caf_send_by_ref’:
> /home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
> ‘src_size’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
>if (size == 0 || src_size == 0)
> ~^~~~
> 
> 
> 
> 
> 2016-11-30 14:30 GMT+01:00 Andre Vehreschild :
> > Hi Paul,
> >
> > thanks for the review. Committed with the changes requested and the one
> > reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with
> > -m32 as r243021.
> >
> > Thanks for the review and tests.
> >
> > Regards,
> > Andre
> >
> > On Wed, 30 Nov 2016 07:49:13 +0100
> > Paul Richard Thomas  wrote:
> >  
> >> Dear Andre,
> >>
> >> This all looks OK to me. The only comment that I have that you might
> >> deal with before committing is that some of the Boolean expressions,
> >> eg:
> >> +  int caf_dereg_mode
> >> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> >> +  || c->attr.codimension)
> >> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> >> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> >> +  : GFC_CAF_COARRAY_DEREGISTER)
> >> +  : GFC_CAF_COARRAY_NOCOARRAY;
> >>
> >> are getting be sufficiently convoluted that a small, appropriately
> >> named, helper function might be clearer. Of course, this is true of
> >> many parts of gfortran but it is not too late to start making the code
> >> a bit clearer.
> >>
> >> You can commit to the present trunk as far as I am concerned. I know
> >> that the caf enthusiasts will test it to bits before release!
> >>
> >> Regards
> >>
> >> Paul
> >>
> >>
> >> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:  
> >> > PING!
> >> >
> >> > I know it's a lengthy patch, but comments would be nice anyway.
> >> >
> >> > - Andre
> >> >
> >> > On Tue, 22 Nov 2016 20:46:50 +0100
> >> > Andre Vehreschild  wrote:
> >> >  
> >> >> Hi all,
> >> >>
> >> >> attached patch addresses the need of extending the API of the caf-libs
> >> >> to enable allocatable components asynchronous allocation. Allocatable
> >> >> components in derived type coarrays are different from regular coarrays
> >> >> or coarrayed components. The latter have to be allocated on all images
> >> >> or on none. Furthermore is the allocation a point of synchronisation.
> >> >>
> >> >> For allocatable components the F2008 allows to have some allocated on
> >> >> some images and on others not. Furthermore is the registration with the
> >> >> caf-lib, that an allocatable component is present in a derived type
> >> >> coarray no longer a synchronisation point. To implement these features
> >> >> two new types of coarray registration have been introduced. The first
> >> >> one just registering the component with the caf-lib and the latter
> >> >>

Re: [PATCH] Fix PR78306

2016-11-30 Thread Richard Biener

On Wed, 30 Nov 2016, Andrew Senkevich wrote:

> 2016-11-30 11:52 GMT+03:00 Richard Biener :
> > On Tue, 29 Nov 2016, Jeff Law wrote:
> >
> >> On 11/29/2016 12:47 AM, Richard Biener wrote:
> >> > > Balaji added this check explicitly. There should be tests in the 
> >> > > testsuite
> >> > > (spawnee_inline, spawner_inline) which exercise that code.
> >> >
> >> > Yes he did, but no, nothing in the testsuite.
> >> I believe the tests are:
> >>
> >> c-c++-common/cilk-plus/CK/spawnee_inline.c
> >> c-c++-common/cilk-plus/CK/spawner_inline.c
> >>
> >> But as I mentioned, they don't check for proper behaviour
> >
> > Actually they do -- and both show what the issue might be, cilk+
> > uses setjmp but we already have code to disallow inlining of
> > functions calling setjmp (but we happily inline into functions
> > calling setjmp).  When mangling the testcases to try forcing
> > inlining I still (the patch was already applied) get
> >
> > /space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:
> > In function ‘fib’:
> > /space/rguenther/src/gcc-git/gcc/testsuite/c-c++-common/cilk-plus/CK/spawnee_inline.c:9:50:
> > error: function ‘fib’ can never be copied because it receives a non-local
> > goto
> >
> > so the intent was probably to disallow inlining of functions calling
> > cilk_spawn, not to disable inlining into functions calling cilk_spawn.
> >
> > But as seen above this is already handled by generic code handling
> > setjmp.
> >
> >>
> >> >
> >> > There is _nowhere_ documented _why_ the checks were added.  Why is
> >> > inlining a transform that can do anything bad to a function using
> >> > cilk_spawn?
> >> I know, it's disappointing.  Even the tests mentioned above don't shed any
> >> real light on the issue.
> >
> > One issue is obvious (but already handled).  Why all inlining should
> > be disabled is indeed still a mystery.
> 
> I can suppose inline should be disabled for the next function after
> cilk_spawn because spawn should be done for function.
> If no way to disable the next call inlining it looks it was disabled
> for all function to fix Cilk Plus Conformance Suite test fail.

I see.  Even with GCC 6.2 the conformace test suite has a lot of FAILs
(and compiler ICEs):

In 734 tests: 229 (60 compilation, 169 execution) tests do not conform 
specification, 505 conform

using the suite in version 1.2.1.  For current trunk I get

In 734 tests: 198 (43 compilation, 155 execution) tests do not conform 
specification, 536 conform

where gcc 6.2 vs. gcc 7 diff is

-  PASS cn_cilk_for_label_in.c
-  PASS cn_cilk_for_label_out.c
+  FAIL cn_cilk_for_label_in.c (internal compiler error)
+  FAIL cn_cilk_for_label_out.c (internal compiler error)
-  PASS cn_cilk_for_return.c
+  FAIL cn_cilk_for_return.c (internal compiler error)
-  PASS ep_cilk_for_compare1.c
+  FAIL ep_cilk_for_compare1.c (internal compiler error)
-  PASS ep_cilk_for_increment1.c
+  FAIL ep_cilk_for_increment1.c (internal compiler error)
-  PASS ep_cilk_for_nest1.c
-  PASS ep_cilk_for_pragma2.c
+  FAIL ep_cilk_for_nest1.c (internal compiler error)
+  FAIL ep_cilk_for_pragma2.c (internal compiler error)
-  PASS cn_cilk_for_init1.cpp
+  FAIL cn_cilk_for_init1.cpp (internal compiler error)
-  FAIL cn_omp_simd_lastprivate2-1.c (internal compiler error)
-  FAIL cn_omp_simd_lastprivate2-2.c (internal compiler error)
-  FAIL cn_omp_simd_lastprivate3.c
-  FAIL cn_omp_simd_lastprivate4-1.c (internal compiler error)
+  PASS cn_omp_simd_lastprivate2-1.c
+  PASS cn_omp_simd_lastprivate2-2.c
+  PASS cn_omp_simd_lastprivate3.c
+  PASS cn_omp_simd_lastprivate4-1.c
-  FAIL cn_omp_simd_linear2.c
-  FAIL cn_omp_simd_linear3-1.c (internal compiler error)
+  PASS cn_omp_simd_linear2.c
+  PASS cn_omp_simd_linear3-1.c
-  FAIL cn_omp_simd_linear6-2.c
+  PASS cn_omp_simd_linear6-2.c
-  FAIL cn_omp_simd_private3.c
-  FAIL cn_omp_simd_private4-1.c (internal compiler error)
+  PASS cn_omp_simd_private3.c
+  PASS cn_omp_simd_private4-1.c
-  FAIL cn_omp_simd_reduction1.c (internal compiler error)
-  FAIL cn_omp_simd_reduction2.c (internal compiler error)
-  FAIL cn_omp_simd_reduction3.c
-  FAIL cn_omp_simd_reduction4-1.c
-  FAIL cn_omp_simd_reduction4-2.c (internal compiler error)
-  FAIL cn_omp_simd_reduction5.c (internal compiler error)
+  PASS cn_omp_simd_reduction1.c
+  PASS cn_omp_simd_reduction2.c
+  PASS cn_omp_simd_reduction3.c
+  PASS cn_omp_simd_reduction4-1.c
+  PASS cn_omp_simd_reduction4-2.c
+  FAIL cn_omp_simd_reduction5.c
-  FAIL cn_simd_linear1.c
-  FAIL cn_simd_linear2.c
-  FAIL cn_simd_linear3.c
+  PASS cn_simd_linear1.c
+  PASS cn_simd_linear2.c
+  PASS cn_simd_linear3.c
-  FAIL cn_omp_simd_lastprivate1.cpp
-  FAIL cn_omp_simd_lastprivate2.cpp
+  PASS cn_omp_simd_lastprivate1.cpp
+  PASS cn_omp_simd_lastprivate2.cpp
-  FAIL cn_omp_simd_linear2-1.cpp
+  PASS cn_omp_simd_linear2-1.cpp
-  FAIL cn_omp_simd_private1.cpp
-  FAIL cn_omp_simd_private2.cpp
+  PASS cn_omp_simd_private1.cpp
+  PASS

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Segher Boessenkool

On Wed, Nov 30, 2016 at 02:43:12PM +0100, Michael Matz wrote:
> > Shouldn't this be simply
> > 
> >   ...
> >   (ior:SI (zero_extract:SI (reg:DI) (16) (0)))
> >   ...
> 
> I don't think mode-changing _extracts are valid in this context.  From the 
> docu:
> 
>   `(sign_extract:M LOC SIZE POS)'
>   ...
>  The mode M is the same as the mode that would be used for LOC if
>  it were a register.
> 
> Probably it could be made to work just fine, but I'm not sure it'd be 
> worth much, as then the targets would need to care for mode-changes 
> occuring not just through subregs as usual, but also through extracts.

The patch https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02987.html I
submitted yesterday deals with this same issue, FWIW -- some ports
apparently already do mode-changing extracts.


Segher

Re: [PATCH PR78574]Fix infinite recursion in find_deriving_biv_for_expr

2016-11-30 Thread Richard Biener

On Wed, Nov 30, 2016 at 2:54 PM, Bin Cheng  wrote:
> Hi,
> Loop header PHI defining IV(biv) may not be identified as biv because its 
> increment statement is in (irreducible) inner loop.  Function 
> find_deriving_biv_for_expr doesn't take this into consideration and runs into 
> infinite recursion.  The patch fixes this issue by skipping such loop header 
> PHI.  Bootstrap and test on x86_64 and AArch64, is it OK?

Ok.

Richard.

> BTW, we don't mark such IV as biv because of below code:
>
>   /* If the increment is in the subloop, ignore it.  */
>   incr_bb = gimple_bb (SSA_NAME_DEF_STMT (var));
>   if (incr_bb->loop_father != data->current_loop
>   || (incr_bb->flags & BB_IRREDUCIBLE_LOOP))
> continue;
>
> I thought twice and this check may be too strict.  Given valid incr_iv 
> returned by simple_iv, we know it behaves just like usual increment IVs.  In 
> other words, though the increment statement is executed multiple times in 
> inner loop, it computes the same result for every iteration.  Anyway this is 
> stage1 work.
>
> Thanks,
> bin
>
> 2016-11-30  Bin Cheng  
>
> PR tree-optimization/78574
> * tree-ssa-loop-ivopts.c (find_deriving_biv_for_expr): Skip loop
> header PHI that doesn't define biv.
>
> gcc/testsuite/ChangeLog
> 2016-11-30  Bin Cheng  
>
> PR tree-optimization/78574
> * gcc.c-torture/compile/pr78574.c: New test.

Re: [PATCH][AArch64] Separate shrink wrapping hooks implementation

2016-11-30 Thread Kyrill Tkachov



On 29/11/16 20:29, Segher Boessenkool wrote:

Hi James, Kyrill,

On Tue, Nov 29, 2016 at 10:57:33AM +, James Greenhalgh wrote:

+static sbitmap
+aarch64_components_for_bb (basic_block bb)
+{
+  bitmap in = DF_LIVE_IN (bb);
+  bitmap gen = _LIVE_BB_INFO (bb)->gen;
+  bitmap kill = _LIVE_BB_INFO (bb)->kill;
+
+  sbitmap components = sbitmap_alloc (V31_REGNUM + 1);
+  bitmap_clear (components);
+
+  /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets.  */
+  for (unsigned regno = R0_REGNUM; regno <= V31_REGNUM; regno++)

The use of R0_REGNUM and V31_REGNUM scare me a little bit, as we're hardcoding
where the end of the register file is (does this, for example, fall apart
with the SVE work that was recently posted). Something like a
LAST_HARDREG_NUM might work?

Components and registers aren't the same thing (you can have components
for things that aren't just a register save, e.g. the frame setup, stack
alignment, save of some non-GPR via a GPR, PIC register setup, etc.)
The loop here should really only cover the non-volatile registers, and
there should be some translation from register number to component number
(it of course is convenient to have a 1-1 translation for the GPRs and
floating point registers).  For rs6000 many things in the backend already
use non-symbolic numbers for the FPRs and GPRs, so that is easier there.


Anyway, here's the patch with James's comments implemented.
I've introduced LAST_SAVED_REGNUM which is used to delimit the registers
considered for shrink-wrapping.

aarch64_process_components is introduced and used to implement
the emit_prologue_components and emit_epilogue_components functions in a single 
place.

Bootstrapped and tested on aarch64-none-linux-gnu.

Thanks,
Kyrill

2016-11-30  Kyrylo Tkachov  

* config/aarch64/aarch64.h (machine_function): Add
reg_is_wrapped_separately field.
* config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Define new constant.
* config/aarch64/aarch64.c (emit_set_insn): Change return type to
rtx_insn *.
(aarch64_save_callee_saves): Don't save registers that are wrapped
separately.
(aarch64_restore_callee_saves): Don't restore registers that are
wrapped separately.
(offset_9bit_signed_unscaled_p, offset_12bit_unsigned_scaled_p,
aarch64_offset_7bit_signed_scaled_p): Move earlier in the file.
(aarch64_get_separate_components): New function.
(aarch64_get_next_set_bit): Likewise.
(aarch64_components_for_bb): Likewise.
(aarch64_disqualify_components): Likewise.
(aarch64_emit_prologue_components): Likewise.
(aarch64_emit_epilogue_components): Likewise.
(aarch64_set_handled_components): Likewise.
(aarch64_process_components): Likewise.
(TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS,
TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB,
TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS,
TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS,
TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS,
TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Define.

+static void
+aarch64_disqualify_components (sbitmap, edge, sbitmap, bool)
+{
+}

Is there no default "do nothing" hook for this?

I can make the shrink-wrap code do nothing here if this hook isn't
defined, if you want?


I don't mind either way.
If you do it I'll then remove the empty implementation in aarch64.




Segher


commit 194816281ec6da2620bb34c9278ed7edf8bcf0da
Author: Kyrylo Tkachov 
Date:   Tue Oct 11 09:25:54 2016 +0100

[AArch64] Separate shrink wrapping hooks implementation

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 82bfe14..48e6e2c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1138,7 +1138,7 @@ aarch64_is_extend_from_extract (machine_mode mode, rtx mult_imm,
 
 /* Emit an insn that's a simple single-set.  Both the operands must be
known to be valid.  */
-inline static rtx
+inline static rtx_insn *
 emit_set_insn (rtx x, rtx y)
 {
   return emit_insn (gen_rtx_SET (x, y));
@@ -3135,6 +3135,9 @@ aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,
 	  || regno == cfun->machine->frame.wb_candidate2))
 	continue;
 
+  if (cfun->machine->reg_is_wrapped_separately[regno])
+   continue;
+
   reg = gen_rtx_REG (mode, regno);
   offset = start_offset + cfun->machine->frame.reg_offset[regno];
   mem = gen_mem_ref (mode, plus_constant (Pmode, stack_pointer_rtx,
@@ -3143,6 +3146,7 @@ aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,
   regno2 = aarch64_next_callee_save (regno + 1, limit);
 
   if (regno2 <= limit
+	  && !cfun->machine->reg_is_wrapped_separately[regno2]
 	  && ((cfun->machine->frame.reg_offset[regno] + UNITS_PER_WORD)
 	  == cfun->machine->frame.reg_offset[regno2]))
 
@@ -3191,6 +3195,9 @@ aarch64_restore_callee_saves (machine_mode mode,
regno <= limit;
regno = aarch64_next_callee_save (regno + 1,

[PATCH PR78574]Fix infinite recursion in find_deriving_biv_for_expr

2016-11-30 Thread Bin Cheng

Hi,
Loop header PHI defining IV(biv) may not be identified as biv because its 
increment statement is in (irreducible) inner loop.  Function 
find_deriving_biv_for_expr doesn't take this into consideration and runs into 
infinite recursion.  The patch fixes this issue by skipping such loop header 
PHI.  Bootstrap and test on x86_64 and AArch64, is it OK?

BTW, we don't mark such IV as biv because of below code:

  /* If the increment is in the subloop, ignore it.  */
  incr_bb = gimple_bb (SSA_NAME_DEF_STMT (var));
  if (incr_bb->loop_father != data->current_loop
  || (incr_bb->flags & BB_IRREDUCIBLE_LOOP))
continue;

I thought twice and this check may be too strict.  Given valid incr_iv returned 
by simple_iv, we know it behaves just like usual increment IVs.  In other 
words, though the increment statement is executed multiple times in inner loop, 
it computes the same result for every iteration.  Anyway this is stage1 work.

Thanks,
bin

2016-11-30  Bin Cheng  

PR tree-optimization/78574
* tree-ssa-loop-ivopts.c (find_deriving_biv_for_expr): Skip loop
header PHI that doesn't define biv.

gcc/testsuite/ChangeLog
2016-11-30  Bin Cheng  

PR tree-optimization/78574
* gcc.c-torture/compile/pr78574.c: New test.diff --git a/gcc/testsuite/gcc.c-torture/compile/pr78574.c 
b/gcc/testsuite/gcc.c-torture/compile/pr78574.c
new file mode 100644
index 000..8c91d1e
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr78574.c
@@ -0,0 +1,27 @@
+/* PR tree-optimization/78574 */
+
+int a, d, f, g;
+int b[1];
+short h;
+int main() {
+  long j;
+  int k, i;
+  for (; j; j++) {
+i = 0;
+for (; i < 6; i++) {
+  int l = a, m = d || g;
+L:
+  l ^ m | a;
+}
+b[j + 1] = 2;
+++k;
+for (; g; g++) {
+  d ^= h;
+  if (f)
+for (;;)
+  ;
+}
+  }
+  if (k)
+goto L;
+}
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 5c667a2..00b287a 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -1853,6 +1853,11 @@ find_deriving_biv_for_expr (struct ivopts_data *data, 
tree expr)
 {
   ssa_op_iter iter;
   use_operand_p use_p;
+  basic_block phi_bb = gimple_bb (phi);
+
+  /* Skip loop header PHI that doesn't define biv.  */
+  if (phi_bb->loop_father == data->current_loop)
+   return NULL;
 
   if (virtual_operand_p (gimple_phi_result (phi)))
return NULL;

Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Segher Boessenkool

On Wed, Nov 30, 2016 at 05:54:58AM -0700, Jeff Law wrote:
> Funny how you speculated there could be these issues hiding in the 
> weeds, then just a few days later, one crawls out.

Two (there is PR78607 as well).  Although that one seems related to the
combine one.  All the same reporter, it's not a big coincidence ;-)

Segher

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Janus Weil

Hi Andre,

after your commit I see several warnings when compiling libgfortran
(see below). Could you please fix those (if possible)?

Thanks,
Janus



/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
‘_gfortran_caf_is_present’:
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2949:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2952:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2976:8: warning:
this statement may fall through [-Wimplicit-fallthrough=]
 if (riter->next == NULL)
^
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2979:3: note: here
   case CAF_ARR_REF_VECTOR:
   ^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
‘_gfortran_caf_get_by_ref’:
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:1863:29: warning:
‘src_size’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
   if (size == 0 || src_size == 0)
~^~~~
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c: In function
‘_gfortran_caf_send_by_ref’:
/home/jweil/gcc/gcc7/trunk/libgfortran/caf/single.c:2649:29: warning:
‘src_size’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
   if (size == 0 || src_size == 0)
~^~~~




2016-11-30 14:30 GMT+01:00 Andre Vehreschild :
> Hi Paul,
>
> thanks for the review. Committed with the changes requested and the one
> reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with -m32 
> as
> r243021.
>
> Thanks for the review and tests.
>
> Regards,
> Andre
>
> On Wed, 30 Nov 2016 07:49:13 +0100
> Paul Richard Thomas  wrote:
>
>> Dear Andre,
>>
>> This all looks OK to me. The only comment that I have that you might
>> deal with before committing is that some of the Boolean expressions,
>> eg:
>> +  int caf_dereg_mode
>> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
>> +  || c->attr.codimension)
>> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
>> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
>> +  : GFC_CAF_COARRAY_DEREGISTER)
>> +  : GFC_CAF_COARRAY_NOCOARRAY;
>>
>> are getting be sufficiently convoluted that a small, appropriately
>> named, helper function might be clearer. Of course, this is true of
>> many parts of gfortran but it is not too late to start making the code
>> a bit clearer.
>>
>> You can commit to the present trunk as far as I am concerned. I know
>> that the caf enthusiasts will test it to bits before release!
>>
>> Regards
>>
>> Paul
>>
>>
>> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
>> > PING!
>> >
>> > I know it's a lengthy patch, but comments would be nice anyway.
>> >
>> > - Andre
>> >
>> > On Tue, 22 Nov 2016 20:46:50 +0100
>> > Andre Vehreschild  wrote:
>> >
>> >> Hi all,
>> >>
>> >> attached patch addresses the need of extending the API of the caf-libs to
>> >> enable allocatable components asynchronous allocation. Allocatable
>> >> components in derived type coarrays are different from regular coarrays or
>> >> coarrayed components. The latter have to be allocated on all images or on
>> >> none. Furthermore is the allocation a point of synchronisation.
>> >>
>> >> For allocatable components the F2008 allows to have some allocated on some
>> >> images and on others not. Furthermore is the registration with the 
>> >> caf-lib,
>> >> that an allocatable component is present in a derived type coarray no
>> >> longer a synchronisation point. To implement these features two new types
>> >> of coarray registration have been introduced. The first one just
>> >> registering the component with the caf-lib and the latter doing the
>> >> allocate. Furthermore has the caf-API been extended to provide a query
>> >> function to learn about the allocation status of a component on a remote
>> >> image.
>> >>
>> >> Sorry, that the patch is rather lengthy. Most of this is due to the
>> >> structure_alloc_comps' signature change. The routine and its wrappers are
>> >> used rather often which needed the appropriate changes.
>> >>
>> >> I know I left two or three TODOs in the patch to remind me of things I
>> >> have to investigate further. For the current state these TODOs are

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Michael Matz

Hi,

On Wed, 30 Nov 2016, Dominik Vogt wrote:

> On Wed, Nov 23, 2016 at 02:22:07PM +, Segher Boessenkool wrote:
> > r242414, for PR77881, introduces some bugs (PR78390, PR78438, PR78477).
> > It all has the same root cause: that patch makes combine convert every
> > lowpart subreg of a logical shift right to a zero_extract.  This cannot
> > work at all if it is not a constant shift, and it has to be a bit more
> > careful exactly which bits it extracts.
> > 
> > Tested on powerpc64-linux {-m32,-m64} (where it fixes the regression
> > c-c++-common/torture/vector-compare-1.c fails at -O1, and where it also
> > has a bootstrap failure with some other patches).  Also tested that the
> > x86_64 compiler still generates the wanted code for the PR77881 testcase.
> 
> Is this a side effect of the patch series?

What is "this"?

>   Trying 7 -> 9:
>   ...
>   Failed to match this instruction:
>   (set (reg:SI 68 [ v_or ])
>   (ior:SI (subreg:SI (zero_extract:DI (reg:DI 2 %r2 [ v_x ])
>^^
>   (const_int 16 [0x10])
>   (const_int 0 [0])) 4)
> ^^^
>   (reg:SI 70 [ v_and1 ])))
> 
> Shouldn't this be simply
> 
>   ...
>   (ior:SI (zero_extract:SI (reg:DI) (16) (0)))
>   ...

I don't think mode-changing _extracts are valid in this context.  From the 
docu:

  `(sign_extract:M LOC SIZE POS)'
  ...
 The mode M is the same as the mode that would be used for LOC if
 it were a register.

Probably it could be made to work just fine, but I'm not sure it'd be 
worth much, as then the targets would need to care for mode-changes 
occuring not just through subregs as usual, but also through extracts.

Ciao,
Michael.

Re: [PATCH] ira: Don't substitute into TRAP_IF insns (PR78610)

2016-11-30 Thread Segher Boessenkool

On Wed, Nov 30, 2016 at 01:52:51PM +0100, Richard Biener wrote:
> On Wed, Nov 30, 2016 at 1:46 PM, Segher Boessenkool
>  wrote:
> > In the testcase, IRA propagates a constant into a TRAP_IF insn, which
> > then becomes an unconditional trap.  Unconditional traps are control
> > flow insns so doing this requires surgery on the cfg.
> 
> Huh, that's an odd choice ;)  I'd say TRAP_IF should be a control-flow insn
> as well, but well...

It doesn't really matter here, converting a conditional TRAP_IF to an
unconditional one requires changing the cfg in any case (and we cannot
do that here).

Making every TRAP_IF a control flow insn means they will end their BB;
that will make some things simpler, sure.  It will also limit some of
the RTL optimizations a bit.

Segher

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)

2016-11-30 Thread Segher Boessenkool

On Wed, Nov 30, 2016 at 02:12:35PM +0100, Dominik Vogt wrote:
> On Wed, Nov 23, 2016 at 02:22:07PM +, Segher Boessenkool wrote:
> > r242414, for PR77881, introduces some bugs (PR78390, PR78438, PR78477).
> > It all has the same root cause: that patch makes combine convert every
> > lowpart subreg of a logical shift right to a zero_extract.  This cannot
> > work at all if it is not a constant shift, and it has to be a bit more
> > careful exactly which bits it extracts.
> > 
> > Tested on powerpc64-linux {-m32,-m64} (where it fixes the regression
> > c-c++-common/torture/vector-compare-1.c fails at -O1, and where it also
> > has a bootstrap failure with some other patches).  Also tested that the
> > x86_64 compiler still generates the wanted code for the PR77881 testcase.
> 
> Is this a side effect of the patch series?

I do not know; I cannot tell just from this, and there is no source
snippet to try.  Maybe Michael can tell?

>   Trying 7 -> 9:
>   ...
>   Failed to match this instruction:
>   (set (reg:SI 68 [ v_or ])
>   (ior:SI (subreg:SI (zero_extract:DI (reg:DI 2 %r2 [ v_x ])
>^^
>   (const_int 16 [0x10])
>   (const_int 0 [0])) 4)
> ^^^
>   (reg:SI 70 [ v_and1 ])))
> 
> Shouldn't this be simply
> 
>   ...
>   (ior:SI (zero_extract:SI (reg:DI) (16) (0)))
>   ...

That seems nicer, sure.  OTOH that will never match on a target that
does not have zero_extract:SI from :DI.  *_extracts are nasty.


Segher

Re: PING! [PATCH, Fortran, accaf, v1] Add caf-API-calls to asynchronously handle allocatable components in derived type coarrays.

2016-11-30 Thread Andre Vehreschild

Hi Paul,

thanks for the review. Committed with the changes requested and the one
reported by Dominique on IRC for coarray_lib_alloc_4 when compiled with -m32 as
r243021. 

Thanks for the review and tests.

Regards,
Andre

On Wed, 30 Nov 2016 07:49:13 +0100
Paul Richard Thomas  wrote:

> Dear Andre,
> 
> This all looks OK to me. The only comment that I have that you might
> deal with before committing is that some of the Boolean expressions,
> eg:
> +  int caf_dereg_mode
> +  = ((caf_mode & GFC_STRUCTURE_CAF_MODE_IN_COARRAY) != 0
> +  || c->attr.codimension)
> +  ? ((caf_mode & GFC_STRUCTURE_CAF_MODE_DEALLOC_ONLY) != 0
> +  ? GFC_CAF_COARRAY_DEALLOCATE_ONLY
> +  : GFC_CAF_COARRAY_DEREGISTER)
> +  : GFC_CAF_COARRAY_NOCOARRAY;
> 
> are getting be sufficiently convoluted that a small, appropriately
> named, helper function might be clearer. Of course, this is true of
> many parts of gfortran but it is not too late to start making the code
> a bit clearer.
> 
> You can commit to the present trunk as far as I am concerned. I know
> that the caf enthusiasts will test it to bits before release!
> 
> Regards
> 
> Paul
> 
> 
> On 28 November 2016 at 19:33, Andre Vehreschild  wrote:
> > PING!
> >
> > I know it's a lengthy patch, but comments would be nice anyway.
> >
> > - Andre
> >
> > On Tue, 22 Nov 2016 20:46:50 +0100
> > Andre Vehreschild  wrote:
> >  
> >> Hi all,
> >>
> >> attached patch addresses the need of extending the API of the caf-libs to
> >> enable allocatable components asynchronous allocation. Allocatable
> >> components in derived type coarrays are different from regular coarrays or
> >> coarrayed components. The latter have to be allocated on all images or on
> >> none. Furthermore is the allocation a point of synchronisation.
> >>
> >> For allocatable components the F2008 allows to have some allocated on some
> >> images and on others not. Furthermore is the registration with the caf-lib,
> >> that an allocatable component is present in a derived type coarray no
> >> longer a synchronisation point. To implement these features two new types
> >> of coarray registration have been introduced. The first one just
> >> registering the component with the caf-lib and the latter doing the
> >> allocate. Furthermore has the caf-API been extended to provide a query
> >> function to learn about the allocation status of a component on a remote
> >> image.
> >>
> >> Sorry, that the patch is rather lengthy. Most of this is due to the
> >> structure_alloc_comps' signature change. The routine and its wrappers are
> >> used rather often which needed the appropriate changes.
> >>
> >> I know I left two or three TODOs in the patch to remind me of things I
> >> have to investigate further. For the current state these TODOs are no
> >> reason to hold back the patch. The third party library opencoarrays
> >> implements the mpi-part of the caf-model and will change in sync. It would
> >> of course be advantageous to just have to say: With gcc-7 gfortran
> >> implements allocatable components in derived coarrays nearly completely.
> >>
> >> I know we are in stage 3. But the patch bootstraps and regtests ok on
> >> x86_64-linux/F23. So, is it ok for trunk or shall it go to 7.2?
> >>
> >> Regards,
> >>   Andre  
> >
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: libgfortran/caf/single.c
===
--- libgfortran/caf/single.c	(Revision 243020)
+++ libgfortran/caf/single.c	(Arbeitskopie)
@@ -144,11 +144,17 @@
   || type == CAF_REGTYPE_CRITICAL || type == CAF_REGTYPE_EVENT_STATIC
   || type == CAF_REGTYPE_EVENT_ALLOC)
 local = calloc (size, sizeof (bool));
+  else if (type == CAF_REGTYPE_COARRAY_ALLOC_REGISTER_ONLY)
+local = NULL;
   else
 local = malloc (size);
-  *token = malloc (sizeof (struct caf_single_token));
 
-  if (unlikely (local == NULL || *token == NULL))
+  if (type != CAF_REGTYPE_COARRAY_ALLOC_ALLOCATE_ONLY)
+*token = malloc (sizeof (struct caf_single_token));
+
+  if (unlikely (*token == NULL
+		|| (local == NULL
+		&& type != CAF_REGTYPE_COARRAY_ALLOC_REGISTER_ONLY)))
 {
   /* Freeing the memory conditionally seems pointless, but
 	 caf_internal_error () may return, when a stat is given and then the
@@ -163,7 +169,7 @@
 
   single_token = TOKEN (*token);
   single_token->memptr = local;
-  single_token->owning_memory = true;
+  single_token->owning_memory = type != CAF_REGTYPE_COARRAY_ALLOC_REGISTER_ONLY;
   single_token->desc = GFC_DESCRIPTOR_RANK (data) > 0 ? data : NULL;
 
 
@@ -184,7 +190,7 @@
 
 
 void
-_gfortran_caf_deregister (caf_token_t *token, int *stat,
+_gfortran_caf_deregister (caf_token_t *token, caf_deregister_t type, int *stat,
 			  char *errmsg __attribute__ ((unused)),
 			  int

1 2 >

1 - 100 of 161 matches

Mail list logo