date:20170103

Re: [PATCH] Fix -fself-test ICE in non-english locale (PR bootstrap/77569)

2017-01-03 Thread Jakub Jelinek

On Tue, Jan 03, 2017 at 08:09:03PM -0500, David Malcolm wrote:
> > LC_ALL=en_US.UTF-8 and LC_ALL=de_DE.UTF-8).  Ok for trunk?
> 
> Thanks for looking at this; OK for trunk.

Thanks.

> I wonder if it makes sense to add a target to gcc/Makefile.in to run
> the selftests in a non-English locale, to catch these things earlier?
> Or to hardcode it to run in the C locale?

Not sure about that.  Testing in a non-English locale wouldn't still
trigger it unless the *.gmo file is installed somewhere where the
non-installed compiler finds it, don't remember if that can be overridden
through env vars etc.  And only testing in C locale is possible, but
then the testing coverage is smaller.

Jakub

Re: C/C++ PATCH to implement -Wpointer-compare warning (PR c++/64767)

2017-01-03 Thread Eric Gallager

On 10/2/16, Jason Merrill  wrote:
> OK, thanks.
>
> On Sat, Oct 1, 2016 at 10:16 AM, Marek Polacek  wrote:
>> On Fri, Sep 30, 2016 at 05:48:03PM -0400, Jason Merrill wrote:
>>> On Fri, Sep 30, 2016 at 12:43 PM, Marek Polacek 
>>> wrote:
>>> > On Fri, Sep 23, 2016 at 10:31:33AM -0400, Jason Merrill wrote:
>>> >> On Fri, Sep 23, 2016 at 9:15 AM, Marek Polacek 
>>> >> wrote:
>>> >> > On Wed, Sep 21, 2016 at 03:52:09PM -0400, Jason Merrill wrote:
>>> >> >> On Mon, Sep 19, 2016 at 2:49 PM, Jason Merrill 
>>> >> >> wrote:
>>> >> >> > I suppose that an INTEGER_CST of character type is necessarily a
>>> >> >> > character constant, so adding a check for !char_type_p ought to
>>> >> >> > do the
>>> >> >> > trick.
>>> >> >>
>>> >> >> Indeed it does.  I'm checking this in:
>>> >> >
>>> >> > Nice, thanks.  What about the original patch?  We still need to
>>> >> > warn
>>> >> > (or error for C++11) for pointer comparisons.
>>> >>
>>> >> If we still accept pointer comparisons in C++, that's another bug
>>> >> with
>>> >> treating \0 as a null pointer constant.  This seems to be because
>>> >> ocp_convert of \0 to int produces an INTEGER_CST indistinguishable
>>> >> from literal 0.
>>> >
>>> > I was trying to fix this in ocp_convert, by using NOP_EXPRs, but that
>>> > wasn't
>>> > successful.  But since we're interested in ==/!=, I think this can be
>>> > fixed
>>> > easily in cp_build_binary_op.  Actually, all that seems to be needed is
>>> > using
>>> > orig_op as the argument to null_ptr_cst_p, but that wouldn't give the
>>> > correct
>>> > diagnostics, so I did this.  By checking orig_op we can see if the
>>> > operands are
>>> > character literals or not, because orig_op is an operand before the
>>> > default
>>> > conversions.
>>>
>>> What is wrong about the diagnostic from just using orig_op?  "ISO C++
>>> forbids comparison between pointer and integer" seems fine to me, and
>>> will help the user to realize that they need to index off the pointer.
>>>
>>> I see that some of the calls to null_ptr_cst_p in cp_build_binary_op
>>> have already been changed to check orig_op*, but not all.  Let's
>>> update the remaining calls, that should do the trick without adding a
>>> new error.
>>
>> Here you go:
>>
>> Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?
>>
>> 2016-10-01  Marek Polacek  
>>
>> Core 903
>> * typeck.c (cp_build_binary_op): Pass original operands to
>> null_ptr_cst_p, not those after the default conversions.
>>
>> * g++.dg/cpp0x/nullptr37.C: New test.
>>
>> diff --git gcc/cp/typeck.c gcc/cp/typeck.c
>> index 617ca55..8b780be 100644
>> --- gcc/cp/typeck.c
>> +++ gcc/cp/typeck.c
>> @@ -4573,7 +4573,7 @@ cp_build_binary_op (location_t location,
>>   || code1 == COMPLEX_TYPE || code1 == ENUMERAL_TYPE))
>> short_compare = 1;
>>else if (((code0 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type0))
>> -   && null_ptr_cst_p (op1))
>> +   && null_ptr_cst_p (orig_op1))
>>/* Handle, eg, (void*)0 (c++/43906), and more.  */
>>|| (code0 == POINTER_TYPE
>>&& TYPE_PTR_P (type1) && integer_zerop (op1)))
>> @@ -4587,7 +4587,7 @@ cp_build_binary_op (location_t location,
>>   warn_for_null_address (location, op0, complain);
>> }
>>else if (((code1 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type1))
>> -   && null_ptr_cst_p (op0))
>> +   && null_ptr_cst_p (orig_op0))
>>/* Handle, eg, (void*)0 (c++/43906), and more.  */
>>|| (code1 == POINTER_TYPE
>>&& TYPE_PTR_P (type0) && integer_zerop (op0)))
>> @@ -4604,7 +4604,7 @@ cp_build_binary_op (location_t location,
>>|| (TYPE_PTRDATAMEM_P (type0) && TYPE_PTRDATAMEM_P
>> (type1)))
>> result_type = composite_pointer_type (type0, type1, op0, op1,
>>   CPO_COMPARISON, complain);
>> -  else if (null_ptr_cst_p (op0) && null_ptr_cst_p (op1))
>> +  else if (null_ptr_cst_p (orig_op0) && null_ptr_cst_p (orig_op1))
>> /* One of the operands must be of nullptr_t type.  */
>>  result_type = TREE_TYPE (nullptr_node);
>>else if (code0 == POINTER_TYPE && code1 == INTEGER_TYPE)
>> @@ -4623,7 +4623,7 @@ cp_build_binary_op (location_t location,
>>else
>>  return error_mark_node;
>> }
>> -  else if (TYPE_PTRMEMFUNC_P (type0) && null_ptr_cst_p (op1))
>> +  else if (TYPE_PTRMEMFUNC_P (type0) && null_ptr_cst_p (orig_op1))
>> {
>>   if (TARGET_PTRMEMFUNC_VBIT_LOCATION
>>   == ptrmemfunc_vbit_in_delta)
>> @@ -4664,7 +4664,7 @@ cp_build_binary_op (location_t location,
>> }
>>   result_type = TREE_TYPE (op0);
>> }
>> -  else if (TYPE_PTRMEMFUNC_P (type1) &&

Re: Implement -Wduplicated-branches (PR c/64279) (v3)

2017-01-03 Thread Eric Gallager

On 11/3/16, Jakub Jelinek  wrote:
> On Thu, Nov 03, 2016 at 09:27:55AM -0400, Jason Merrill wrote:
>> On Thu, Nov 3, 2016 at 7:24 AM, Marek Polacek  wrote:
>> > On Tue, Nov 01, 2016 at 02:53:58PM +0100, Jakub Jelinek wrote:
>> >> On Tue, Nov 01, 2016 at 09:41:20AM -0400, Jason Merrill wrote:
>> >> > On Tue, Oct 25, 2016 at 9:59 AM, Marek Polacek 
>> >> > wrote:
>> >> > > On Mon, Oct 24, 2016 at 04:10:21PM +0200, Marek Polacek wrote:
>> >> > >> On Thu, Oct 20, 2016 at 12:28:36PM +0200, Marek Polacek wrote:
>> >> > >> > I found a problem with this patch--we can't call
>> >> > >> > do_warn_duplicated_branches in
>> >> > >> > build_conditional_expr, because that way the C++-specific codes
>> >> > >> > might leak into
>> >> > >> > the hasher.  Instead, I should use operand_equal_p, I think.
>> >> > >> > Let me rework
>> >> > >> > that part of the patch.
>> >> >
>> >> > Hmm, is there a reason not to use operand_equal_p for
>> >> > do_warn_duplicated_branches as well?  I'm concerned about hash
>> >> > collisions leading to false positives.
>> >>
>> >> If the hashing function is iterative_hash_expr / inchash::add_expr,
>> >> then
>> >> that is supposed to pair together with operand_equal_p, we even have
>> >> checking verification of that.
>>
>> Yes, but there could still be hash collisions; we can't guarantee that
>> everything that compares unequal also hashes unequal.
>
> Right, after h0 == h1 is missing && operand_equal_p (thenb, elseb, 0)
> or so (the exact last operand needs to be figured out).
> OEP_ONLY_CONST is certainly wrong, we want the same VAR_DECLs to mean the
> same thing.  0 is a tiny bit better, but still it will give up on e.g. pure
> and other calls.  OEP_PURE_SAME is tiny bit better than that, but still
> calls with the same arguments to the same function will not be considered
> equal, plus likely operand_equal_p doesn't handle STATEMENT_LIST etc.
> So maybe we need another OEP_* mode for this.
>
>   Jakub
>

Pinging this conversation for the new year. Any chances of
-Wduplicated-branches making it in in time for GCC 7?

Thanks,
Eric

Contents of PO file 'cpplib-7.1-b20170101.ru.po'

2017-01-03 Thread Translation Project Robot



cpplib-7.1-b20170101.ru.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

New Russian PO file for 'cpplib' (version 7.1-b20170101)

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Russian team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/ru.po

(This file, 'cpplib-7.1-b20170101.ru.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] c++/78771 ICE with inheriting ctor

2017-01-03 Thread Jason Merrill


On 12/19/2016 08:09 AM, Nathan Sidwell wrote:

this patch fixes 78771, were an assert fires due to recursive
instantiation of an inheriting ctor.  Normally when a recursive
instantiation is needed, we've already constructed and registered the
declaration, so simply return it.  For ctors though we need to construct
the clones after we've instantiated the the master pattern (later in
instantiate_template_1).  Hence any recursive instantiation of a cloned
fn will barf, as we do.

Now, with an inherited ctor we have to deduce its exception spec and
deletedness (deduce_inheriting_ctor).  That's fine, until one gets the
perverse testcase here.  In figuring out what Middle ctor is needed by
Middle(0), we end up trying to instantiate Derived::Derived (int) to see
if Middle::Middle (Derived) is a viable candidate.


Hmm, that seems like where the problem is.  We shouldn't try to 
instantiate the inheriting constructor until we've already chosen the 
base constructor; in the new model the inheriting constructor is just an 
implementation detail.


Jason

Patch Pings for fixes to bz33562 and bz61912.

2017-01-03 Thread Jeff Law



Pinging patches #1 and #2 from the 4 part series to improve DSE.  ISTM 
that #3 and #4 should wait for gcc-7.


Patches #1 and #2 included for reference.

--- Begin Message ---
This is the first of the 4 part patchkit to address deficiencies in our 
DSE implementation.


This patch addresses the P2 regression 33562 which has been a low 
priority regression since gcc-4.3.  To summarize, DSE no longer has the 
ability to detect an aggregate store as dead if subsequent stores are 
done in a piecemeal fashion.


I originally tackled this by changing how we lower complex objects. That 
was sufficient to address 33562, but was reasonably rejected.


This version attacks the problem by improving DSE to track stores to 
memory at a byte level.  That allows us to determine if a series of 
stores completely covers an earlier store (thus making the earlier store 
dead).


A useful side effect of this is we can detect when parts of a store are 
dead and potentially rewrite the store.  This patch implements that for 
complex object initializations.  While not strictly part of 33562, it's 
so closely related that I felt it belongs as part of this patch.


This originally limited the size of the tracked memory space to 64 
bytes.  I bumped the limit after working through the CONSTRUCTOR and 
mem* trimming patches.  The 256 byte limit is still fairly arbitrary and 
I wouldn't lose sleep if we throttled back to 64 or 128 bytes.


Later patches in the kit will build upon this patch.  So if pieces look 
like skeleton code, that's because it is.


The changes since the V2 patch are:

1. Using sbitmaps rather than bitmaps.
2. Returning a tri-state from dse_classify_store (renamed from 
dse_possible_dead_store_p)

3. More efficient trim computation
4. Moving trimming code out of dse_classify_store
5. Refactoring code to delete dead calls/assignments
6. dse_optimize_stmt moves into the dse_dom_walker class

Not surprisingly, this patch has most of the changes based on prior 
feedback as it includes the raw infrastructure.


Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?

PR tree-optimization/33562
* params.def (PARM_DSE_MAX_OBJECT_SIZE): New PARAM.
* sbitmap.h (bitmap_clear_range, bitmap_set_range): Prototype new
functions.
(bitmap_count_bits): Likewise.
* sbitmap.c (bitmap_clear_range, bitmap_set_range): New functions.
(bitmap_count_bits): Likewise.
* tree-ssa-dse.c: Include params.h.
(dse_store_status): New enum.
(initialize_ao_ref_for_dse): New, partially extracted from
dse_optimize_stmt.
(valid_ao_ref_for_dse, normalize_ref): New.
(setup_live_bytes_from_ref, compute_trims): Likewise.
(clear_bytes_written_by, trim_complex_store): Likewise.
(maybe_trim_partially_dead_store): Likewise.
(maybe_trim_complex_store): Likewise.
(dse_classify_store): Renamed from dse_possibly_dead_store_p.
Track what bytes live from the original store.  Return tri-state
for dead, partially dead or live.
(dse_dom_walker): Add constructor, destructor and new private members.
(delete_dead_call, delete_dead_assignment): New extracted from
dse_optimize_stmt.
(dse_optimize_stmt): Make a member of dse_dom_walker.
Use initialize_ao_ref_for_dse.


* gcc.dg/tree-ssa/complex-4.c: No longer xfailed.
* gcc.dg/tree-ssa/complex-5.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-9.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-18.c: New test.
* gcc.dg/tree-ssa/ssa-dse-19.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-20.c: Likewise.
* gcc.dg/tree-ssa/ssa-dse-21.c: Likewise.

diff --git a/gcc/params.def b/gcc/params.def
index 50f75a7..f367c1d 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -532,6 +532,11 @@ DEFPARAM(PARAM_AVG_LOOP_NITER,
 "Average number of iterations of a loop.",
 10, 1, 0)
 
+DEFPARAM(PARAM_DSE_MAX_OBJECT_SIZE,
+"dse-max-object-size",
+"Maximum size (in bytes) of objects tracked by dead store 
elimination.",
+256, 0, 0)
+
 DEFPARAM(PARAM_SCEV_MAX_EXPR_SIZE,
 "scev-max-expr-size",
 "Bound on size of expressions used in the scalar evolutions analyzer.",
diff --git a/gcc/sbitmap.c b/gcc/sbitmap.c
index 10b4347..2b66a6c 100644
--- a/gcc/sbitmap.c
+++ b/gcc/sbitmap.c
@@ -202,6 +202,39 @@ bitmap_empty_p (const_sbitmap bmap)
   return true;
 }
 
+void
+bitmap_clear_range (sbitmap bmap, unsigned int start, unsigned int count)
+{
+  for (unsigned int i = start; i < start + count; i++)
+bitmap_clear_bit (bmap, i);
+}
+
+void
+bitmap_set_range (sbitmap bmap, unsigned int start, unsigned int count)
+{
+  for (unsigned int i = start; i < start + count; i++)
+bitmap_set_bit (bmap, i);
+}
+
+
+unsigned int
+bitmap_count_bits (const_sbitmap bmap)
+{
+  unsigned int count = 0;
+  for (unsigned int i = 0; i < bmap->size; i++)
+if (bmap->elms[i])
+

Re: [PATCH] c++/78765

2017-01-03 Thread Jason Merrill


On 12/16/2016 07:23 AM, Nathan Sidwell wrote:

when cxx_eval_constant_expression finds a nonconstant expression it
returns a TREE without TREE_CONSTANT set.
  else if (non_constant_p && TREE_CONSTANT (r))
  {
  /* This isn't actually constant, so unset TREE_CONSTANT.  */
  ...
 else // THIS CASE HAPPENS
r = build_nop (TREE_TYPE (r), r);
  TREE_CONSTANT (r) = false;
   }


Hmm, we shouldn't get here for an expression we're going to use as an 
lvalue.


Jason

Re: [PATCH] PR c++/66735 lambda capture by reference

2017-01-03 Thread Jason Merrill


On 01/03/2017 08:57 AM, Nathan Sidwell wrote:

   else if (!is_this && explicit_init_p)
 {
-  type = make_auto ();
-  type = do_auto_deduction (type, expr, type);
+  tree auto_node = make_auto ();
+
+  type = auto_node;
+  if (by_reference_p)
+   {
+ /* Add the reference now, so deduction doesn't lose
+outermost CV qualifiers of EXPR.  */
+ type = build_reference_type (type);
+ by_reference_p = false;
+   }
+  type = do_auto_deduction (type, expr, auto_node);
 }
   else
 type = non_reference (unlowered_expr_type (expr));
+
+  if (!is_this && by_reference_p)
+type = build_reference_type (type);


This looks like it will call build_reference_type twice in the explicit 
init case, producing a reference to reference.



if (DECLTYPE_FOR_LAMBDA_CAPTURE (t))
  type = lambda_capture_field_type (type,
-   DECLTYPE_FOR_INIT_CAPTURE (t));
+   DECLTYPE_FOR_INIT_CAPTURE (t),
+   /*by_reference_p=*/false);


Always passing false seems unlikely to be correct.

Jason

[PATCH][PR tree-optimization/78856] Invalidate cached iteration information when threading across multiple loop headers

2017-01-03 Thread Jeff Law



So as noted in the BZ comments the jump threading code has code that 
detects when a jump threading path wants to cross multiple loop headers 
and truncates the jump threading path in that case.


What we should have done instead is invalidate the cached loop information.

Additionally, this BZ shows that just looking at loop headers is not 
sufficient -- we might cross from a reducible to an irreducible region 
which is equivalent to crossing into another loop in that we need to 
invalidate the cached loop iteration information.


What's so damn funny here is that eventually we take nested loops and 
irreducible regions, thread various edges and end up with a nice natural 
loop and no irreducible regions in the end :-)  But the cached iteration 
information is still bogus.


Anyway, this patch corrects both issues.  It treats moving between an 
reducible and irreducible region as crossing a loop header and it 
invalidates the cached iteration information rather than truncating the 
jump thread path.


Bootstrapped and regression tested on x86_64-linux-gnu.  That compiler 
was also used to build all the configurations in config-list.mk.


Installing on the trunk.  I could be convinced to install on the gcc-6 
branch as well since it's affected by the same problem.


Jeff

commit 93e3964a4664350446eefe786e3b73eb41d99036
Author: law 
Date:   Wed Jan 4 05:31:23 2017 +

PR tree-optimizatin/78856
* tree-ssa-threadupdate.c: Include tree-vectorizer.h.
(mark_threaded_blocks): Remove code to truncate thread paths that
cross multiple loop headers.  Instead invalidate the cached loop
iteration information and handle case of a thread path walking
into an irreducible region.

PR tree-optimization/78856
* gcc.c-torture/execute/pr78856.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244045 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3114e02..6b2888f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2017-01-03  Jeff Law  
+
+   PR tree-optimizatin/78856
+   * tree-ssa-threadupdate.c: Include tree-vectorizer.h.
+   (mark_threaded_blocks): Remove code to truncate thread paths that
+   cross multiple loop headers.  Instead invalidate the cached loop
+   iteration information and handle case of a thread path walking
+   into an irreducible region.
+
 2016-12-30  Michael Meissner  
 
PR target/78900
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index cd2a065..cadfbc9 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2017-01-03  Jeff Law  
+
+   PR tree-optimization/78856
+   * gcc.c-torture/execute/pr78856.c: New test.
+
 2017-01-03  Michael Meissner  
 
PR target/78953
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78856.c 
b/gcc/testsuite/gcc.c-torture/execute/pr78856.c
new file mode 100644
index 000..80f2317
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr78856.c
@@ -0,0 +1,25 @@
+extern void exit (int);
+
+int a, b, c, d, e, f[3]; 
+
+int main() 
+{
+  while (d)
+while (1)
+  ;
+  int g = 0, h, i = 0;
+  for (; g < 21; g += 9) 
+{
+  int j = 1;
+  for (h = 0; h < 3; h++)
+   f[h] = 1;
+  for (; j < 10; j++) {
+   d = i && (b ? 0 : c); 
+   i = 1;
+   if (g)
+ a = e;
+  }
+  }
+  exit (0);
+}
+
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index adbb6e0..2da93a8 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "dbgcnt.h"
 #include "tree-cfg.h"
+#include "tree-vectorizer.h"
 
 /* Given a block B, update the CFG and SSA graph to reflect redirecting
one or more in-edges to B to instead reach the destination of an
@@ -2084,10 +2085,8 @@ mark_threaded_blocks (bitmap threaded_blocks)
   /* Look for jump threading paths which cross multiple loop headers.
 
  The code to thread through loop headers will change the CFG in ways
- that break assumptions made by the loop optimization code.
-
- We don't want to blindly cancel the requests.  We can instead do better
- by trimming off the end of the jump thread path.  */
+ that invalidate the cached loop iteration information.  So we must
+ detect that case and wipe the cached information.  */
   EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
 {
   basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i);
@@ -2102,26 +2101,16 @@ mark_threaded_blocks (bitmap threaded_blocks)
   i++)
{
  basic_block dest = (*path)[i]->e->dest;
+ basic_block src = (*path)[i]->e->src;
  crossed_headers += (dest

[doc,libstdc++] sourceforge.net defaults to https, adjust reference

2017-01-03 Thread Gerald Pfeifer

Applied.

Gerald

2017-01-03  Gerald Pfeifer  

* doc/xml/manual/documentation_hacking.xml: sourceforge.net now
defaults to https; adjust reference.

Index: doc/xml/manual/documentation_hacking.xml
===
--- doc/xml/manual/documentation_hacking.xml(revision 244022)
+++ doc/xml/manual/documentation_hacking.xml(working copy)
@@ -778,7 +778,7 @@
   
 
   
-   For epub output, the http://www.w3.org/1999/xlink; 
xlink:href="http://sourceforge.net/projects/docbook/files/epub3/;>stylesheets
 for EPUB3 are required. These stylesheets are still in development. To 
validate the created file, http://www.w3.org/1999/xlink; 
xlink:href="https://code.google.com/p/epubcheck/;>epubcheck is necessary.
+   For epub output, the http://www.w3.org/1999/xlink; 
xlink:href="https://sourceforge.net/projects/docbook/files/epub3/;>stylesheets
 for EPUB3 are required. These stylesheets are still in development. To 
validate the created file, http://www.w3.org/1999/xlink; 
xlink:href="https://code.google.com/p/epubcheck/;>epubcheck is necessary.

[doc,rfc] clean up preprocessor option documentation

2017-01-03 Thread Sandra Loosemore

This patch is another installment in my series to clean up the 
preprocessor documentation.  Most of the changes here are just 
copy-editing for style and markup and fixing/removing obvious bit-rot, 
but there's also one chunk that I've rewritten completely.


The rewrite involves combining the discussion of -I, -iquote, -isystem, 
and -idirafter to include a coherent explanation of how the preprocessor 
searches for header files.  I think that this belongs in the GCC manual 
and not just the preprocessor tutorial, and that just discussing the 
effect of each option in isolation is confusing and doesn't give users 
the "big picture".  The existing documentation may be confusing enough 
that I ended up getting something wrong in the rewrite, so I'm going to 
hold off on committing this for a day or two to give others a chance to 
review and comment first.


I'm aware that the tutorial material in cpp.texi also has similar 
issues, due to newer options not being fully integrated into the older 
discussion of the behavior of the preprocessor.  (E.g., it gives way too 
much emphasis to the deprecated -I- option.)  I'll tackle fixing that 
with a separate patch.


-Sandra

2017-01-03  Sandra Loosemore  

	gcc/
	* doc/cppdiropts.texi: Merge documentation of -I, -iquote,
	-isystem, and -idirafter.  Copy-edit.
	* doc/cppopts.texi: Copy-edit.  Remove contradiction about 
	default for -ftrack-macro-expansion.  Delete obsolete and 
	badly-formatted implementation details about -fdebug-cpp output.
	* doc/cppwarnopts.texi: Copy-edit.
Index: gcc/doc/cppdiropts.texi
===
--- gcc/doc/cppdiropts.texi	(revision 244009)
+++ gcc/doc/cppdiropts.texi	(working copy)
@@ -10,61 +10,85 @@
 @c formatted for inclusion in the CPP manual; otherwise the main GCC manual.
 
 @item -I @var{dir}
+@itemx -iquote @var{dir}
+@itemx -isystem @var{dir}
+@itemx -idirafter @var{dir}
 @opindex I
+@opindex iquote
+@opindex isystem
+@opindex idirafter
 Add the directory @var{dir} to the list of directories to be searched
-for header files.
+for header files during preprocessing.
 @ifset cppmanual
 @xref{Search Path}.
 @end ifset
-If you use more than
-one @option{-I} option, the directories are scanned in left-to-right
-order; the standard system directories come after.
+If @var{dir} begins with @samp{=}, then the @samp{=} is replaced
+by the sysroot prefix; see @option{--sysroot} and @option{-isysroot}.
 
-This can be used to override a system header
+Directories specified with @option{-iquote} apply only to the quote 
+form of the directive, @code{@w{#include "@var{file}"}}.
+Directories specified with @option{-I}, @option{-isystem}, 
+or @option{-idirafter} apply to lookup for both the
+@code{@w{#include "@var{file}"}} and
+@code{@w{#include <@var{file}>}} directives.
+
+You can specify any number or combination of these options on the 
+command line to search for header files in several directories.  
+The lookup order is as follows:
+
+@enumerate
+@item
+For the quote form of the include directive, the directory of the current
+file is searched first.
+
+@item
+For the quote form of the include directive, the directories specified
+by @option{-iquote} options are searched in left-to-right order,
+as they appear on the command line.
+
+@item
+Directories specified with @option{-I} options are scanned in
+left-to-right order.
+
+@item
+Directories specified with @option{-isystem} options are scanned in
+left-to-right order.
+
+@item
+Standard system directories are scanned.
+
+@item
+Directories specified with @option{-idirafter} options are scanned in
+left-to-right order.
+@end enumerate
+
+You can use @option{-I} to override a system header
 file, substituting your own version, since these directories are
-searched before the system header file directories.  However, you should
+searched before the standard system header file directories.  
+However, you should
 not use this option to add directories that contain vendor-supplied
-system header files (use @option{-isystem} for that).
+system header files; use @option{-isystem} for that.
+
+The @option{-isystem} and @option{-idirafter} options also mark the directory
+as a system directory, so that it gets the same special treatment that
+is applied to the standard system directories.
+@ifset cppmanual
+@xref{System Headers}.
+@end ifset
 
 If a standard system include directory, or a directory specified with
 @option{-isystem}, is also specified with @option{-I}, the @option{-I}
 option is ignored.  The directory is still searched but as a
 system directory at its normal position in the system include chain.
 This is to ensure that GCC's procedure to fix buggy system headers and
-the ordering for the @code{include_next} directive are not inadvertently changed.
+the ordering for the @code{#include_next} directive are not inadvertently
+changed.
 If you really need to change the search order for system

New Brazilian Portuguese PO file for 'cpplib' (version 7.1-b20170101)

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Brazilian Portuguese team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/pt_BR.po

(This file, 'cpplib-7.1-b20170101.pt_BR.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Contents of PO file 'cpplib-7.1-b20170101.pt_BR.po'

2017-01-03 Thread Translation Project Robot



cpplib-7.1-b20170101.pt_BR.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] Fix -fself-test ICE in non-english locale (PR bootstrap/77569)

2017-01-03 Thread David Malcolm

On Wed, 2017-01-04 at 00:01 +0100, Jakub Jelinek wrote:
> Hi!
> 
> The cb.error hook is called in the case we are looking for with
> _("conversion from %s to %s not supported by iconv")
> where _(msgid) is dgettext ("cpplib", msgid), so if performing -fself
> -test
> on iconv that doesn't support ebcdic in a locale that has
> translations
> for this string, gcc ICEs.
> 
> The following patch uses the same translation as libcpp to avoid
> that.
> 
> I've bootstrapped/regtested this on x86_64-linux and i686-linux, but
> there
> iconv doesn't fail, plus simulated in the debugger iconv error (both
> in
> LC_ALL=en_US.UTF-8 and LC_ALL=de_DE.UTF-8).  Ok for trunk?

Thanks for looking at this; OK for trunk.

I wonder if it makes sense to add a target to gcc/Makefile.in to run
the selftests in a non-English locale, to catch these things earlier?
Or to hardcode it to run in the C locale?

> 2017-01-03  Jakub Jelinek  
> 
>   PR bootstrap/77569
>   * input.c (ebcdic_execution_charset::on_error): Don't use
> strstr for
>   a substring of the message, but strcmp with the whole message. 
>  Ifdef
>   ENABLE_NLS, translate the message first using dgettext.
> 
> --- gcc/input.c.jj2017-01-01 12:45:37.0 +0100
> +++ gcc/input.c   2017-01-03 13:40:46.827595040 +0100
> @@ -2026,9 +2026,14 @@ class ebcdic_execution_charset : public
>  ATTRIBUTE_FPTR_PRINTF(5,0)
>{
>  gcc_assert (s_singleton);
> +/* Avoid exgettext from picking this up, it is translated in
> libcpp.  */
> +const char *msg = "conversion from %s to %s not supported by
> iconv";
> +#ifdef ENABLE_NLS
> +msg = dgettext ("cpplib", msg);
> +#endif
>  /* Detect and record errors emitted by
> libcpp/charset.c:init_iconv_desc
> when the local iconv build doesn't support the conversion. 
>  */
> -if (strstr (msgid, "not supported by iconv"))
> +if (strcmp (msgid, msg) == 0)
>{
>   s_singleton->m_num_iconv_errors++;
>   return true;
> 
>   Jakub

GCC patch committed: Fix -fdump-go-spec for enums

2017-01-03 Thread Ian Lance Taylor

This patch to GCC fixes the -fdump-go-spec output for enums.
Previously GCC always printed "int", which is often incorrect as the
Go type "int" is 64 bits on a 64-bit hosts, while enums are typically
32 bits.  This was mostly harmless but caused the representation of
the ffi_cif type from libffi to be wrong, as the first type in that
field is an enum.  This caused the other fields to be misaligned from
the point of view of Go, which caused the garbage collector to not see
some pointers stored by the code in libgo/go/runtime/ffi.go.  These
pointers could then be collected early, causing a crash.  This patch
fixes the problem by simply treating enums as integers.  Bootstrapped
and ran Go and godump tests on x86_64-pc-linux-gnu.  Committed to
mainline.

Ian

gcc/ChangeLog:

2017-01-03  Ian Lance Taylor  

* godump.c (go_format_type): Treat ENUMERAL_TYPE like
INTEGER_TYPE.


gcc/testsuite/ChangeLog:

2017-01-03  Ian Lance Taylor  

* gcc.misc-tests/godump-1.c: Update for accurate representation of
enums.
Index: gcc/godump.c
===
--- gcc/godump.c(revision 244040)
+++ gcc/godump.c(working copy)
@@ -722,10 +722,6 @@ go_format_type (struct godump_container
 
   switch (TREE_CODE (type))
 {
-case ENUMERAL_TYPE:
-  obstack_grow (ob, "int", 3);
-  break;
-
 case TYPE_DECL:
   {
void **slot;
@@ -741,6 +737,7 @@ go_format_type (struct godump_container
   }
   break;
 
+case ENUMERAL_TYPE:
 case INTEGER_TYPE:
   {
const char *s;
Index: gcc/testsuite/gcc.misc-tests/godump-1.c
===
--- gcc/testsuite/gcc.misc-tests/godump-1.c (revision 244040)
+++ gcc/testsuite/gcc.misc-tests/godump-1.c (working copy)
@@ -373,7 +373,7 @@ enum { E11 };
 /* { dg-final { scan-file godump-1.out "(?n)^const _E11 = 0$" } } */
 
 enum { EV11 } e1_v1;
-/* { dg-final { scan-file godump-1.out "(?n)^var _e1_v1 int$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^var _e1_v1 u?int\[0-9\]*$" } } */
 /* { dg-final { scan-file godump-1.out "(?n)^const _EV11 = 0$" } } */
 
 enum { E21, E22 };
@@ -381,7 +381,7 @@ enum { E21, E22 };
 /* { dg-final { scan-file godump-1.out "(?n)^const _E22 = 1$" } } */
 
 enum { EV21, EV22 } e2_v1;
-/* { dg-final { scan-file godump-1.out "(?n)^var _e2_v1 int$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^var _e2_v1 u?int\[0-9\]*$" } } */
 /* { dg-final { scan-file godump-1.out "(?n)^const _EV21 = 0$" } } */
 /* { dg-final { scan-file godump-1.out "(?n)^const _EV22 = 1$" } } */
 
@@ -392,12 +392,12 @@ enum { EN1 = 3, EN2 = 77, EN3 = -1, EN4
 /* { dg-final { scan-file godump-1.out "(?n)^const _EN4 = 0$" } } */
 
 typedef enum { ET1, ET2 } et_t;
-/* { dg-final { scan-file godump-1.out "(?n)^type _et_t int$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^type _et_t u?int\[0-9\]*$" } } */
 /* { dg-final { scan-file godump-1.out "(?n)^const _ET1 = 0$" } } */
 /* { dg-final { scan-file godump-1.out "(?n)^const _ET2 = 1$" } } */
 
 enum { ETV1, ETV2 } et_v1;
-/* { dg-final { scan-file godump-1.out "(?n)^var _et_v1 int$" } } */
+/* { dg-final { scan-file godump-1.out "(?n)^var _et_v1 u?int\[0-9\]*$" } } */
 /* { dg-final { scan-file godump-1.out "(?n)^const _ETV1 = 0$" } } */
 /* { dg-final { scan-file godump-1.out "(?n)^const _ETV2 = 1$" } } */

[PATCH] C++: fix fix-it hints for misspellings within explicit namespaces

2017-01-03 Thread David Malcolm

PR c++/77829 and PR c++/78656 identify an issue within the C++ frontend
where it issues nonsensical fix-it hints for misspelled name lookups
within an explicitly given namespace: it finds the closest name within
all namespaces, and uses the location of the namespace for the replacement,
rather than the name.

For example, for this error:

  #include 
  void* allocate(std::size_t n)
  {
return std::alocator().allocate(n);
  }

we currently emit an erroneous fix-it hint that would generate this
nonsensical patch:

   {
  -  return std::alocator().allocate(n);
  +  return allocate::alocator().allocate(n);
   }

whereas we ought to emit a fix-it hint that would generate this patch:

   {
  -  return std::alocator().allocate(n);
  +  return std::allocator().allocate(n);
   }

This patch fixes the suggestions, in two parts:

The incorrect name in the suggestion is fixed by introducing a
new function "suggest_alternative_in_explicit_scope"
for use by qualified_name_lookup_error when handling failures
in explicitly-given namespaces, looking for hint candidates within
just that namespace.  The function suggest_alternatives_for gains a
"suggest_misspellings" bool param, so that we can disable fuzzy name
lookup for the case where we've ruled out hint candidates in the
explicitly-given namespace.

This lets us suggest "allocator" (found in "std") rather "allocate"
(found in the global ns).

The patch fixes the location for the replacement by updating
local "unqualified_id" in cp_parser_id_expression from tree to
cp_expr to avoid implicitly dropping location information, and
passing this location to a new param of finish_id_expression.
There are multiple users of finish_id_expression, and it wasn't
possible to provide location information for the id for all of them
so the new location information is assumed to be optional there.

This fixes the underlined location, and ensures that the fix-it hint
replaces "alocator", rather than "std".

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/77829
PR c++/78656
* cp-tree.h (finish_id_expression): Add second location_t param.
(suggest_alternatives_for): Add bool param.
(suggest_alternative_in_explicit_scope): New decl.
* error.c (qualified_name_lookup_error): When SCOPE is a namespace
that isn't the global one, call new function
suggest_alternative_in_explicit_scope, only calling
suggest_alternatives_for if it fails, and disabling near match
searches fort that case.  When SCOPE is the global namespace,
pass true for new param to suggest_alternatives_for to allow for
fuzzy name lookups.
* lex.c (unqualified_name_lookup_error): Pass true for new param
to suggest_alternatives_for.
* name-lookup.c (consider_binding_level): Add forward decl.
(suggest_alternatives_for): Add "suggest_misspellings" param,
using it to conditionalize the fuzzy name-lookup code.
(suggest_alternative_in_explicit_scope): New function.
* parser.c (cp_parser_primary_expression): Pass location of
id_expression to the new param of finish_id_expression.
(cp_parser_id_expression): Convert local "unqualified_id" from
tree to cp_expr to avoid implicitly dropping location information.
(cp_parser_lambda_introducer): Pass UNKNOWN_LOCATION to new param
to finish_id_expression.
(cp_parser_decltype_expr): Likewise.
* pt.c (tsubst_copy_and_build): Likewise.
* semantics.c (finish_id_expression): Document param "location".
Add param "id_location", using it for qualified_name_lookup_error
if it contains a known location.
(omp_reduction_lookup): Pass UNKNOWN_LOCATION to new param to
finish_id_expression.

gcc/testsuite/ChangeLog:
PR c++/77829
PR c++/78656
* g++.dg/spellcheck-pr77829.C: New test case.
* g++.dg/spellcheck-pr78656.C: New test case.
---
 gcc/cp/cp-tree.h  |   6 +-
 gcc/cp/error.c|   5 +-
 gcc/cp/lex.c  |   2 +-
 gcc/cp/name-lookup.c  |  55 --
 gcc/cp/parser.c   |  11 +-
 gcc/cp/pt.c   |   3 +-
 gcc/cp/semantics.c|  22 +++-
 gcc/testsuite/g++.dg/spellcheck-pr77829.C | 167 ++
 gcc/testsuite/g++.dg/spellcheck-pr78656.C |  39 +++
 9 files changed, 287 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-pr77829.C
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-pr78656.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f1a5835..ce71a20 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6462,7 +6462,8 @@ extern cp_expr finish_id_expression   (tree, 
tree, tree,
 bool, bool, bool

Contents of PO file 'cpplib-7.1-b20170101.fi.po'

2017-01-03 Thread Translation Project Robot



cpplib-7.1-b20170101.fi.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

New Finnish PO file for 'cpplib' (version 7.1-b20170101)

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Finnish team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/fi.po

(This file, 'cpplib-7.1-b20170101.fi.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[PATCH] move snprintf truncation warnings under own option ()

2017-01-03 Thread Martin Sebor


The -Wformat-length option warns about both overflow and truncation.
I had initially debated introducing two options, one for each of
the two kinds of problems, but decided to go with just one and
consider breaking it up based on feedback.

I feel that there has now been sufficient feedback (e.g., bugs 77708
and 78913) to justify breaking up the checkers and providing a new
option to control truncation independently.

The attached patch adds a new option, -Wformat-truncation=level that
accomplishes this.  At level 1 the new option only warns on certain
truncation, or on likely truncation in snprintf calls whose return
value is unused (using the return value suppresses the warning in
these cases).

This change eliminates the -Wformat-length warnings from a build
of the Linux kernel, replacing their 43 instances with 32 of the
-Wformat-truncation warning.  With one exception, they're all for
snprintf calls whose return values is unused (and thus possible
sources of bugs).

If/when this patch is approved I'd like to rename -Wformat-length
to -Wformat-overflow to make the option's refined purpose clear,
and for consistency with the -Wstringop-overflow option.

Martin
PR tree-optimization/78913 - Probably misleading error reported by -Wformat-length
PR middle-end/77708 - -Wformat-length %s warns for snprintf

gcc/c-family/ChangeLog:

	PR tree-optimization/78913
	PR middle-end/77708
	* c.opt (-Wformat-truncation): New option.

gcc/testsuite/ChangeLog:

	PR tree-optimization/78913
	PR middle-end/77708
	* gcc.dg/tree-ssa/builtin-snprintf-warn-1.c: New test.
	* gcc.dg/tree-ssa/builtin-snprintf-warn-2.c: New test.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-6.c: XFAIL test cases failing
	due to bug 78969.

gcc/ChangeLog:

	PR tree-optimization/78913
	PR middle-end/77708
	* doc/invoke.texi (Warning Options): Document -Wformat-truncation.
	* gimple-ssa-sprintf.c (call_info::reval_used, call_info::warnopt):
	New member functions.
	(format_directive): Used them.
	(add_bytes): Same.
	(pass_sprintf_length::handle_gimple_call): Same.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 3c06aec..849634c 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -537,6 +537,11 @@ Wformat-signedness
 C ObjC C++ ObjC++ Var(warn_format_signedness) Warning
 Warn about sign differences with format functions.
 
+Wformat-truncation
+C ObjC C++ ObjC++ Warning Alias(Wformat-truncation=, 1, 0)
+Warn about calls to snprintf and similar functions that truncate output.
+Same as -Wformat-truncation=1.
+
 Wformat-y2k
 C ObjC C++ ObjC++ Var(warn_format_y2k) Warning LangEnabledBy(C ObjC C++ ObjC++,Wformat=,warn_format >= 2, 0)
 Warn about strftime formats yielding 2-digit years.
@@ -554,6 +559,10 @@ C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_format_length) Warning
 Warn about function calls with format strings that write past the end
 of the destination region.
 
+Wformat-truncation=
+C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_format_trunc) Warning LangEnabledBy(C ObjC C++ ObjC++,Wformat=, warn_format >= 1, 0)
+Warn about calls to snprintf and similar functions that truncate output.
+
 Wignored-qualifiers
 C C++ Var(warn_ignored_qualifiers) Warning EnabledBy(Wextra)
 Warn whenever type qualifiers are ignored.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a8f8efe..2ae265a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -276,7 +276,8 @@ Objective-C and Objective-C++ Dialects}.
 -Werror  -Werror=* -Wfatal-errors -Wfloat-equal  -Wformat  -Wformat=2 @gol
 -Wno-format-contains-nul -Wno-format-extra-args -Wformat-length=@var{n} @gol
 -Wformat-nonliteral @gol
--Wformat-security  -Wformat-signedness  -Wformat-y2k -Wframe-address @gol
+-Wformat-security  -Wformat-signedness -Wformat-truncation=@var{n} @gol
+-Wformat-y2k -Wframe-address @gol
 -Wframe-larger-than=@var{len} -Wno-free-nonheap-object -Wjump-misses-init @gol
 -Wignored-qualifiers  -Wignored-attributes  -Wincompatible-pointer-types @gol
 -Wimplicit  -Wimplicit-fallthrough  -Wimplicit-fallthrough=@var{n} @gol
@@ -3959,10 +3960,9 @@ Unix Specification says that such unused arguments are allowed.
 @opindex Wformat-length
 @opindex Wno-format-length
 Warn about calls to formatted input/output functions such as @code{sprintf}
-that might overflow the destination buffer, or about bounded functions such
-as @code{snprintf} that might result in output truncation.  When the exact
-number of bytes written by a format directive cannot be determined at
-compile-time it is estimated based on heuristics that depend on the
+and @code{vsprintf} that might overflow the destination buffer.  When the
+exact number of bytes written by a format directive cannot be determined
+at compile-time it is estimated based on heuristics that depend on the
 @var{level} argument and on optimization.  While enabling optimization
 will in most cases improve the accuracy of the warning, it may also
 result in false positives.
@@ -3974,15 +3974,14 @@ result in

[PATCH] Change DWARF5 .debug_loclists location description sizes from 2-byte length to uleb128 lengths

2017-01-03 Thread Jakub Jelinek

Hi!

http://dwarfstd.org/ShowIssue.php?issue=161102.1
got accepted today, so DWARF5 is going to use uleb128 sizes instead of
2-byte sizes in .debug_loclists section.
On a randomly chosen *.i file I had around, this results in shrinking
of .debug_loclists section size from 0xef7df to 0xddd65, so around 7.5%
saving, not too bad.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Jan/Mark, are you going to adjust the consumers accordingly?  Thanks.

2017-01-03  Jakub Jelinek  

* dwarf2out.c (output_loc_list): Don't throw away 64K+ location
descriptions for -gdwarf-5 and emit them as uleb128 instead of
2-byte data.

--- gcc/dwarf2out.c.jj  2017-01-03 19:41:45.0 +0100
+++ gcc/dwarf2out.c 2017-01-03 20:58:21.304628767 +0100
@@ -9590,7 +9590,7 @@ output_loc_list (dw_loc_list_ref list_he
 perhaps put it into DW_TAG_dwarf_procedure and refer to that
 in the expression, but >= 64KB expressions for a single value
 in a single range are unlikely very useful.  */
-  if (size > 0x)
+  if (dwarf_version < 5 && size > 0x)
continue;
   if (dwarf_version >= 5)
{
@@ -9642,8 +9642,6 @@ output_loc_list (dw_loc_list_ref list_he
  if (strcmp (curr2->begin, curr2->end) == 0
  && !curr2->force)
continue;
- if ((unsigned long) size_of_locs (curr2->expr) > 0x)
-   continue;
  break;
}
  if (curr2 == NULL || curr->section != curr2->section)
@@ -9744,8 +9742,13 @@ output_loc_list (dw_loc_list_ref list_he
}
 
   /* Output the block length for this list of location operations.  */
-  gcc_assert (size <= 0x);
-  dw2_asm_output_data (2, size, "%s", "Location expression size");
+  if (dwarf_version >= 5)
+   dw2_asm_output_data_uleb128 (size, "Location expression size");
+  else
+   {
+ gcc_assert (size <= 0x);
+ dw2_asm_output_data (2, size, "Location expression size");
+   }
 
   output_loc_sequence (curr->expr, -1);
 }

Jakub

[PATCH] Remove padding from DWARF5 headers

2017-01-03 Thread Jakub Jelinek

Hi!

http://dwarfstd.org/ShowIssue.php?issue=161031.2
got approved today, so DWARF5 is changing and the various DW_UT_* kinds
will no longer have the same size of the headers.  So,
DW_UT_compile/DW_UT_partial shrinks by 12/16 bytes (padding 1 and padding 2
is removed; 16 bytes for 64-bit DWARF), DW_UT_type remains the same,
DW_UT_skeleton/DW_UT_split_compile shrink by 4/8 bytes (padding 2 is
removed).  For DW_UT_* kinds consumers don't understand, the first 3 fields
(length, version and ut kind) are required to be present and the only
sensible action is to skip the whole unit (using length field).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Jan/Mark, are you going to adjust GDB/elfutils etc. correspondingly?

2017-01-03  Jakub Jelinek  

* dwarf2out.c (DWARF_COMPILE_UNIT_HEADER_SIZE): For DWARF5 decrease
by 12.
(DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE): Always
DWARF_COMPILE_UNIT_HEADER_SIZE plus 12.
(DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE): Define.
(calc_base_type_die_sizes): Use DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
for initial die_offset if dwarf_split_debug_info.
(output_comp_unit): Use DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE for
initial next_die_offset if dwo_id is non-NULL.  Don't emit padding
fields.
(output_skeleton_debug_sections): Formatting fix.  Use
DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE instead of
DWARF_COMPILE_UNIT_HEADER_SIZE.  Don't emit padding.

--- gcc/dwarf2out.c.jj  2017-01-03 16:04:17.0 +0100
+++ gcc/dwarf2out.c 2017-01-03 19:41:45.526194592 +0100
@@ -2996,14 +2996,16 @@ skeleton_chain_node;
 /* Fixed size portion of the DWARF compilation unit header.  */
 #define DWARF_COMPILE_UNIT_HEADER_SIZE \
   (DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE   \
-   + (dwarf_version >= 5   \
-  ? 4 + DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE : 3))
+   + (dwarf_version >= 5 ? 4 : 3))
 
 /* Fixed size portion of the DWARF comdat type unit header.  */
 #define DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE \
   (DWARF_COMPILE_UNIT_HEADER_SIZE  \
-   + (dwarf_version >= 5   \
-  ? 0 : DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE))
+   + DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE)
+
+/* Fixed size portion of the DWARF skeleton compilation unit header.  */
+#define DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE \
+  (DWARF_COMPILE_UNIT_HEADER_SIZE + (dwarf_version >= 5 ? 8 : 0))
 
 /* Fixed size portion of public names info.  */
 #define DWARF_PUBNAMES_HEADER_SIZE (2 * DWARF_OFFSET_SIZE + 2)
@@ -9044,7 +9046,9 @@ calc_die_sizes (dw_die_ref die)
 static void
 calc_base_type_die_sizes (void)
 {
-  unsigned long die_offset = DWARF_COMPILE_UNIT_HEADER_SIZE;
+  unsigned long die_offset = (dwarf_split_debug_info
+ ? DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
+ : DWARF_COMPILE_UNIT_HEADER_SIZE);
   unsigned int i;
   dw_die_ref base_type;
 #if ENABLE_ASSERT_CHECKING
@@ -10302,7 +10306,9 @@ output_comp_unit (dw_die_ref die, int ou
   delete extern_map;
 
   /* Initialize the beginning DIE offset - and calculate sizes/offsets.  */
-  next_die_offset = DWARF_COMPILE_UNIT_HEADER_SIZE;
+  next_die_offset = (dwo_id
+? DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
+: DWARF_COMPILE_UNIT_HEADER_SIZE);
   calc_die_sizes (die);
 
   oldsym = die->die_id.die_symbol;
@@ -10330,12 +10336,6 @@ output_comp_unit (dw_die_ref die, int ou
   if (dwo_id != NULL)
for (int i = 0; i < 8; i++)
  dw2_asm_output_data (1, dwo_id[i], i == 0 ? "DWO id" : NULL);
-  else
-   /* Hope all the padding will be removed for DWARF 5 final for
-  DW_AT_compile and DW_AT_partial.  */
-   dw2_asm_output_data (8, 0, "Padding 1");
-
-  dw2_asm_output_data (DWARF_OFFSET_SIZE, 0, "Padding 2");
 }
   output_die (die);
 
@@ -10430,10 +10430,11 @@ output_skeleton_debug_sections (dw_die_r
  header.  */
   if (DWARF_INITIAL_LENGTH_SIZE - DWARF_OFFSET_SIZE == 4)
 dw2_asm_output_data (4, 0x,
-  "Initial length escape value indicating 64-bit DWARF extension");
+"Initial length escape value indicating 64-bit "
+"DWARF extension");
 
   dw2_asm_output_data (DWARF_OFFSET_SIZE,
-   DWARF_COMPILE_UNIT_HEADER_SIZE
+  DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
- DWARF_INITIAL_LENGTH_SIZE
+ size_of_die (comp_unit),
   "Length of Compilation Unit Info");
@@ -10449,12 +10450,8 @@ output_skeleton_debug_sections (dw_die_r
   if (dwarf_version < 5)
 dw2_asm_output_data (1, DWARF2_ADDR_SIZE, "Pointer Size (in bytes)");
   else
-{
-  for (int i = 0; i < 8; i++)
-

[PATCH] Fix -fself-test ICE in non-english locale (PR bootstrap/77569)

2017-01-03 Thread Jakub Jelinek

Hi!

The cb.error hook is called in the case we are looking for with
_("conversion from %s to %s not supported by iconv")
where _(msgid) is dgettext ("cpplib", msgid), so if performing -fself-test
on iconv that doesn't support ebcdic in a locale that has translations
for this string, gcc ICEs.

The following patch uses the same translation as libcpp to avoid that.

I've bootstrapped/regtested this on x86_64-linux and i686-linux, but there
iconv doesn't fail, plus simulated in the debugger iconv error (both in
LC_ALL=en_US.UTF-8 and LC_ALL=de_DE.UTF-8).  Ok for trunk?

2017-01-03  Jakub Jelinek  

PR bootstrap/77569
* input.c (ebcdic_execution_charset::on_error): Don't use strstr for
a substring of the message, but strcmp with the whole message.  Ifdef
ENABLE_NLS, translate the message first using dgettext.

--- gcc/input.c.jj  2017-01-01 12:45:37.0 +0100
+++ gcc/input.c 2017-01-03 13:40:46.827595040 +0100
@@ -2026,9 +2026,14 @@ class ebcdic_execution_charset : public
 ATTRIBUTE_FPTR_PRINTF(5,0)
   {
 gcc_assert (s_singleton);
+/* Avoid exgettext from picking this up, it is translated in libcpp.  */
+const char *msg = "conversion from %s to %s not supported by iconv";
+#ifdef ENABLE_NLS
+msg = dgettext ("cpplib", msg);
+#endif
 /* Detect and record errors emitted by libcpp/charset.c:init_iconv_desc
when the local iconv build doesn't support the conversion.  */
-if (strstr (msgid, "not supported by iconv"))
+if (strcmp (msgid, msg) == 0)
   {
s_singleton->m_num_iconv_errors++;
return true;

Jakub

libgo patch committed: Eliminate __go_alloc and __go_free

2017-01-03 Thread Ian Lance Taylor

This patch to libgo eliminates the __go_alloc and __go_free functions,
in favor of fully typed memory allocation (except for one remaining
call in parfor.c, which will go away later).  Implementing this was
simplified by moving the allgs slice from C to Go, which involved
moving a few functions.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 244031)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-eac28020ee4b2532d4cd43f448fe612e84e0a108
+dfe446c5a54ca0febabb81b542cc4e634c6f5c30
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/mprof.go
===
--- libgo/go/runtime/mprof.go   (revision 243084)
+++ libgo/go/runtime/mprof.go   (working copy)
@@ -556,7 +556,7 @@ func GoroutineProfile(p []StackRecord) (
stopTheWorld("profile")
 
n = 1
-   for _, gp1 := range allgs() {
+   for _, gp1 := range allgs {
if isOK(gp1) {
n++
}
@@ -571,7 +571,7 @@ func GoroutineProfile(p []StackRecord) (
r = r[1:]
 
// Save other goroutines.
-   for _, gp1 := range allgs() {
+   for _, gp1 := range allgs {
if isOK(gp1) {
if len(r) == 0 {
// Should be impossible, but better to 
return a
Index: libgo/go/runtime/proc.go
===
--- libgo/go/runtime/proc.go(revision 243805)
+++ libgo/go/runtime/proc.go(working copy)
@@ -11,15 +11,18 @@ import (
 
 // Functions temporarily called by C code.
 //go:linkname newextram runtime.newextram
+//go:linkname checkdead runtime.checkdead
+//go:linkname schedtrace runtime.schedtrace
+//go:linkname allgadd runtime.allgadd
 
 // Functions temporarily in C that have not yet been ported.
 func allocm(*p, bool, *unsafe.Pointer, *uintptr) *m
 func malg(bool, bool, *unsafe.Pointer, *uintptr) *g
-func allgadd(*g)
 
 // C functions for ucontext management.
 func setGContext()
 func makeGContext(*g, unsafe.Pointer, uintptr)
+func getTraceback(me, gp *g)
 
 // main_init_done is a signal used by cgocallbackg that initialization
 // has been completed. It is made before _cgo_notify_runtime_init_done,
@@ -27,6 +30,39 @@ func makeGContext(*g, unsafe.Pointer, ui
 // it is closed, meaning cgocallbackg can reliably receive from it.
 var main_init_done chan bool
 
+var (
+   allgs[]*g
+   allglock mutex
+)
+
+func allgadd(gp *g) {
+   if readgstatus(gp) == _Gidle {
+   throw("allgadd: bad status Gidle")
+   }
+
+   lock()
+   allgs = append(allgs, gp)
+   allglen = uintptr(len(allgs))
+
+   // Grow GC rescan list if necessary.
+   if len(allgs) > cap(work.rescan.list) {
+   lock()
+   l := work.rescan.list
+   // Let append do the heavy lifting, but keep the
+   // length the same.
+   work.rescan.list = append(l[:cap(l)], 0)[:len(l)]
+   unlock()
+   }
+   unlock()
+}
+
+// All reads and writes of g's status go through readgstatus, casgstatus
+// castogscanstatus, casfrom_Gscanstatus.
+//go:nosplit
+func readgstatus(gp *g) uint32 {
+   return atomic.Load()
+}
+
 // If asked to move to or from a Gscanstatus this will throw. Use the 
castogscanstatus
 // and casfrom_Gscanstatus instead.
 // casgstatus will loop if the g->atomicstatus is in a Gscan status until the 
routine that
@@ -328,3 +364,170 @@ func lockextra(nilokay bool) *m {
 func unlockextra(mp *m) {
atomic.Storeuintptr(, uintptr(unsafe.Pointer(mp)))
 }
+
+// Check for deadlock situation.
+// The check is based on number of running M's, if 0 -> deadlock.
+func checkdead() {
+   // For -buildmode=c-shared or -buildmode=c-archive it's OK if
+   // there are no running goroutines. The calling program is
+   // assumed to be running.
+   if islibrary || isarchive {
+   return
+   }
+
+   // If we are dying because of a signal caught on an already idle thread,
+   // freezetheworld will cause all running threads to block.
+   // And runtime will essentially enter into deadlock state,
+   // except that there is a thread that will call exit soon.
+   if panicking > 0 {
+   return
+   }
+
+   // -1 for sysmon
+   run := sched.mcount - sched.nmidle - sched.nmidlelocked - 1
+   if run > 0 {
+   return
+   }
+   if run < 0 {
+   print("runtime: checkdead: nmidle=", sched.nmidle, " 
nmidlelocked=", sched.nmidlelocked, " mcount=", sched.mcount, "\n")
+   throw("checkdead:

[C++ PATCH] Avoid UB in cp_lexer_previous_token (PR c++/71182)

2017-01-03 Thread Jakub Jelinek

Hi!

cp_lexer_new_from_tokens creates lexer that has NULL lexer->buffer,
calling lexer->buffer->address () therefore is UB (diagnosed by
--with-boot-config=bootstrap-ubsan).

The following patch fixes this, or Markus offered
  gcc_assert (!lexer->buffer || tp != lexer->buffer->address ());
instead.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk (or do you prefer Markus' version)?

2017-01-03  Jakub Jelinek  

PR c++/71182
* parser.c (cp_lexer_previous_token): Use vec_safe_address in the
assertion, as lexer->buffer may be NULL.

* g++.dg/cpp0x/pr71182.C: New test.

--- gcc/cp/parser.c.jj  2017-01-03 09:58:13.0 +0100
+++ gcc/cp/parser.c 2017-01-03 11:39:56.940595981 +0100
@@ -766,7 +766,7 @@ cp_lexer_previous_token (cp_lexer *lexer
   /* Skip past purged tokens.  */
   while (tp->purged_p)
 {
-  gcc_assert (tp != lexer->buffer->address ());
+  gcc_assert (tp != vec_safe_address (lexer->buffer));
   tp--;
 }
 
--- gcc/testsuite/g++.dg/cpp0x/pr71182.C.jj 2017-01-03 11:44:19.400246340 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr71182.C2017-01-03 11:44:04.795432815 
+0100
@@ -0,0 +1,12 @@
+// PR c++/71182
+// { dg-do compile { target c++11 } }
+
+class A {
+  template  void As();
+};
+template  class B : A {
+  void f() {
+A *g ;
+g ? g->As() : nullptr;
+  }
+};

Jakub

[C++ PATCH] Reject invalid auto foo (), a = 5;

2017-01-03 Thread Jakub Jelinek

Hi!

C++14 and above says that for auto specifier in [dcl.spec.auto]/7:
"If the init-declarator-list contains more than one
init-declarator, they shall all form declarations of variables."

The following patch attempts to reject this.  Bootstrapped/regtested
on x86_64-linux and i686-linux, ok for trunk?

struct A
{
  auto foo(), bar();
};

auto A::foo() { return 1; }
auto A::bar() { return 2; }

isn't rejected though, is that invalid too?

2017-01-03  Jakub Jelinek  

* parser.c (cp_parser_simple_declaration): Diagnose function
declaration among more than one init-declarators with auto
specifier.

* g++.dg/cpp1y/auto-fn34.C: New test.

--- gcc/cp/parser.c.jj  2017-01-03 08:12:27.0 +0100
+++ gcc/cp/parser.c 2017-01-03 09:58:13.336703629 +0100
@@ -12723,8 +12723,17 @@ cp_parser_simple_declaration (cp_parser*
   break;
 
   tree last_type;
+  bool auto_specifier_p;
+  /* NULL_TREE if both variable and function declaration are allowed,
+ error_mark_node if function declaration are not allowed and
+ a FUNCTION_DECL that should be diagnosed if it is followed by
+ variable declarations.  */
+  tree auto_function_declaration;
 
   last_type = NULL_TREE;
+  auto_specifier_p
+= decl_specifiers.type && type_uses_auto (decl_specifiers.type);
+  auto_function_declaration = NULL_TREE;
 
   /* Keep going until we hit the `;' at the end of the simple
  declaration.  */
@@ -12770,6 +12779,27 @@ cp_parser_simple_declaration (cp_parser*
   if (cp_parser_error_occurred (parser))
goto done;
 
+  if (auto_specifier_p && cxx_dialect >= cxx14)
+   {
+ /* If the init-declarator-list contains more than one
+init-declarator, they shall all form declarations of
+variables.  */
+ if (auto_function_declaration
+ && (TREE_CODE (decl) == FUNCTION_DECL
+ || auto_function_declaration != error_mark_node))
+   {
+ error_at (decl_specifiers.locations[ds_type_spec],
+   "non-variable %qD in declaration with more than one "
+   "declarator with placeholder type",
+   TREE_CODE (decl) == FUNCTION_DECL
+   ? decl : auto_function_declaration);
+ auto_function_declaration = error_mark_node;
+   }
+ else if (auto_function_declaration == NULL_TREE)
+   auto_function_declaration
+ = TREE_CODE (decl) == FUNCTION_DECL ? decl : error_mark_node;
+   }
+
   if (auto_result)
{
  if (last_type && last_type != error_mark_node
--- gcc/testsuite/g++.dg/cpp1y/auto-fn34.C.jj   2017-01-03 09:51:21.208086328 
+0100
+++ gcc/testsuite/g++.dg/cpp1y/auto-fn34.C  2017-01-03 09:48:06.0 
+0100
@@ -0,0 +1,12 @@
+// { dg-do compile { target c++14 } }
+
+auto f1 ();
+auto a = 5, f2 (); // { dg-error "in declaration with more than 
one declarator" }
+auto f3 (), b = 6; // { dg-error "in declaration with more than 
one declarator" }
+auto f4 (), f5 (), f6 ();  // { dg-error "in declaration with more than 
one declarator" }
+auto f1 () { return 3; }
+auto f2 () { return 4; }
+auto f3 () { return 5; }
+auto f4 () { return 6; }
+auto f5 () { return 7; }
+auto f6 () { return 8; }

Jakub

Re: [PATCH], PR tqrget/78953, Fix power9 insn does not meet its constraints

2017-01-03 Thread David Edelsohn

On Tue, Jan 3, 2017 at 4:43 PM, Michael Meissner
 wrote:
> In builting Spec 2006 with -mcpu=power9 with -O3, two of the benchmarks 
> (gamess
> and calculix) did not build due to an "insn does not match its constraints"
> error.
>
> (insn 2674 2673 2675 37 (parallel [
> (set (reg:SI 0 0 [985])
> (vec_select:SI (reg:V4SI 32 0 [orig:378 vect__50.42 ] [378])
> (parallel [
> (const_int 1 [0x1])
> ])))
> (clobber (reg:SI 31 31 [986]))
> ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1184 {vsx_extract_v4si_p9}
>  (expr_list:REG_UNUSED (reg:SI 31 31 [986])
> (nil)))
>
> This insn was formed by vsx_extract_v4si_store_p9 splitting the following insn
> after register allocation:
>
> (insn 376 374 378 32 (parallel [
> (set (mem:SI (plus:DI (reg:DI 7 7 [orig:394 ivtmp.316 ] [394])
> (const_int 112 [0x70])) [3 MEM[base: _399, offset: 
> 112B]+0 S4 A32])
> (vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
> (parallel [
> (const_int 2 [0x2])
> ])))
> (clobber (reg:SI 9 9 [675]))
> (clobber (reg:SI 10 10 [676]))
> ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1191 
> {*vsx_extract_v4si_store_p9}
>  (nil))
>
> It split it to:
>
> (insn 968 381 969 32 (parallel [
> (set (reg:SI 44 12 [671])
> (vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
> (parallel [
> (const_int 0 [0])
> ])))
> (clobber (scratch:SI))
> ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1185 {vsx_extract_v4si_p9}
>  (nil))
>
> Unfortunately, when it is splitting a word extract to be deposited into a GPR
> register, it needs to use a traditional Altivec register.
>
> The following patch fixes this:
>
> [gcc]
> 2017-01-03  Michael Meissner  
>
> PR target/78953
> * config/rs6000/vsx.md (vsx_extract__store_p9): If we are
> extracting SImode to a GPR register so that we can generate a
> store, limit the vector to be in a traditional Altivec register
> for the vextuwrx instruction.
>
> [gcc/testsuite]
> 2017-01-03  Michael Meissner  
>
> PR target/78953
> * gcc.target/powerpc/pr78953.c: New test.
>
> I did the usual bootstrap and make check with no regression on a little 
> endinan
> power8 system.  I also compiled the two Spec 2006 benchmarks that failed and
> they now build.  Is this ok for the trunk?  It does not need to be applied to
> GCC 6.x since the word extract optimization is new to GCC 7.

Okay.

Thanks, David

[PATCH], PR tqrget/78953, Fix power9 insn does not meet its constraints

2017-01-03 Thread Michael Meissner

In builting Spec 2006 with -mcpu=power9 with -O3, two of the benchmarks (gamess
and calculix) did not build due to an "insn does not match its constraints"
error.

(insn 2674 2673 2675 37 (parallel [
(set (reg:SI 0 0 [985])
(vec_select:SI (reg:V4SI 32 0 [orig:378 vect__50.42 ] [378])
(parallel [
(const_int 1 [0x1])
])))
(clobber (reg:SI 31 31 [986]))
]) "SPOOLES/MSMD/src/MSMD_init.c":113 1184 {vsx_extract_v4si_p9}
 (expr_list:REG_UNUSED (reg:SI 31 31 [986])
(nil)))

This insn was formed by vsx_extract_v4si_store_p9 splitting the following insn
after register allocation:

(insn 376 374 378 32 (parallel [
(set (mem:SI (plus:DI (reg:DI 7 7 [orig:394 ivtmp.316 ] [394])
(const_int 112 [0x70])) [3 MEM[base: _399, offset: 
112B]+0 S4 A32])
(vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
(parallel [
(const_int 2 [0x2])
])))
(clobber (reg:SI 9 9 [675]))
(clobber (reg:SI 10 10 [676]))
]) "SPOOLES/MSMD/src/MSMD_init.c":113 1191 {*vsx_extract_v4si_store_p9}
 (nil))

It split it to:

(insn 968 381 969 32 (parallel [
(set (reg:SI 44 12 [671])
(vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
(parallel [
(const_int 0 [0])
])))
(clobber (scratch:SI))
]) "SPOOLES/MSMD/src/MSMD_init.c":113 1185 {vsx_extract_v4si_p9}
 (nil))

Unfortunately, when it is splitting a word extract to be deposited into a GPR
register, it needs to use a traditional Altivec register.

The following patch fixes this:

[gcc]
2017-01-03  Michael Meissner  

PR target/78953
* config/rs6000/vsx.md (vsx_extract__store_p9): If we are
extracting SImode to a GPR register so that we can generate a
store, limit the vector to be in a traditional Altivec register
for the vextuwrx instruction.

[gcc/testsuite]
2017-01-03  Michael Meissner  

PR target/78953
* gcc.target/powerpc/pr78953.c: New test.

I did the usual bootstrap and make check with no regression on a little endinan
power8 system.  I also compiled the two Spec 2006 benchmarks that failed and
they now build.  Is this ok for the trunk?  It does not need to be applied to
GCC 6.x since the word extract optimization is new to GCC 7.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 243966)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -2628,7 +2628,7 @@ (define_insn_and_split "*vsx_extract__store_p9"
   [(set (match_operand: 0 "memory_operand" "=Z,m")
(vec_select:
-(match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" ",")
+(match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" ",v")
 (parallel [(match_operand:QI 2 "const_int_operand" "n,n")])))
(clobber (match_scratch: 3 "=,"))
(clobber (match_scratch:SI 4 "=X,"))]
Index: gcc/testsuite/gcc.target/powerpc/pr78953.c
===
--- gcc/testsuite/gcc.target/powerpc/pr78953.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr78953.c  (working copy)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+#include 
+
+/* PR 78953: mem = vec_extract (V4SI, ) failed if the vector was in a
+   traditional FPR register.  */
+
+void
+foo (vector int *vp, int *ip)
+{
+  vector int v = *vp;
+  __asm__ (" # fpr %x0" : "+d" (v));
+  ip[4] = vec_extract (v, 0);
+}
+
+/* { dg-final { scan-assembler "xxextractuw\|vextuw\[lr\]x" } } */

[SPARC] Enable LRA by default

2017-01-03 Thread Eric Botcazou

I changed my mind and decided to give it a try for GCC 7, after bootstrapping 
and testing the 32-bit and 64-bit compilers, both in development and release 
modes, over the last weeks.  The few encountered issues were minor: missed 
optimization in a specific case (PR rtl-optimization/78664) and compilation 
time explosion on a pathological Go testcase at -O0 (I'll open another PR 
if/when I can come up with an equivalent C/C++ testcase).

Tested on SPARC/Solaris in various configurations, applied on the mainline.
I also updated htdocs/backends.html in the wwwdocs module.


2017-01-03  Eric Botcazou  

* doc/invoke.texi (SPARC options): Document -mlra as the default.
* config/sparc/sparc.c (sparc_option_override): Force LRA unless
-mlra/-mno-lra was passed to the compiler.

-- 
Eric BotcazouIndex: doc/invoke.texi
===
--- doc/invoke.texi	(revision 244005)
+++ doc/invoke.texi	(working copy)
@@ -23271,8 +23271,8 @@ in 64-bit mode.
 @itemx -mno-lra
 @opindex mlra
 @opindex mno-lra
-Enable Local Register Allocation.  This is experimental for SPARC, so by
-default the compiler uses standard reload (i.e. @option{-mno-lra}).
+Enable Local Register Allocation.  This is the default for SPARC since GCC 7
+so @option{-mno-lra} needs to be passed to get old Reload.
 
 @item -mcpu=@var{cpu_type}
 @opindex mcpu
Index: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 244005)
+++ config/sparc/sparc.c	(working copy)
@@ -1523,6 +1523,10 @@ sparc_option_override (void)
   if (TARGET_ARCH32)
 target_flags &= ~MASK_STACK_BIAS;
 
+  /* Use LRA instead of reload, unless otherwise instructed.  */
+  if (!(target_flags_explicit & MASK_LRA))
+target_flags |= MASK_LRA;
+
   /* Supply a default value for align_functions.  */
   if (align_functions == 0
   && (sparc_cpu == PROCESSOR_ULTRASPARC

Re: [PATCH/AARCH64] Handle ILP32 multi-arch

2017-01-03 Thread Andrew Pinski

Ping?

On Sat, Dec 10, 2016 at 1:24 PM, Andrew Pinski  wrote:
> On Thu, Nov 10, 2016 at 6:58 PM, Andrew Pinski  wrote:
>> On Tue, Oct 25, 2016 at 3:25 PM, Matthias Klose  wrote:
>>> On 07.10.2016 23:08, Andrew Pinski wrote:
 Hi,
   This patch adds ilp32 multi-arch support.  This is needed to support
 multi-arch on Debian like systems.

 OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
 Also tested with ilp32 with a newly built toolchain that supports
 ILP32 with Ubuntu 1604 base.

 Thanks,
 Andrew

 ChangeLog:
 * config/aarch64/t-aarch64-linux (MULTILIB_OSDIRNAMES): Handle
 multi-arch for ilp32.
>>>
>>> I can't approve that, but it looks like a reasonable change, but we should
>>> document the multiarch triplet at https://wiki.debian.org/Multiarch/Tuples
>>
>>
>> Ping?
>
> Ping?  This is the only outstanding GCC ILP32 related patch.
>
>>
>> Thanks,
>> Andrew
>>
>>>
>>> Matthias
>>>

Re: [PATCH, libgo] Avoid compiling runtime/aeshash.c with older assemblers [PR go/78789]

2017-01-03 Thread Ian Lance Taylor

On Tue, Jan 3, 2017 at 10:37 AM, Uros Bizjak  wrote:
>
> Attached patch detect support for AES instructions and avoids
> compiling runtime/aeshash.c with older assemblers (on e.g. CentOS
> 5.11). The result of configure is also communicated into go runtime,
> so the library doesn't try to call non-existent aeshashbody routine.
>
> Patch was tested on x86_64-linux-gnu, on Fedora 25 with AES capable
> CPU and on CentOS 5.11 with non-AES capable CPU.

Thanks.  Committed.

Ian

Re: [-fcompare-debug] find jump before debug insns in expand

2017-01-03 Thread Alexandre Oliva

On Jan  3, 2017, Richard Sandiford  wrote:

>>&& (last = get_last_insn ())
>> -  && JUMP_P (last))
>> +  && (JUMP_P (last)
>> +  || (DEBUG_INSN_P (last)
>> +  && JUMP_P (prev_nondebug_insn (last)

> Would it be worth adding a get_last_nondebug_insn in case other patterns
> like this crop up?

I didn't think so.  Most of the RTL passes use the BB-based interfaces
nowadays, so it seemed that cfgexpand would be pretty much the only
place where this could be used.  That was my reasoning anyway; I didn't
actually check that this was indeed the case.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Jakub Jelinek

On Tue, Jan 03, 2017 at 09:40:02PM +0200, Janne Blomqvist wrote:
> I do think this is fixable without waiting for potential improved
> hardware.  The vast majority of uses of boolean_type_node in the
> Fortran frontend is code like
> 
>   tmp = fold_build2_loc (input_location, GT_EXPR,
>  boolean_type_node, from_len,
>  build_zero_cst (TREE_TYPE (from_len)));
>   tmp = fold_build3_loc (input_location, COND_EXPR,
>  void_type_node, tmp, extcopy, stdcopy);
> 
> where the result of a boolean expression is returned as a
> boolean_type_node.  This is a more like high-level language semantics,
> whereas on the asm level boolean expressions are of course evaluated
> with integer arithmetic (or are there targets where this is not
> true?).
> 
> So while the Fortran frontend shouldn't redefine the ABI specified
> boolean_type_node, for situations like the above we could instead use
> some fast_scalar_non_abi_bool_type_node (better name suggestions
> welcome?), if we had some mechanism for returning such information
> from the backend? Or is there some universal best choice here?
> 
> Or since boolean types are not actually present at the asm level, what
> is the best choice in cases like the above in order to generate code
> that can be lowered to as efficient code as possible? Should the
> result of the fold_build2_loc be of, say, TREE_TYPE(from_len) instead
> of boolean_type_node (or fast_scalar_non_abi_bool_type_node or
> whatever we come up with)?
> 
> 
> (Repost without html formatting, sorry for the spam)

Well, generally if it helps Fortran on z10, it will likely help C and C++
too, so you probably instead want some type promotion pass which will know
that on the particular target it is beneficial to promote booleans to
32-bits and know under what circumstances (the loads and stores of course
need to remain the same size for ABI reasons, but as soon as you have it in
SSA_NAME, it depends on what you use it for, sometimes it will be beneficial
to promote it, sometimes not.  Of course, that is something that is too late
for GCC 7.

Jakub

[gomp4] backport an acc directive matching cleanup for fortran

2017-01-03 Thread Cesar Philippidis

This patch contains a backport of the fortran OpenACC directive matching
changes I made to trunk here
. It's not a
clean backport because gomp4 has some support for the device_type clause
which is missing in trunk.

Next I plan to backport Jakub's OpenMP 4.5 fortran changes to gomp4.

Cesar
2017-01-03  Cesar Philippidis  

	gcc/fortran/
	* openmp.c (match_acc): New function.
	(gfc_match_oacc_parallel_loop): Simplify by calling match_acc.
	(gfc_match_oacc_parallel): Likewise.
	(gfc_match_oacc_kernels_loop): Likewise.
	(gfc_match_oacc_kernels): Likewise.
	(gfc_match_oacc_data): Likewise.
	(gfc_match_oacc_host_data): Likewise.
	(gfc_match_oacc_loop): Likewise.
	(gfc_match_oacc_enter_data): likewise.
	(gfc_match_oacc_exit_data): Likewise.


diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 0a9d137..61940d7 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1565,112 +1565,72 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 #define OACC_UPDATE_CLAUSE_DEVICE_TYPE_MASK   \
(OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT | OMP_CLAUSE_DEVICE_TYPE)
 
-
-match
-gfc_match_oacc_parallel_loop (void)
+static match
+match_acc (gfc_exec_op op, uint64_t mask, uint64_t dmask)
 {
   gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_PARALLEL_LOOP_CLAUSES,
-			 OACC_PARALLEL_CLAUSE_DEVICE_TYPE_MASK
-			 | OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK, false,
-			 false, true) != MATCH_YES)
+
+  if (gfc_match_omp_clauses (, mask, dmask, false, false, true) != MATCH_YES)
 return MATCH_ERROR;
 
-  new_st.op = EXEC_OACC_PARALLEL_LOOP;
+  new_st.op = op;
   new_st.ext.omp_clauses = c;
   return MATCH_YES;
 }
 
+match
+gfc_match_oacc_parallel_loop (void)
+{
+  return match_acc (EXEC_OACC_PARALLEL_LOOP, OACC_PARALLEL_LOOP_CLAUSES,
+		OACC_PARALLEL_CLAUSE_DEVICE_TYPE_MASK
+		| OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK);
+}
+
 
 match
 gfc_match_oacc_parallel (void)
 {
-  gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_PARALLEL_CLAUSES,
-			 OACC_PARALLEL_CLAUSE_DEVICE_TYPE_MASK, false,
-			 false, true)
-  != MATCH_YES)
-return MATCH_ERROR;
-
-  new_st.op = EXEC_OACC_PARALLEL;
-  new_st.ext.omp_clauses = c;
-  return MATCH_YES;
+  return match_acc (EXEC_OACC_PARALLEL, OACC_PARALLEL_CLAUSES,
+		OACC_PARALLEL_CLAUSE_DEVICE_TYPE_MASK);
 }
 
 
 match
 gfc_match_oacc_kernels_loop (void)
 {
-  gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_KERNELS_LOOP_CLAUSES,
-			 OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK
-			 | OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK, false,
-			 false, true) != MATCH_YES)
-return MATCH_ERROR;
-
-  new_st.op = EXEC_OACC_KERNELS_LOOP;
-  new_st.ext.omp_clauses = c;
-  return MATCH_YES;
+  return match_acc (EXEC_OACC_KERNELS_LOOP, OACC_KERNELS_LOOP_CLAUSES,
+		OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK
+		| OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK);
 }
 
 
 match
 gfc_match_oacc_kernels (void)
 {
-  gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_KERNELS_CLAUSES,
-			 OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK, false,
-			 false, true)
-  != MATCH_YES)
-return MATCH_ERROR;
-
-  new_st.op = EXEC_OACC_KERNELS;
-  new_st.ext.omp_clauses = c;
-  return MATCH_YES;
+  return match_acc (EXEC_OACC_KERNELS, OACC_KERNELS_CLAUSES,
+		OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK);
 }
 
 
 match
 gfc_match_oacc_data (void)
 {
-  gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_DATA_CLAUSES, 0, false, false, true)
-  != MATCH_YES)
-return MATCH_ERROR;
-
-  new_st.op = EXEC_OACC_DATA;
-  new_st.ext.omp_clauses = c;
-  return MATCH_YES;
+  return match_acc (EXEC_OACC_DATA, OACC_DATA_CLAUSES, 0);
 }
 
 
 match
 gfc_match_oacc_host_data (void)
 {
-  gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_HOST_DATA_CLAUSES, 0, false, false, true)
-  != MATCH_YES)
-return MATCH_ERROR;
-
-  new_st.op = EXEC_OACC_HOST_DATA;
-  new_st.ext.omp_clauses = c;
-  return MATCH_YES;
+  return match_acc (EXEC_OACC_HOST_DATA, OACC_HOST_DATA_CLAUSES, 0);
 }
 
 
 match
 gfc_match_oacc_loop (void)
 {
-  gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_LOOP_CLAUSES,
-			 OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK, false, false,
-			 true)
-  != MATCH_YES)
-return MATCH_ERROR;
-
-  new_st.op = EXEC_OACC_LOOP;
-  new_st.ext.omp_clauses = c;
-  return MATCH_YES;
+  return match_acc (EXEC_OACC_LOOP, OACC_LOOP_CLAUSES,
+		OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK);
 }
 
 
@@ -1791,28 +1751,14 @@ gfc_match_oacc_update (void)
 match
 gfc_match_oacc_enter_data (void)
 {
-  gfc_omp_clauses *c;
-  if (gfc_match_omp_clauses (, OACC_ENTER_DATA_CLAUSES, 0, false, false, true)
-  != MATCH_YES)
-return MATCH_ERROR;
-
-  new_st.op = EXEC_OACC_ENTER_DATA;
-  new_st.ext.omp_clauses = c;
-  return MATCH_YES;
+  return match_acc (EXEC_OACC_ENTER_DATA, OACC_ENTER_DATA_CLAUSES, 0);
 }
 
 
 match
 gfc_match_oacc_exit_data (void)
 {
-  gfc_omp_clauses *c;
-

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Janne Blomqvist

On Tue, Jan 3, 2017 at 6:50 PM, Andreas Krebbel
 wrote:
> The regression with 8 bit boolean types surfaced with the z10 machine. The 
> ABI is much older. Since
> we cannot change it anymore we should rather talk to the hardware guys to add 
> the instruction we
> need. So for now we probably have to live with the regression in the Fortran 
> cases since as I
> understand it your change fixes an actual problem.

I do think this is fixable without waiting for potential improved
hardware.  The vast majority of uses of boolean_type_node in the
Fortran frontend is code like

  tmp = fold_build2_loc (input_location, GT_EXPR,
 boolean_type_node, from_len,
 build_zero_cst (TREE_TYPE (from_len)));
  tmp = fold_build3_loc (input_location, COND_EXPR,
 void_type_node, tmp, extcopy, stdcopy);

where the result of a boolean expression is returned as a
boolean_type_node.  This is a more like high-level language semantics,
whereas on the asm level boolean expressions are of course evaluated
with integer arithmetic (or are there targets where this is not
true?).

So while the Fortran frontend shouldn't redefine the ABI specified
boolean_type_node, for situations like the above we could instead use
some fast_scalar_non_abi_bool_type_node (better name suggestions
welcome?), if we had some mechanism for returning such information
from the backend? Or is there some universal best choice here?

Or since boolean types are not actually present at the asm level, what
is the best choice in cases like the above in order to generate code
that can be lowered to as efficient code as possible? Should the
result of the fold_build2_loc be of, say, TREE_TYPE(from_len) instead
of boolean_type_node (or fast_scalar_non_abi_bool_type_node or
whatever we come up with)?

(Repost without html formatting, sorry for the spam)

-- 
Janne Blomqvist

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread Janne Blomqvist

On Tue, Jan 3, 2017 at 9:21 PM, FX  wrote:
>> r244027 reverts r244011. Sorry for the breakage. It seems to affect
>> all i686 as well in addition to power, maybe all 32-bit hosts.
>
> The breakage is surprising, as the rejects-valid does not involve character 
> length at all.
> Jane, any chance you might have accidentally committed some unrelated change 
> along?

Yes, I'm surprised, and I don't understand it yet. I'm planning to
fire up a 32-bit VM and build there in order see what the issue is.

-- 
Janne Blomqvist

Re: [bootstrap-O3] add a default initializer to avoid a warning at -O3

2017-01-03 Thread Jason Merrill

Are there bugzillas for these false positive warnings?

On Tue, Jan 3, 2017 at 12:45 PM, Jeff Law  wrote:
> On 01/02/2017 10:28 PM, Alexandre Oliva wrote:
>>
>> Building with the bootstrap-O3 configuration option fails to compile
>> input.c due to an AFAICT false-positive warning about an uninitialized
>> use of a variable.
>>
>> This patch adds a default initializer to silence it.
>>
>> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  OK to install?
>>
>> for  gcc/ChangeLog
>>
>> * input.c (assert_char_at_range): Default-initialize
>> actual_range.
>
> OK.
> jeff
>

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread FX

> r244027 reverts r244011. Sorry for the breakage. It seems to affect
> all i686 as well in addition to power, maybe all 32-bit hosts.

The breakage is surprising, as the rejects-valid does not involve character 
length at all.
Jane, any chance you might have accidentally committed some unrelated change 
along?

FX

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread FX

> The regression with 8 bit boolean types surfaced with the z10 machine. The 
> ABI is much older. Since
> we cannot change it anymore we should rather talk to the hardware guys to add 
> the instruction we
> need. So for now we probably have to live with the regression in the Fortran 
> cases since as I
> understand it your change fixes an actual problem.

As far as I understand (and Jane will correct me if I am wrong), the patch does 
not fix anything in particular. The idea was that, by transitioning from having 
all boolean expressions from “int” to “bool” (in C terms), and thus from 32-bit 
to 8-bit on “typical” targets, the optimizer might be able to emit more compact 
code. I am not sure this was tested.

So: maybe it is a case of "Profile. Don't speculate.”

FX

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread FX

> The gfc_init_types change is an ABI change, at least if the fortran FE
> bool type is ever stored in memory and accessed by multiple TUs, or
> passed as argument etc.  And the difference between the C/C++ _Bool/bool
> and fortran FE bool has caused lots of issues in the past, so if it can be
> the same type, it is preferrable.

The patch committed doesn’t change the way Fortran LOGICAL types are emitted. 
The fact that the default LOGICAL kind is different (on most platforms) from 
C/C++ boolean is due to the standard, and not our own choice of ABI.

As Jane says, boolean_type_node in the Fortran front-end is only used for 
intermediate values and temporaries; it is used every time we build a 
COND_EXPR. It is not part of the ABI.

FX

Re: [PATCH], PR target/78900, Fix PowerPC __float128 signbit

2017-01-03 Thread David Edelsohn

On Fri, Dec 30, 2016 at 3:54 PM, Michael Meissner
 wrote:
> The signbit-3.c test explicitly tests for the value coming from memory, a
> vector register, or a GPR.  Unfortunately, the code did not handle splitting 
> up
> the registers when the value was in a GPR.
>
> These patches add teh GPR support.  While I was editing the code, I also did
> some cleanup.
>
> I removed the Fsignbit mode attribute, since the only two modes used both use
> the same attribute.  This is a relic of the original code generation that also
> provided optimized signbit support for DFmode/SFmode.  Since the DFmode/SFmode
> got dropped (GCC 6 was in stage 3, and we needed to get signbit working for
> __float128 -- it already worked for DFmode/SFmode, but the code generation
> could be improved).
>
> I also noticed that use of signbit tended to generate sign or zero extension.
> Since the function only returns 0/1, I added combiner insns to eliminate the
> extra zero/sign extend.
>
> I have tested this on both big endian and little endian power8 systems.  The
> bootstrap and make check had no regressions.  Is this ok to put into the 
> trunk?
>
> The same error appears on GCC 6 as well.  Assuming the patch applys cleanly 
> and
> fixes the problem, can I install it on the GCC 6 branch as well after a burn 
> in
> period?
>
> 2016-12-30  Michael Meissner  
>
> PR target/78900
> * config/rs6000/rs6000.c (rs6000_split_signbit): Change some
> assertions.  Add support for doing the signbit if the IEEE 128-bit
> floating point value is in a GPR.
> * config/rs6000/rs6000.md (Fsignbit): Delete.
> (signbit2_dm): Delete using  and just use "wa".
> Update the length attribute if the value is in a GPR.
> (signbit2_dm_ext): Add combiner pattern to eliminate
> the sign or zero extension instruction, since the value is always
> 0/1.
> (signbit2_dm2): Delete using .

This patch is okay for trunk and okay for GCC 6 branch after a week or
two of no problems.

Thanks, David

Re: [PATCH] [PR rtl-optimization/65618] Fix MIPS ADA bootstrap failure

2017-01-03 Thread Jeff Law


On 01/03/2017 04:04 AM, James Cowgill wrote:

On 01/01/17 22:27, Jeff Law wrote:

On 12/20/2016 07:38 AM, James Cowgill wrote:

Hi,

On 19/12/16 21:43, Jeff Law wrote:

On 12/19/2016 08:44 AM, James Cowgill wrote:

2016-12-16  James Cowgill  

PR rtl-optimization/65618
* emit-rtl.c (try_split): Update "after" when moving a
NOTE_INSN_CALL_ARG_LOCATION.

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 7de17454037..6be124ac038 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3742,6 +3742,11 @@ try_split (rtx pat, rtx_insn *trial, int last)
next = NEXT_INSN (next))
 if (NOTE_KIND (next) == NOTE_INSN_CALL_ARG_LOCATION)
   {
+/* Advance after to the next instruction if it is about to
+   be removed.  */
+if (after == next)
+  after = NEXT_INSN (after);
+
 remove_insn (next);
 add_insn_after (next, insn, NULL);
 break;


So the thing I don't like when looking at this code is we set AFTER
immediately upon entry to try_split.  But we don't use it until near the
very end of try_split.  That's just asking for trouble.

Can we reasonably initialize AFTER just before it's used?


I wasn't sure but looking closer I think that would be fine. This patch
also works and does what Richard Sandiford suggested in the PR.

2016-12-20  James Cowgill  

PR rtl-optimization/65618
* emit-rtl.c (try_split): Move initialization of "before" and
"after" to just before the call to emit_insn_after_setloc.

OK.


Great. Can you commit this for me, since I don't have commit access?

Done.

If you're going to be contributing regularly we should probably start 
the process of getting you commit access.


jeff

[PATCH, libgo] Avoid compiling runtime/aeshash.c with older assemblers [PR go/78789]

2017-01-03 Thread Uros Bizjak

Hello!

Attached patch detect support for AES instructions and avoids
compiling runtime/aeshash.c with older assemblers (on e.g. CentOS
5.11). The result of configure is also communicated into go runtime,
so the library doesn't try to call non-existent aeshashbody routine.

Patch was tested on x86_64-linux-gnu, on Fedora 25 with AES capable
CPU and on CentOS 5.11 with non-AES capable CPU.

Uros.
Index: config.h.in
===
--- config.h.in (revision 244024)
+++ config.h.in (working copy)
@@ -21,6 +21,9 @@
 /* Define if your assembler supports unwind section type. */
 #undef HAVE_AS_X86_64_UNWIND_SECTION_TYPE
 
+/* Define if your assembler supports AES instructions. */
+#undef HAVE_AS_X86_AES
+
 /* Define if your assembler supports PC relative relocs. */
 #undef HAVE_AS_X86_PCREL
 
Index: configure
===
--- configure   (revision 244024)
+++ configure   (working copy)
@@ -15490,6 +15490,32 @@
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler supports AES 
instructions" >&5
+$as_echo_n "checking assembler supports AES instructions... " >&6; }
+if test "${libgo_cv_as_x86_aes+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+
+libgo_cv_as_x86_aes=yes
+echo 'aesenc %xmm0, %xmm1' > conftest.s
+CFLAGS_hold=$CFLAGS
+if test "$libgo_cv_c_unused_arguments" = yes; then
+  CFLAGS="$CFLAGS -Qunused-arguments"
+fi
+if $CC $CFLAGS -c conftest.s 2>&1 | grep -i error > /dev/null; then
+libgo_cv_as_x86_aes=no
+fi
+CFLAGS=$CFLAGS_hold
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libgo_cv_as_x86_aes" >&5
+$as_echo "$libgo_cv_as_x86_aes" >&6; }
+if test "x$libgo_cv_as_x86_aes" = xyes; then
+
+$as_echo "#define HAVE_AS_X86_AES 1" >>confdefs.h
+
+fi
+
 cat >confcache <<\_ACEOF
 # This file is a shell script that caches the results of configure
 # tests run on this system so they can be shared between configure
Index: configure.ac
===
--- configure.ac(revision 244024)
+++ configure.ac(working copy)
@@ -934,6 +934,24 @@
[Define if your assembler supports unwind section type.])
 fi
 
+AC_CACHE_CHECK([assembler supports AES instructions],
+libgo_cv_as_x86_aes, [
+libgo_cv_as_x86_aes=yes
+echo 'aesenc %xmm0, %xmm1' > conftest.s
+CFLAGS_hold=$CFLAGS
+if test "$libgo_cv_c_unused_arguments" = yes; then
+  CFLAGS="$CFLAGS -Qunused-arguments"
+fi
+if $CC $CFLAGS -c conftest.s 2>&1 | grep -i error > /dev/null; then
+libgo_cv_as_x86_aes=no
+fi
+CFLAGS=$CFLAGS_hold
+])
+if test "x$libgo_cv_as_x86_aes" = xyes; then
+  AC_DEFINE(HAVE_AS_X86_AES, 1,
+   [Define if your assembler supports AES instructions.])
+fi
+
 AC_CACHE_SAVE
 
 if test ${multilib} = yes; then
Index: go/runtime/alg.go
===
--- go/runtime/alg.go   (revision 244024)
+++ go/runtime/alg.go   (working copy)
@@ -233,6 +233,7 @@
// Install aes hash algorithm if we have the instructions we need
if (GOARCH == "386" || GOARCH == "amd64") &&
GOOS != "nacl" &&
+   support_aes &&
cpuid_ecx&(1<<25) != 0 && // aes (aesenc)
cpuid_ecx&(1<<9) != 0 && // sse3 (pshufb)
cpuid_ecx&(1<<19) != 0 { // sse4.1 (pinsr{d,q})
Index: go/runtime/runtime2.go
===
--- go/runtime/runtime2.go  (revision 244024)
+++ go/runtime/runtime2.go  (working copy)
@@ -771,7 +771,8 @@
 
// Information about what cpu features are available.
// Set on startup.
-   cpuid_ecx uint32
+   cpuid_ecx uint32
+   support_aes   bool
 
 // cpuid_edx uint32
 // cpuid_ebx7uint32
Index: go/runtime/stubs.go
===
--- go/runtime/stubs.go (revision 244024)
+++ go/runtime/stubs.go (working copy)
@@ -272,6 +272,12 @@
cpuid_ecx = v
 }
 
+// For gccgo, to communicate from the C code to the Go code.
+//go:linkname setSupportAES runtime.setSupportAES
+func setSupportAES(v bool) {
+   support_aes = v
+}
+
 // typedmemmove copies a typed value.
 // For gccgo for now.
 //go:nosplit
Index: runtime/aeshash.c
===
--- runtime/aeshash.c   (revision 244024)
+++ runtime/aeshash.c   (working copy)
@@ -12,7 +12,7 @@
 uintptr aeshashbody(void*, uintptr, uintptr, Slice)
__attribute__((no_split_stack));
 
-#if defined(__i386__) || defined(__x86_64__)
+#if (defined(__i386__) || defined(__x86_64__)) && defined(HAVE_AS_X86_AES)
 
 #include 
 #include 
@@ -573,7 +573,7 @@
 
 #endif // !defined(__x86_64__)
 
-#else // !defined(__i386__) && !defined(__x86_64__)
+#else // !defined(__i386__) && !defined(__x86_64__) || 
!defined(HAVE_AS_X86_AES)
 
 uintptr aeshashbody(void* p

Re: [bootstrap-O3,fortran] silence warning in simplify_transformation_to_array

2017-01-03 Thread Jeff Law


On 01/02/2017 10:29 PM, Alexandre Oliva wrote:

simplify_transformation_to_array had the nested loop unrolled 7 times,
which is reasonable given that it iterates over arrays of size
GFC_MAX_DIMENSIONS == 7.

The problem is that the last iteration increments the index, tests
that it's less than result->rank, and then accesses the arrays with
the incremented index.

We did not optimize out that part in the 7th iteration, so VRP flagged
the unreachable code as accessing arrays past the end.

It couldn't possibly know that we'd never reach that part, since the
test was on result->rank, and it's not obvious (for the compiler) that
result->rank <= GFC_MAX_DIMENSIONS.

Even an assert to that effect before the enclosing loop didn't avoid
the warning turned to error, though; I suppose there might be some
aliasing at play, because moving the assert into the loop does, but
then, it's not as efficient as testing the index itself against the
limit.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  OK to install?

for  gcc/fortran/ChangeLog

* simplify.c (simplify_transformation_to_array): Assert the
array access is in range.  Fix whitespace.

OK.
jeff

Contents of PO file 'cpplib-7.1-b20170101.sv.po'

2017-01-03 Thread Translation Project Robot



cpplib-7.1-b20170101.sv.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

New Swedish PO file for 'cpplib' (version 7.1-b20170101)

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/sv.po

(This file, 'cpplib-7.1-b20170101.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [bootstrap-O3] use unsigned type for regno in df-scan

2017-01-03 Thread Jeff Law


On 01/03/2017 11:11 AM, Jakub Jelinek wrote:

On Tue, Jan 03, 2017 at 11:08:05AM -0700, Jeff Law wrote:

What if REGNO is 2147483648 (assume 32 bit host).  That will get us into the
else block in df_ref_record as it's >= FIRST_PSEUDO_REGISTER.

In df_ref_create_structure, we use the same expression to compute REGNO, but
this time it's interpreted as a signed integer, so -2147483648. That gets us
into the path where we call TEST_HARD_REG_BIT and thus the oob array index.

Right?

The patch is OK.  It does highlight the desire to pick the right type and
consistently use it.


Note I think we require --disable-werror for the not so common bootstrap
configurations, only normal bootstrap and profiledbootstrap are supposed to
be --enable-werror free.  At least that is my understanding of the general
preference, there are too many bootstrap options and too many different sets
of false positives.  Not talking about this particular patch, just a general
comment.

Agreed.  I was on the fence WRT these patches for that exact reason.

Jeff

Re: [bootstrap-O3] use unsigned type for regno in df-scan

2017-01-03 Thread Jakub Jelinek

On Tue, Jan 03, 2017 at 11:08:05AM -0700, Jeff Law wrote:
> What if REGNO is 2147483648 (assume 32 bit host).  That will get us into the
> else block in df_ref_record as it's >= FIRST_PSEUDO_REGISTER.
> 
> In df_ref_create_structure, we use the same expression to compute REGNO, but
> this time it's interpreted as a signed integer, so -2147483648. That gets us
> into the path where we call TEST_HARD_REG_BIT and thus the oob array index.
> 
> Right?
> 
> The patch is OK.  It does highlight the desire to pick the right type and
> consistently use it.

Note I think we require --disable-werror for the not so common bootstrap
configurations, only normal bootstrap and profiledbootstrap are supposed to
be --enable-werror free.  At least that is my understanding of the general
preference, there are too many bootstrap options and too many different sets
of false positives.  Not talking about this particular patch, just a general
comment.

Jakub

Re: [bootstrap-O3] use unsigned type for regno in df-scan

2017-01-03 Thread Jeff Law


On 01/02/2017 10:29 PM, Alexandre Oliva wrote:

This patch fixes a false-positive warning in df-scan, at bootstrap-O3
failed, and enables GCC to optimize out the code that leads to the
warning.

df_ref_create_structure was inlined into the else part of
df_ref_record.  Due to the condition of the corresponding if, In the
else part, VRP deduced unsigned regno >= FIRST_PSEUDO_REGISTER.

In df_ref_create_structure, there's another regno variable,
initialized with the same expression and value as the caller's.  GCC
can tell as much, but this regno variable is signed.  It is used,
shifted right, to index a hard regset bit array within a path that
tests that this signed regno < FIRST_PSEUDO_REGISTER.

GCC warned about the possible out-of-range indexing into the hard
regset array.  It shouldn't, after all, the same regno can't possibly
be both < FIRST_PSEUDO_REGISTER and >= FIRST_PSEUDO_REGISTER, can it?

Well, the optimizers correctly decide it could, if it was a negative
int that, when converted to unsigned, became larger than
FIRST_PSEUDO_REGISTER.  But GCC doesn't know regno can't be negative,
so the test could not be optimize out.  What's more, given the
constraints, VRP correctly concluded the hard regset array would
always be indexed by a value way outside the array index range.

This patch changes the inlined regno to unsigned, like the caller's,
so that we can now tell the conditions can't both hold, so we optimize
out the path containing the would-be out-of-range array indexing.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  OK to install?

for  gcc/ChangeLog

* df-scan.c (df_ref_create_structure): Make regno unsigned,
to match the caller.
What if REGNO is 2147483648 (assume 32 bit host).  That will get us into 
the else block in df_ref_record as it's >= FIRST_PSEUDO_REGISTER.


In df_ref_create_structure, we use the same expression to compute REGNO, 
but this time it's interpreted as a signed integer, so -2147483648. 
That gets us into the path where we call TEST_HARD_REG_BIT and thus the 
oob array index.


Right?

The patch is OK.  It does highlight the desire to pick the right type 
and consistently use it.


jeff

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread Janne Blomqvist

On Tue, Jan 3, 2017 at 4:07 PM, David Edelsohn  wrote:
> This patch broke bootstrap.  I now am seeing numerous errors when
> building libgomp.
>
> Please fix or revert immediately.

r244027 reverts r244011. Sorry for the breakage. It seems to affect
all i686 as well in addition to power, maybe all 32-bit hosts.

-- 
Janne Blomqvist

Re: [PATCH 1/2] [ADA] Fix MIPS big-endian build

2017-01-03 Thread Eric Botcazou

> Thanks, can you commit it for me? I screwed up the patch in the email I
> sent a few minutes ago but the patch below should apply.

I installed both patches on the mainline.

-- 
Eric Botcazou

Re: [bootstrap-O3] add a default initializer to avoid a warning at -O3

2017-01-03 Thread Jeff Law


On 01/02/2017 10:28 PM, Alexandre Oliva wrote:

Building with the bootstrap-O3 configuration option fails to compile
input.c due to an AFAICT false-positive warning about an uninitialized
use of a variable.

This patch adds a default initializer to silence it.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  OK to install?

for  gcc/ChangeLog

* input.c (assert_char_at_range): Default-initialize
actual_range.

OK.
jeff

Re: [bootstrap-O3,fortran] add a NULL initializer to avoid a warning at -O3

2017-01-03 Thread Jeff Law


On 01/02/2017 10:28 PM, Alexandre Oliva wrote:

Building with the bootstrap-O3 configuration option fails to compile
fortran/module.c due to an AFAICT false-positive warning about an
uninitialized use of a variable.

This patch adds a dummy initializer to silence it.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  OK to install?

for  gcc/fortran/ChangeLog

* module.c (load_omp_udrs): Initialize name.

OK.
jeff

Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

2017-01-03 Thread Aldy Hernandez


On 12/20/2016 09:16 AM, Richard Biener wrote:

On Fri, Dec 16, 2016 at 3:41 PM, Aldy Hernandez  wrote:

Hi folks.

This is a follow-up on Jeff and Richi's interaction on the aforementioned PR
here:

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01397.html

I decided to explore the idea of analyzing may-undefness on-demand, which
actually looks rather cheap.

BTW, I don't understand why we don't have auto_bitmap's, as we already have
auto_sbitmap's.  I've implemented the former based on auto_sbitmap's code we
already have.

The attached patch fixes the bug without introducing any regressions.

I also tested the patch by compiling 242 .ii files with -O3.  These were
gathered from a stage1 build with -save-temps.  There is a slight time
degradation of 4 seconds within 27 minutes of user time:

tainted:26:52
orig:   26:48

This was the average aggregate time of two runs compiling all 242 .ii files.
IMO, this looks reasonable.  It is after all, -O3.Is it acceptable?


+  while (!worklist.is_empty ())
+{
+  name = worklist.pop ();
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+
+  if (ssa_undefined_value_p (name, true))
+   return true;
+
+  bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));

it should be already set as we use visited_ssa as "was it ever on the worklist",
so maybe renaming it would be a good thing as well.


I don't understand what you would prefer here.



+ if (TREE_CODE (name) == SSA_NAME)
+   {
+ /* If an SSA has already been seen, this may be a loop.
+Fail conservatively.  */
+ if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
+   return false;

so to me "conservative" is returning true, not false.


OK



+ bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
+ worklist.safe_push (name);

but for loops we can just continue and ignore this use.  And bitmap_set_bit
returns whether it set a bit, thus

if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name)))
  worklist.safe_push (name);

should work?


Fixed.



+  /* Check that any SSA names used to define NAME is also fully
+defined.  */
+  use_operand_p use_p;
+  ssa_op_iter iter;
+  FOR_EACH_SSA_USE_OPERAND (use_p, def, iter, SSA_OP_USE)
+   {
+ name = USE_FROM_PTR (use_p);
+ if (TREE_CODE (name) == SSA_NAME)

always true.

+   {
+ /* If an SSA has already been seen, this may be a loop.
+Fail conservatively.  */
+ if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
+   return false;
+ bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
+ worklist.safe_push (name);

See above.

In principle the thing is sound but I'd like to be able to pass in a bitmap of
known maybe-undefined/must-defined SSA names to have a cache for
multiple invocations from within a pass (like this unswitching case).


Done, though bitmaps are now calculated as part of the instantiation.



Also once you hit defs that are in a post-dominated region of the loop entry
you can treat them as not undefined (as their use invokes undefined
behavior).  This is also how you treat function parameters (well,
ssa_undefined_value_p does), where the call site invokes undefined behavior
when passing in undefined values.  So we need an extra parameter specifying
the post-dominance region.


Done.



You do not handle memory or calls conservatively which means the existing
testcase only needs some obfuscation to become a problem again.  To fix
that before /* Check that any SSA names used to define NAME is also fully
defined.  */ bail out conservatively, like

   if (! is_gimple_assign (def)
  || gimple_assign_single_p (def))
return true;


As I asked previously, I understand the !is_gimple_assign, which will 
skip over GIMPLE_CALLs setting a value, but the 
"gimple_assign_single_p(def) == true"??


Won't this last one bail on things like e.3_7 = e, or x_77 = y_88? Don't 
we want to follow up the def chain precisely on these?


The attached implementation uses a cache, and a pre-computed 
post-dominance region.  A previous incantation of this patch (sans the 
post-dominance stuff) had performance within the noise of the previous 
implementation.


I am testing again, and will do some performance comparisons in a bit, 
but for now-- are we on the same page here?  Is this what you wanted?


Aldy

p.s. I could turn the post-dominance region into a bitmap for faster 
access if preferred.
commit 47d0c1b3144d4d56405d72c3ad55183d8632d0a7
Author: Aldy Hernandez 
Date:   Fri Dec 16 03:44:52 2016 -0500

PR tree-optimization/71691
* bitmap.h (class auto_bitmap): New.
* tree-ssa-defined-or-undefined.c: New file.
*

Re: [bootstrap-O1] add initializers to avoid warnings at -O1

2017-01-03 Thread Jeff Law


On 01/02/2017 10:28 PM, Alexandre Oliva wrote:

Building with the bootstrap-O1 configuration option fails to compile a
number of files due to AFAICT false-positive warnings about uses of
uninitialized variables.

This patch adds dummy initializers to silence them all.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  OK to install?

for  gcc/ChangeLog

* multiple_target.c (create_dispatcher_calls): Init e_next.
* tree-ssa-loop-split.c (split_loop): Init border.
* tree-vect-loop.c (vect_determine_vectorization_factor): Init
scalar_type.
Most likely these are due to either not running VRP (and thus the jump 
threading within VRP) or a throttled jump threading elsewhere.


OK.

jeff

Re: [bootstrap-O1] enlarge sprintf output buffer to avoid warning

2017-01-03 Thread Jeff Law


On 01/02/2017 10:28 PM, Alexandre Oliva wrote:

In stage2 of bootstrap-O1, the code that warns if sprintf might
overflow its output buffer cannot tell that an unsigned value narrowed
to 16 bits will fit in 4 bytes with %4x.

I couldn't find a better way to avoid the warning at -O1 than growing
the buffer so that there's no doubt the output will fit.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

for  gcc/c-family/ChangeLog

* c-pretty-print.c (pp_c_tree_decl_identifier): Grow static
buffer to avoid false-positive warning.
Presumably this is an artifact of not running VRP at -O1 and thus we 
don't have a narrowed range from the masking operation.


This isn't performance critical code so we *could* avoid the statically 
sized array.  But I doubt it's worth the effort.


OK for the trunk.

jeff

New French PO file for 'gcc' (version 7.1-b20170101)

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the French team of translators.  The file is available at:

http://translationproject.org/latest/gcc/fr.po

(This file, 'gcc-7.1-b20170101.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH, GCC/ARM 2/2, ping4] Allow combination of aprofile and rmprofile multilibs

2017-01-03 Thread Thomas Preudhomme


Ping?

Best regards,

Thomas

On 06/12/16 11:35, Thomas Preudhomme wrote:

Ping?

*** gcc/ChangeLog ***

2016-10-03  Thomas Preud'homme  

* config.gcc: Allow combinations of aprofile and rmprofile values for
--with-multilib-list.
* config/arm/t-multilib: New file.
* config/arm/t-aprofile: Remove initialization of MULTILIB_*
variables.  Remove setting of ISA and floating-point ABI in
MULTILIB_OPTIONS and MULTILIB_DIRNAMES.  Set architecture and FPU in
MULTI_ARCH_OPTS_A and MULTI_ARCH_DIRS_A rather than MULTILIB_OPTIONS
and MULTILIB_DIRNAMES respectively.  Add comment to introduce all
matches.  Add architecture matches for marvel-pj4 and generic-armv7-a
CPU options.
* config/arm/t-rmprofile: Likewise except for the matches changes.
* doc/install.texi (--with-multilib-list): Document the combination of
aprofile and rmprofile values and warn about pitfalls in doing that.

Best regards,

Thomas

On 17/11/16 20:43, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 08/11/16 13:36, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 02/11/16 10:05, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 24/10/16 09:07, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 13/10/16 16:35, Thomas Preudhomme wrote:

Hi ARM maintainers,

This patchset aims at adding multilib support for R and M profile ARM
architectures and allowing it to be built alongside multilib for A profile
ARM
architectures. This specific patch is concerned with the latter. The patch
works
by moving the bits shared by both aprofile and rmprofile multilib build
(variable initilization as well as ISA and float ABI to build multilib for)
to a
new t-multilib file. Then, based on which profile was requested in
--with-multilib-list option, that files includes t-aprofile and/or
t-rmprofile
where the architecture and FPU to build the multilib for are specified.

Unfortunately the duplication of CPU to A profile architectures could not be
avoided because substitution due to MULTILIB_MATCHES are not transitive.
Therefore, mapping armv7-a to armv7 for rmprofile multilib build does not
have
the expected effect. Two patches were written to allow this using 2 different
approaches but I decided against it because this is not the right solution
IMO.
See caveats below for what I believe is the correct approach.


*** combined build caveats ***

As the documentation in this patch warns, there is a few caveats to using a
combined multilib build due to the way the multilib framework works.

1) For instance, when using only rmprofile the combination of options -mthumb
-march=armv7 -mfpu=neon the thumb/-march=armv7 multilib but in a combined
multilib build the default multilib would be used. This is because in the
rmprofile build -mfpu=neon is not specified in MULTILIB_OPTION and thus the
option is ignored when considering MULTILIB_REQUIRED entries.

2) Another issue is the fact that aprofile and rmprofile multilib build have
some conflicting requirements in terms of how to map options for which no
multilib is built to another option. (i) A first example of this is the
difference of CPU to architecture mapping mentionned above: rmprofile
multilib
build needs A profile CPUs and architectures to be mapped down to ARMv7 so
that
one of the v7-ar multilib gets chosen in such a case but aprofile needs A
profile architectures to stand on their own because multilibs are built for
several architectures.

(ii) Another example of this is that in aprofile multilib build no
multilib is
built with -mfpu=fpv5-d16 but some multilibs are built with -mfpu=fpv4-d16.
Therefore, aprofile defines a match rule to map fpv5-d16 onto fpv4-d16.
However,
rmprofile multilib profile *does* build some multilibs with -mfpu=fpv5-d16.
This
has the consequence that when building for -mthumb -march=armv7e-m
-mfpu=fpv5-d16 -mfloat-abi=hard the default multilib is chosen because
this is
rewritten into -mthumb -march=armv7e-m -mfpu=fpv5-d16 -mfloat-abi=hard and
there
is no multilib for that.

Both of these issues could be handled by using MULTILIB_REUSE instead of
MULTILIB_MATCHES but this would require a large set of rules. I believe
instead
the right approach is to create a new mechanism to inform GCC on how options
can
be down mapped _when no multilib can be found_ which would require a smaller
set
of rules and would make it explicit that the options are not equivalent. A
patch
will be posted to this effect at a later time.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2016-10-03  Thomas Preud'homme  

* config.gcc: Allow combinations of aprofile and rmprofile values for
--with-multilib-list.
* config/arm/t-multilib: New file.
* config/arm/t-aprofile: Remove initialization of MULTILIB_*
variables.  Remove setting of ISA and floating-point ABI in
MULTILIB_OPTIONS

Re: [PATCH, GCC/testsuite/ARM, ping] Fix empty_fiq_handler target selector

2017-01-03 Thread Thomas Preudhomme


Happy new year!

Ping?

Best regards,

Thomas

On 09/12/16 15:28, Thomas Preudhomme wrote:

Hi,

The current target selector for empty_fiq_handler.c testcase skips the test when
targeting Thumb mode on a device with ARM execution state. Because it checks
Thumb mode by looking for an -mthumb option it fails to work when GCC was
configured with --with-mode=thumb. It is also too restrictive because interrupt
handler can be compiled in Thumb-2. This patch checks the arm_thumb1 effective
target instead of the -mthumb flag to fix both issues.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2016-12-09  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler: Skip instead if targeting Thumb-1
on a non Thumb-only target.


Tested with GCC built for ARMv5T and ARMv7-A with --with-mode=thumb and
--with-mode=arm and for ARMv6S-M with --with-mode=thumb:

* test pass in all cases for ARMv5T and ARMv7-A with -marm
* test pass in all cases for ARMv6S-M and ARMv7-A with -mthumb
* test pass without option when defaulting to ARM for ARMv5T and ARMv7-A
* test pass without option when defaulting to Thumb for ARMv6S-M and ARMv7-A
* test is unsupported with -marm for ARMv5T
* test is unsupported without option when defaulting to Thumb for ARMv5T

Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
index 8313f2199122be153a737946e817a5e3bee60372..69bb0669dd416e1fcb015c278d62961d071fc42f 100644
--- a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
+++ b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
@@ -1,5 +1,4 @@
-/* { dg-do compile } */
-/* { dg-skip-if "" { ! arm_cortex_m } { "-mthumb" } } */
+/* { dg-do compile { target { {! arm_thumb1 } || arm_cortex_m } } } */
 
 /* Below code used to trigger an ICE due to missing constraints for
sp = fp + cst pattern.  */

Re: [PATCH, GCC/testsuite/ARM, ping] Skip optional_mthumb tests if GCC has a default mode

2017-01-03 Thread Thomas Preudhomme


Ping?

Best regards,

Thomas

On 12/12/16 17:52, Thomas Preudhomme wrote:

Hi,

The logic to make -mthumb optional for Thumb-only devices is only executed when
no -marm or -mthumb is given on the command-line. This includes configuring GCC
with --with-mode= because this makes the option to be passed before any others.
The optional_mthumb-* testcases are skipped when -marm or -mthumb is passed on
the command line but not when GCC was configured with --with-mode. Not only are
the tests meaningless in these configurations, they also spuriously FAIL if
--with-mode=arm was used since the test are built for Thumb-only devices, as
reported by Christophe Lyon in [1].

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02082.html

This patch adds logic to target-support.exp to check how was GCC configured and
changes the optional_mthumb testcases to be skipped if there is a default mode
(--with-mode=). It also fixes a couple of typo on the selectors.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2016-12-09  Thomas Preud'homme  

* lib/target-supports.exp (check_configured_with): New procedure.
(check_effective_target_default_mode): new effective target.
* gcc.target/arm/optional_thumb-1.c: Skip if GCC was configured with a
default mode.  Fix dg-skip-if target selector syntax.
* gcc.target/arm/optional_thumb-2.c: Likewise.
* gcc.target/arm/optional_thumb-3.c: Fix dg-skip-if target selector
syntax.


Is this ok for stage3?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
index 23df62887ba4aaa1d8717a34ecda9a40246f0552..99cb0c3f33b601fff4493feef72765f7590e18f6 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
@@ -1,5 +1,5 @@
-/* { dg-do compile } */
-/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
+/* { dg-do compile { target { ! default_mode } } } */
+/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
 /* { dg-options "-march=armv6-m" } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
index 4bd53a45eca97e62dd3b86d5a1a66c5ca21e7aad..280dfb3fec55570b6cfe934303c9bd3d50322b86 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-2.c
@@ -1,5 +1,5 @@
-/* { dg-do compile } */
-/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-*} { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
+/* { dg-do compile { target { ! default_mode } } } */
+/* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
 /* { dg-options "-mcpu=cortex-m4" } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
index f1fd5c8840b191e600c20a7817c611bb9bb645df..d9150e09e475dfbeb7b0b0c153c913b1ad6f0777 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
@@ -1,8 +1,8 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
-/* { dg-skip-if "-mthumb given" { *-*-*} { "-mthumb" } } */
+/* { dg-skip-if "-mthumb given" { *-*-* } { "-mthumb" } } */
 /* { dg-options "-marm" } */
-/* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-*} 0 } */
+/* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-* } 0 } */
 
 /* Check that -marm gives an error when compiling for a Thumb-only target.  */
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 0fc0bafa67d8d34ec74ce2d1d7a2323a375615cc..f7511ef3aebca72c798496fb95ce43fcbbc08ed1 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -252,6 +252,20 @@ proc check_runtime {prop args} {
 }]
 }
 
+# Return 1 if GCC was configured with $pattern.
+proc check_configured_with { pattern } {
+global tool
+
+set gcc_output [${tool}_target_compile "-v" "" "none" ""]
+if { [ regexp "Configured with: \[^\n\]*$pattern" $gcc_output ] } {
+verbose "Matched: $pattern" 2
+return 1
+}
+
+verbose "Failed to match: $pattern" 2
+return 0
+}
+
 ###
 # proc check_weak_available { }
 ###
@@ -3797,6 +3811,12 @@ proc add_options_for_arm_arch_v7ve { flags } {
 return "$flags -march=armv7ve"
 }
 
+# Return 1 if GCC was configured with --with-mode=
+proc check_effective_target_default_mode { } {
+
+

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Andreas Krebbel

On 01/03/2017 03:31 PM, Janne Blomqvist wrote:
> On Tue, Jan 3, 2017 at 4:20 PM, Jakub Jelinek  wrote:
>> On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
>>> This patch costs several thousand additional instructions in
>>> Spec2006 on s390x ("lines" = instructions):
>>>
>>>   410.bwaves: +28 lines (2 funcs bigger)
>>>   437.leslie3d:   +43 lines (5 funcs bigger)
>>>   434.zeusmp:   +2650 lines (15 funcs bigger)
>>>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>>>   454.calculix:  +474 lines (23 funcs bigger)
>>>   465.tonto:+2182 lines (221 funcs bigger)
>>>   481.wrf:  +4988 lines (117 funcs bigger)
>>>   416.gamess:   +3723 lines (466 funcs bigger)
>>>
>>> s390x has a "compare with immediate and jump relative" instruction
>>> for 32 bit, but for an 8 bit quantities it needs separate compare
>>> and jump instructions, e.g.
>>>
>>>   cijne   %r1,0,... 
>>>
>>> ->
>>>
>>>   tmll%r1,1
>>>   jne ... 
>>>
>>> Instead of hard coding a specific type, should one ask the backend
>>> for the preferred type?
> 
> Hmm, that's sort of the opposite of what I had hoped for.. :-/
> 
> Is there some way to ask the backend what the preferred type is, then?
> 
> (The snide answer, why didn't the s390 ABi define
> bool/_Bool/boolean_type_node to be a 32 bit type if 8 bit types are
> problematic? But that's of course water under the bridge by now...)

The regression with 8 bit boolean types surfaced with the z10 machine. The ABI 
is much older. Since
we cannot change it anymore we should rather talk to the hardware guys to add 
the instruction we
need. So for now we probably have to live with the regression in the Fortran 
cases since as I
understand it your change fixes an actual problem.

-Andreas-

> 
>> The gfc_init_types change is an ABI change, at least if the fortran FE
>> bool type is ever stored in memory and accessed by multiple TUs, or
>> passed as argument etc.
> 
> Based on the quick audit I did when I wrote the patch, the only time
> it's used except as a local temp variable, is for a couple of the
> co-array intrinsics, where the corresponding library implementation
> actually uses C _Bool (I suspect it has worked by accident if the args
> are passed in registers).
> 
>>  And the difference between the C/C++ _Bool/bool
>> and fortran FE bool has caused lots of issues in the past, so if it can be
>> the same type, it is preferrable.
>>
>> Jakub
> 
> 
>

PING Re: [PATCH] Add x86_64-specific selftests for RTL function reader (v2)

2017-01-03 Thread David Malcolm

Ping:
  https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01616.html

(the patch has been successfully bootstrap on
x86_64-pc-linux-gnu, and also tested on i686-pc-linux-gnu).

On Mon, 2016-12-19 at 12:12 -0500, David Malcolm wrote:
> Note to i386 maintainters: this patch is part of the RTL frontend.
> It adds selftests for verifying that the RTL dump reader works as
> expected, with a mixture of real and hand-written "dumps" to
> exercise various aspects of the loader.   Many RTL dumps contain
> target-specific features (e.g. names of hard regs), and so these
> selftests need to be target-specific, and hence this patch puts
> them in i386.c.
> 
> Tested on i686-pc-linux-gnu and x86_64-pc-linux-gnu.
> 
> OK for trunk, assuming bootstrap?
> (this is dependent on patch 8a within the kit).
> 
> Changed in v2:
> - fixed selftest failures on i686:
>   * config/i386/i386.c
>   (selftest::ix86_test_loading_dump_fragment_1): Fix handling of
>   "frame" reg.
>   (selftest::ix86_test_loading_call_insn): Require TARGET_SSE.
> - updated to use "<3>" syntax for pseudos, rather than "$3"
> 
> Blurb from v1:
> This patch adds more selftests for class function_reader, where
> the dumps to be read contain x86_64-specific features.
> 
> In an earlier version of the patch kit, these were handled using
> preprocessor conditionals.
> This version instead runs them via a target hook for running
> target-specific selftests, thus putting them within i386.c.
> 
> gcc/ChangeLog:
>   * config/i386/i386.c
>   (selftest::ix86_test_loading_dump_fragment_1): New function.
>   (selftest::ix86_test_loading_call_insn): New function.
>   (selftest::ix86_test_loading_full_dump): New function.
>   (selftest::ix86_test_loading_unspec): New function.
>   (selftest::ix86_run_selftests): Call the new functions.
> 
> gcc/testsuite/ChangeLog:
>   * selftests/x86_64: New subdirectory.
>   * selftests/x86_64/call-insn.rtl: New file.
>   * selftests/x86_64/copy-hard-reg-into-frame.rtl: New file.
>   * selftests/x86_64/times-two.rtl: New file.
>   * selftests/x86_64/unspec.rtl: New file.
> 
> ---
>  gcc/config/i386/i386.c | 210
> +
>  gcc/testsuite/selftests/x86_64/call-insn.rtl   |  17 ++
>  .../selftests/x86_64/copy-hard-reg-into-frame.rtl  |  15 ++
>  gcc/testsuite/selftests/x86_64/times-two.rtl   |  51 +
>  gcc/testsuite/selftests/x86_64/unspec.rtl  |  20 ++
>  5 files changed, 313 insertions(+)
>  create mode 100644 gcc/testsuite/selftests/x86_64/call-insn.rtl
>  create mode 100644 gcc/testsuite/selftests/x86_64/copy-hard-reg-into
> -frame.rtl
>  create mode 100644 gcc/testsuite/selftests/x86_64/times-two.rtl
>  create mode 100644 gcc/testsuite/selftests/x86_64/unspec.rtl
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 1cd1cd8..dc1a86f 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -51200,6 +51200,209 @@ ix86_test_dumping_memory_blockage ()
> "] UNSPEC_MEMORY_BLOCKAGE)))\n", pat, );
>  }
>  
> +/* Verify loading an RTL dump; specifically a dump of copying
> +   a param on x86_64 from a hard reg into the frame.
> +   This test is target-specific since the dump contains target
> -specific
> +   hard reg names.  */
> +
> +static void
> +ix86_test_loading_dump_fragment_1 ()
> +{
> +  rtl_dump_test t (SELFTEST_LOCATION,
> +locate_file ("x86_64/copy-hard-reg-into
> -frame.rtl"));
> +
> +  rtx_insn *insn = get_insn_by_uid (1);
> +
> +  /* The block structure and indentation here is purely for
> + readability; it mirrors the structure of the rtx.  */
> +  tree mem_expr;
> +  {
> +rtx pat = PATTERN (insn);
> +ASSERT_EQ (SET, GET_CODE (pat));
> +{
> +  rtx dest = SET_DEST (pat);
> +  ASSERT_EQ (MEM, GET_CODE (dest));
> +  /* Verify the "/c" was parsed.  */
> +  ASSERT_TRUE (RTX_FLAG (dest, call));
> +  ASSERT_EQ (SImode, GET_MODE (dest));
> +  {
> + rtx addr = XEXP (dest, 0);
> + ASSERT_EQ (PLUS, GET_CODE (addr));
> + ASSERT_EQ (DImode, GET_MODE (addr));
> + {
> +   rtx lhs = XEXP (addr, 0);
> +   /* Verify that the "frame" REG was consolidated.  */
> +   ASSERT_RTX_PTR_EQ (frame_pointer_rtx, lhs);
> + }
> + {
> +   rtx rhs = XEXP (addr, 1);
> +   ASSERT_EQ (CONST_INT, GET_CODE (rhs));
> +   ASSERT_EQ (-4, INTVAL (rhs));
> + }
> +  }
> +  /* Verify the "[1 i+0 S4 A32]" was parsed.  */
> +  ASSERT_EQ (1, MEM_ALIAS_SET (dest));
> +  /* "i" should have been handled by synthesizing a global int
> +  variable named "i".  */
> +  mem_expr = MEM_EXPR (dest);
> +  ASSERT_NE (mem_expr, NULL);
> +  ASSERT_EQ (VAR_DECL, TREE_CODE (mem_expr));
> +  ASSERT_EQ (integer_type_node, TREE_TYPE (mem_expr));
> +  ASSERT_EQ (IDENTIFIER_NODE, TREE_CODE (DECL_NAME (mem_expr)));
> +  ASSERT_STREQ ("i", IDENTIFIER_POINTER (DECL_NAME

Re: [patch,avr] PR78883: Implement CANNOT_CHANGE_MODE_CLASS.

2017-01-03 Thread Segher Boessenkool

On Tue, Jan 03, 2017 at 01:43:01PM +, Richard Sandiford wrote:
> An alternative would be to add a new macro to control this block in
> general_operand:
> 
> #ifdef INSN_SCHEDULING
>   /* On machines that have insn scheduling, we want all memory
>reference to be explicit, so outlaw paradoxical SUBREGs.
>However, we must allow them after reload so that they can
>get cleaned up by cleanup_subreg_operands.  */
>   if (!reload_completed && MEM_P (sub)
> && GET_MODE_SIZE (mode) > GET_MODE_SIZE (GET_MODE (sub)))
>   return 0;
> #endif
> 
> The default would still be INSN_SCHEDULING, but AVR could override it
> to 1 and reject the same subregs.

Or you can define INSN_SCHEDULING (by defining a trivial automaton for
your port: a define_automaton, a define_cpu_unit for that automaton, and
a define_insn_reservation for that unit).

It would be nice if we could separate "no subregs of memory" from "has
instruction scheduling".  Or ideally subregs of memory would just go away
completely.

> That would still be a hack, but at least it would be taking things in
> a good direction.  Having different rules depending on whether targets
> define a scheduler is just a horrible wart that no-one's ever had chance
> to fix.  If using the code above works well on AVR then that'd be a big
> step towards making the code unconditional.

Ugh, it sounds like I am volunteering.  Oh well.

> It'd definitely be worth checking how it affects code quality though.

It shouldn't be too bad, if ports that do have instruction scheduling can
live without it.

> (Although the same goes for the current patch, since C_C_M_C is a pretty
> big hammer.)

Yeah, and the current patch disallows much more than is needed here, even.

Segher

Re: [Patch][ARM,AArch64] more poly64 intrinsics and tests

2017-01-03 Thread Christophe Lyon

Ping?


On 14 December 2016 at 23:09, Christophe Lyon
 wrote:
> On 14 December 2016 at 17:55, James Greenhalgh  
> wrote:
>> On Mon, Dec 12, 2016 at 05:03:31PM +0100, Christophe Lyon wrote:
>>> Hi,
>>>
>>> After the recent update from Tamar, I noticed a few discrepancies
>>> between ARM and AArch64 regarding a few poly64 intrinsics.
>>>
>>> This patch:
>>> - adds vtst_p64 and vtstq_p64 to AArch64's arm_neon.h
>>> - adds vgetq_lane_p64, vset_lane_p64 and vsetq_lane_p64 to ARM's arm_neon.h
>>> ( vget_lane_p64 was already there)
>>> - adds the corresponding tests, and moves the vget_lane_p64 ones out
>>> of the #ifdef __aarch64__ zone.
>>>
>>> Cross-tested on arm* and aarch64* targets.
>>>
>>> OK?
>>
>> The AArch64 parts of this look fine to me, but I do have one question on
>> your inline assembly implementation for vtstq_p64:
>>
>>> +__extension__ extern __inline uint64x2_t
>>> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>>> +vtstq_p64 (poly64x2_t a, poly64x2_t b)
>>> +{
>>> +  uint64x2_t result;
>>> +  __asm__ ("cmtst %0.2d, %1.2d, %2.2d"
>>> +   : "=w"(result)
>>> +   : "w"(a), "w"(b)
>>> +   : /* No clobbers */);
>>> +  return result;
>>> +}
>>> +
>>
>> Why can this not be written as many of the other vtstq intrinsics are; e.g.:
>>
>>__extension__ extern __inline uint64x2_t
>>   __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>>   vtstq_p64 (poly64x2_t __a, poly64x2_t __b)
>>   {
>> return (uint64x2_t) uint64x2_t) __a) & ((uint64x2_t) __b))
>>   != __AARCH64_INT64_C (0));
>>   }
>>
>
> I don't know, I just followed the pattern used for vtstq_p8 and vtstq_p16
> just above...
>
>
>> Thanks,
>> James
>>
>>> gcc/ChangeLog:
>>>
>>> 2016-12-12  Christophe Lyon  
>>>
>>>   * config/aarch64/arm_neon.h (vtst_p64): New.
>>>   (vtstq_p64): New.
>>>   * config/arm/arm_neon.h (vgetq_lane_p64): New.
>>>   (vset_lane_p64): New.
>>>   (vsetq_lane_p64): New.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2016-12-12  Christophe Lyon  
>>>
>>>   * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
>>>   (vget_lane_expected, vset_lane_expected, vtst_expected_poly64x1):
>>>   New.
>>>   (vmov_n_expected0, vmov_n_expected1, vmov_n_expected2)
>>>   (expected_vld_st2_0, expected_vld_st2_1, expected_vld_st3_0)
>>>   (expected_vld_st3_1, expected_vld_st3_2, expected_vld_st4_0)
>>>   (expected_vld_st4_1, expected_vld_st4_2, expected_vld_st4_3)
>>>   (vtst_expected_poly64x2): Move to aarch64-only section.
>>>   (vget_lane_p64, vgetq_lane_p64, vset_lane_p64, vsetq_lane_p64)
>>>   (vtst_p64, vtstq_p64): New tests.
>>>
>>
>>

Re: [ARM] PR 78253 do not resolve weak ref locally

2017-01-03 Thread Christophe Lyon

Ping?

The patch is at https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00078.html


On 14 December 2016 at 16:29, Christophe Lyon
 wrote:
> Ping^2 ?
>
> As a reminder, this patch mimics what aarch64 does wrt to references to weak
> symbols such that they are not resolved by the assembler, in case a strong
> definition overrides the local one at link time.
>
> Christophe
>
>
> On 8 December 2016 at 09:17, Christophe Lyon  
> wrote:
>> Ping?
>>
>> On 1 December 2016 at 15:27, Christophe Lyon  
>> wrote:
>>> Hi,
>>>
>>>
>>> On 10 November 2016 at 15:10, Christophe Lyon
>>>  wrote:
 On 10 November 2016 at 11:05, Richard Earnshaw
  wrote:
> On 09/11/16 21:29, Christophe Lyon wrote:
>> Hi,
>>
>> PR 78253 shows that the handling of weak references has changed for
>> ARM with gcc-5.
>>
>> When r220674 was committed, default_binds_local_p_2 gained a new
>> parameter (weak_dominate), which, when true, implies that a reference
>> to a weak symbol defined locally will be resolved locally, even though
>> it could be overridden by a strong definition in another object file.
>>
>> With r220674, default_binds_local_p forces weak_dominate=true,
>> effectively changing the previous behavior.
>>
>> The attached patch introduces default_binds_local_p_4 which is a copy
>> of default_binds_local_p_2, but using weak_dominate=false, and updates
>> the ARM target to call default_binds_local_p_4 instead of
>> default_binds_local_p_2.
>>
>> I ran cross-tests on various arm* configurations with no regression,
>> and checked that the test attached to the original bugzilla now works
>> as expected.
>>
>> I am not sure why weak_dominate defaults to true, and I couldn't
>> really understand why by reading the threads related to r220674 and
>> following updates to default_binds_local_p_* which all deal with other
>> corner cases and do not discuss the weak_dominate parameter.
>>
>> Or should this patch be made more generic?
>>
>
> I certainly don't think it should be ARM specific.
 That was my feeling too.

>
> The questions I have are:
>
> 1) What do other targets do today.  Are they the same, or different?

 arm, aarch64, s390 use default_binds_local_p_2 since PR 65780, and
 default_binds_local_p before that. Both have weak_dominate=true
 i386 has its own version, calling default_binds_local_p_3 with true
 for weak_dominate

 But the behaviour of default_binds_local_p changed with r220674 as I said 
 above.
 See https://gcc.gnu.org/viewcvs/gcc?view=revision=220674 and
 notice how weak_dominate was introduced

 The original bug report is about a different case:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32219

 The original patch submission is
 https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00410.html
 and the 1st version with weak_dominate is in:
 https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00469.html
 but it's not clear to me why this was introduced

> 2) If different why?
 on aarch64, although binds_local_p returns true, the relocations used when
 building the function pointer is still the same (still via the GOT).

 aarch64 has different logic than arm when accessing a symbol
 (eg aarch64_classify_symbol)

> 3) Is the current behaviour really what was intended by the patch?  ie.
> Was the old behaviour actually wrong?
>
 That's what I was wondering.
 Before r220674, calling a weak function directly or via a function
 pointer had the same effect (in other words, the function pointer
 points to the actual implementation: the strong one if any, the weak
 one otherwise).

 After r220674, on arm the function pointer points to the weak
 definition, which seems wrong to me, it should leave the actual
 resolution to the linker.


>>>
>>> After looking at the aarch64 port, I think that references to weak symbols
>>> have to be handled carefully, to make sure they cannot be resolved
>>> by the assembler, since the weak symbol can be overridden by a strong
>>> definition at link-time.
>>>
>>> Here is a new patch which does that.
>>> Validated on arm* targets with no regression, and I checked that the
>>> original testcase now executes as expected.
>>>
>>> Christophe
>>>
>>>
> R.
>> Thanks,
>>
>> Christophe
>>
>

Re: [PATCH] libstdc++: Allow using without lock free atomic int

2017-01-03 Thread Jonathan Wakely


On 19/12/16 17:52 +, Jonathan Wakely wrote:

On 16/12/16 17:52 +, Jonathan Wakely wrote:

On 09/11/16 23:26 +0200, Pauli wrote:

Compiling programs using std::future for old arm processors fails. The
problem is caused by preprocessor check for atomic lock free int.

Future can be changed to work correctly without lock free atomics with
minor changes to exception_ptr implementation.

Without lock free atomics there is question if deadlock can happen. But
atomic operations can't call outside code preventing any ABBA or
recursive mutex acquiring deadlocks.
Deadlock could happen if throwing an exception or access
is_lock_free() == false atomic from asynchronous signal handler.
Throwing from signal handler is undefined behavior. I don't know about
accessing atomics from asynchronous signal handler but that feels like
undefined behavior if is_lock_free returns false.

Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64735

differences to current if atomic builtins available:
* Race detector annotations that are empty by default
* Check for __gthread_active_p
* Generate x86 code uses xadd instead of xsub
This makes code a bit worse. But problem is duplicated to any other user
of __exchange_and_add. The internal API implementation should be fixed
to generate better code for all cases. But that is a follow up patch.


I'd prefer to do it so we don't change anything for the targets that
already work. Your follow-up patch missed the deadline for GCC 7 and
so will have to wait for GCC 8 now, and we don't want to pessimize
x86.

Also, I think your patch actually breaks some GNU/Linux targets,
because you removed the  header from
, which means that in libsupc++/guard.cc the macro
ATOMIC_INT_LOCK_FREE is no longer defined, and so _GLIBCXX_USE_FUTEX
doesn't get defined. Now arguably guard.cc should have been including
the header directly, but it still shows why such an invasive patch is
a bad idea at this stage of the GCC 7 process.

The attached patch attempts to make exception propagation work for all
targets, without changing anything if it already works.

Do you see any problems with this alternative approach?
Could you please test it for armv5te?

It passes all tests for x86_64-linux and ppc64le-linux.

For your follow-up patch, do you already have a copyright assignment
for contributions to GCC? We'll probably need that before it can be
accepted. We don't need one for this patch, because what remains of
your original patch is just the testsuite changes, which are
mechanical and not copyrightable.


We also need to adjust the linker script to avoid adding new exports
to old symbol versions, revised patch attached. I think it would be
better to make configure define a macro like
HAVE_EXCEPTION_PTR_SINCE_GCC6 and use that in the linker script
instead of testing __GCC_ATOMIC_INT_LOCK_FREE directly. I'll work on
that.


Here's what I plan to commit to trunk tomorrow.


commit fa070e976709218d9927e2ed880bd29f7f98106f
Author: Jonathan Wakely 
Date:   Fri Dec 16 15:22:21 2016 +

Support exception propagation without lock-free atomic int

2017-01-03  Pauli Nieminen  
	Jonathan Wakely  

	PR libstdc++/64735
	* acinclude.m4 (GLIBCXX_CHECK_EXCEPTION_PTR_SYMVER): Define.
	* config.h.in: Regenerate.
	* config/abi/pre/gnu.ver [HAVE_EXCEPTION_PTR_SINCE_GCC46]
	(GLIBCXX_3.4.15, GLIBCXX_3.4.21, CXXABI_1.3.3, CXXABI_1.3.5): Make
	exports for exception_ptr, nested_exception, and future conditional.
	[HAVE_EXCEPTION_PTR_SINCE_GCC46] (GLIBCXX_3.4.23, CXXABI_1.3.11): Add
	exports for exception_ptr, nested_exception, and future conditional.
	* configure: Regenerate.
	* configure.ac: Use GLIBCXX_CHECK_EXCEPTION_PTR_SYMVER.
	* include/std/future: Remove check for ATOMIC_INT_LOCK_FREE
	* libsupc++/eh_atomics.h: New file for internal use only.
	(__eh_atomic_inc, __eh_atomic_dec): New.
	* libsupc++/eh_ptr.cc (exception_ptr::_M_addref)
	(exception_ptr::_M_release) (__gxx_dependent_exception_cleanup)
	(rethrow_exception): Use eh_atomics.h reference counting helpers.
	* libsupc++/eh_throw.cc (__gxx_exception_cleanup): Likewise.
	* libsupc++/eh_tm.cc (free_any_cxa_exception): Likewise.
	* libsupc++/exception: Remove check for ATOMIC_INT_LOCK_FREE.
	* libsupc++/exception_ptr.h: Likewise.
	* libsupc++/guard.cc: Include header for ATOMIC_INT_LOCK_FREE macro.
	* libsupc++/nested_exception.cc: Remove check for
	ATOMIC_INT_LOCK_FREE.
	* libsupc++/nested_exception.h: Likewise.
	* src/c++11/future.cc: Likewise.
	* testsuite/18_support/exception_ptr/*: Remove atomic builtins checks.
	* testsuite/18_support/nested_exception/*: Likewise.
	* testsuite/30_threads/async/*: Likewise.
	* testsuite/30_threads/future/*: Likewise.
	* testsuite/30_threads/headers/future/types_std_c++0x.cc: Likewise.
	* testsuite/30_threads/packaged_task/*: Likewise.

Re: [PATCH v2] aarch64: Add split-stack initial support

2017-01-03 Thread Wilco Dijkstra

Adhemerval Zanella wrote:
  
Sorry for the late reply - but I think it's getting there. A few more comments:

+  /* If function uses stacked arguments save the old stack value so morestack
+ can return it.  */
+  reg11 = gen_rtx_REG (Pmode, R11_REGNUM);
+  if (cfun->machine->frame.saved_regs_size
+  || cfun->machine->frame.saved_varargs_size)
+emit_move_insn (reg11, stack_pointer_rtx);

This doesn't look right - we could have many arguments even without varargs or
saved regs.  This would need to check varargs as well as ctrl->args.size (I 
believe
that is the size of the arguments on the stack). It's fine to omit this 
optimization
in the first version - we already emit 2-3 extra instructions for the check 
anyway.


+void
+aarch64_split_stack_space_check (rtx size, rtx label)
{
+  rtx mem, ssvalue, cc, cmp, jump, temp;
+  rtx requested = gen_reg_rtx (Pmode);
+  /* Offset from thread pointer to __private_ss.  */
+  int psso = 0x10;
+
+  /* Load __private_ss from TCB.  */
+  ssvalue = gen_rtx_REG (Pmode, R9_REGNUM);

ssvalue doesn't need to be a hardcoded register.

+  emit_insn (gen_aarch64_load_tp_hard (ssvalue));
+  mem = gen_rtx_MEM (Pmode, plus_constant (Pmode, ssvalue, psso));
+  emit_move_insn (ssvalue, mem);
+
+  temp = gen_rtx_REG (Pmode, R10_REGNUM);
+
+  /* And compare it with frame pointer plus required stack.  */
+  size = force_reg (Pmode, size);
+  emit_move_insn (requested, gen_rtx_MINUS (Pmode, stack_pointer_rtx, size));
+
+  /* Jump to __morestack call if current __private_ss is not suffice.  */
+  cc = aarch64_gen_compare_reg (LT, temp, ssvalue);

This uses X10, but where is it set???

+  cmp = gen_rtx_fmt_ee (GEU, VOIDmode, cc, const0_rtx);
+  jump = emit_jump_insn (gen_condjump (cmp, cc, label));
+  JUMP_LABEL (jump) = label;
+}

So neither X10 nor X12 are set before potentially calling __morestack, so I 
don't
think it will work. Could this be causing the crash you mentioned?

Wilco

Re: [PATCH 2/2] [ADA] Fix constants in s-linux-mips.ads

2017-01-03 Thread Arnaud Charlet

> This patch corrects various constants in s-linux-mips.ads. A large
> proportion (especially the signals) were simply wrong on MIPS. It also
> fixes the struct sigaction offsets which are incorrect on 64-bit
> systems because sa_flags is an int (always 32-bits), and not a pointer.
> 
> Thanks,
> James
> 
> gcc/ada/Changelog:
> 
> 2017-01-03  James Cowgill  
> 
>   * s-linux-mips.ads: Use correct signal and errno constants.
>   (sa_handler_pos, sa_mask_pos): Fix offsets for 64-bit MIPS.

Change is OK.

[PATCH 2/2] [ADA] Fix constants in s-linux-mips.ads

2017-01-03 Thread James Cowgill

Hi,

This patch corrects various constants in s-linux-mips.ads. A large
proportion (especially the signals) were simply wrong on MIPS. It also
fixes the struct sigaction offsets which are incorrect on 64-bit
systems because sa_flags is an int (always 32-bits), and not a pointer.

Thanks,
James

gcc/ada/Changelog:

2017-01-03  James Cowgill  

* s-linux-mips.ads: Use correct signal and errno constants.
(sa_handler_pos, sa_mask_pos): Fix offsets for 64-bit MIPS.

diff --git a/gcc/ada/s-linux-mips.ads b/gcc/ada/s-linux-mips.ads
index 17a3375ccce..f10f35caff9 100644
--- a/gcc/ada/s-linux-mips.ads
+++ b/gcc/ada/s-linux-mips.ads
@@ -26,7 +26,7 @@
 --  --
 --
 
---  This is the mipsel version of this package
+--  This is the mips version of this package
 
 --  This package encapsulates cpu specific differences between implementations
 --  of GNU/Linux, in order to share s-osinte-linux.ads.
@@ -43,6 +43,7 @@ package System.Linux is
-- Time --
--
 
+   subtype int is Interfaces.C.int;
subtype longis Interfaces.C.long;
subtype suseconds_t is Interfaces.C.long;
subtype time_t  is Interfaces.C.long;
@@ -69,7 +70,7 @@ package System.Linux is
EINVAL: constant := 22;
ENOMEM: constant := 12;
EPERM : constant := 1;
-   ETIMEDOUT : constant := 110;
+   ETIMEDOUT : constant := 145;
 
-
-- Signals --
@@ -82,45 +83,52 @@ package System.Linux is
SIGTRAP: constant := 5; --  trace trap (not reset)
SIGIOT : constant := 6; --  IOT instruction
SIGABRT: constant := 6; --  used by abort, replace SIGIOT in the  future
+   SIGEMT : constant := 7; --  EMT
SIGFPE : constant := 8; --  floating point exception
SIGKILL: constant := 9; --  kill (cannot be caught or ignored)
-   SIGBUS : constant := 7; --  bus error
+   SIGBUS : constant := 10; --  bus error
SIGSEGV: constant := 11; --  segmentation violation
+   SIGSYS : constant := 12; --  bad system call
SIGPIPE: constant := 13; --  write on a pipe with no one to read it
SIGALRM: constant := 14; --  alarm clock
SIGTERM: constant := 15; --  software termination signal from kill
-   SIGUSR1: constant := 10; --  user defined signal 1
-   SIGUSR2: constant := 12; --  user defined signal 2
-   SIGCLD : constant := 17; --  alias for SIGCHLD
-   SIGCHLD: constant := 17; --  child status change
-   SIGPWR : constant := 30; --  power-fail restart
-   SIGWINCH   : constant := 28; --  window size change
-   SIGURG : constant := 23; --  urgent condition on IO channel
-   SIGPOLL: constant := 29; --  pollable event occurred
-   SIGIO  : constant := 29; --  I/O now possible (4.2 BSD)
-   SIGLOST: constant := 29; --  File lock lost
-   SIGSTOP: constant := 19; --  stop (cannot be caught or ignored)
-   SIGTSTP: constant := 20; --  user stop requested from tty
-   SIGCONT: constant := 18; --  stopped process has been continued
-   SIGTTIN: constant := 21; --  background tty read attempted
-   SIGTTOU: constant := 22; --  background tty write attempted
-   SIGVTALRM  : constant := 26; --  virtual timer expired
-   SIGPROF: constant := 27; --  profiling timer expired
-   SIGXCPU: constant := 24; --  CPU time limit exceeded
-   SIGXFSZ: constant := 25; --  filesize limit exceeded
-   SIGUNUSED  : constant := 31; --  unused signal (GNU/Linux)
-   SIGSTKFLT  : constant := 16; --  coprocessor stack fault (Linux)
+   SIGUSR1: constant := 16; --  user defined signal 1
+   SIGUSR2: constant := 17; --  user defined signal 2
+   SIGCLD : constant := 18; --  alias for SIGCHLD
+   SIGCHLD: constant := 18; --  child status change
+   SIGPWR : constant := 19; --  power-fail restart
+   SIGWINCH   : constant := 20; --  window size change
+   SIGURG : constant := 21; --  urgent condition on IO channel
+   SIGPOLL: constant := 22; --  pollable event occurred
+   SIGIO  : constant := 22; --  I/O now possible (4.2 BSD)
+   SIGSTOP: constant := 23; --  stop (cannot be caught or ignored)
+   SIGTSTP: constant := 24; --  user stop requested from tty
+   SIGCONT: constant := 25; --  stopped process has been continued
+   SIGTTIN: constant := 26; --  background tty read attempted
+   SIGTTOU: constant := 27; --  background tty write attempted
+   SIGVTALRM  : constant := 28; --  virtual timer expired
+   SIGPROF: constant := 29; --  profiling timer expired
+   SIGXCPU: constant := 30; --  CPU time limit exceeded
+   SIGXFSZ: constant := 31; --  filesize limit exceeded
+
SIGLTHRRES : constant := 32; --  GNU/LinuxThreads restart signal
SIGLTHRCAN : constant := 33; --  GNU/LinuxThreads cancel signal
SIGLTHRDBG : constant := 34;

Re: [PATCH 1/2] [ADA] Fix MIPS big-endian build

2017-01-03 Thread James Cowgill

Hi,

On 03/01/17 14:47, Arnaud Charlet wrote:
>> This patch merges the mips and mipsel sections in
>> gcc-interface/Makefile.in favoring the existing variables in mipsel.
>> Over time, the mipsel target was tested much more than the mips target
>> and a number of fixes were applied which should have been applied to
>> both. Since the only real difference between mips and mipsel is the
>> endianness, it makes sense to merge them together and add an extra ifeq
>> for the one file which does differ with endianness.
>>
>> I don't have commit access.
>>
>> Thanks,
>> James
>>
>> gcc/ada/Changelog:
>>
>> 2017-01-03  James Cowgill  
>>
>>  * s-linux-mips.ads: Rename from s-linux-mipsel.ads.
>>  * gcc-interface/Makefile.in (MIPS/Linux): Merge mips and mipsel
>>  sections.
> 
> Changes look OK to me.

Thanks, can you commit it for me? I screwed up the patch in the email I
sent a few minutes ago but the patch below should apply.

James

gcc/ada/Changelog:

2017-01-03  James Cowgill  

* s-linux-mips.ads: Rename from s-linux-mipsel.ads.
* gcc-interface/Makefile.in (MIPS/Linux): Merge mips and mipsel
sections.

diff --git a/gcc/ada/gcc-interface/Makefile.in 
b/gcc/ada/gcc-interface/Makefile.in
index 98889c0f30f..b47a16c8b41 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -1813,36 +1813,12 @@ ifeq ($(strip $(filter-out cygwin% mingw32% 
pe,$(target_os))),)
 endif
 
 # Mips Linux
-ifeq ($(strip $(filter-out mips linux%,$(target_cpu) $(target_os))),)
+ifeq ($(strip $(filter-out mips% linux%,$(target_cpu) $(target_os))),)
   LIBGNAT_TARGET_PAIRS = \
   a-intnam.ads

Re: [PATCH 1/2] [ADA] Fix MIPS big-endian build

2017-01-03 Thread Arnaud Charlet

> This patch merges the mips and mipsel sections in
> gcc-interface/Makefile.in favoring the existing variables in mipsel.
> Over time, the mipsel target was tested much more than the mips target
> and a number of fixes were applied which should have been applied to
> both. Since the only real difference between mips and mipsel is the
> endianness, it makes sense to merge them together and add an extra ifeq
> for the one file which does differ with endianness.
> 
> I don't have commit access.
> 
> Thanks,
> James
> 
> gcc/ada/Changelog:
> 
> 2017-01-03  James Cowgill  
> 
>   * s-linux-mips.ads: Rename from s-linux-mipsel.ads.
>   * gcc-interface/Makefile.in (MIPS/Linux): Merge mips and mipsel
>   sections.

Changes look OK to me.

Arno

[PATCH 1/2] [ADA] Fix MIPS big-endian build

2017-01-03 Thread James Cowgill

Hi,

This patch merges the mips and mipsel sections in
gcc-interface/Makefile.in favoring the existing variables in mipsel.
Over time, the mipsel target was tested much more than the mips target
and a number of fixes were applied which should have been applied to
both. Since the only real difference between mips and mipsel is the
endianness, it makes sense to merge them together and add an extra ifeq
for the one file which does differ with endianness.

I don't have commit access.

Thanks,
James

gcc/ada/Changelog:

2017-01-03  James Cowgill  

* s-linux-mips.ads: Rename from s-linux-mipsel.ads.
* gcc-interface/Makefile.in (MIPS/Linux): Merge mips and mipsel
sections.

diff --git a/gcc/ada/gcc-interface/Makefile.in
b/gcc/ada/gcc-interface/Makefile.in
index 98889c0f30f..b47a16c8b41 100644
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -1813,36 +1813,12 @@ ifeq ($(strip $(filter-out cygwin% mingw32%
pe,$(target_os))),)
 endif
  # Mips Linux
-ifeq ($(strip $(filter-out mips linux%,$(target_cpu) $(target_os))),)
+ifeq ($(strip $(filter-out mips% linux%,$(target_cpu) $(target_os))),)
   LIBGNAT_TARGET_PAIRS = \
   a-intnam.ads

Re: [PATCH, wwwdocs] Add gcc-7/porting_to.html, Fortran changes

2017-01-03 Thread Gerald Pfeifer

On Tue, 3 Jan 2017, Janne Blomqvist wrote:
> the attached patch mentions some changes in the Fortran frontend for 
> the GCC 7 cycle, and also adds the so-far missing gcc-7/porting_to.html 
> file.

Thank you, Janne.

> Ok to commit?

Yes, modulo Joseph's comment and a missing  towards the end
of porting_to.html.

(I am not sure ... can sit inside a ... environment,
and so you may need to alternate the two instead of nesting them, though
if it validates fine, that's okay of course.)

Gerald

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Janne Blomqvist

On Tue, Jan 3, 2017 at 4:20 PM, Jakub Jelinek  wrote:
> On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
>> This patch costs several thousand additional instructions in
>> Spec2006 on s390x ("lines" = instructions):
>>
>>   410.bwaves: +28 lines (2 funcs bigger)
>>   437.leslie3d:   +43 lines (5 funcs bigger)
>>   434.zeusmp:   +2650 lines (15 funcs bigger)
>>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>>   454.calculix:  +474 lines (23 funcs bigger)
>>   465.tonto:+2182 lines (221 funcs bigger)
>>   481.wrf:  +4988 lines (117 funcs bigger)
>>   416.gamess:   +3723 lines (466 funcs bigger)
>>
>> s390x has a "compare with immediate and jump relative" instruction
>> for 32 bit, but for an 8 bit quantities it needs separate compare
>> and jump instructions, e.g.
>>
>>   cijne   %r1,0,... 
>>
>> ->
>>
>>   tmll%r1,1
>>   jne ... 
>>
>> Instead of hard coding a specific type, should one ask the backend
>> for the preferred type?

Hmm, that's sort of the opposite of what I had hoped for.. :-/

Is there some way to ask the backend what the preferred type is, then?

(The snide answer, why didn't the s390 ABi define
bool/_Bool/boolean_type_node to be a 32 bit type if 8 bit types are
problematic? But that's of course water under the bridge by now...)

> The gfc_init_types change is an ABI change, at least if the fortran FE
> bool type is ever stored in memory and accessed by multiple TUs, or
> passed as argument etc.

Based on the quick audit I did when I wrote the patch, the only time
it's used except as a local temp variable, is for a couple of the
co-array intrinsics, where the corresponding library implementation
actually uses C _Bool (I suspect it has worked by accident if the args
are passed in registers).

>  And the difference between the C/C++ _Bool/bool
> and fortran FE bool has caused lots of issues in the past, so if it can be
> the same type, it is preferrable.
>
> Jakub

-- 
Janne Blomqvist

[wwwdocs] Trim GCJ references from "Current Development" in onlinedocs/

2017-01-03 Thread Gerald Pfeifer

Applied.

Gerald

Index: onlinedocs/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/onlinedocs/index.html,v
retrieving revision 1.161
diff -u -r1.161 index.html
--- onlinedocs/index.html   21 Dec 2016 10:57:36 -  1.161
+++ onlinedocs/index.html   3 Jan 2017 14:19:34 -
@@ -1098,12 +1098,6 @@
href="https://gcc.gnu.org/onlinedocs/gfortran.ps.gz;>PostScript 
or https://gcc.gnu.org/onlinedocs/gfortran-html.tar.gz;>an
HTML tarball)
-https://gcc.gnu.org/onlinedocs/gcj/;>GCJ Manual (https://gcc.gnu.org/onlinedocs/gcj.pdf;>also in
-   PDF or https://gcc.gnu.org/onlinedocs/gcj.ps.gz;>PostScript or https://gcc.gnu.org/onlinedocs/gcj-html.tar.gz;>an
-   HTML tarball)
 https://gcc.gnu.org/onlinedocs/cpp/;>CPP Manual (https://gcc.gnu.org/onlinedocs/cpp.pdf;>also in
PDF or

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Jakub Jelinek

On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
> This patch costs several thousand additional instructions in
> Spec2006 on s390x ("lines" = instructions):
> 
>   410.bwaves: +28 lines (2 funcs bigger)
>   437.leslie3d:   +43 lines (5 funcs bigger)
>   434.zeusmp:   +2650 lines (15 funcs bigger)
>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>   454.calculix:  +474 lines (23 funcs bigger)
>   465.tonto:+2182 lines (221 funcs bigger)
>   481.wrf:  +4988 lines (117 funcs bigger)
>   416.gamess:   +3723 lines (466 funcs bigger)
> 
> s390x has a "compare with immediate and jump relative" instruction
> for 32 bit, but for an 8 bit quantities it needs separate compare
> and jump instructions, e.g.
> 
>   cijne   %r1,0,... 
> 
> ->
> 
>   tmll%r1,1
>   jne ... 
> 
> Instead of hard coding a specific type, should one ask the backend
> for the preferred type?

The gfc_init_types change is an ABI change, at least if the fortran FE
bool type is ever stored in memory and accessed by multiple TUs, or
passed as argument etc.  And the difference between the C/C++ _Bool/bool
and fortran FE bool has caused lots of issues in the past, so if it can be
the same type, it is preferrable.

Jakub

Re: [PATCH] Use the middle-end boolean_type_node

2017-01-03 Thread Dominik Vogt

On Tue, Dec 13, 2016 at 10:59:09PM +0200, Janne Blomqvist wrote:
> Use the boolean_type_node setup by the middle-end instead of
> redefining it. boolean_type_node is not used in GFortran for any
> ABI-visible stuff, only internally as the type of boolean
> expressions. There appears to be one exception to this, namely the
> caf_get* and caf_send* calls which have boolean_type_node
> arguments. However, on the library side they seem to use C _Bool, so I
> suspect this might be a case of a argument mismatch that hasn't
> affected anything so far.
> 
> The practical effect of this is that the size of such variables will
> be the same as a C _Bool or C++ bool, that is, on most targets a
> single byte. Previously we redefined boolean_type_node to be a Fortran
> default logical kind sized variable, that is 4 or 8 bytes depending on
> compile options. This might enable slightly more compact code, in case
> the optimizer determines that the result of such a generated
> comparison expression needs to be stored in some temporary location
> rather than being used immediately.

This patch costs several thousand additional instructions in
Spec2006 on s390x ("lines" = instructions):

  410.bwaves: +28 lines (2 funcs bigger)
  437.leslie3d:   +43 lines (5 funcs bigger)
  434.zeusmp:   +2650 lines (15 funcs bigger)
  459.GemsFDTD:   +65 lines (7 funcs bigger)
  454.calculix:  +474 lines (23 funcs bigger)
  465.tonto:+2182 lines (221 funcs bigger)
  481.wrf:  +4988 lines (117 funcs bigger)
  416.gamess:   +3723 lines (466 funcs bigger)

s390x has a "compare with immediate and jump relative" instruction
for 32 bit, but for an 8 bit quantities it needs separate compare
and jump instructions, e.g.

  cijne   %r1,0,... 

->

  tmll%r1,1
  jne ... 

Instead of hard coding a specific type, should one ask the backend
for the preferred type?

> Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
> 
> 2016-12-13  Janne Blomqvist  
> 
>   * trans-types.c (gfc_init_types): Don't redefine boolean type node.
> ---
>  gcc/fortran/trans-types.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> index 354308f..e8dafa0 100644
> --- a/gcc/fortran/trans-types.c
> +++ b/gcc/fortran/trans-types.c
> @@ -961,10 +961,6 @@ gfc_init_types (void)
>   wi::mask (n, UNSIGNED,
> TYPE_PRECISION (size_type_node)));
>  
> -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> -
>/* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
>gfc_charlen_int_kind = 4;
>gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> -- 
> 2.7.4


Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: [-fcompare-debug] find jump before debug insns in expand

2017-01-03 Thread Richard Sandiford

Alexandre Oliva  writes:
> A debug insn after the final jump of a basic block may cause the
> expander to emit a dummy move where the non-debug compile won't
> because it finds the jump insn at the end of the insn stream.
>
> Fix the condition so that, instead of requiring the jump as the last
> insn, it also matches a jump followed by debug insns.
>
> This fixes the compilation of libgcc/libgcov-profiler.c with
> -fcompare-debug on i686-linux-gnu.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?
>
> for  gcc/ChangeLog
>
>   * cfgexpand.c (expand_gimple_basic_block): Disregard debug
>   insns after final jump in test to emit dummy move.
> ---
>  gcc/cfgexpand.c |4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 97dc648..76bb614 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -5767,7 +5767,9 @@ expand_gimple_basic_block (basic_block bb, bool 
> disable_tail_calls)
>if (single_succ_p (bb)
>&& (single_succ_edge (bb)->flags & EDGE_FALLTHRU)
>&& (last = get_last_insn ())
> -  && JUMP_P (last))
> +  && (JUMP_P (last)
> +   || (DEBUG_INSN_P (last)
> +   && JUMP_P (prev_nondebug_insn (last)

Would it be worth adding a get_last_nondebug_insn in case other patterns
like this crop up?

Thanks,
Richard

Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-03 Thread David Edelsohn

This patch broke bootstrap.  I now am seeing numerous errors when
building libgomp.

Please fix or revert immediately.

Thanks, David

omp_lib.f90:184:40:

 logical (4) :: omp_test_lock
1
Error: Symbol 'omp_test_lock' at (1) has already been host associated
omp_lib.f90:216:45:

 integer (4) :: omp_test_nest_lock
 1
Error: Symbol 'omp_test_nest_lock' at (1) has already been host associated
omp_lib.f90:329:61:

 integer (omp_proc_bind_kind) :: omp_get_proc_bind
 1
Error: Symbol 'omp_get_proc_bind' at (1) has already been host associated


and


/nasfarm/edelsohn/src/src/libgomp/openacc.f90:466:6:

   use openacc_internal
  1
Error: 'acc_device_nvidia' of module 'openacc_internal', imported at (1), is als
o the name of the current program unit
/nasfarm/edelsohn/src/src/libgomp/openacc.f90:466:6:

   use openacc_internal
  1
Error: Alternate return specifier in function 'acc_async_test_h' at (1) is not a
llowed
/nasfarm/edelsohn/src/src/libgomp/openacc.f90:466:6:

   use openacc_internal
  1
Error: Alternate return specifier in function 'acc_async_test_

[PATCH] PR c++/66735 lambda capture by reference

2017-01-03 Thread Nathan Sidwell

This patch fixes 66735, where we lose cv qualifiers on an explicit 
lambda capture by reference:


  [ = ] () {...}

The problem is the partitioning between lambda_capture_field_type and 
add_capture, where the latter is responsible for adding the 
referenceness.  That leaves auto deduction clueless that it should 
preserve cv qualifiers.  Fixed by moving the referenceness into LCFT and 
adjusting.


I also refactored the error checking in add_capture, as that seemed a 
little repetitious to me -- both paths check for type completeness.


ok?

nathan
--
Nathan Sidwell
2017-01-03  Nathan Sidwell  

	cp/
	PR c++/66735
	* lambda.c (lambda_capture_field_type): Add is_reference parm.
	(add_capture): Adjust lambda_capture_field_type call, refactor
	error checking.
	* pt.c (tsubst): Adjust lambda_capture_field_type call.
	* cp-tree.h (lambda_capture_field_type): Update prototype.

	testsuite/
	PR c++/66735
	* g++.dg/cpp1y/pr66735.C: New.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 244021)
+++ cp/cp-tree.h	(working copy)
@@ -6528,7 +6528,7 @@ extern tree finish_trait_expr			(enum cp
 extern tree build_lambda_expr   (void);
 extern tree build_lambda_object			(tree);
 extern tree begin_lambda_type   (tree);
-extern tree lambda_capture_field_type		(tree, bool);
+extern tree lambda_capture_field_type		(tree, bool, bool);
 extern tree lambda_return_type			(tree);
 extern tree lambda_proxy_type			(tree);
 extern tree lambda_function			(tree);
Index: cp/lambda.c
===
--- cp/lambda.c	(revision 244021)
+++ cp/lambda.c	(working copy)
@@ -211,14 +211,17 @@ lambda_function (tree lambda)
 }
 
 /* Returns the type to use for the FIELD_DECL corresponding to the
-   capture of EXPR.
-   The caller should add REFERENCE_TYPE for capture by reference.  */
+   capture of EXPR.  EXPLICIT_INIT_P indicates whether this is a
+   C++14 init capture, and BY_REFERENCE_P indicates whether we're
+   capturing by reference.  */
 
 tree
-lambda_capture_field_type (tree expr, bool explicit_init_p)
+lambda_capture_field_type (tree expr, bool explicit_init_p,
+			   bool by_reference_p)
 {
   tree type;
   bool is_this = is_this_parameter (tree_strip_nop_conversions (expr));
+
   if (!is_this && type_dependent_expression_p (expr))
 {
   type = cxx_make_type (DECLTYPE_TYPE);
@@ -229,11 +232,24 @@ lambda_capture_field_type (tree expr, bo
 }
   else if (!is_this && explicit_init_p)
 {
-  type = make_auto ();
-  type = do_auto_deduction (type, expr, type);
+  tree auto_node = make_auto ();
+  
+  type = auto_node;
+  if (by_reference_p)
+	{
+	  /* Add the reference now, so deduction doesn't lose
+	 outermost CV qualifiers of EXPR.  */
+	  type = build_reference_type (type);
+	  by_reference_p = false;
+	}
+  type = do_auto_deduction (type, expr, auto_node);
 }
   else
 type = non_reference (unlowered_expr_type (expr));
+
+  if (!is_this && by_reference_p)
+type = build_reference_type (type);
+
   return type;
 }
 
@@ -504,9 +520,11 @@ add_capture (tree lambda, tree id, tree
 }
   else
 {
-  type = lambda_capture_field_type (initializer, explicit_init_p);
+  type = lambda_capture_field_type (initializer, explicit_init_p,
+	by_reference_p);
   if (type == error_mark_node)
 	return error_mark_node;
+
   if (id == this_identifier && !by_reference_p)
 	{
 	  gcc_assert (POINTER_TYPE_P (type));
@@ -514,17 +532,19 @@ add_capture (tree lambda, tree id, tree
 	  initializer = cp_build_indirect_ref (initializer, RO_NULL,
 	   tf_warning_or_error);
 	}
-  if (id != this_identifier && by_reference_p)
+
+  if (dependent_type_p (type))
+	;
+  else if (id != this_identifier && by_reference_p)
 	{
-	  type = build_reference_type (type);
-	  if (!dependent_type_p (type) && !lvalue_p (initializer))
+	  if (!lvalue_p (initializer))
 	error ("cannot capture %qE by reference", initializer);
 	}
   else
 	{
 	  /* Capture by copy requires a complete type.  */
 	  type = complete_type (type);
-	  if (!dependent_type_p (type) && !COMPLETE_TYPE_P (type))
+	  if (!COMPLETE_TYPE_P (type))
 	{
 	  error ("capture by copy of incomplete type %qT", type);
 	  cxx_incomplete_type_inform (type);
Index: cp/pt.c
===
--- cp/pt.c	(revision 244021)
+++ cp/pt.c	(working copy)
@@ -13988,7 +13988,8 @@ tsubst (tree t, tree args, tsubst_flags_
 
 	if (DECLTYPE_FOR_LAMBDA_CAPTURE (t))
 	  type = lambda_capture_field_type (type,
-	DECLTYPE_FOR_INIT_CAPTURE (t));
+	DECLTYPE_FOR_INIT_CAPTURE (t),
+	/*by_reference_p=*/false);
 	else if (DECLTYPE_FOR_LAMBDA_PROXY (t))
 	  type = lambda_proxy_type (type);
 	else
Index: testsuite/g++.dg/cpp1y/pr66735.C

[doc] md.texi -- "trap" is no longer used by the Java front end

2017-01-03 Thread Gerald Pfeifer

Applied.

Gerald

2017-01-03  Gerald Pfeifer  

* doc/md.texi (Standard Names): Remove reference to Java frontend.

Index: doc/md.texi
===
--- doc/md.texi (revision 244022)
+++ doc/md.texi (working copy)
@@ -6534,8 +6534,7 @@
 @cindex @code{trap} instruction pattern
 @item @samp{trap}
 This pattern, if defined, signals an error, typically by causing some
-kind of signal to be raised.  Among other places, it is used by the Java
-front end to signal `invalid array index' exceptions.
+kind of signal to be raised.
 
 @cindex @code{ctrap@var{MM}4} instruction pattern
 @item @samp{ctrap@var{MM}4}

Re: [patch,avr] PR78883: Implement CANNOT_CHANGE_MODE_CLASS.

2017-01-03 Thread Richard Sandiford

Georg-Johann Lay  writes:
> On 02.01.2017 15:54, Dominik Vogt wrote:
>> On Mon, Jan 02, 2017 at 03:47:43PM +0100, Georg-Johann Lay wrote:
>>> This fixes PR78883 which is a problem in reload revealed by a
>>> change to combine.c.  The fix is as proposed by Segher: implement
>>> CANNOT_CHANGE_MODE_CLASS.
>>>
>>> Ok for trunk?
>>>
>>> Johann
>>>
>>>
>>> gcc/
>>> PR target/78883
>>> * config/avr/avr.h (CANNOT_CHANGE_MODE_CLASS): New define.
>>> * config/avr/avr-protos.h (avr_cannot_change_mode_class): New proto.
>>> * config/avr/avr.c (avr_cannot_change_mode_class): New function.
>>>
>>> gcc/testsuite/
>>> PR target/78883
>>> * gcc.c-torture/compile/pr78883.c: New test.
>>
>>> Index: config/avr/avr-protos.h
>>> ===
>>> --- config/avr/avr-protos.h (revision 244001)
>>> +++ config/avr/avr-protos.h (working copy)
>>> @@ -111,7 +111,7 @@ extern int _reg_unused_after (rtx_insn *
>>>  extern int avr_jump_mode (rtx x, rtx_insn *insn);
>>>  extern int test_hard_reg_class (enum reg_class rclass, rtx x);
>>>  extern int jump_over_one_insn_p (rtx_insn *insn, rtx dest);
>>> -
>>> +extern int avr_cannot_change_mode_class (machine_mode, machine_mode, enum 
>>> reg_class);
>>>  extern int avr_hard_regno_mode_ok (int regno, machine_mode mode);
>>>  extern void avr_final_prescan_insn (rtx_insn *insn, rtx *operand,
>>> int num_operands);
>>> Index: config/avr/avr.c
>>> ===
>>> --- config/avr/avr.c(revision 244001)
>>> +++ config/avr/avr.c(working copy)
>>> @@ -11833,6 +11833,21 @@ jump_over_one_insn_p (rtx_insn *insn, rt
>>>  }
>>>
>>>
>>> +/* Worker function for `CANNOT_CHANGE_MODE_CLASS'.  */
>>> +
>>> +int
>>> +avr_cannot_change_mode_class (machine_mode from, machine_mode to,
>>> +  enum reg_class /* rclass */)
>>> +{
>>> +  /* We cannot access a hard register in a wider mode, for example we
>>> + must not access (reg:QI 31) as (reg:HI 31).  HARD_REGNO_MODE_OK
>>> + would avoid such hard regs, but reload would generate it anyway
>>> + from paradoxical subregs of mem, cf. PR78883.  */
>>> +
>>> +  return GET_MODE_SIZE (to) > GET_MODE_SIZE (from);
>>
>> I understand how this fixes the ICE, but is it really necessary to
>> suppress conversions to a wider mode for lower numbered registers?
>
> If there is a better hook, I'll propose an according patch.
>
> My expectation was that HARD_REGNO_MODE_OK would be enough to keep
> reload from putting HI into odd registers (and in particular into R31).
> But this is obviously not the case...

It should be enough in principle.  It's just a case of whether you want
to fix reload, hack around it, or take the plunge and switch to LRA.

Having a (subreg (mem)) is probably key here.  If it had been
(subreg (reg:HI X)) for some pseudo X then init_subregs_of_mode should
have realised that 31 isn't a valid choice for X.

I think the reload fix would be to honour simplifiable_subregs when
reloading the (subreg (mem)).

> And internals are not very helpful here.  It only mentions modifying
> ordinary subregs of pseudos, but not paradoxical subreg of memory.
>
> What's also astonishing me is that this problem never popped up
> during the last > 15 years of avr back-end.

FWIW, the current init_subregs_of_mode/simplifiable_subregs behaviour
is fairly recent (2014) and CANNOT_CHANGE_MODE_CLASS had been used in
the past to avoid errors like this.  Using it that way was always a
hack though.

An alternative would be to add a new macro to control this block in
general_operand:

#ifdef INSN_SCHEDULING
  /* On machines that have insn scheduling, we want all memory
 reference to be explicit, so outlaw paradoxical SUBREGs.
 However, we must allow them after reload so that they can
 get cleaned up by cleanup_subreg_operands.  */
  if (!reload_completed && MEM_P (sub)
  && GET_MODE_SIZE (mode) > GET_MODE_SIZE (GET_MODE (sub)))
return 0;
#endif

The default would still be INSN_SCHEDULING, but AVR could override it
to 1 and reject the same subregs.

That would still be a hack, but at least it would be taking things in
a good direction.  Having different rules depending on whether targets
define a scheduler is just a horrible wart that no-one's ever had chance
to fix.  If using the code above works well on AVR then that'd be a big
step towards making the code unconditional.

It'd definitely be worth checking how it affects code quality though.
(Although the same goes for the current patch, since C_C_M_C is a pretty
big hammer.)

Thanks,
Richard

C++ PATCH for c++/77545 and c++/77284 (ICE with CLEANUP_STMT)

2017-01-03 Thread Marek Polacek

The problem here is that we've gotten to potential_constant_expression_1 with a
CLEANUP_STMT, but it doesn't know how to handle that so we ICE.  I thought it'd
be possible to look into CLEANUP_{BODY,EXPR} to determine whether the
CLEANUP_STMT can be potentially const, but cxx_eval_constant_expression can't
handle CLEANUP_STMTs so it couldn't evaluate it anyway.  So it seems that it's
safe to consider CLEANUP_STMTs non-constant.

This happens when initializing __for_range, where finish_eh_cleanup creates
a CLEANUP_STMT that would run ~A() in case of an exception.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-01-03  Marek Polacek  

PR c++/77545
PR c++/77284
* constexpr.c (potential_constant_expression_1): Handle CLEANUP_STMT.

* g++.dg/cpp0x/range-for32.C: New test.
* g++.dg/cpp0x/range-for33.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 1e83b0b..a3dec68 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -5661,6 +5661,7 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict,
   /* We can see these in statement-expressions.  */
   return true;
 
+case CLEANUP_STMT:
 case EMPTY_CLASS_EXPR:
   return false;
 
diff --git gcc/testsuite/g++.dg/cpp0x/range-for32.C 
gcc/testsuite/g++.dg/cpp0x/range-for32.C
index e69de29..375a707 100644
--- gcc/testsuite/g++.dg/cpp0x/range-for32.C
+++ gcc/testsuite/g++.dg/cpp0x/range-for32.C
@@ -0,0 +1,16 @@
+// PR c++/77545
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wno-pedantic" }
+
+template < typename T > struct A
+{
+  A ();
+  ~A ();
+  T t;
+};
+
+void f (A < int > a)
+{
+  for (auto x : (A[]) { a })
+;
+}
diff --git gcc/testsuite/g++.dg/cpp0x/range-for33.C 
gcc/testsuite/g++.dg/cpp0x/range-for33.C
index e69de29..206f36e 100644
--- gcc/testsuite/g++.dg/cpp0x/range-for33.C
+++ gcc/testsuite/g++.dg/cpp0x/range-for33.C
@@ -0,0 +1,14 @@
+// PR c++/77284
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct A
+{
+  ~A () {}
+};
+
+void foo (A & v)
+{
+  for (A a : { v }) {};
+}

Marek

[PATCH] Add deleted std::thread(const thread&&) constructor

2017-01-03 Thread Jonathan Wakely


LWG 2097 says that the templated std::thread(Callable&&, Args&&...)
constructor should not participate in overload resolution when
decay_t is std::thread. Rather than removing it from the
overload set we just ensure that there are better matches, but we fail
to do so for const rvalues. This fixes it.

PR libstdc++/78956
* include/std/thread (thread(const thread&&)): Add deleted
constructor.
* testsuite/30_threads/thread/cons/lwg2097.cc: New test.

Tested powerpc64le-linux, committed to trunk.

commit a8b2bad2dda80054e3b8cc5a4d1c2c39226070d5
Author: Jonathan Wakely 
Date:   Tue Jan 3 12:09:59 2017 +

Add deleted std::thread(const thread&&) constructor

PR libstdc++/78956
* include/std/thread (thread(const thread&&)): Add deleted
constructor.
* testsuite/30_threads/thread/cons/lwg2097.cc: New test.

diff --git a/libstdc++-v3/include/std/thread b/libstdc++-v3/include/std/thread
index f725bdb..8e2cb68 100644
--- a/libstdc++-v3/include/std/thread
+++ b/libstdc++-v3/include/std/thread
@@ -108,6 +108,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 // 2097.  packaged_task constructors should be constrained
 thread(thread&) = delete;
 thread(const thread&) = delete;
+thread(const thread&&) = delete;
 
 thread(thread&& __t) noexcept
 { swap(__t); }
diff --git a/libstdc++-v3/testsuite/30_threads/thread/cons/lwg2097.cc 
b/libstdc++-v3/testsuite/30_threads/thread/cons/lwg2097.cc
new file mode 100644
index 000..165c649
--- /dev/null
+++ b/libstdc++-v3/testsuite/30_threads/thread/cons/lwg2097.cc
@@ -0,0 +1,29 @@
+// { dg-do compile { target c++11 } }
+// { dg-require-cstdint "" }
+// { dg-require-gthreads "" }
+
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+#include 
+
+using std::thread;
+using std::is_constructible;
+
+static_assert( !is_constructible::value, "" );
+static_assert( !is_constructible::value, "" );
+static_assert( !is_constructible::value, "" );

Re: [PATCH, wwwdocs] Add gcc-7/porting_to.html, Fortran changes

2017-01-03 Thread Janne Blomqvist

On Tue, Jan 3, 2017 at 2:04 PM, Joseph Myers  wrote:
> On Tue, 3 Jan 2017, Janne Blomqvist wrote:
>
>> (I added  and   tags to the
>> porting_to.html file, which is apparently what is recommended these
>> days; I didn't add this to any existing file)
>
> DOCTYPE etc. are added automatically via style.mhtml.  They do not belong
> in any individual checked-in .html files.

Oh, right, so it seems. Well, scratch that part of the patch then, obviously.

-- 
Janne Blomqvist

New template for 'gcc' made available

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'gcc' has been made available
to the language teams for translation.  It is archived as:

http://translationproject.org/POT-files/gcc-7.1-b20170101.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

ftp://gcc.gnu.org/pub/gcc/snapshots/7-20170101/gcc-7-20170101.tar.bz2

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH, wwwdocs] Add gcc-7/porting_to.html, Fortran changes

2017-01-03 Thread Joseph Myers

On Tue, 3 Jan 2017, Janne Blomqvist wrote:

> (I added  and   tags to the
> porting_to.html file, which is apparently what is recommended these
> days; I didn't add this to any existing file)

DOCTYPE etc. are added automatically via style.mhtml.  They do not belong 
in any individual checked-in .html files.

-- 
Joseph S. Myers
jos...@codesourcery.com

New Ukrainian PO file for 'cpplib' (version 7.1-b20170101)

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Ukrainian team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/uk.po

(This file, 'cpplib-7.1-b20170101.uk.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Contents of PO file 'cpplib-7.1-b20170101.uk.po'

2017-01-03 Thread Translation Project Robot



cpplib-7.1-b20170101.uk.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

New template for 'cpplib' made available

2017-01-03 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'cpplib' has been made available
to the language teams for translation.  It is archived as:

http://translationproject.org/POT-files/cpplib-7.1-b20170101.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

ftp://gcc.gnu.org/pub/gcc/snapshots/7-20170101/gcc-7-20170101.tar.bz2

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[PATCH] Fix typos in libstdc++ docs and update copyright years

2017-01-03 Thread Jonathan Wakely


* doc/xml/manual/spine.xml: Update copyright years.
* doc/xml/manual/build_hacking.xml: Fix spelling of libbuilddir.
* doc/xml/manual/test.xml: Likewise.
* doc/html/*: Regenerate.

Comitted to trunk.

commit 6c912caa212670131a9d3ed61c89d2182c79b1c0
Author: Jonathan Wakely 
Date:   Tue Jan 3 11:15:36 2017 +

Fix typos in libstdc++ docs and update copyright years

* doc/xml/manual/spine.xml: Update copyright years.
* doc/xml/manual/build_hacking.xml: Fix spelling of libbuilddir.
* doc/xml/manual/test.xml: Likewise.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/build_hacking.xml 
b/libstdc++-v3/doc/xml/manual/build_hacking.xml
index 90489d1..f0cbd70 100644
--- a/libstdc++-v3/doc/xml/manual/build_hacking.xml
+++ b/libstdc++-v3/doc/xml/manual/build_hacking.xml
@@ -507,7 +507,7 @@ If it wasn't done for the last release, you might also need 
to regenerate
 the baseline_symbols.txt file that defines the set
 of expected symbols for old symbol versions. A new baseline file can be
 generated by running make new-abi-baseline in the
-libbuildir/testsuite
+libbuilddir/testsuite
 directory. Be sure to generate the baseline from a clean build using
 unmodified sources, or you will incorporate your local changes into the
 baseline file.
diff --git a/libstdc++-v3/doc/xml/manual/spine.xml 
b/libstdc++-v3/doc/xml/manual/spine.xml
index c67c49c..d28df73 100644
--- a/libstdc++-v3/doc/xml/manual/spine.xml
+++ b/libstdc++-v3/doc/xml/manual/spine.xml
@@ -24,6 +24,7 @@
 2014
 2015
 2016
+2017
 
   http://www.w3.org/1999/xlink; 
xlink:href="http://www.fsf.org;>FSF
 
diff --git a/libstdc++-v3/doc/xml/manual/test.xml 
b/libstdc++-v3/doc/xml/manual/test.xml
index a1781e5..d3b4c40 100644
--- a/libstdc++-v3/doc/xml/manual/test.xml
+++ b/libstdc++-v3/doc/xml/manual/test.xml
@@ -556,7 +556,7 @@ cat 27_io/objects/char/3_xin.in | a.out
 
   The tests will be compiled with a set of default compiler flags defined
   by the
-  
libbuildir/scripts/testsuite_flags
+  
libbuilddir/scripts/testsuite_flags
   file, as well as options specified in individual tests. You can run
   the tests with different options by adding them to the output of
   the --cxxflags option of that script, or by setting
@@ -585,7 +585,7 @@ cat 27_io/objects/char/3_xin.in | a.out
   To run the libstdc++ test suite under the
   debug mode, use
   make check-debug. Alternatively, edit
-  
libbuildir/scripts/testsuite_flags
+  
libbuilddir/scripts/testsuite_flags
   to add the compile-time flag -D_GLIBCXX_DEBUG to the
   result printed by the --cxxflags
   option. Additionally, add the

[PATCH] [msp430] Sync msp430_mcu_data with devices.csv

2017-01-03 Thread Joe Seymour

This patch syncs the generated msp430_mcu_data structure:
- With the latest version of devices.csv released by TI.
- Between msp430.c and driver-msp430.c. The former was updated more
  recently than the latter.
- With the copy of the same data structure in binutils, for which a
  similar patch has already been approved and committed:

https://sourceware.org/ml/binutils/2016-12/msg00347.html

My understanding is that the devices being removed were "invalid spins",
so can't be being used by anyone, and never will be. Web searches
related to these devices return no relevant results:
  msp430 FR5862: No results.
  msp430 FR5864: 1 (invalid) result.
  msp430 FR5892: No results.
  msp430 FR5894: No results.

This patch does not update the structure in t-msp430, which contains
only msp430 devices (as opposed to msp430x devices), none of which are
being added by this patch.

Built and tested (no regressions) as follows:
  Configured with: --target=msp430-elf --enable-languages=c
  Test variations:
msp430-sim/-mcpu=msp430
msp430-sim/-mcpu=msp430x
msp430-sim/-mcpu=msp430x/-mlarge/-mdata-region=either/-mcode-region=either
msp430-sim/-mhwmult=none
msp430-sim/-mhwmult=f5series

I don't have write access, so if this patch is acceptable I'd appreciate
it if someone would commit it for me.

Thanks,

2017-01-03  Joe Seymour  

* config/msp430/driver-msp430.c (msp430_mcu_data): Sync with data
from TI's devices.csv file as of September 2016.
* config/msp430/msp430.c (msp430_mcu_data): Likewise.
---
 gcc/config/msp430/driver-msp430.c |   17 +++--
 gcc/config/msp430/msp430.c|   11 +--
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/gcc/config/msp430/driver-msp430.c 
b/gcc/config/msp430/driver-msp430.c
index 69b7a73..b6b5676 100644
--- a/gcc/config/msp430/driver-msp430.c
+++ b/gcc/config/msp430/driver-msp430.c
@@ -27,8 +27,8 @@
 /* This is a copy of the same data structure found in gas/config/tc-msp430.c
Also another (sort-of) copy can be found in gcc/config/msp430/msp430.c
Keep these three structures in sync.
-   The data in this structure has been extracted from the devices.csv file
-   released by TI, updated as of 8 October 2015.  */
+   The data in this structure has been extracted from version 1.194 of the
+   devices.csv file released by TI in September 2016.  */
 
 struct msp430_mcu_data
 {
@@ -454,7 +454,15 @@ msp430_mcu_data [] =
   { "msp430fg6626",2,8 },
   { "msp430fr2032",2,0 },
   { "msp430fr2033",2,0 },
+  { "msp430fr2110",2,0 },
+  { "msp430fr2111",2,0 },
+  { "msp430fr2310",2,0 },
+  { "msp430fr2311",2,0 },
   { "msp430fr2433",2,8 },
+  { "msp430fr2532",2,8 },
+  { "msp430fr2533",2,8 },
+  { "msp430fr2632",2,8 },
+  { "msp430fr2633",2,8 },
   { "msp430fr2xx_4xxgeneric",2,8 },
   { "msp430fr4131",2,0 },
   { "msp430fr4132",2,0 },
@@ -507,6 +515,8 @@ msp430_mcu_data [] =
   { "msp430fr5957",2,8 },
   { "msp430fr5958",2,8 },
   { "msp430fr5959",2,8 },
+  { "msp430fr5962",2,8 },
+  { "msp430fr5964",2,8 },
   { "msp430fr5967",2,8 },
   { "msp430fr5968",2,8 },
   { "msp430fr5969",2,8 },
@@ -519,6 +529,9 @@ msp430_mcu_data [] =
   { "msp430fr5988",2,8 },
   { "msp430fr5989",2,8 },
   { "msp430fr59891",2,8 },
+  { "msp430fr5992",2,8 },
+  { "msp430fr5994",2,8 },
+  { "msp430fr59941",2,8 },
   { "msp430fr5xx_6xxgeneric",2,8 },
   { "msp430fr6820",2,8 },
   { "msp430fr6822",2,8 },
diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index fb1978b..fe92370 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -93,8 +93,8 @@ msp430_init_machine_status (void)
 /* This is a copy of the same data structure found in gas/config/tc-msp430.c
Also another (sort-of) copy can be found in gcc/config/msp430/t-msp430
Keep these three structures in sync.
-   The data in this structure has been extracted from the devices.csv file
-   released by TI, updated as of March 2016.  */
+   The data in this structure has been extracted from version 1.194 of the
+   devices.csv file released by TI in September 2016.  */
 
 struct msp430_mcu_data
 {
@@ -520,6 +520,8 @@ msp430_mcu_data [] =
   { "msp430fg6626",2,8 },
   { "msp430fr2032",2,0 },
   { "msp430fr2033",2,0 },
+  { "msp430fr2110",2,0 },
+  { "msp430fr2111",2,0 },
   { "msp430fr2310",2,0 },
   { "msp430fr2311",2,0 },
   { "msp430fr2433",2,8 },
@@ -560,8 +562,6 @@ msp430_mcu_data [] =
   { "msp430fr5858",2,8 },
   { "msp430fr5859",2,8 },
   { "msp430fr5867",2,8 },
-  { "msp430fr5862",2,8 },
-  { "msp430fr5864",2,8 },
   { "msp430fr58671",2,8 },
   { "msp430fr5868",2,8 },
   { "msp430fr5869",2,8 },
@@ -572,8 +572,6 @@ msp430_mcu_data [] =
   { "msp430fr5888",2,8 },
   { "msp430fr5889",2,8 },
   { "msp430fr58891",2,8 },
-  { "msp430fr5892",2,8 },
-  { "msp430fr5894",2,8 },
   { "msp430fr5922",2,8 },
   { "msp430fr59221",2,8 },
   { "msp430fr5947",2,8 },
@@ -599,6 +597,7 @@ msp430_mcu_data [] =
   { "msp430fr59891",2,8 },
   {

Re: [PATCH] [PR rtl-optimization/65618] Fix MIPS ADA bootstrap failure

2017-01-03 Thread James Cowgill

On 01/01/17 22:27, Jeff Law wrote:
> On 12/20/2016 07:38 AM, James Cowgill wrote:
>> Hi,
>>
>> On 19/12/16 21:43, Jeff Law wrote:
>>> On 12/19/2016 08:44 AM, James Cowgill wrote:
 2016-12-16  James Cowgill  

 PR rtl-optimization/65618
 * emit-rtl.c (try_split): Update "after" when moving a
 NOTE_INSN_CALL_ARG_LOCATION.

 diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
 index 7de17454037..6be124ac038 100644
 --- a/gcc/emit-rtl.c
 +++ b/gcc/emit-rtl.c
 @@ -3742,6 +3742,11 @@ try_split (rtx pat, rtx_insn *trial, int last)
 next = NEXT_INSN (next))
  if (NOTE_KIND (next) == NOTE_INSN_CALL_ARG_LOCATION)
{
 +/* Advance after to the next instruction if it is about to
 +   be removed.  */
 +if (after == next)
 +  after = NEXT_INSN (after);
 +
  remove_insn (next);
  add_insn_after (next, insn, NULL);
  break;

>>> So the thing I don't like when looking at this code is we set AFTER
>>> immediately upon entry to try_split.  But we don't use it until near the
>>> very end of try_split.  That's just asking for trouble.
>>>
>>> Can we reasonably initialize AFTER just before it's used?
>>
>> I wasn't sure but looking closer I think that would be fine. This patch
>> also works and does what Richard Sandiford suggested in the PR.
>>
>> 2016-12-20  James Cowgill  
>>
>> PR rtl-optimization/65618
>> * emit-rtl.c (try_split): Move initialization of "before" and
>> "after" to just before the call to emit_insn_after_setloc.
> OK.

Great. Can you commit this for me, since I don't have commit access?

Thanks,
James

Re: [C++ PATCH] Implement LWG2296 helper intrinsic

2017-01-03 Thread Jonathan Wakely


On 01/01/17 15:53 +0100, Jakub Jelinek wrote:

On Sun, Jan 01, 2017 at 10:27:24AM -0400, Gerald Pfeifer wrote:

On Fri, 7 Oct 2016, Jakub Jelinek wrote:
> The following patch adds __builtin_addressof with the semantics it has in
> clang, i.e. it is a constexpr & operator alternative that never uses the
> overloaded & operator.

Nice!

Are you planning to document this in gcc-7/changes.html ?


We shouldn't document the builtin, but that std::addressof is usable in
constexpr contexts.  I'll defer documentation thereof to Jon, together with
other libstdc++ changes.


I've committed this to wwwdocs.


? htdocs/gcc-7/.changes.html.swp
Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.32
diff -u -r1.32 changes.html
--- htdocs/gcc-7/changes.html	27 Nov 2016 12:54:13 -	1.32
+++ htdocs/gcc-7/changes.html	3 Jan 2017 10:54:01 -
@@ -287,13 +287,30 @@
   std::chrono::round, and std::chrono::abs;
 
 
-  std::clamp;
+  std::clamp, std::gcd, std::lcm,
+  3-dimensional std::hypot;
+
+std::shared_mutex;
+std::default_searcher,
+  std::boyer_moore_searcher and
+  std::boyer_moore_horspool_searcher;
+
+
+  Extraction and re-insertion of map and set nodes, try_emplace
+  members for maps, and functions for accessing containers
+  std::size, std::empty, and
+  std::data;
 
 
+  std::shared_ptr support for arrays,
   std::shared_ptrT::weak_type,
   std::enable_shared_from_thisT::weak_from_this(),
   and std::owner_lessvoid;
 
+std::as_const, std::not_fn,
+  std::has_unique_object_representations,
+  constexpr std::addressof.
+
   
   Thanks to Daniel Krgler, Tim Shen, Edward Smith-Rowland, and Ville Voutilainen for
   work on the C++17 support.

Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.

2017-01-03 Thread Tamar Christina

Hi Jeff,

I wasn't sure if you saw the updated patch attached to the previous email or if 
you just hadn't had the time to look at it yet.

Cheers,
Tamar


From: Jeff Law 
Sent: Monday, December 19, 2016 8:27:33 PM
To: Tamar Christina; Joseph Myers
Cc: GCC Patches; Wilco Dijkstra; rguent...@suse.de; Michael Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

On 12/15/2016 03:14 AM, Tamar Christina wrote:
>
>> On a high level, presumably there's no real value in keeping the old
>> code to "fold" fpclassify.  By exposing those operations as integer
>> logicals for the fast path, if the FP value becomes a constant during
>> the optimization pipeline we'll see the reinterpreted values flowing
>> into the new integer logical tests and they'll simplify just like
>> anything else.  Right?
>
> Yes, if it becomes a constant it will be folded away, both in the integer and 
> the fp case.
Thanks for clarifying.


>
>> The old IBM format is still supported, though they are expected to be
>> moveing towards a standard ieee 128 bit format.  So my only concern is
>> that we preserve correct behavior for those cases -- I don't really care
>> about optimizing them.  So I think you need to keep them.
>
> Yes, I re-added them. It's mostly a copy paste from what they were in the
> other functions. But I have no way of testing it.
Understood.

>>  > +  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
>>> +  return (format->is_binary_ieee_compatible
>>> +   && FLOAT_WORDS_BIG_ENDIAN == WORDS_BIG_ENDIAN
>>> +   /* We explicitly disable quad float support on 32 bit systems.  */
>>> +   && !(UNITS_PER_WORD == 4 && type_width == 128)
>>> +   && targetm.scalar_mode_supported_p (mode));
>>> +}
>> Presumably this is why you needed the target.h inclusion.
>>
>> Note that on some systems we even disable 64bit floating point support.
>> I suspect this check needs a little re-thinking as I don't think that
>> checking for a specific UNITS_PER_WORD is correct, nor is checking the
>> width of the type.  I'm not offhand sure what the test should be, just
>> that I think we need something better here.
>
> I think what I really wanted to test here is if there was an integer mode 
> available
> which has the exact width as the floating point one. So I have replaced this 
> with
> just a call to int_mode_for_mode. Which is probably more correct.
I'll need to think about it, but would inherently think that
int_mode_for_mode is better than an explicit check of UNITS_PER_WORD and
typewidth.


>
>>> +
>>> +/* Determines if the given number is a NaN value.
>>> +   This function is the last in the chain and only has to
>>> +   check if it's preconditions are true.  */
>>> +static tree
>>> +is_nan (gimple_seq *seq, tree arg, location_t loc)
>> So in the old code we checked UNGT_EXPR, in the new code's slow path you
>> check UNORDERED.  Was that change intentional?
>
> The old FP code used UNORDERED and the new one was using ORDERED and negating 
> the result.
> I've replaced it with UNORDERED, but both are correct.
OK.  Just wanted to make sure.

jeff

[PATCH 2/4] S/390: Unroll mvc/xc loop for memset with small constant lengths.

2017-01-03 Thread Andreas Krebbel

When expanding a memset we emit a loop of MVCs/XCs instructions dealing
with 256 byte blocks.  This loop used to get unrolled with older GCCs
when using constant length operands.  GCC lost this ability probably
when more of the loop unrolling stuff has been moved to tree level.

With this patch the unrolling is done manually when emitting the RTL
insns.

2017-01-03  Andreas Krebbel  

* gcc.target/s390/memset-1.c: New test.

gcc/ChangeLog:

2017-01-03  Andreas Krebbel  

* config/s390/s390.c (s390_expand_setmem): Unroll the loop for
small constant length operands.
---
 gcc/config/s390/s390.c   |  56 -
 gcc/testsuite/gcc.target/s390/memset-1.c | 134 +++
 2 files changed, 168 insertions(+), 22 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/memset-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 257bce7..1266f45 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5348,34 +5348,46 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
 {
   const int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
 
-  if (GET_CODE (len) == CONST_INT && INTVAL (len) == 0)
+  if (GET_CODE (len) == CONST_INT && INTVAL (len) <= 0)
 return;
 
   gcc_assert (GET_CODE (val) == CONST_INT || GET_MODE (val) == QImode);
 
-  if (GET_CODE (len) == CONST_INT && INTVAL (len) > 0 && INTVAL (len) <= 257)
+  /* Expand setmem/clrmem for a constant length operand without a
+ loop if it will be shorter that way.
+ With a constant length and without pfd argument a
+ clrmem loop is 32 bytes -> 5.3 * xc
+ setmem loop is 36 bytes -> 3.6 * (mvi/stc + mvc) */
+  if (GET_CODE (len) == CONST_INT
+  && ((INTVAL (len) <= 256 * 5 && val == const0_rtx)
+ || INTVAL (len) <= 257 * 3)
+  && (!TARGET_MVCLE || INTVAL (len) <= 256))
 {
-  if (val == const0_rtx && INTVAL (len) <= 256)
-emit_insn (gen_clrmem_short (dst, GEN_INT (INTVAL (len) - 1)));
-  else
-   {
- /* Initialize memory by storing the first byte.  */
- emit_move_insn (adjust_address (dst, QImode, 0), val);
+  HOST_WIDE_INT o, l;
 
- if (INTVAL (len) > 1)
-   {
- /* Initiate 1 byte overlap move.
-The first byte of DST is propagated through DSTP1.
-Prepare a movmem for:  DST+1 = DST (length = LEN - 1).
-DST is set to size 1 so the rest of the memory location
-does not count as source operand.  */
- rtx dstp1 = adjust_address (dst, VOIDmode, 1);
- set_mem_size (dst, 1);
-
- emit_insn (gen_movmem_short (dstp1, dst,
-  GEN_INT (INTVAL (len) - 2)));
-   }
-   }
+  if (val == const0_rtx)
+   /* clrmem: emit 256 byte blockwise XCs.  */
+   for (l = INTVAL (len), o = 0; l > 0; l -= 256, o += 256)
+ {
+   rtx newdst = adjust_address (dst, BLKmode, o);
+   emit_insn (gen_clrmem_short (newdst,
+GEN_INT (l > 256 ? 255 : l - 1)));
+ }
+  else
+   /* setmem: emit 1(mvi) + 256(mvc) byte blockwise memsets by
+  setting first byte to val and using a 256 byte mvc with one
+  byte overlap to propagate the byte.  */
+   for (l = INTVAL (len), o = 0; l > 0; l -= 257, o += 257)
+ {
+   rtx newdst = adjust_address (dst, BLKmode, o);
+   emit_move_insn (adjust_address (dst, QImode, o), val);
+   if (l > 1)
+ {
+   rtx newdstp1 = adjust_address (dst, BLKmode, o + 1);
+   emit_insn (gen_movmem_short (newdstp1, newdst,
+GEN_INT (l > 257 ? 255 : l - 2)));
+ }
+ }
 }
 
   else if (TARGET_MVCLE)
diff --git a/gcc/testsuite/gcc.target/s390/memset-1.c 
b/gcc/testsuite/gcc.target/s390/memset-1.c
new file mode 100644
index 000..7b43b97c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/memset-1.c
@@ -0,0 +1,134 @@
+/* Make sure that short memset's with constant length are emitted
+   without loop statements.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch" } */
+
+/* 1 mvc */
+void
+*memset1(void *s, int c)
+{
+  return __builtin_memset (s, c, 42);
+}
+
+/* 3 mvc */
+void
+*memset2(void *s, int c)
+{
+  return __builtin_memset (s, c, 700);
+}
+
+/* nop */
+void
+*memset3(void *s, int c)
+{
+  return __builtin_memset (s, c, 0);
+}
+
+/* mvc */
+void
+*memset4(void *s, int c)
+{
+  return __builtin_memset (s, c, 256);
+}
+
+/* 2 mvc */
+void
+*memset5(void *s, int c)
+{
+  return __builtin_memset (s, c, 512);
+}
+
+/* still 2 mvc through the additional first byte  */
+void
+*memset6(void *s, int c)
+{
+  return __builtin_memset (s, c, 514);
+}
+
+/* 3 mvc */
+void
+*memset7(void *s, int c)
+{
+  return __builtin_memset (s, c, 515);
+}
+
+/*

[PATCH 4/4] S/390: Additional memset/memcpy runtime tests.

2017-01-03 Thread Andreas Krebbel

These were provided by Dominik to check more of the corner case in our
memset/memcpy inline code.

gcc/testsuite/ChangeLog:

2017-01-03  Dominik Vogt  

* gcc.target/s390/memcpy-2.c: New test.
* gcc.target/s390/memset-2.c: New test.
---
 gcc/testsuite/gcc.target/s390/memcpy-2.c | 94 
 gcc/testsuite/gcc.target/s390/memset-2.c | 92 +++
 2 files changed, 186 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/memcpy-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/memset-2.c

diff --git a/gcc/testsuite/gcc.target/s390/memcpy-2.c 
b/gcc/testsuite/gcc.target/s390/memcpy-2.c
new file mode 100644
index 000..b9568ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/memcpy-2.c
@@ -0,0 +1,94 @@
+/* Funtional memmov test.  */
+
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+#define MAX_LEN (8 * 1000)
+#define X 0x11
+
+char gsrc[MAX_LEN + 2];
+char gdst[MAX_LEN + 2];
+
+__attribute__ ((noinline))
+int
+compare_mem (int len)
+{
+  int i;
+
+  if (gdst[0] != 0x61)
+__builtin_abort();
+  for (i = 1; i <= len; i++)
+if (gsrc[i] != gdst[i])
+  __builtin_abort();
+  for (i = len + 1; i < MAX_LEN; i++)
+if (gdst[i] != 0x61 + i % 4)
+  __builtin_abort();
+}
+
+__attribute__ ((noinline))
+void
+init_mem (void)
+{
+  unsigned int *p1;
+  unsigned int *p2;
+  int i;
+
+  p1 = (unsigned int *)gsrc;
+  p2 = (unsigned int *)gdst;
+  for (i = 0; i < MAX_LEN / sizeof(unsigned int); i++)
+{
+  p1[i] = 0x71727374;
+  p2[i] = 0x61626364;
+}
+}
+
+#define MEMCPY_CHECK(DST, SRC, LEN)\
+  init_mem (); \
+  __builtin_memcpy ((DST) + 1, (SRC) + 1, (LEN));  \
+  compare_mem ((LEN));
+
+
+int main(void)
+{
+  int lens[] =
+{
+  255, 256, 257,
+  511, 512, 513,
+  767, 768, 769,
+  1023, 1024, 1025,
+  1279, 1280, 1281,
+  1535, 1536, 1537,
+  -999
+};
+  int t;
+
+  /* variable length */
+  for (t = 0; lens[t] != -999; t++)
+{
+  MEMCPY_CHECK (gdst, gsrc, lens[t]);
+}
+  /* constant length */
+  MEMCPY_CHECK (gdst, gsrc, 0);
+  MEMCPY_CHECK (gdst, gsrc, 1);
+  MEMCPY_CHECK (gdst, gsrc, 2);
+  MEMCPY_CHECK (gdst, gsrc, 3);
+  MEMCPY_CHECK (gdst, gsrc, 256);
+  MEMCPY_CHECK (gdst, gsrc, 257);
+  MEMCPY_CHECK (gdst, gsrc, 511);
+  MEMCPY_CHECK (gdst, gsrc, 512);
+  MEMCPY_CHECK (gdst, gsrc, 513);
+  MEMCPY_CHECK (gdst, gsrc, 767);
+  MEMCPY_CHECK (gdst, gsrc, 768);
+  MEMCPY_CHECK (gdst, gsrc, 769);
+  MEMCPY_CHECK (gdst, gsrc, 1023);
+  MEMCPY_CHECK (gdst, gsrc, 1024);
+  MEMCPY_CHECK (gdst, gsrc, 1025);
+  MEMCPY_CHECK (gdst, gsrc, 1279);
+  MEMCPY_CHECK (gdst, gsrc, 1280);
+  MEMCPY_CHECK (gdst, gsrc, 1281);
+  MEMCPY_CHECK (gdst, gsrc, 1535);
+  MEMCPY_CHECK (gdst, gsrc, 1536);
+  MEMCPY_CHECK (gdst, gsrc, 1537);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/s390/memset-2.c 
b/gcc/testsuite/gcc.target/s390/memset-2.c
new file mode 100644
index 000..e1af7fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/memset-2.c
@@ -0,0 +1,92 @@
+/* Funtional setmem test.  */
+
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+#define MAX_LEN (8 * 1000)
+
+__attribute__ ((noinline))
+int
+check_mem (char *mem, int val, int len)
+{
+  int i;
+
+  if (mem[0] != 0x71)
+__builtin_abort();
+  for (i = 1; i <= len; i++)
+if (mem[i] != val)
+  __builtin_abort();
+  if (mem[len + 1] != 0x71 + (len + 1) % 4)
+__builtin_abort();
+}
+
+__attribute__ ((noinline))
+void
+init_mem (char *mem)
+{
+  unsigned int *p;
+  int i;
+
+  p = (unsigned int *)mem;
+  for (i = 0; i < MAX_LEN / sizeof(unsigned int); i++)
+p[i] = 0x71727374;
+}
+
+#define MEMSET_CHECK(VAL, SIZE)\
+  init_mem (mem1); \
+  __builtin_memset (mem1 + 1, 0, (SIZE));  \
+  check_mem (mem1, 0, SIZE);   \
+  init_mem (mem2); \
+  __builtin_memset (mem2 + 1, (VAL), (SIZE));  \
+  check_mem (mem2, VAL, SIZE);
+
+char mem1[MAX_LEN + 2];
+char mem2[MAX_LEN + 2];
+
+int main(int argc, char **argv)
+{
+  int lens[] =
+{
+  256, 257, 258, 259,
+  512, 513, 514, 515,
+  768, 769, 770, 771,
+  1024, 1025, 1026, 1027,
+  1280, 1281, 1282, 1283,
+  -999
+};
+  int t;
+
+  /* variable length */
+  for (t = 0; lens[t] != -999; t++)
+{
+  MEMSET_CHECK (argc + 0x10, lens[t]);
+}
+
+  /* constant length */
+  MEMSET_CHECK (argc + 0x10, 0);
+  MEMSET_CHECK (argc + 0x10, 1);
+  MEMSET_CHECK (argc + 0x10, 2);
+  MEMSET_CHECK (argc + 0x10, 3);
+  MEMSET_CHECK (argc + 0x10, 256);
+  MEMSET_CHECK (argc + 0x10, 257);
+  MEMSET_CHECK (argc + 0x10, 258);
+  MEMSET_CHECK (argc + 0x10, 259);
+  MEMSET_CHECK (argc + 0x10, 512);
+  MEMSET_CHECK (argc + 0x10, 513);
+  MEMSET_CHECK (argc + 0x10, 514);
+  MEMSET_CHECK (argc + 0x10, 515);
+  MEMSET_CHECK (argc + 0x10, 768);
+  MEMSET_CHECK

[PATCH 3/4] S/390: Unroll mvc loop for memcpy with small constant lengths.

2017-01-03 Thread Andreas Krebbel

See the memset unrolling patch.  The very same applies to memcpys with
constant lengths.

2017-01-03  Andreas Krebbel  

* config/s390/s390.c (s390_expand_movmem): Unroll MVC loop for
small constant length operands.

gcc/testsuite/ChangeLog:

2017-01-03  Andreas Krebbel  

* gcc.target/s390/memcpy-1.c: New test.
---
 gcc/config/s390/s390.c   | 21 +++--
 gcc/testsuite/gcc.target/s390/memcpy-1.c | 53 
 2 files changed, 71 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/memcpy-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 1266f45..9bd98eb 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5246,10 +5246,25 @@ s390_expand_movmem (rtx dst, rtx src, rtx len)
   && (GET_CODE (len) != CONST_INT || INTVAL (len) > (1<<16)))
 return false;
 
-  if (GET_CODE (len) == CONST_INT && INTVAL (len) >= 0 && INTVAL (len) <= 256)
+  /* Expand memcpy for constant length operands without a loop if it
+ is shorter that way.
+
+ With a constant length argument a
+ memcpy loop (without pfd) is 36 bytes -> 6 * mvc  */
+  if (GET_CODE (len) == CONST_INT
+  && INTVAL (len) >= 0
+  && INTVAL (len) <= 256 * 6
+  && (!TARGET_MVCLE || INTVAL (len) <= 256))
 {
-  if (INTVAL (len) > 0)
-emit_insn (gen_movmem_short (dst, src, GEN_INT (INTVAL (len) - 1)));
+  HOST_WIDE_INT o, l;
+
+  for (l = INTVAL (len), o = 0; l > 0; l -= 256, o += 256)
+   {
+ rtx newdst = adjust_address (dst, BLKmode, o);
+ rtx newsrc = adjust_address (src, BLKmode, o);
+ emit_insn (gen_movmem_short (newdst, newsrc,
+  GEN_INT (l > 256 ? 255 : l - 1)));
+   }
 }
 
   else if (TARGET_MVCLE)
diff --git a/gcc/testsuite/gcc.target/s390/memcpy-1.c 
b/gcc/testsuite/gcc.target/s390/memcpy-1.c
new file mode 100644
index 000..58c1b49
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/memcpy-1.c
@@ -0,0 +1,53 @@
+/* Make sure that short memcpy's with constant length are emitted
+   without loop statements.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch" } */
+
+/* 3 MVCs */
+void
+*memcpy1(void *dest, const void *src)
+{
+  return __builtin_memcpy (dest, src, 700);
+}
+
+/* NOP */
+void
+*memcpy2(void *dest, const void *src)
+{
+  return __builtin_memcpy (dest, src, 0);
+}
+
+/* 1 MVC */
+void
+*memcpy3(void *dest, const void *src)
+{
+  return __builtin_memcpy (dest, src, 256);
+}
+
+/* 2 MVCs */
+void
+*memcpy4(void *dest, const void *src)
+{
+  return __builtin_memcpy (dest, src, 512);
+}
+
+/* 3 MVCs */
+void
+*memcpy5(void *dest, const void *src)
+{
+  return __builtin_memcpy (dest, src, 768);
+}
+
+/* Loop with 2 MVCs */
+void
+*memcpy6(void *dest, const void *src)
+{
+  return __builtin_memcpy (dest, src, 1537);
+}
+
+/* memcpy6 uses a loop - check for the two load address instructions
+   used to increment src and dest.  */
+/* { dg-final { scan-assembler-times "la" 2 } } */
+
+/* { dg-final { scan-assembler-times "mvc" 11 } } */
-- 
2.9.1

[PATCH 1/4] S/390: memset: Avoid overlapping MVC operands between iterations.

2017-01-03 Thread Andreas Krebbel

A memset with a value != 0 is currently implemented using the mvc
instruction propagating the first byte through 256 byte blocks.  While
for the first mvc the byte is written with a separate instruction
subsequent MVCs used the last byte of the previous 256 byte block.

Starting with z13 this causes a major performance degradation.  With
this patch we always set the first byte with an mvi or stc in order to
avoid the overlapping of the MVC operands between loop iterations.

On older machines this basically makes no measurable difference so the
patch enables the new behavior for all machine levels in order to make
sure that code built for older machine levels runs well when moved to
a z13.

Bootstrapped and regression tested on s390 and s390x using z900 and z13
as default -march level. No regressions.

gcc/ChangeLog:

2017-01-03  Andreas Krebbel  

* config/s390/s390.c (s390_expand_setmem): Avoid overlapping bytes
between loop iterations.
---
 gcc/config/s390/s390.c | 95 ++
 1 file changed, 64 insertions(+), 31 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 2082cb5..257bce7 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5346,6 +5346,8 @@ s390_expand_movmem (rtx dst, rtx src, rtx len)
 void
 s390_expand_setmem (rtx dst, rtx len, rtx val)
 {
+  const int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
+
   if (GET_CODE (len) == CONST_INT && INTVAL (len) == 0)
 return;
 
@@ -5391,13 +5393,14 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
 {
   rtx dst_addr, count, blocks, temp, dstp1 = NULL_RTX;
   rtx_code_label *loop_start_label = gen_label_rtx ();
-  rtx_code_label *loop_end_label = gen_label_rtx ();
-  rtx_code_label *end_label = gen_label_rtx ();
+  rtx_code_label *onebyte_end_label = gen_label_rtx ();
+  rtx_code_label *zerobyte_end_label = gen_label_rtx ();
+  rtx_code_label *restbyte_end_label = gen_label_rtx ();
   machine_mode mode;
 
   mode = GET_MODE (len);
   if (mode == VOIDmode)
-mode = Pmode;
+   mode = Pmode;
 
   dst_addr = gen_reg_rtx (Pmode);
   count = gen_reg_rtx (mode);
@@ -5405,39 +5408,56 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
 
   convert_move (count, len, 1);
   emit_cmp_and_jump_insns (count, const0_rtx,
-  EQ, NULL_RTX, mode, 1, end_label);
+  EQ, NULL_RTX, mode, 1, zerobyte_end_label,
+  very_unlikely);
 
+  /* We need to make a copy of the target address since memset is
+supposed to return it unmodified.  We have to make it here
+already since the new reg is used at onebyte_end_label.  */
   emit_move_insn (dst_addr, force_operand (XEXP (dst, 0), NULL_RTX));
   dst = change_address (dst, VOIDmode, dst_addr);
 
-  if (val == const0_rtx)
-temp = expand_binop (mode, add_optab, count, constm1_rtx, count, 1,
-OPTAB_DIRECT);
-  else
+  if (val != const0_rtx)
{
- dstp1 = adjust_address (dst, VOIDmode, 1);
+ /* When using the overlapping mvc the original target
+address is only accessed as single byte entity (even by
+the mvc reading this value).  */
  set_mem_size (dst, 1);
-
- /* Initialize memory by storing the first byte.  */
- emit_move_insn (adjust_address (dst, QImode, 0), val);
-
- /* If count is 1 we are done.  */
- emit_cmp_and_jump_insns (count, const1_rtx,
-  EQ, NULL_RTX, mode, 1, end_label);
-
- temp = expand_binop (mode, add_optab, count, GEN_INT (-2), count, 1,
-  OPTAB_DIRECT);
-   }
+ dstp1 = adjust_address (dst, VOIDmode, 1);
+ emit_cmp_and_jump_insns (count,
+  const1_rtx, EQ, NULL_RTX, mode, 1,
+  onebyte_end_label, very_unlikely);
+   }
+
+  /* There is one unconditional (mvi+mvc)/xc after the loop
+dealing with the rest of the bytes, subtracting two (mvi+mvc)
+or one (xc) here leaves this number of bytes to be handled by
+it.  */
+  temp = expand_binop (mode, add_optab, count,
+  val == const0_rtx ? constm1_rtx : GEN_INT (-2),
+  count, 1, OPTAB_DIRECT);
   if (temp != count)
-emit_move_insn (count, temp);
+   emit_move_insn (count, temp);
 
   temp = expand_binop (mode, lshr_optab, count, GEN_INT (8), blocks, 1,
   OPTAB_DIRECT);
   if (temp != blocks)
-emit_move_insn (blocks, temp);
+   emit_move_insn (blocks, temp);
 
   emit_cmp_and_jump_insns (blocks, const0_rtx,
-  EQ, NULL_RTX, mode, 1, loop_end_label);
+  EQ, NULL_RTX, mode, 1,

[PATCH 0/4] S/390: memset/memcpy inline code improvements

2017-01-03 Thread Andreas Krebbel

Please see the individual patches for descriptions.

I'll commit these after leaving a few days for comments.

Andreas Krebbel (4):
  S/390: memset: Avoid overlapping MVC operands between iterations.
  S/390: Unroll mvc/xc loop for memset with small constant lengths.
  S/390: Unroll mvc loop for memcpy with small constant lengths.
  Additional memset/memcpy runtime tests.

 gcc/config/s390/s390.c   | 172 +--
 gcc/testsuite/gcc.target/s390/memcpy-1.c |  53 ++
 gcc/testsuite/gcc.target/s390/memcpy-2.c |  94 +
 gcc/testsuite/gcc.target/s390/memset-1.c | 134 
 gcc/testsuite/gcc.target/s390/memset-2.c |  92 +
 5 files changed, 489 insertions(+), 56 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/memcpy-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/memcpy-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/memset-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/memset-2.c

-- 
2.9.1

1 2 >

1 - 100 of 103 matches

Mail list logo