date:20180608

Go patch committed: Remove stack_allocation_expression

2018-06-08 Thread Ian Lance Taylor

This patch to the Go frontend by Cherry Zhang removes
stack_allocation_expression from the backend interface.  Now that we
consistently use temporary variables for temporaries,
stack_allocation_expression is no longer used.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2018-06-08  Cherry Zhang  

* go-gcc.cc (class Gcc_backend): Remove
stack_allocation_expression method.
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc(revision 261203)
+++ gcc/go/go-gcc.cc(working copy)
@@ -352,9 +352,6 @@ class Gcc_backend : public Backend
   const std::vector& args,
   Bexpression* static_chain, Location);
 
-  Bexpression*
-  stack_allocation_expression(int64_t size, Location);
-
   // Statements.
 
   Bstatement*
@@ -1999,20 +1996,6 @@ Gcc_backend::call_expression(Bfunction*,
   return this->make_expression(ret);
 }
 
-// Return an expression that allocates SIZE bytes on the stack.
-
-Bexpression*
-Gcc_backend::stack_allocation_expression(int64_t size, Location location)
-{
-  tree alloca = builtin_decl_explicit(BUILT_IN_ALLOCA);
-  tree size_tree = build_int_cst(integer_type_node, size);
-  tree ret = build_call_expr_loc(location.gcc_location(), alloca, 1, 
size_tree);
-  tree memset = builtin_decl_explicit(BUILT_IN_MEMSET);
-  ret = build_call_expr_loc(location.gcc_location(), memset, 3,
-ret, integer_zero_node, size_tree);
-  return this->make_expression(ret);
-}
-
 // An expression as a statement.
 
 Bstatement*
Index: gcc/go/gofrontend/backend.h
===
--- gcc/go/gofrontend/backend.h (revision 261203)
+++ gcc/go/gofrontend/backend.h (working copy)
@@ -379,10 +379,6 @@ class Backend
   const std::vector& args,
  Bexpression* static_chain, Location) = 0;
 
-  // Return an expression that allocates SIZE bytes on the stack.
-  virtual Bexpression*
-  stack_allocation_expression(int64_t size, Location) = 0;
-
   // Statements.
 
   // Create an error statement.  This is used for cases which should

Re: [patch, fortran] Fix PR 85631, wrong size checking with allocatable arrays

2018-06-08 Thread Thomas Koenig


Hi Steve,


On Fri, Jun 08, 2018 at 09:06:55PM +0200, Thomas Koenig wrote:


the attached patch fixes a bug which was uncovered by the PR in
a matmul regression.

The problem is that bounds checking on the LHS with reallocation on
assignment makes no sense, and the original flag was not set for
the case in question.

I added both the original test and the reduced test to the single test
case.

OK for trunk?



OK.


Committed as r261348.

Thanks!

Thomas

RE: [PATCH][Aarch64] v2: Arithmetic overflow subv patterns [Patch 3/4]

2018-06-08 Thread Michael Collison

All requested changes made:

- label_ref added as operand 3
- more descriptive variable names used

Okay for trunk?

-Original Message-
From: James Greenhalgh  
Sent: Thursday, June 7, 2018 5:30 PM
To: Michael Collison 
Cc: GCC Patches ; nd 
Subject: Re: [PATCH][Aarch64] v2: Arithmetic overflow subv patterns [Patch 3/4]

On Wed, Jun 06, 2018 at 12:19:52PM -0500, Michael Collison wrote:
> This is a respin of a AArch64 patch that adds support for builtin arithmetic 
> overflow operations. This update separates the patch into multiple pieces and 
> addresses comments made by Richard Earnshaw here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00249.html
> 
> Original patch and motivation for patch here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01512.html
> 
> This patch contains new patterns for subv overflow patterns.

>  
> +(define_expand "subv4"
> +  [(match_operand:GPI 0 "register_operand")
> +   (match_operand:GPI 1 "aarch64_reg_or_zero")
> +   (match_operand:GPI 2 "aarch64_reg_or_zero")
> +   (match_operand 3 "")]
> +

As in the previous patch I'd prefer to have the predicate showing this needs a 
label, even if it is not used for validation.

Likewise on the variable names.

Otherwise, this is OK, but aghain I'd appreciate more eyes on the patterns.

Thanks,
James

> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?
> 
> 2018-05-31  Michael Collison  
>   Richard Henderson 
> 
>   * config/aarch64/aarch64.md (subv4, usubv4): New patterns.
>   (subti): Handle op1 zero.
>   (subvti4, usub4ti4): New.
>   (*sub3_compare1_imm): New.
>   (sub3_carryinCV): New.
>   (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
>   (*sub3_carryinCV_z2, *sub3_carryinCV): New.
>

RE: [PATCH][Aarch64] v2: Arithmetic overflow addv patterns [Patch 2/4]

2018-06-08 Thread Michael Collison

All requested changes made:

- label_ref added as operand 3
- more meaningful names given to variables

Okay for trunk?
-Original Message-
From: James Greenhalgh  
Sent: Thursday, June 7, 2018 5:29 PM
To: Michael Collison 
Cc: GCC Patches ; nd 
Subject: Re: [PATCH][Aarch64] v2: Arithmetic overflow addv patterns [Patch 2/4]

On Wed, Jun 06, 2018 at 12:16:22PM -0500, Michael Collison wrote:
> This is a respin of a AArch64 patch that adds support for builtin arithmetic 
> overflow operations. This update separates the patch into multiple pieces and 
> addresses comments made by Richard Earnshaw here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00249.html
> 
> Original patch and motivation for patch here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01512.html
> 
> This patch contains new patterns for addv overflow patterns.
> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?

> +(define_expand "addv4"
> +  [(match_operand:GPI 0 "register_operand")
> +   (match_operand:GPI 1 "register_operand")
> +   (match_operand:GPI 2 "register_operand")
> +   (match_operand 3 "")]
> +  ""

It won't be validated; but I'd prefer us to add the constraint on the label so 
this code is self-documenting. It would have saved me a trip to the manual to 
understand operand 3.

>  (define_expand "addti3"
>[(set (match_operand:TI 0 "register_operand" "")
>   (plus:TI (match_operand:TI 1 "register_operand" "")
> -  (match_operand:TI 2 "register_operand" "")))]
> +  (match_operand:TI 2 "aarch64_reg_or_imm" "")))]
>""
>  {
> -  rtx low = gen_reg_rtx (DImode);
> -  emit_insn (gen_adddi3_compareC (low, gen_lowpart (DImode, operands[1]),
> -   gen_lowpart (DImode, operands[2])));
> +  rtx l0,l1,l2,h0,h1,h2;

Let's give these slightly meaningful names please. dest_high, dest_low, 
op1_high, etc.

Other than these two comments, I think this is OK.

There are some subtleties in here though that I've probably missed, so I 
wouldn't say no to a second pair of eyes.

Thanks,
James


> 
> 
> 2018-05-31  Michael Collison  
>   Richard Henderson 
> 
>   * config/aarch64/aarch64.md: (addv4, uaddv4): New.
>   (addti3): Create simpler code if low part is already known to be 0.
>   (addvti4, uaddvti4): New.
>   (*add3_compareC_cconly_imm): New.
>   (*add3_compareC_cconly): New.
>   (*add3_compareC_imm): New.
>   (*add3_compareC): Rename from add3_compare1; do not
>   handle constants within this pattern..
>   (*add3_compareV_cconly_imm): New.
>   (*add3_compareV_cconly): New.
>   (*add3_compareV_imm): New.
>   (add3_compareV): New.
>   (add3_carryinC, add3_carryinV): New.
>   (*add3_carryinC_zero, *add3_carryinV_zero): New.
>   (*add3_carryinC, *add3_carryinV): New.
>   ((*add3_compareC_cconly_imm): Replace 'ne' operator
>   with 'comparison' operator.
>   (*add3_compareV_cconly_imm): Ditto.
>   (*add3_compareV_cconly): Ditto.
>   (*add3_compareV_imm): Ditto.
>   (add3_compareV): Ditto.
>   (add3_carryinC): Ditto.
>   (*add3_carryinC_zero): Ditto.
>   (*add3_carryinC): Ditto.
>   (add3_carryinV): Ditto.
>   (*add3_carryinV_zero): Ditto.
>   (*add3_carryinV): Ditto.




gnutools-6308-pt2.patch
Description: gnutools-6308-pt2.patch

RE: [PATCH][Aarch64] v2: Arithmetic overflow common functions [Patch 1/4]

2018-06-08 Thread Michael Collison

Patch updated as requested:

- name changed from 'aarch64_add_128bit_scratch_regs' to 
'aarch64_addti_scratch_regs'
- name changed from 'aarch64_subv_128bit_scratch_reg's to ' 
aarch64_subvti_scratch_regs'

I did not find any helper function to replace ' aarch64_gen_unlikely_cbranch'.

Okay for trunk?


-Original Message-
From: James Greenhalgh  
Sent: Thursday, June 7, 2018 5:19 PM
To: Michael Collison 
Cc: GCC Patches ; nd 
Subject: Re: [PATCH][Aarch64] v2: Arithmetic overflow common functions [Patch 
1/4]

On Wed, Jun 06, 2018 at 12:14:03PM -0500, Michael Collison wrote:
> This is a respin of a AArch64 patch that adds support for builtin arithmetic 
> overflow operations. This update separates the patch into multiple pieces and 
> addresses comments made by Richard Earnshaw here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00249.html
> 
> Original patch and motivation for patch here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01512.html
> 
> This patch primarily contains common functions in aarch64.c for 
> generating TImode scratch registers, and common rtl functions utilized by the 
> overflow patterns in aarch64.md. In addition a new mode representing overflow 
> CC_Vmode is introduced.
> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?

Normally it is preferred that each patch in a series stands independent of the 
others. So if I apply just 1/4 I should get a working toolchain. You have some 
dependencies here between 1/4 and 3/4.

Rather than ask you to rework these patches, I think I'll instead ask you to 
squash them all to a single commit after we're done with review. That will save 
you some rebase work and maintain the property that trunk can be built at most 
revisions.


> (aarch64_add_128bit_scratch_regs): Declare
> (aarch64_subv_128bit_scratch_regs): Declare.

Why use 128bit in the function name rather than call it 
aarch64_subvti_scratch_regs ?


> @@ -16337,6 +16353,131 @@ aarch64_split_dimode_const_store (rtx dst, rtx src)
>return true;
>  }
>  
> +/* Generate RTL for a conditional branch with rtx comparison CODE in
> +   mode CC_MODE.  The destination of the unlikely conditional branch
> +   is LABEL_REF.  */
> +
> +void
> +aarch64_gen_unlikely_cbranch (enum rtx_code code, machine_mode cc_mode,
> +   rtx label_ref)
> +{
> +  rtx x;
> +  x = gen_rtx_fmt_ee (code, VOIDmode,
> +   gen_rtx_REG (cc_mode, CC_REGNUM),
> +   const0_rtx);
> +
> +  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
> + gen_rtx_LABEL_REF (VOIDmode, label_ref),
> + pc_rtx);
> +  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x)); }
> +

I'm a bit surprised this is AArh64 specific and there are no helper functions 
to get you here. Not that it should block the patch;l but if we can reuse 
something I'd prefer we did.

> +void
> +aarch64_expand_subvti (rtx op0, rtx low_dest, rtx low_in1,
> +rtx low_in2, rtx high_dest, rtx high_in1,
> +rtx high_in2)
> +{
> +  if (low_in2 == const0_rtx)
> +{
> +  low_dest = low_in1;
> +  emit_insn (gen_subdi3_compare1 (high_dest, high_in1,
> +   force_reg (DImode, high_in2)));
> +}
> +  else
> +{
> +  if (CONST_INT_P (low_in2))
> + {
> +   low_in2 = force_reg (DImode, GEN_INT (-UINTVAL (low_in2)));
> +   high_in2 = force_reg (DImode, high_in2);
> +   emit_insn (gen_adddi3_compareC (low_dest, low_in1, low_in2));
> + }
> +  else
> + emit_insn (gen_subdi3_compare1 (low_dest, low_in1, low_in2));
> +  emit_insn (gen_subdi3_carryinCV (high_dest,
> +force_reg (DImode, high_in1),
> +high_in2));

This is where we'd break the build. gen_subdi3_carryinCV isn't defined until 
3/4.

The above points are minor.

This patch is OK with them cleaned up, once I've reviewed the other 3 parts to 
this series.

James

> 
> 2018-05-31  Michael Collison  
> Richard Henderson 
> 
> * config/aarch64/aarch64-modes.def (CC_V): New.
> * config/aarch64/aarch64-protos.h
> (aarch64_add_128bit_scratch_regs): Declare
> (aarch64_subv_128bit_scratch_regs): Declare.
> (aarch64_expand_subvti): Declare.
> (aarch64_gen_unlikely_cbranch): Declare
> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test for signed 
> overflow using CC_Vmode.
> (aarch64_get_condition_code_1): Handle CC_Vmode.
> (aarch64_gen_unlikely_cbranch): New function.
> (aarch64_add_128bit_scratch_regs): New function.
> (aarch64_subv_128bit_scratch_regs): New function.
> (aarch64_expand_subvti): New function.




gnutools-6308-pt1.patch
Description: gnutools-6308-pt1.patch

[PATCH] PR fortran/78278 -- Issue error message for double initialization

2018-06-08 Thread Steve Kargl

The attach patch re-arranges code to permit gfortran to issue
an error message under non-gnu -std=* options when an entity
appears in a double initialization.  Prior to this patch,
the new testcase pr78278.f90 would compile without error.
Regression tested on x86_64-*-freebsd.  OK to commit?

2018-06-08  Steven G. Kargl  

PR fortran/78278
* data.c (gfc_assign_data_value): Re-arrange code to allow for
an error for double initialization of CHARACTER entities.

2018-06-08  Steven G. Kargl  

PR fortran/78278
* gfortran.dg/data_bounds_1.f90: Add -std=gnu option.
* gfortran.dg/data_char_1.f90: Ditto.
* gfortran.dg/pr78571.f90: Ditto.
* gfortran.dg/pr78278.f90: New test.

-- 
Steve
Index: gcc/fortran/data.c
===
--- gcc/fortran/data.c	(revision 261343)
+++ gcc/fortran/data.c	(working copy)
@@ -483,6 +483,21 @@ gfc_assign_data_value (gfc_expr *lvalue, gfc_expr *rva
   mpz_clear (offset);
   gcc_assert (repeat == NULL);
 
+  /* Overwriting an existing initializer is non-standard but usually only
+ provokes a warning from other compilers.  */
+  if (init != NULL && init->where.lb && rvalue->where.lb)
+{
+  /* Order in which the expressions arrive here depends on whether
+	 they are from data statements or F95 style declarations.
+	 Therefore, check which is the most recent.  */
+  expr = (LOCATION_LINE (init->where.lb->location)
+	  > LOCATION_LINE (rvalue->where.lb->location))
+	   ? init : rvalue;
+  if (gfc_notify_std (GFC_STD_GNU, "re-initialization of %qs at %L",
+			  symbol->name, >where) == false)
+	return false;
+}
+
   if (ref || last_ts->type == BT_CHARACTER)
 {
   /* An initializer has to be constant.  */
@@ -501,22 +516,6 @@ gfc_assign_data_value (gfc_expr *lvalue, gfc_expr *rva
 		 "shall not appear in a DATA statement at %L", 
 		 symbol->name, >where);
 	  return false;
-	}
-
-  /* Overwriting an existing initializer is non-standard but usually only
-	 provokes a warning from other compilers.  */
-  if (init != NULL)
-	{
-	  /* Order in which the expressions arrive here depends on whether
-	 they are from data statements or F95 style declarations.
-	 Therefore, check which is the most recent.  */
-	  expr = (LOCATION_LINE (init->where.lb->location)
-		  > LOCATION_LINE (rvalue->where.lb->location))
-	   ? init : rvalue;
-	  if (gfc_notify_std (GFC_STD_GNU,
-			  "re-initialization of %qs at %L",
-			  symbol->name, >where) == false)
-	return false;
 	}
 
   expr = gfc_copy_expr (rvalue);
Index: gcc/testsuite/gfortran.dg/data_bounds_1.f90
===
--- gcc/testsuite/gfortran.dg/data_bounds_1.f90	(revision 261342)
+++ gcc/testsuite/gfortran.dg/data_bounds_1.f90	(working copy)
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-options "-std=gnu" }
 ! Checks the fix for PR32315, in which the bounds checks below were not being done.
 !
 ! Contributed by Tobias Burnus 
Index: gcc/testsuite/gfortran.dg/data_char_1.f90
===
--- gcc/testsuite/gfortran.dg/data_char_1.f90	(revision 261342)
+++ gcc/testsuite/gfortran.dg/data_char_1.f90	(working copy)
@@ -1,4 +1,5 @@
 ! { dg-do run }
+! { dg-options "-std=gnu" }
 ! Test character variables in data statements
 ! Also substrings of character variables.
 ! PR14976 PR16228 
Index: gcc/testsuite/gfortran.dg/pr78278.f90
===
--- gcc/testsuite/gfortran.dg/pr78278.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr78278.f90	(working copy)
@@ -0,0 +1,14 @@
+! { dg-do compile }
+! { dg-options "-std=f95" }
+! PR fortran/78278
+program p
+   character, pointer :: x => null()
+   data x /null()/ ! { dg-error "GNU Extension: re-initialization" }
+   print *, associated(x)
+end
+
+subroutine foo
+   real :: x = 42
+   data x /0/  ! { dg-error "GNU Extension: re-initialization" }
+   print *, x
+end subroutine foo
Index: gcc/testsuite/gfortran.dg/pr78571.f90
===
--- gcc/testsuite/gfortran.dg/pr78571.f90	(revision 261343)
+++ gcc/testsuite/gfortran.dg/pr78571.f90	(working copy)
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-options "-std=gnu" }
 ! PR fortran/78571
 program p
type t

Re: [PATCH 09/14] Remove cgraph_node::summary_uid and make cgraph_node::uid really unique.

2018-06-08 Thread Christophe Lyon

On 8 June 2018 at 22:05, Martin Liška  wrote:
> On 06/08/2018 09:58 PM, Christophe Lyon wrote:
>>
>> On 7 June 2018 at 14:09, Jan Hubicka  wrote:


 gcc/ChangeLog:

 2018-05-16  Martin Liska  

* cgraph.c (cgraph_node::remove): Do not recycle uid.
* cgraph.h (symbol_table::release_symbol): Do not pass uid.
(symbol_table::allocate_cgraph_symbol): Do not set uid.
* passes.c (uid_hash_t): Record removed_nodes by their uids.
(remove_cgraph_node_from_order): Use the removed_nodes set.
(do_per_function_toporder): Likwise.
* symbol-summary.h (symtab_insertion): Use cgraph_node::uid
instead of summary_uid.
(symtab_removal): Likewise.
(symtab_duplication): Likewise.

 gcc/lto/ChangeLog:

 2018-05-16  Martin Liska  

* lto-partition.c (lto_balanced_map): Use cgraph_node::uid
instead of summary_uid.
>>>
>>>
>>> I am still now convinced that competely moving from arrays made dense by
>>> uid recyclic to hash tables is performance-wise smart idea, but current
>>> uid is not working very well for this purpose - most summaries we have
>>> are only for definitions so we want something like definition uid.
>>>
>>> In general it seems bad that we allocate same memory for object with
>>> definition
>>> and external symbol. Something I planned to change but did not get to do
>>> that yet.
>>>
>>> So the patch is OK. With new abstraction we can always re-invent dense
>>> uids for
>>> this purpose later.
>>>
>>> Honza
>>
>>
>>
>> Hi!
>>
>> This patch broke the GCC build:
>
>
> Sorry for that. It was just short breakage, it's fixed in r261320.
>

OK, good to know. My build queue hasn't reached that stage yet :)

Thanks

> Martin
>
>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:
>> In function ‘void remove_cgraph_node_from_order(cgraph_node*, void*)’:
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> warning: ‘>>’ operator will be treated as two right angle brackets in
>> C++0x
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> warning: suggest parentheses around ‘>>’ expression
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: ‘removed_nodes’ was not declared in this scope
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: ‘*’ cannot appear in a constant-expression
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> warning: ‘>>’ operator will be treated as two right angle brackets in
>> C++0x
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> warning: suggest parentheses around ‘>>’ expression
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: ‘*’ cannot appear in a constant-expression
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: template argument 3 is invalid
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: template argument 1 is invalid
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: template argument 2 is invalid
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: an assignment cannot appear in a constant-expression
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: template argument 3 is invalid
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: template argument 1 is invalid
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
>> error: template argument 2 is invalid
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:
>> In function ‘void do_per_function_toporder(void (*)(function*, void*),
>> void*)’:
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
>> warning: ‘>>’ operator will be treated as two right angle brackets in
>> C++0x
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
>> warning: suggest parentheses around ‘>>’ expression
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
>> error: ‘removed_nodes’ was not declared in this scope
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
>> error: template argument 3 is invalid
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
>> error: template argument 1 is invalid
>>
>> /tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
>> error: template argument 2 is invalid
>> make[2]: *** [passes.o] Error 1
>>
>

Re: [PATCH] Avoid excessive function type casts with splay-trees part 2

2018-06-08 Thread Bernd Edlinger

On 06/08/18 16:28, David Malcolm wrote:
> On Fri, 2018-06-08 at 14:03 +, Bernd Edlinger wrote:
>> Hi!
>>
>>
>> This patch converts the splay-tree internals into a template, and
>> makes
>> the typed_splay_tree template really type-safe.  Previously
>> everything
>> would break apart if KEY_TYPE or VALUE_TYPE would not be pointer
>> types.
>> This limitation is now removed.
>>
>> I took the freedom to add a remove function which is only for
>> completeness and test coverage, but not (yet) used in a productive
>> way.
>>
>>
>> Bootstrapped and reg-tested on x86_64-linux-gnu.
>> Is it OK for trunk?
> 
> Was this testing done with "jit" enabled? (there's some usage of
> typed_splay_tree there, for jit's equivalent of switch statements)
> 
> Note that the jit frontend isn't covered by "all"  in --enable-
> languages; it has to be added manually, iirc since it requires --
> enable-host-shared.
> 

Yes, good point.  I repeated the test with --enable-host-shared
and all jit tests did pass.


Thanks
Bernd.


> Thanks
> Dave
>

Re: [PATCH] Fix altivec-7 issues on Power 6

2018-06-08 Thread Carl Love

GCC maintainers:

Aargh!!  I attached an old copy of the patch to the original message. 
Guess it is time to do some house cleaning.  

  Carl Love
---

gcc/testsuite/ChangeLog:

2018-06-08  Carl Love  
* gcc.target/powerpc/altivec-7.c (main): Remove tests
vec_unpackh(vecubi[0]) and vec_unpackl(vecubi[0]) returning
long long bool.  Remove duplicate dg-final for xxlxor.  Update
dg-final instruction counts.
* gcc.target/powerpc/altivec-37.c (main): New file for
tests vec_unpackh and vec_unpackl returning long long bool and
long long int.
---
 gcc/testsuite/gcc.target/powerpc/altivec-37.c | 32 +++
 gcc/testsuite/gcc.target/powerpc/altivec-7.c  |  9 +---
 2 files changed, 33 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/altivec-37.c

diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-37.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-37.c
new file mode 100644
index 000..362b6ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-37.c
@@ -0,0 +1,32 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mpower8-vector -mvsx" } */
+
+#include 
+
+vector bool int *vecubi;
+vector bool long long *vecublli;
+vector signed int *vecsi;
+vector signed long long int *vecslli;
+
+int main ()
+{
+
+  /*  use of ‘long long’ in AltiVec types requires -mvsx */
+  /* __builtin_altivec_vupkhsw and __builtin_altivec_vupklsw
+ requires the -mpower8-vector option */
+  *vecublli++ = vec_unpackh(vecubi[0]);
+  *vecublli++ = vec_unpackl(vecubi[0]);
+  *vecslli++ = vec_unpackh(vecsi[0]);
+  *vecslli++ = vec_unpackl(vecsi[0]);
+  
+  return 0;
+}
+
+/* Expected results:
+ vec_unpackhvupklsw
+ vec_unpacklvupkhsw
+*/
+
+/* { dg-final { scan-assembler-times "vupklsw" 2 } } */
+/* { dg-final { scan-assembler-times "vupkhsw" 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-7.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-7.c
index 6aad9a9..b61092c 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-7.c
@@ -18,7 +18,6 @@ vector unsigned int *vecuint;
 vector bool int *vecubi;
 vector bool char *vecubci;
 vector bool short int *vecubsi;
-vector bool long long int *vecublli;
 vector unsigned short *vecushort;
 vector bool int *vecbint;
 vector float *vecfloat;
@@ -50,13 +49,11 @@ int main ()
 
   *vecubi++ = vec_unpackh(vecubsi[0]);
   *vecuint++ = vec_unpackh(varpixel[0]);
-  *vecublli++ = vec_unpackh(vecubi[0]);
   *vecubsi++ = vec_unpackh(vecubci[0]);
   *vecshort++ = vec_unpackh(vecchar[0]);
 
   *vecubi++ = vec_unpackl(vecubsi[0]);
   *vecuint++ = vec_unpackl(varpixel[0]);
-  *vecublli++ = vec_unpackl(vecubi[0]);
   *vecubsi++ = vec_unpackl(vecubci[0]);
   *vecshort++ = vec_unpackl(vecchar[0]);
   
@@ -72,11 +69,9 @@ int main ()
  vec_lvewx  lvewx
  vec_unpackhvupklsh
  vec_unpackhvupklpx
- vec_unpackhvupklsw
  vec_unpackhvupklsb
  vec_unpacklvupkhsh
  vec_unpacklvupkhpx
- vec_unpacklvupkhsw
  vec_unpacklvupkhsb
  vec_andc   xxnor
 xxland
@@ -90,7 +85,7 @@ int main ()
 /* { dg-final { scan-assembler-times "vpkpx" 2 } } */
 /* { dg-final { scan-assembler-times "vmulesb" 1 } } */
 /* { dg-final { scan-assembler-times "vmulosb" 1 } } */
-/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 44 { target le } } } 
*/
+/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 42 { target le } } } 
*/
 /* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 4 { target be } } } 
*/
 /* { dg-final { scan-assembler-times "lvewx" 2 } } */
 /* { dg-final { scan-assembler-times "lvxl" 1 } } */
@@ -100,12 +95,10 @@ int main ()
 /* { dg-final { scan-assembler-times "xxland" 4 } } */
 /* { dg-final { scan-assembler-times "xxlxor" 5 } } */
 /* { dg-final { scan-assembler-times "xxlandc" 0 } } */
-/* { dg-final { scan-assembler-times "xxlxor" 5 } } */
 /* { dg-final { scan-assembler-times "lvx" 1 } } */
 /* { dg-final { scan-assembler-times "vmsumubm" 1 } } */
 /* { dg-final { scan-assembler-times "vupklpx" 1 } } */
 /* { dg-final { scan-assembler-times "vupklsx" 0 } } */
 /* { dg-final { scan-assembler-times "vupklsb" 2 } } */
 /* { dg-final { scan-assembler-times "vupkhpx" 1 } } */
-/* { dg-final { scan-assembler-times "vupkhsw" 1 } } */
 /* { dg-final { scan-assembler-times "vupkhsb" 2 } } */
-- 
2.7.4

Re: [PATCH 09/14] Remove cgraph_node::summary_uid and make cgraph_node::uid really unique.

2018-06-08 Thread Martin Liška


On 06/08/2018 09:58 PM, Christophe Lyon wrote:

On 7 June 2018 at 14:09, Jan Hubicka  wrote:


gcc/ChangeLog:

2018-05-16  Martin Liska  

   * cgraph.c (cgraph_node::remove): Do not recycle uid.
   * cgraph.h (symbol_table::release_symbol): Do not pass uid.
   (symbol_table::allocate_cgraph_symbol): Do not set uid.
   * passes.c (uid_hash_t): Record removed_nodes by their uids.
   (remove_cgraph_node_from_order): Use the removed_nodes set.
   (do_per_function_toporder): Likwise.
   * symbol-summary.h (symtab_insertion): Use cgraph_node::uid
   instead of summary_uid.
   (symtab_removal): Likewise.
   (symtab_duplication): Likewise.

gcc/lto/ChangeLog:

2018-05-16  Martin Liska  

   * lto-partition.c (lto_balanced_map): Use cgraph_node::uid
   instead of summary_uid.


I am still now convinced that competely moving from arrays made dense by
uid recyclic to hash tables is performance-wise smart idea, but current
uid is not working very well for this purpose - most summaries we have
are only for definitions so we want something like definition uid.

In general it seems bad that we allocate same memory for object with definition
and external symbol. Something I planned to change but did not get to do that 
yet.

So the patch is OK. With new abstraction we can always re-invent dense uids for
this purpose later.

Honza



Hi!

This patch broke the GCC build:


Sorry for that. It was just short breakage, it's fixed in r261320.

Martin


/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:
In function ‘void remove_cgraph_node_from_order(cgraph_node*, void*)’:
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: ‘>>’ operator will be treated as two right angle brackets in
C++0x
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: suggest parentheses around ‘>>’ expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: ‘removed_nodes’ was not declared in this scope
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: ‘*’ cannot appear in a constant-expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: ‘>>’ operator will be treated as two right angle brackets in
C++0x
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: suggest parentheses around ‘>>’ expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: ‘*’ cannot appear in a constant-expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 3 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 1 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 2 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: an assignment cannot appear in a constant-expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 3 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 1 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 2 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:
In function ‘void do_per_function_toporder(void (*)(function*, void*),
void*)’:
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
warning: ‘>>’ operator will be treated as two right angle brackets in
C++0x
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
warning: suggest parentheses around ‘>>’ expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: ‘removed_nodes’ was not declared in this scope
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: template argument 3 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: template argument 1 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: template argument 2 is invalid
make[2]: *** [passes.o] Error 1

Re: [PATCH 09/14] Remove cgraph_node::summary_uid and make cgraph_node::uid really unique.

2018-06-08 Thread Christophe Lyon

On 7 June 2018 at 14:09, Jan Hubicka  wrote:
>>
>> gcc/ChangeLog:
>>
>> 2018-05-16  Martin Liska  
>>
>>   * cgraph.c (cgraph_node::remove): Do not recycle uid.
>>   * cgraph.h (symbol_table::release_symbol): Do not pass uid.
>>   (symbol_table::allocate_cgraph_symbol): Do not set uid.
>>   * passes.c (uid_hash_t): Record removed_nodes by their uids.
>>   (remove_cgraph_node_from_order): Use the removed_nodes set.
>>   (do_per_function_toporder): Likwise.
>>   * symbol-summary.h (symtab_insertion): Use cgraph_node::uid
>>   instead of summary_uid.
>>   (symtab_removal): Likewise.
>>   (symtab_duplication): Likewise.
>>
>> gcc/lto/ChangeLog:
>>
>> 2018-05-16  Martin Liska  
>>
>>   * lto-partition.c (lto_balanced_map): Use cgraph_node::uid
>>   instead of summary_uid.
>
> I am still now convinced that competely moving from arrays made dense by
> uid recyclic to hash tables is performance-wise smart idea, but current
> uid is not working very well for this purpose - most summaries we have
> are only for definitions so we want something like definition uid.
>
> In general it seems bad that we allocate same memory for object with 
> definition
> and external symbol. Something I planned to change but did not get to do that 
> yet.
>
> So the patch is OK. With new abstraction we can always re-invent dense uids 
> for
> this purpose later.
>
> Honza


Hi!

This patch broke the GCC build:
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:
In function ‘void remove_cgraph_node_from_order(cgraph_node*, void*)’:
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: ‘>>’ operator will be treated as two right angle brackets in
C++0x
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: suggest parentheses around ‘>>’ expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: ‘removed_nodes’ was not declared in this scope
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: ‘*’ cannot appear in a constant-expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: ‘>>’ operator will be treated as two right angle brackets in
C++0x
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
warning: suggest parentheses around ‘>>’ expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: ‘*’ cannot appear in a constant-expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 3 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 1 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 2 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: an assignment cannot appear in a constant-expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 3 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 1 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1646:
error: template argument 2 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:
In function ‘void do_per_function_toporder(void (*)(function*, void*),
void*)’:
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
warning: ‘>>’ operator will be treated as two right angle brackets in
C++0x
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
warning: suggest parentheses around ‘>>’ expression
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: ‘removed_nodes’ was not declared in this scope
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: template argument 3 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: template argument 1 is invalid
/tmp/9400570_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1664:
error: template argument 2 is invalid
make[2]: *** [passes.o] Error 1

[PATCH] Fix altivec-7 issues on Power 6

2018-06-08 Thread Carl Love

GCC Maintainers:

Test file gcc/testsuite/gcc.target/powerpc/altivec-7.c has issues when
compiling for Power 6.  Specifically, the new tests that were added for
vec_unpackh and vec_unpackl that return a long long bool.  The long
long type is not compatible on Power 6 with just the "-maltivec"
command line option.  

This patch removes the tests for vec_unpackh and vec_unpackl that
return a long long bool from altivec-7.c and puts them in a new file
altivec-37.c using the "=mvsx" option.  Additionally, tests for the two
builtins returning long long int are added to altivec-37.  

The patch was tested on:

powerpc64le-unknown-linux-gnu (Power 8 LE)   
powerpc64le-unknown-linux-gnu (Power 9 LE)
powerpc64-unknown-linux-gnu (Power 8 BE)

With no regressions.

Additionally, hand testing with the commands 

  make -k check-gcc RUNTESTFLAGS="-mcpu=power6 --target_board=unix'{-m64,-m32}' 
powerpc.exp=altivec-7.c"  

  make -k check-gcc RUNTESTFLAGS="-mcpu=power6 --target_board=unix'{-m64,-m32}' 
powerpc.exp=altivec-37.c" 

were run on all three configurations to ensure compiling for Power 6 works 
everywhere.

Please let me know if the patch looks OK for GCC mainline. 

 Carl Love

-

gcc/testsuite/ChangeLog:

2018-06-08  Carl Love  
* gcc.target/powerpc/altivec-7.c (main): Remove tests
vec_unpackh(vecubi[0]) and vec_unpackl(vecubi[0]).  Remove
duplicate dg-final for xxlxor.  Update dg-final instruction
counts.
* gcc.target/powerpc/altivec-37.c (main): New file for
tests vec_unpackh(vecubi[0]) and vec_unpackl(vecubi[0]).
---
 gcc/testsuite/gcc.target/powerpc/altivec-37.c | 32 +++
 gcc/testsuite/gcc.target/powerpc/altivec-7.c  |  9 +---
 2 files changed, 33 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/altivec-37.c

diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-37.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-37.c
new file mode 100644
index 000..a77bcd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-37.c
@@ -0,0 +1,32 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mpower8-vector -mvsx" } */
+
+#include 
+
+vector bool int *vecubi;
+vector bool long long *vecublli;
+vector signed int *vecsi;
+vector signed long long int *vecslli;
+
+int main ()
+{
+
+  /*  use of ‘long long’ in AltiVec types requires -mvsx */
+  /* __builtin_altivec_vupkhsw and __builtin_altivec_vupklsw
+ requires the -mpower8-vector option */
+  *vecublli++ = vec_unpackh(vecubi[0]);
+  *vecslli++ = vec_unpackl(vecsi[0]);
+  
+  return 0;
+}
+
+//MAKE SURE INSTRUCTIONS ARE CORRECT
+
+/* Expected results:
+ vec_unpackhvupklsw
+ vec_unpacklvupkhsw
+*/
+
+/* { dg-final { scan-assembler-times "vupklsw" 1 } } */
+/* { dg-final { scan-assembler-times "vupkhsw" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-7.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-7.c
index 6aad9a9..b61092c 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-7.c
@@ -18,7 +18,6 @@ vector unsigned int *vecuint;
 vector bool int *vecubi;
 vector bool char *vecubci;
 vector bool short int *vecubsi;
-vector bool long long int *vecublli;
 vector unsigned short *vecushort;
 vector bool int *vecbint;
 vector float *vecfloat;
@@ -50,13 +49,11 @@ int main ()
 
   *vecubi++ = vec_unpackh(vecubsi[0]);
   *vecuint++ = vec_unpackh(varpixel[0]);
-  *vecublli++ = vec_unpackh(vecubi[0]);
   *vecubsi++ = vec_unpackh(vecubci[0]);
   *vecshort++ = vec_unpackh(vecchar[0]);
 
   *vecubi++ = vec_unpackl(vecubsi[0]);
   *vecuint++ = vec_unpackl(varpixel[0]);
-  *vecublli++ = vec_unpackl(vecubi[0]);
   *vecubsi++ = vec_unpackl(vecubci[0]);
   *vecshort++ = vec_unpackl(vecchar[0]);
   
@@ -72,11 +69,9 @@ int main ()
  vec_lvewx  lvewx
  vec_unpackhvupklsh
  vec_unpackhvupklpx
- vec_unpackhvupklsw
  vec_unpackhvupklsb
  vec_unpacklvupkhsh
  vec_unpacklvupkhpx
- vec_unpacklvupkhsw
  vec_unpacklvupkhsb
  vec_andc   xxnor
 xxland
@@ -90,7 +85,7 @@ int main ()
 /* { dg-final { scan-assembler-times "vpkpx" 2 } } */
 /* { dg-final { scan-assembler-times "vmulesb" 1 } } */
 /* { dg-final { scan-assembler-times "vmulosb" 1 } } */
-/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 44 { target le } } } 
*/
+/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 42 { target le } } } 
*/
 /* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M} 4 { target be } } } 
*/
 /* { dg-final { scan-assembler-times

Re: [patch, fortran] Fix PR 85631, wrong size checking with allocatable arrays

2018-06-08 Thread Steve Kargl

On Fri, Jun 08, 2018 at 09:06:55PM +0200, Thomas Koenig wrote:
> 
> the attached patch fixes a bug which was uncovered by the PR in
> a matmul regression.
> 
> The problem is that bounds checking on the LHS with reallocation on
> assignment makes no sense, and the original flag was not set for
> the case in question.
> 
> I added both the original test and the reduced test to the single test
> case.
> 
> OK for trunk?
> 

OK.

-- 
steve

Re: [PATCH] PR fortran/78571 -- Avoid ICE in invalid CHARACTER initialization

2018-06-08 Thread Steve Kargl

On Fri, Jun 08, 2018 at 09:01:11PM +0200, Thomas König wrote:
> > Regression tested on x86_64-*-freebsd.  OK to commit?
> 
> OK, and thanks!
> 

Thanks.  Committed to trunk.

-- 
Steve

Re: [PATCH] PR fortran/86059 -- NULL() cannot be in array constructor

2018-06-08 Thread Steve Kargl

On Fri, Jun 08, 2018 at 09:01:50PM +0200, Thomas Koenig wrote:
> > Regression tested on x86_64-*-freebsd.  OK to commit.
> 
> OK, and thanks!

Thanks.  Committed to trunk.

-- 
Steve

[patch, fortran] Fix PR 85631, wrong size checking with allocatable arrays

2018-06-08 Thread Thomas Koenig


Hello world,

the attached patch fixes a bug which was uncovered by the PR in
a matmul regression.

The problem is that bounds checking on the LHS with reallocation on
assignment makes no sense, and the original flag was not set for
the case in question.

I added both the original test and the reduced test to the single test
case.

OK for trunk?

Regards

Thomas

2018-06-08  Thomas Koenig  

PR fortran/85631
* trans.h (gfc_ss): Add field no_bounds_check.
* trans-array.c (gfc_conv_ss_startstride): If flag_realloc_lhs and
ss->no_bounds_check is set, do not use runtime checks.
* trans-expr.c (gfc_trans_assignment_1): Set lss->no_bounds_check
for reallocatable lhs.

2018-06-08  Thomas Koenig  

PR fortran/85631
* gfortran.dg/bounds_check_20.f90: New test.
Index: trans-array.c
===
--- trans-array.c	(Revision 261245)
+++ trans-array.c	(Arbeitskopie)
@@ -4304,7 +4304,7 @@ done:
 	}
 }
 
-  /* The rest is just runtime bound checking.  */
+  /* The rest is just runtime bounds checking.  */
   if (gfc_option.rtcheck & GFC_RTCHECK_BOUNDS)
 {
   stmtblock_t block;
@@ -4334,7 +4334,7 @@ done:
 	continue;
 
 	  /* Catch allocatable lhs in f2003.  */
-	  if (flag_realloc_lhs && ss->is_alloc_lhs)
+	  if (flag_realloc_lhs && ss->no_bounds_check)
 	continue;
 
 	  expr = ss_info->expr;
Index: trans-expr.c
===
--- trans-expr.c	(Revision 261245)
+++ trans-expr.c	(Arbeitskopie)
@@ -9982,12 +9982,15 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr
 
   /* Walk the lhs.  */
   lss = gfc_walk_expr (expr1);
-  if (gfc_is_reallocatable_lhs (expr1)
-  && !(expr2->expr_type == EXPR_FUNCTION
-	   && expr2->value.function.isym != NULL
-	   && !(expr2->value.function.isym->elemental
-		|| expr2->value.function.isym->conversion)))
-lss->is_alloc_lhs = 1;
+  if (gfc_is_reallocatable_lhs (expr1))
+{
+  lss->no_bounds_check = 1;
+  if (!(expr2->expr_type == EXPR_FUNCTION
+	&& expr2->value.function.isym != NULL
+	&& !(expr2->value.function.isym->elemental
+		 || expr2->value.function.isym->conversion)))
+	lss->is_alloc_lhs = 1;
+}
 
   rss = NULL;
 
Index: trans.h
===
--- trans.h	(Revision 261245)
+++ trans.h	(Arbeitskopie)
@@ -330,6 +330,7 @@ typedef struct gfc_ss
   struct gfc_loopinfo *loop;
 
   unsigned is_alloc_lhs:1;
+  unsigned no_bounds_check:1;
 }
 gfc_ss;
 #define gfc_get_ss() XCNEW (gfc_ss)
! { dg-do  run }
! { dg-additional-options "-fcheck=bounds -ffrontend-optimize" }
! PR 85631 - this used to cause a runtime error with bounds checking.
module x
contains
  subroutine sub(a, b)
real, dimension(:,:), intent(in) :: a
real, dimension(:,:), intent(out), allocatable :: b
b = transpose(a)
  end subroutine sub
end module x

program main
  use x
  implicit none
  real, dimension(2,2) :: a
  real, dimension(:,:), allocatable :: b
  data a /-2., 3., -5., 7./
  call sub(a, b)
  if (any (b /= reshape([-2., -5., 3., 7.], shape(b stop 1
  b = matmul(transpose(b), a)
  if (any (b /= reshape([-11., 15., -25.,  34.], shape(b stop 2
end program

Re: [PATCH] PR fortran/86059 -- NULL() cannot be in array constructor

2018-06-08 Thread Thomas Koenig


Hi Steve,


Regression tested on x86_64-*-freebsd.  OK to commit.


OK, and thanks!

Thomas

Re: [PATCH] PR fortran/78571 -- Avoid ICE in invalid CHARACTER initialization

2018-06-08 Thread Thomas König


Hi Steve,


Regression tested on x86_64-*-freebsd.  OK to commit?


OK, and thanks!

Thomas

Enforce F2008:C1282 and F2018:C1588

2018-06-08 Thread Steve Kargl

The attached patch adresses part of an issue raised in
PR fortran/63514 by enforcing F2008:C1282 and F2018:C1588.
Regression tested on x86_64-*-freebsd.  OK to commit?

PS: the actual underlying point of PR fortran/63514 is bogus,
and I recommend that it be closed with WONTFIX.


2018-06-08  Steven G. Kargl  

PR fortran/63514
* symbol.c (gfc_add_volatile): Enforce F2008:C1282 and F2018:C1588.


2018-06-08  Steven G. Kargl  

PR fortran/63514
* gfortran.dg/pr63514.f90: New test.


-- 
Steve
Index: gcc/fortran/symbol.c
===
--- gcc/fortran/symbol.c	(revision 261285)
+++ gcc/fortran/symbol.c	(working copy)
@@ -1349,6 +1349,20 @@ gfc_add_volatile (symbol_attribute *attr, const char *
 			 where))
   return false;
 
+  /* F2008:  C1282 A designator of a variable with the VOLATILE attribute
+ shall not appear in a pure subprogram.
+
+ F2018: C1588 A local variable of a pure subprogram, or of a BLOCK
+ construct within a pure subprogram, shall not have the SAVE or
+ VOLATILE attribute.  */
+  if (gfc_pure (NULL))
+{
+  gfc_error ("VOLATILE attribute at %L cannot be specified in a "
+		 "PURE procedure", where);
+  return false;
+}
+
+
   attr->volatile_ = 1;
   attr->volatile_ns = gfc_current_ns;
   return check_conflict (attr, name, where);
Index: gcc/testsuite/gfortran.dg/pr63514.f90
===
--- gcc/testsuite/gfortran.dg/pr63514.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr63514.f90	(working copy)
@@ -0,0 +1,41 @@
+! { dg-do compile }
+! PR fortran/63514.f90
+program foo
+
+   implicit none
+
+   integer, volatile :: n
+
+   n = 0
+
+   call bar
+   call bah
+
+   contains
+
+   subroutine bar
+  integer k
+  integer, volatile :: m
+  block
+ integer, save :: i
+ integer, volatile :: j
+ i = 42
+ j = 2 * i
+ k = i + j + n
+  end block
+   end subroutine bar
+
+   pure subroutine bah
+  integer k
+  integer, volatile :: m ! { dg-error "cannot be specified in a PURE" }
+  block
+ integer, save :: i  ! { dg-error "cannot be specified in a PURE" }
+ integer, volatile :: j  ! { dg-error "cannot be specified in a PURE" }
+ i = 42  ! { dg-error "has no IMPLICIT type" }
+ j = 2 * i   ! { dg-error "has no IMPLICIT type" }
+ k = i + j + n
+  end block
+  m = k * m  ! { dg-error "has no IMPLICIT type" }
+   end subroutine bah
+
+end program foo

Re: [PATCH, rs6000] Add missing test cases, fix arguments to match specifications.

2018-06-08 Thread Segher Boessenkool

On Fri, Jun 08, 2018 at 10:09:23AM -0700, Carl Love wrote:
> > > @@ -100,7 +152,6 @@ extract_uchar_15 (vector unsigned char a)
> > >  /* { dg-final { scan-assembler "extsb " } } */
> > >  /* { dg-final { scan-assembler "extsh " } } */
> > >  /* { dg-final { scan-assembler "extsw " } } */
> > > -/* { dg-final { scan-assembler-not "m\[ft\]vsr" } } */
> > >  /* { dg-final { scan-assembler-not "stxvd2x "   } } */
> > >  /* { dg-final { scan-assembler-not "stxv "  } } */
> > >  /* { dg-final { scan-assembler-not "lwa "   } } */
> > 
> > Why delete this?  The changelog doesn't mention it either.
> > 
> > Otherwise okay for trunk.  Thanks!
> 
> I went back and looked at that, it has been awhile since I did the
> patch and don't remember the details.  The above occurs in file 
> gcc/testsuite/gcc.target/powerpc/p9-extract-1.c.  Reading the test
> file, there is a comment at the top that I probably didn't read before.
> 
> /* Test that under ISA 3.0 (-mcpu=power9), the compiler optimizes conversion 
> to 
>    double after a vec_extract to use the VEXTRACTU{B,H} or XXEXTRACTUW
>   
>    instructions (which leaves the result in a vector register), and not the   
>   
>    VEXTU{B,H,W}{L,R}X instructions (which needs a direct move to do the 
> floating
>    point conversion).  */ 
> 
> So, the dg-final { scan-assembler-not "m\[ft\]vsr"  is checking to make
> sure we are not using any direct moves.  The new vec_extract tests with
> the "long long int" and "long long bool" are generating the move
> instruction.  It looks like the existing GCC support for VEXTRACTU{B,H}
> or XXEXTRACTUW doesn't include support for extracting a double element.

Maybe split the test file in two, then?  Or just do scan-assembler-times
for mfvsr and for mtvsr?

>  There is a new Power 9 instruction vextractd for extracting double as
> well as the new Power 9 instructions vextractub, vextractuw,
> vextractuh.  At first glance I didn't see an xxextractd or similar
> instruction.  Will need to look further.  So, that said, it looks like
> I really need to add the support to GCC to extract the double element. 
> Based on a quick look at the code, that is not trivial.  So, I have
> dropped the changes to file p9-extract-1.c from the patch.  The updated
> patch is given below.
> 
> Please let me know if this revised patch is OK for mainline.  The
> changes to p9-extract-1.c will be addressed in a future patch.  Thanks.

Looks fine to me.  Okay for trunk.  Thanks!


Segher

Re: [PATCH, rs6000] Fix PR85755: PowerPC Gcc's -mupdate produces inefficient code

2018-06-08 Thread Peter Bergner

On 6/8/18 12:12 PM, Segher Boessenkool wrote:
> On Fri, Jun 08, 2018 at 12:07:34PM -0500, Peter Bergner wrote:
>> On 6/8/18 11:21 AM, Segher Boessenkool wrote:
>>> On Fri, Jun 08, 2018 at 10:35:22AM -0500, Peter Bergner wrote:
 +/* { dg-final { scan-assembler-times {\mstdu\M} 2 } } */
 +/* { dg-final { scan-assembler-not {\mstfdu\M} } } */
>>>
>>> Does this need p8 at all?  Would it be better to just test without -mcpu=,
>>> on just whatever default cpu is thrown at it?  p8 is default for powerpc64le
>>> so it will get plenty coverage.
>>>
>>> You do need an lp64 target btw.
>>
>> I guess I was just following what was reported in the bugzilla, but you
>> are correct, we don't need -mcpu=power8.  How about the following?
> 
> Looks perfect, thanks!

Ok, I committed the patch with the updated test case to trunk.
I'll let that bake over the weekend before committing the backports
to the GCC 7 and 8 release branches.  Thanks!

Peter

Re: [PATCH, rs6000] Fix PR85755: PowerPC Gcc's -mupdate produces inefficient code

2018-06-08 Thread Segher Boessenkool

On Fri, Jun 08, 2018 at 12:07:34PM -0500, Peter Bergner wrote:
> On 6/8/18 11:21 AM, Segher Boessenkool wrote:
> > On Fri, Jun 08, 2018 at 10:35:22AM -0500, Peter Bergner wrote:
> >> +/* { dg-final { scan-assembler-times {\mstdu\M} 2 } } */
> >> +/* { dg-final { scan-assembler-not {\mstfdu\M} } } */
> > 
> > Does this need p8 at all?  Would it be better to just test without -mcpu=,
> > on just whatever default cpu is thrown at it?  p8 is default for powerpc64le
> > so it will get plenty coverage.
> > 
> > You do need an lp64 target btw.
> 
> I guess I was just following what was reported in the bugzilla, but you
> are correct, we don't need -mcpu=power8.  How about the following?

Looks perfect, thanks!


Segher

[PATCH][PR sanitizer/86090] Add checks for lstat and readlink to sanitizer configure.

2018-06-08 Thread Denis Khalikov

Hello,
this is a patch for PR sanitizer/86090
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86090
Thanks.
From: Denis Khalikov 
Date: Fri, 8 Jun 2018 19:53:01 +0300
Subject: [PATCH] PR sanitizer/86090

* configure.ac: Check for lstat and readlink.
* configure, config.h.in: Rebuild.
---
 libsanitizer/ChangeLog| 6 ++
 libsanitizer/config.h.in  | 6 ++
 libsanitizer/configure| 2 +-
 libsanitizer/configure.ac | 2 +-
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/libsanitizer/ChangeLog b/libsanitizer/ChangeLog
index 43eb1de..4c669dd 100644
--- a/libsanitizer/ChangeLog
+++ b/libsanitizer/ChangeLog
@@ -1,3 +1,9 @@
+2018-06-08  Denis Khalikov  
+
+PR sanitizer/86090
+* configure.ac: Check for lstat and readlink.
+* configure, config.h.in: Rebuild.
+
 2018-05-31  Matthias Klose  
 
 	PR sanitizer/86012
diff --git a/libsanitizer/config.h.in b/libsanitizer/config.h.in
index 7195840..f716c24 100644
--- a/libsanitizer/config.h.in
+++ b/libsanitizer/config.h.in
@@ -43,6 +43,12 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_MEMORY_H
 
+/* Define to 1 if you have the `lstat' function. */
+#undef HAVE_LSTAT
+
+/* Define to 1 if you have the `readlink' function. */
+#undef HAVE_READLINK
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_RPC_XDR_H
 
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 4695bc7..5836450 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -15509,7 +15509,7 @@ fi
 
 
 # Check for functions needed.
-for ac_func in clock_getres clock_gettime clock_settime
+for ac_func in clock_getres clock_gettime clock_settime lstat readlink
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/libsanitizer/configure.ac b/libsanitizer/configure.ac
index 0d11afd..34ba09f 100644
--- a/libsanitizer/configure.ac
+++ b/libsanitizer/configure.ac
@@ -93,7 +93,7 @@ AM_CONDITIONAL(TSAN_SUPPORTED, [test "x$TSAN_SUPPORTED" = "xyes"])
 AM_CONDITIONAL(LSAN_SUPPORTED, [test "x$LSAN_SUPPORTED" = "xyes"])
 
 # Check for functions needed.
-AC_CHECK_FUNCS(clock_getres clock_gettime clock_settime)
+AC_CHECK_FUNCS(clock_getres clock_gettime clock_settime lstat readlink)
 
 # Common libraries that we need to link against for all sanitizer libs.
 link_sanitizer_common='-lpthread -lm'
-- 
1.9.1

Re: [PATCH, rs6000] Add missing test cases, fix arguments to match specifications.

2018-06-08 Thread Carl Love

Segher:
> 
> > @@ -100,7 +152,6 @@ extract_uchar_15 (vector unsigned char a)
> >  /* { dg-final { scan-assembler "extsb " } } */
> >  /* { dg-final { scan-assembler "extsh " } } */
> >  /* { dg-final { scan-assembler "extsw " } } */
> > -/* { dg-final { scan-assembler-not "m\[ft\]vsr" } } */
> >  /* { dg-final { scan-assembler-not "stxvd2x "   } } */
> >  /* { dg-final { scan-assembler-not "stxv "  } } */
> >  /* { dg-final { scan-assembler-not "lwa "   } } */
> 
> Why delete this?  The changelog doesn't mention it either.
> 
> Otherwise okay for trunk.  Thanks!
> 

I went back and looked at that, it has been awhile since I did the
patch and don't remember the details.  The above occurs in file 
gcc/testsuite/gcc.target/powerpc/p9-extract-1.c.  Reading the test
file, there is a comment at the top that I probably didn't read before.

/* Test that under ISA 3.0 (-mcpu=power9), the compiler optimizes conversion to 

   double after a vec_extract to use the VEXTRACTU{B,H} or XXEXTRACTUW  

   instructions (which leaves the result in a vector register), and not the 

   VEXTU{B,H,W}{L,R}X instructions (which needs a direct move to do the 
floating
   point conversion).  */ 

So, the dg-final { scan-assembler-not "m\[ft\]vsr"  is checking to make
sure we are not using any direct moves.  The new vec_extract tests with
the "long long int" and "long long bool" are generating the move
instruction.  It looks like the existing GCC support for VEXTRACTU{B,H}
or XXEXTRACTUW doesn't include support for extracting a double element.
 There is a new Power 9 instruction vextractd for extracting double as
well as the new Power 9 instructions vextractub, vextractuw,
vextractuh.  At first glance I didn't see an xxextractd or similar
instruction.  Will need to look further.  So, that said, it looks like
I really need to add the support to GCC to extract the double element. 
Based on a quick look at the code, that is not trivial.  So, I have
dropped the changes to file p9-extract-1.c from the patch.  The updated
patch is given below.

Please let me know if this revised patch is OK for mainline.  The
changes to p9-extract-1.c will be addressed in a future patch.  Thanks.

 Carl Love



gcc/testsuite/ChangeLog:

2018-06-08  Carl Love  

* gcc.target/powerpc/p8vector-builtin-3.c: Add vec_pack test. Update
vpkudum counts.
* gcc.target/powerpc/p9-extract-3.c: Make second argument of
vec_extract a signed int.
* gcc.target/powerpc/vec-cmp.c: Add vec_cmple, vec_cmpge tests. Update,
vcmpgtsb, vcmpgtub, vcmpgtsh, vcmpgtuh, vcmpgtsw, vcmpgtsw, vcmpgtuw,
vcmpgtsd, vcmpgtud.
* gcc.target/powerpc/vsx-extract-4.c: Make second argument of
vec_extract a signed int.
* gcc.target/powerpc/vsx-extract-5.c: Make second argument of
vec_extract a signed int.
* gcc.target/powerpc/vsx-vector-7.c (foo): Add tests for vec_sel and
vec_xor builtins.  Update xxsel, xxlxor counts.
---
 .../gcc.target/powerpc/p8vector-builtin-3.c   |   9 +-
 .../gcc.target/powerpc/p9-extract-3.c |  36 ++--
 gcc/testsuite/gcc.target/powerpc/vec-cmp.c| 159 +-
 .../gcc.target/powerpc/vsx-extract-4.c|  24 ++-
 .../gcc.target/powerpc/vsx-extract-5.c|  24 ++-
 .../gcc.target/powerpc/vsx-vector-7.c |  72 ++--
 6 files changed, 276 insertions(+), 48 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-3.c 
b/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-3.c
index ff50a9aad..56ba6c722 100644
--- a/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-3.c
@@ -33,7 +33,12 @@ vi_sign vi_pack_2 (vll_sign a, vll_sign b)
   return vec_pack (a, b);
 }
 
-vi_sign vi_pack_3 (vll_sign a, vll_sign b)
+vi_uns vi_pack_3 (vll_uns a, vll_uns b)
+{
+  return vec_pack (a, b);
+}
+
+vi_sign vi_pack_4 (vll_sign a, vll_sign b)
 {
   return vec_vpkudum (a, b);
 }
@@ -98,7 +103,7 @@ vll_sign vll_unpack_lo_3 (vi_sign a)
   return vec_vupklsw (a);
 }
 
-/* { dg-final { scan-assembler-times "vpkudum" 3 } } */
+/* { dg-final { scan-assembler-times "vpkudum" 4 } } */
 /* { dg-final { scan-assembler-times "vpkuwum" 3 } } */
 /* { dg-final { scan-assembler-times "vpkuhum" 3 } } */
 /* { dg-final { scan-assembler-times "vupklsw" 3 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-extract-3.c 
b/gcc/testsuite/gcc.target/powerpc/p9-extract-3.c
index 90b3eae83..68a0cda01 100644
--- a/gcc/testsuite/gcc.target/powerpc/p9-extract-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-extract-3.c
@@ -14,84 +14,96 @@
 double
 fpcvt_int_0 (vector int a)
 {
-  int b = vec_extract (a, 0);
+  int c = 0;
+  int b = vec_extract (a, c);
   return (double)b;
 }
 
 double
 fpcvt_int_3 (vector int a)
 {
-  int b = vec_extract (a, 3);
+  int c = 3;
+  int b =

Re: [PATCH, rs6000] Fix PR85755: PowerPC Gcc's -mupdate produces inefficient code

2018-06-08 Thread Peter Bergner

On 6/8/18 11:21 AM, Segher Boessenkool wrote:
> On Fri, Jun 08, 2018 at 10:35:22AM -0500, Peter Bergner wrote:
>> +/* { dg-final { scan-assembler-times {\mstdu\M} 2 } } */
>> +/* { dg-final { scan-assembler-not {\mstfdu\M} } } */
> 
> Does this need p8 at all?  Would it be better to just test without -mcpu=,
> on just whatever default cpu is thrown at it?  p8 is default for powerpc64le
> so it will get plenty coverage.
> 
> You do need an lp64 target btw.

I guess I was just following what was reported in the bugzilla, but you
are correct, we don't need -mcpu=power8.  How about the following?

Peter

Index: pr85755.c
===
--- pr85755.c   (nonexistent)
+++ pr85755.c   (working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-options "-O1" } */
+
+void
+preinc (long *q, long n)
+{
+  long i;
+  for (i = 0; i < n; i++)
+q[i] = i;
+}
+
+void
+predec (long *q, long n)
+{
+  long i;
+  for (i = n; i >= 0; i--)
+q[i] = i;
+}
+
+/* { dg-final { scan-assembler-times {\mstwu\M} 2 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mstdu\M} 2 { target lp64 } } } */
+/* { dg-final { scan-assembler-not {\mstfdu\M} } } */

[PATCH] Define special members as defaulted

2018-06-08 Thread Jonathan Wakely


This adds defaulted definitions for a few more types where implicitly
declaring them is deprecated (as discussed at the WG21 meeting this
week).

* include/bits/ios_base.h (ios::Init::Init(const Init&))
(ios::Init::operator=): Define as defaulted.
* include/bits/stl_bvector.h (_Bit_reference(const _Bit_reference&)):
Likewise.
* include/bits/stream_iterator.h (istream_iterator::operator=)
(ostream_iterator::operator=): Likewise.
* include/bits/streambuf_iterator.h (istreambuf_iterator::operator=)
Likewise.
* include/std/bitset (bitset::reference::reference(const reference&)):
Likewise.
* include/std/complex (complex::complex(const complex&))
(complex::complex(const complex&))
(complex::complex(const complex&)): Likewise.

Tested powerpc64le-linux, committed to trunk.


commit 3121248b11c1e5a98e9c6337a864471306796000
Author: Jonathan Wakely 
Date:   Fri Jun 8 17:02:22 2018 +0100

Define special members as defaulted

* include/bits/ios_base.h (ios::Init::Init(const Init&))
(ios::Init::operator=): Define as defaulted.
* include/bits/stl_bvector.h (_Bit_reference(const 
_Bit_reference&)):
Likewise.
* include/bits/stream_iterator.h (istream_iterator::operator=)
(ostream_iterator::operator=): Likewise.
* include/bits/streambuf_iterator.h (istreambuf_iterator::operator=)
Likewise.
* include/std/bitset (bitset::reference::reference(const 
reference&)):
Likewise.
* include/std/complex (complex::complex(const complex&))
(complex::complex(const complex&))
(complex::complex(const complex&)): Likewise.

diff --git a/libstdc++-v3/include/bits/ios_base.h 
b/libstdc++-v3/include/bits/ios_base.h
index c0c4e3b2abe..819afb96187 100644
--- a/libstdc++-v3/include/bits/ios_base.h
+++ b/libstdc++-v3/include/bits/ios_base.h
@@ -607,6 +607,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   Init();
   ~Init();
 
+#if __cplusplus >= 201103L
+  Init(const Init&) = default;
+  Init& operator=(const Init&) = default;
+#endif
+
 private:
   static _Atomic_word  _S_refcount;
   static bool  _S_synced_with_stdio;
diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 24594044d7a..4527ce7832a 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -79,6 +79,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
 _Bit_reference() _GLIBCXX_NOEXCEPT : _M_p(0), _M_mask(0) { }
 
+#if __cplusplus >= 201103L
+_Bit_reference(const _Bit_reference&) = default;
+#endif
+
 operator bool() const _GLIBCXX_NOEXCEPT
 { return !!(*_M_p & _M_mask); }
 
diff --git a/libstdc++-v3/include/bits/stream_iterator.h 
b/libstdc++-v3/include/bits/stream_iterator.h
index 002310c07a2..7b682d2959e 100644
--- a/libstdc++-v3/include/bits/stream_iterator.h
+++ b/libstdc++-v3/include/bits/stream_iterator.h
@@ -74,6 +74,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _M_ok(__obj._M_ok)
   { }
 
+#if __cplusplus >= 201103L
+  istream_iterator& operator=(const istream_iterator&) = default;
+#endif
+
   const _Tp&
   operator*() const
   {
@@ -188,6 +192,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   ostream_iterator(const ostream_iterator& __obj)
   : _M_stream(__obj._M_stream), _M_string(__obj._M_string)  { }
 
+#if __cplusplus >= 201103L
+  ostream_iterator& operator=(const ostream_iterator&) = default;
+#endif
+
   /// Writes @a value to underlying ostream using operator<<.  If
   /// constructed with delimiter string, writes delimiter to ostream.
   ostream_iterator&
diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 292ef3a5335..8a3a382325a 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -121,6 +121,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   istreambuf_iterator(streambuf_type* __s) _GLIBCXX_USE_NOEXCEPT
   : _M_sbuf(__s), _M_c(traits_type::eof()) { }
 
+#if __cplusplus >= 201103L
+  istreambuf_iterator&
+  operator=(const istreambuf_iterator&) noexcept = default;
+#endif
+
   ///  Return the current character pointed to by iterator.  This returns
   ///  streambuf.sgetc().  It cannot be assigned.  NB: The result of
   ///  operator*() on an end of stream is undefined.
diff --git a/libstdc++-v3/include/std/bitset b/libstdc++-v3/include/std/bitset
index e598ea312a7..25e44d1553d 100644
--- a/libstdc++-v3/include/std/bitset
+++ b/libstdc++-v3/include/std/bitset
@@ -816,6 +816,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  _M_bpos = _Base::_S_whichbit(__pos);
}
 
+#if __cplusplus >= 201103L
+   reference(const reference&) = default;
+#endif
+
~reference()

Re: [PATCH, rs6000] Fix PR85755: PowerPC Gcc's -mupdate produces inefficient code

2018-06-08 Thread Segher Boessenkool

On Fri, Jun 08, 2018 at 10:35:22AM -0500, Peter Bergner wrote:
> The fix for PR83969 accidentally disallowed PRE_INC and PRE_DEC addresses
> from being matched for the Y constraint leading to poor code generation.
> The old PRE_INC and PRE_DEC addresses were originally accepted via the
> address_offset() call and test, but the fix for PR83969 added the test
> for rs6000_offsettable_memref_p() and that doesn't accept PRE_INC/PRE_DEC.
> 
> My earlier patch just tried moving the rs6000_offsettable_memref_p() call
> to after the address_offset() call, but I now remember why I had it placed
> before it.  The problem was that the address_offset() call and test was
> incorrectly accepting some non-offsetable addresses, so we need to test for
> rs6000_offsettable_memref_p() first.  However, rs6000_offsettable_memref_p()
> doesn't accept PRE_INC/PRE_DEC addresses, so the fix used here is to just
> test for them explicitly before the other tests, which fixes the reported
> bug and doesn't regress the older bugs.
> 
> Is this ok for trunk and after some trunk burn in, the GCC 8 and 7 release
> branches where the earlier fixes were backported to?  All bootstrap builds
> completed and the testsuite runs all showed no regressions.

Looks good, please commit.  Modulo some testcase things:

> --- gcc/testsuite/gcc.target/powerpc/pr85755.c(nonexistent)
> +++ gcc/testsuite/gcc.target/powerpc/pr85755.c(working copy)
> @@ -0,0 +1,24 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
> "-mcpu=power8" } } */
> +/* { dg-options "-O1 -mcpu=power8" } */
> +
> +void
> +preinc (long *q, long n)
> +{
> +  long i;
> +  for (i = 0; i < n; i++)
> +q[i] = i;
> +}
> +
> +void
> +predec (long *q, long n)
> +{
> +  long i;
> +  for (i = n; i >= 0; i--)
> +q[i] = i;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mstdu\M} 2 } } */
> +/* { dg-final { scan-assembler-not {\mstfdu\M} } } */

Does this need p8 at all?  Would it be better to just test without -mcpu=,
on just whatever default cpu is thrown at it?  p8 is default for powerpc64le
so it will get plenty coverage.

You do need an lp64 target btw.

Thanks!


Segher

[PATCH] Protect rs6000_passes_ieee128 declaration

2018-06-08 Thread David Edelsohn

The new variable rs6000_passes_ieee128 is not referenced on non-ELF
paths. This patch protects the definition to avoid unused variable
warnings.

Thanks, David

* config/rs6000/rs6000.c (rs6000_passes_ieee128): Protect with #if TARGET_ELF.

Index: rs6000.c
===
--- rs6000.c(revision 261335)
+++ rs6000.c(working copy)
@@ -197,12 +197,14 @@
of this machine mode.  */
 scalar_int_mode rs6000_pmode;

+#if TARGET_ELF
 /* Note whether IEEE 128-bit floating point was passed or returned, either as
the __float128/_Float128 explicit type, or when long double is IEEE 128-bit
floating point.  We changed the default C++ mangling for these types and we
may want to generate a weak alias of the old mangling (U10__float128) to the
new mangling (u9__ieee128).  */
 static bool rs6000_passes_ieee128;
+#endif

 /* Generate the manged name (i.e. U10__float128) used in GCC 8.1, and not the
name used in current releases (i.e. u9__ieee128).  */

Re: [PATCH, rs6000] Fix PR85755: PowerPC Gcc's -mupdate produces inefficient code

2018-06-08 Thread Peter Bergner

On 6/7/18 8:16 PM, Peter Bergner wrote:
> On 6/7/18 5:12 PM, Peter Bergner wrote:
>> Is this ok for trunk and the release branches where the earlier fixes
>> were backported to, assuming no bootstrap errors and the testsuite runs
>> do not show any regressions?
> 
> Hold off for now.  I'm seeing a TImode issue I need to debug first.

The fix for PR83969 accidentally disallowed PRE_INC and PRE_DEC addresses
from being matched for the Y constraint leading to poor code generation.
The old PRE_INC and PRE_DEC addresses were originally accepted via the
address_offset() call and test, but the fix for PR83969 added the test
for rs6000_offsettable_memref_p() and that doesn't accept PRE_INC/PRE_DEC.

My earlier patch just tried moving the rs6000_offsettable_memref_p() call
to after the address_offset() call, but I now remember why I had it placed
before it.  The problem was that the address_offset() call and test was
incorrectly accepting some non-offsetable addresses, so we need to test for
rs6000_offsettable_memref_p() first.  However, rs6000_offsettable_memref_p()
doesn't accept PRE_INC/PRE_DEC addresses, so the fix used here is to just
test for them explicitly before the other tests, which fixes the reported
bug and doesn't regress the older bugs.

Is this ok for trunk and after some trunk burn in, the GCC 8 and 7 release
branches where the earlier fixes were backported to?  All bootstrap builds
completed and the testsuite runs all showed no regressions.

gcc/
PR target/85755
* config/rs6000/rs6000.c (mem_operand_gpr): Enable PRE_INC and PRE_DEC
addresses.

gcc/testsuite/
PR target/85755
* gcc.target/powerpc/pr85755.c: New test.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 261279)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7997,6 +7997,13 @@ mem_operand_gpr (rtx op, machine_mode mo
   int extra;
   rtx addr = XEXP (op, 0);

+  /* PR85755: Allow PRE_INC and PRE_DEC addresses.  */
+  if (TARGET_UPDATE
+  && (GET_CODE (addr) == PRE_INC || GET_CODE (addr) == PRE_DEC)
+  && mode_supports_pre_incdec_p (mode)
+  && legitimate_indirect_address_p (XEXP (addr, 0), false))
+return true;
+
   /* Don't allow non-offsettable addresses.  See PRs 83969 and 84279.  */
   if (!rs6000_offsettable_memref_p (op, mode, false))
 return false;
Index: gcc/testsuite/gcc.target/powerpc/pr85755.c
===
--- gcc/testsuite/gcc.target/powerpc/pr85755.c  (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr85755.c  (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-O1 -mcpu=power8" } */
+
+void
+preinc (long *q, long n)
+{
+  long i;
+  for (i = 0; i < n; i++)
+q[i] = i;
+}
+
+void
+predec (long *q, long n)
+{
+  long i;
+  for (i = n; i >= 0; i--)
+q[i] = i;
+}
+
+/* { dg-final { scan-assembler-times {\mstdu\M} 2 } } */
+/* { dg-final { scan-assembler-not {\mstfdu\M} } } */

Re: [PATCH][wwwdocs][arm] Mention removal of deprecated architectures

2018-06-08 Thread Kyrill Tkachov

On 08/06/18 16:16, Gerald Pfeifer wrote:

On Fri, 8 Jun 2018, Kyrill  Tkachov wrote:
>> Now that Gerald has created a changes.html page for GCC 9 here's
>> an entry about the removal of older arm architectures.

Great!  Happy to see that the page template is being used already,
and thanks for starting to fill it.

> And I got a message from Gerald's validation bot that my commit failed
> validation :(
> http://validator.w3.org/check?uri=http://gcc.gnu.org/gcc-9/changes.html
:
> Gerald (or anyone else with insight into this), is this the right fix?

Yes, this looks quite right.  If it still fails for some other
reason after you commit this fix, I'll take care.

Thanks Gerald! The above validator looks happy now.
Kyrill

Gerald

Re: [PATCH][wwwdocs][arm] Mention removal of deprecated architectures

2018-06-08 Thread Gerald Pfeifer


On Fri, 8 Jun 2018, Kyrill  Tkachov wrote:
Now that Gerald has created a changes.html page for GCC 9 here's 
an entry about the removal of older arm architectures.


Great!  Happy to see that the page template is being used already,
and thanks for starting to fill it.


And I got a message from Gerald's validation bot that my commit failed
validation :(
http://validator.w3.org/check?uri=http://gcc.gnu.org/gcc-9/changes.html

:

Gerald (or anyone else with insight into this), is this the right fix?


Yes, this looks quite right.  If it still fails for some other
reason after you commit this fix, I'll take care.

Gerald

Break LTO cgraph dump into more pieces.

2018-06-08 Thread Jan Hubicka

Hi,
this patch splits LTO cgraph dump file into multiple files:
 1) cgraph which contains pretty much what cgraph dump has in normal compilation
 2) lto-link which contains symtab before symtab merging and decision of the
LTO linker
 3) lto-decl-merge which dumps declaration merging
 4) lto-partition which contains partitioning decisions.

The main motivation is to make it easier to find relevant data.  Bit ugly is 
that
dumps are not quite chronological. Technically lto-link happens first, 
decl-merge next,
then all optimization passes are done and lto-partition is last.  I don't know 
how to
arrange it and I think i tis OK as it is.

Other bit non-standard thing is that I have added them as ipa dumps. It may 
also make
sense to declare them language dumps since LTO is techincally front-end but it 
seems
bit misleading to me. I think it makes more sense if they appear with 
-fdump-ipa-all.

Bootstrapped/regtested x86_64-linux, will commit later if there are no 
complains.

Honza

* dumpfile.c (FIRST_ME_AUTO_NUMBERED_DUMP): Bump to 4.
* lto-lang.c (lto_link_dump_id, decl_merge_dump_id, partition_dump_id):
New global vars.
(lto_register_dumps): New hook.
(LANG_HOOKS_REGISTER_DUMPS): New.
* lto-partition.c: Dump into dump_file instead of symtab->dump_file.
* lto-symtab.c: Include lto.h; dump into dump_file instead of
symtab->dump_file.
(lto_symtab_merge_decls): Initialize dump file.
* lto.c (read_cgraph_and_symbols): Initialize dump file.
(do_whole_program_analysis): Likewise.
Index: dumpfile.c
===
--- dumpfile.c  (revision 261327)
+++ dumpfile.c  (working copy)
@@ -65,7 +65,7 @@
   DUMP_FILE_INFO (".gimple", "tree-gimple", DK_tree, 0),
   DUMP_FILE_INFO (".nested", "tree-nested", DK_tree, 0),
 #define FIRST_AUTO_NUMBERED_DUMP 1
-#define FIRST_ME_AUTO_NUMBERED_DUMP 3
+#define FIRST_ME_AUTO_NUMBERED_DUMP 4
 
   DUMP_FILE_INFO (NULL, "lang-all", DK_lang, 0),
   DUMP_FILE_INFO (NULL, "tree-all", DK_tree, 0),
Index: lto/lto-lang.c
===
--- lto/lto-lang.c  (revision 261327)
+++ lto/lto-lang.c  (working copy)
@@ -37,6 +37,9 @@
 #include "stringpool.h"
 #include "attribs.h"
 
+/* LTO specific dumps.  */
+int lto_link_dump_id, decl_merge_dump_id, partition_dump_id;
+
 static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
 static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
 static tree handle_const_attribute (tree *, tree, tree, int, bool *);
@@ -1375,6 +1378,23 @@
   return true;
 }
 
+/* Register c++-specific dumps.  */
+
+void
+lto_register_dumps (gcc::dump_manager *dumps)
+{
+  lto_link_dump_id = dumps->dump_register
+(".lto-link", "ipa-lto-link", "ipa-lto-link",
+ DK_ipa, OPTGROUP_NONE, false);
+  decl_merge_dump_id = dumps->dump_register
+(".lto-decl-merge", "ipa-lto-decl-merge", "ipa-lto-decl-merge",
+ DK_ipa, OPTGROUP_NONE, false);
+  partition_dump_id = dumps->dump_register
+(".lto-partition", "ipa-lto-partition", "ipa-lto-partition",
+ DK_ipa, OPTGROUP_NONE, false);
+}
+
+
 /* Initialize tree structures required by the LTO front end.  */
 
 static void lto_init_ts (void)
@@ -1390,6 +1410,8 @@
 #define LANG_HOOKS_COMPLAIN_WRONG_LANG_P lto_complain_wrong_lang_p
 #undef LANG_HOOKS_INIT_OPTIONS_STRUCT
 #define LANG_HOOKS_INIT_OPTIONS_STRUCT lto_init_options_struct
+#undef LANG_HOOKS_REGISTER_DUMPS
+#define LANG_HOOKS_REGISTER_DUMPS lto_register_dumps
 #undef LANG_HOOKS_HANDLE_OPTION
 #define LANG_HOOKS_HANDLE_OPTION lto_handle_option
 #undef LANG_HOOKS_POST_OPTIONS
Index: lto/lto-partition.c
===
--- lto/lto-partition.c (revision 261327)
+++ lto/lto-partition.c (working copy)
@@ -160,8 +160,8 @@
   if (symbol_partitioned_p (node))
 {
   node->in_other_partition = 1;
-  if (symtab->dump_file)
-   fprintf (symtab->dump_file,
+  if (dump_file)
+   fprintf (dump_file,
 "Symbol node %s now used in multiple partitions\n",
 node->name ());
 }
@@ -541,13 +541,13 @@
   order.qsort (node_cmp);
   noreorder.qsort (node_cmp);
 
-  if (symtab->dump_file)
+  if (dump_file)
 {
   for (unsigned i = 0; i < order.length (); i++)
-   fprintf (symtab->dump_file, "Balanced map symbol order:%s:%u\n",
+   fprintf (dump_file, "Balanced map symbol order:%s:%u\n",
 order[i]->name (), order[i]->tp_first_run);
   for (unsigned i = 0; i < noreorder.length (); i++)
-   fprintf (symtab->dump_file, "Balanced map symbol no_reorder:%s:%u\n",
+   fprintf (dump_file, "Balanced map symbol no_reorder:%s:%u\n",
 noreorder[i]->name (), noreorder[i]->tp_first_run);
 }
 
@@ -569,8 +569,8 @@
 partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
   npartitions = 1;
   partition =

Re: [1/2] Add option to disable c++11 std::string small-size

2018-06-08 Thread Mikhail Kashkarov

Updated.


On 05/29/2018 09:53 AM, Mikhail Kashkarov wrote:
> Add option to disable c++11 std::string small-sizeoptimization usage.
>
>       * include/bits/basic_string.h [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
>       (basic_string::_M_is_local, basic_string::_M_destroy)
>       (basic_string::basic_string, basic_string::basic_string(const _Alloc&))
>       (basic_string::basic_string(basic_string&&))
>       (basic_string::basic_string(basic_string&&, const _Alloc&))
>       (basic_string::operator=(const basic_string&))
>       (basic_string::operator=(basic_string&&))
>       (basic_string::clear()): Disable usage of _M_local_buf if
>        _GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
>       * include/bits/basic_string.tcc [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
>       (basic_string::_M_construct, basic_string::reserve)
>       (basic_string::_M_replace): Disable usage of _M_local_buf if
>       _GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
>       * testsuite/basic_string/allocator/char/copy_assign.cc: Support for
>       std::string without SSO.
>       * testsuite/basic_string/allocator/wchar_t/copy_assign.cc: Likewise.
>       * testsuite/21_strings/basic_string/init-list.cc: Likewise.
>       * testsuite/rand/assoc/rand_regression_test.hpp: Likewise.
>       * testsuite/rand/priority_queue/rand_regression_test.hpp: Likewise.
>

-- 
Best regards,
Kashkarov Mikhail
Samsung R Institute Russia

From 49c34919fba3000d751ac505b498eb6b21b0f4b3 Mon Sep 17 00:00:00 2001
From: Mikhail Kashkarov 
Date: Fri, 8 Jun 2018 12:22:33 +0300
Subject: [PATCH 1/2] Add option to disable c++11 std::basic_string SSO
 optimization.

	* include/bits/basic_string.h [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
	(basic_string::_M_is_local())
	(basic_string::_M_destroy(size_type __size))
	(basic_string::basic_string())
	(basic_string::basic_string(const _Alloc& __a))
	(basic_string::basic_string(basic_string&& __str))
	(basic_string::basic_string(basic_string&& __str, const _Alloc& __a))
	(basic_string::operator=(const basic_string& __str))
	(basic_string::operator=(basic_string&& __str))
	(basic_string::clear()): Disable usage of _M_local_buf if
	 _GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
	* include/bits/basic_string.tcc [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
	(basic_string::_M_construct, basic_string::reserve)
	(basic_string::_M_replace): Disable usage of _M_local_buf if
	_GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
	* testsuite/basic_string/allocator/char/copy_assign.cc: Support for
	std::string without SSO.
	* testsuite/basic_string/allocator/wchar_t/copy_assign.cc: Likewise.
	* testsuite/21_strings/basic_string/init-list.cc: Likewise.
	* testsuite/rand/assoc/rand_regression_test.hpp: Likewise.
	* testsuite/rand/priority_queue/rand_regression_test.hpp: Likewise.
---
 libstdc++-v3/include/bits/basic_string.h   | 104 +++--
 libstdc++-v3/include/bits/basic_string.tcc |  46 -
 .../basic_string/allocator/char/copy_assign.cc |   4 +
 .../basic_string/allocator/wchar_t/copy_assign.cc  |   4 +
 .../testsuite/21_strings/basic_string/init-list.cc |   2 +
 .../regression/rand/assoc/rand_regression_test.hpp |   5 +
 .../rand/priority_queue/rand_regression_test.hpp   |   5 +
 7 files changed, 156 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index 5bffa1c..9d971c0 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -208,7 +208,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
   bool
   _M_is_local() const
-  { return _M_data() == _M_local_data(); }
+  {
+#if _GLIBCXX_DISABLE_STRING_SSO_USAGE
+	return false;
+#else
+	return _M_data() == _M_local_data();
+#endif
+  }
 
   // Create & Destroy
   pointer
@@ -223,7 +229,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
   void
   _M_destroy(size_type __size) throw()
-  { _Alloc_traits::deallocate(_M_get_allocator(), _M_data(), __size + 1); }
+  {
+#if _GLIBCXX_DISABLE_STRING_SSO_USAGE
+	if (!_M_allocated_capacity)
+	  return;
+#endif
+_Alloc_traits::deallocate(_M_get_allocator(), _M_data(), __size + 1);
+  }
 
   // _M_construct_aux is used to implement the 21.3.1 para 15 which
   // requires special behaviour if _InIterator is an integral type
@@ -420,7 +432,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   basic_string()
   _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
   : _M_dataplus(_M_local_data())
-  { _M_set_length(0); }
+  {
+#if _GLIBCXX_DISABLE_STRING_SSO_USAGE
+	_M_length(0);
+	_M_capacity(0);
+#else
+	_M_set_length(0);
+#endif
+  }
 
   /**
*  @brief  Construct an empty string using allocator @a a.
@@ -428,7 +447,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   explicit
   basic_string(const _Alloc& __a) _GLIBCXX_NOEXCEPT
   : _M_dataplus(_M_local_data(), __a)
-  { _M_set_length(0); }
+  {
+#if

Re: [2/2] Add AddressSanitizer annotations to std::string.

2018-06-08 Thread Mikhail Kashkarov

Hello,

I've updated patches for std::string sanitization and disabling CXX11
string SSO usage for correct sanitization.

 >>       _M_destroy(_M_allocated_capacity);
 >>+    else
 >>+      __annotate_delete();
 >
 >Do these calls definitely optimize away completely when not
 >sanitizing? Even for -O1, -Os and -Og?
 >
 >For std::vector annotation I used macros to add these annotations, so
 >there is no change to the generated code when annotations are
 >disabled. But it makes the code quite ugly.

I've checked asm code for string-inst.o and it looks like this calls are
optimized away, but there are some light changes after patch fir .

 > Right, I was wondering specifically about the 
 > instantiations. I could be wrong but I don't think anything in
 >  creates, destroys or modifies a std::basic_string.

I was confused by reinterpret_cast's on strings in fstream.tcc, looks 
like this is not needed, you are right.

 >>   // calling 4.0.x+ _S_create.
 >>   __p->_M_set_sharable();
 >>+#if _GLIBCXX_SANITIZER_ANNOTATE_STRING
 >>+  __p->_M_length = 0;
 >>+#endif
 >
 > Why is this needed? I think a comment explaining it would help (like
 > the one above explaining why calling _M_set_sharable() is needed).

Checked current version without this change, looks like now it works, 
reverted.

Short summary:
The reason for changing strings layout under sanitization is to place string
char buffer on heap for correct aligning in 32-bit environments,
both pre-CXX11 and CXX11 strings ABI.

| Sanitize string | string type | ABI is changed? | 32-bit | 64-bit |
|-+-+-++|
| FULL    | SSO-string  | yes | +  | +  |
| | COW-string  | yes | +  | +  |
|-+-+-++|
| PARTIAL | SSO-string  | no  | -+(*)  | +  |
| | COW-string  | no  | -  | +  |
*only longs strings are sanitized for 32-bit

Some functions with new define looks a bit messy without changing internal
functions(operator=), I'm also not sure if disabling string SSO usage define
should also affects other parts that relies on basic_string class size 
(checks
with static_assert in exceptions/shim-facets).


Any thoughts?

On 05/29/2018 06:55 PM, Jonathan Wakely wrote:
> On 29/05/18 18:18 +0300, Kashkarov Mikhail wrote:
>> Jonathan Wakely  writes:
 --- a/libstdc++-v3/include/bits/fstream.tcc
 +++ b/libstdc++-v3/include/bits/fstream.tcc
 @@ -1081,6 +1081,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   // Inhibit implicit instantiations for required instantiations,
   // which are defined via explicit instantiations elsewhere.
 +#if !_GLIBCXX_SANITIZE_STRING
 #if _GLIBCXX_EXTERN_TEMPLATE
   extern template class basic_filebuf;
   extern template class basic_ifstream;
 @@ -1094,6 +1095,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   extern template class basic_fstream;
 #endif
 #endif
 +#endif // !_GLIBCXX_SANITIZE_STRING
>>>
>>> Why do we need to disable these explicit instantiation declarations?
>>> Are they affected by the std::string layout changes? Is that just
>>> because of the constructors taking std::string, or something else?
>>
>> Libstdc++ build is not sanitized, so macroses that requires
>> AddressSanitizer support will not applied and these templates will be
>> instantate without support for ASan annotations.
>
> Right, I was wondering specifically about the 
> instantiations. I could be wrong but I don't think anything in
>  creates, destroys or modifies a std::basic_string.
>
>
>
>
>

-- 
Best regards,
Kashkarov Mikhail
Samsung R Institute Russia

From 4b8de0240ac091cdd43b690276f09e94bfb0ef4d Mon Sep 17 00:00:00 2001
From: Mikhail Kashkarov 
Date: Fri, 8 Jun 2018 14:11:11 +0300
Subject: [PATCH 2/2] Add AddressSanitizer annotations to std::string.

	* include/bits/c++config: define
	(_GLIBCXX_SANITIZE_STRING_PARTIAL, _GLIBCXX_SANITIZE_STRING_FULL)
	(_GLIBCXX_SANTIZE_STRING, _GLIBCXX_SANITIZER_ANNOTATE_STRING)
	(_GLIBCXX_SANITIZER_DISABLE_LOCAL_STRING_ANNOTATION)
	(_GLIBCXX_SANITIZER_ALIGN_COW_STRING)
	* doc/xml/manual/using.xml: document GLIBCXX_SANITIZE_STRING_PARTIAL,
	_GLIBCXX_SANITIZE_STRING_FULL
	* include/bits/basic_string.h [_GLIBCXX_USE_DUAL_ABI]
	(_asan_traits<_CharT, _Alloc>, _asan_traits<_CharT, allocator<_CharT>)
	(_asan_traits::__annotate_delete, _asan_traits::__annotate_new)
	(_asan_traits::__annotate_grow): New traits for annotation.
	(basic_string::__RAII_IncreaseAnnotator::__RAII_Increaseannotator)
	(basic_string::__RAII_IncreaseAnnotator::done)
	(basic_string::__RAII_IncreaseAnnotator::~RAII_Increaseannotator):
	New annotation helpers in case of exceptions.
	(basic_string::__get_beg, basic_string::__get_mid)
	(basic_string::__get_end): New annotation helpers.
	(basis_string::_M_dispose, basic_string::_M_destroy)

Re: [1/2] Add option to disable c++11 std::string SSO

2018-06-08 Thread Mikhail Kashkarov

Updated.


On 05/29/2018 09:53 AM, Mikhail Kashkarov wrote:
> Add option to disable c++11 std::string small-string optimization usage.
>
>       * include/bits/basic_string.h [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
>       (basic_string::_M_is_local, basic_string::_M_destroy)
>       (basic_string::basic_string, basic_string::basic_string(const _Alloc&))
>       (basic_string::basic_string(basic_string&&))
>       (basic_string::basic_string(basic_string&&, const _Alloc&))
>       (basic_string::operator=(const basic_string&))
>       (basic_string::operator=(basic_string&&))
>       (basic_string::clear()): Disable usage of _M_local_buf if
>        _GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
>       * include/bits/basic_string.tcc [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
>       (basic_string::_M_construct, basic_string::reserve)
>       (basic_string::_M_replace): Disable usage of _M_local_buf if
>       _GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
>       * testsuite/basic_string/allocator/char/copy_assign.cc: Support for
>       std::string without SSO.
>       * testsuite/basic_string/allocator/wchar_t/copy_assign.cc: Likewise.
>       * testsuite/21_strings/basic_string/init-list.cc: Likewise.
>       * testsuite/rand/assoc/rand_regression_test.hpp: Likewise.
>       * testsuite/rand/priority_queue/rand_regression_test.hpp: Likewise.
>

-- 
Best regards,
Kashkarov Mikhail
Samsung R Institute Russia

From 49c34919fba3000d751ac505b498eb6b21b0f4b3 Mon Sep 17 00:00:00 2001
From: Mikhail Kashkarov 
Date: Fri, 8 Jun 2018 12:22:33 +0300
Subject: [PATCH 1/2] Add option to disable c++11 std::basic_string SSO
 optimization.

	* include/bits/basic_string.h [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
	(basic_string::_M_is_local())
	(basic_string::_M_destroy(size_type __size))
	(basic_string::basic_string())
	(basic_string::basic_string(const _Alloc& __a))
	(basic_string::basic_string(basic_string&& __str))
	(basic_string::basic_string(basic_string&& __str, const _Alloc& __a))
	(basic_string::operator=(const basic_string& __str))
	(basic_string::operator=(basic_string&& __str))
	(basic_string::clear()): Disable usage of _M_local_buf if
	 _GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
	* include/bits/basic_string.tcc [_GLIBCXX_DISABLE_STRING_SSO_USAGE]:
	(basic_string::_M_construct, basic_string::reserve)
	(basic_string::_M_replace): Disable usage of _M_local_buf if
	_GLIBCXX_DISABLE_STRING_SSO_USAGE is defined.
	* testsuite/basic_string/allocator/char/copy_assign.cc: Support for
	std::string without SSO.
	* testsuite/basic_string/allocator/wchar_t/copy_assign.cc: Likewise.
	* testsuite/21_strings/basic_string/init-list.cc: Likewise.
	* testsuite/rand/assoc/rand_regression_test.hpp: Likewise.
	* testsuite/rand/priority_queue/rand_regression_test.hpp: Likewise.
---
 libstdc++-v3/include/bits/basic_string.h   | 104 +++--
 libstdc++-v3/include/bits/basic_string.tcc |  46 -
 .../basic_string/allocator/char/copy_assign.cc |   4 +
 .../basic_string/allocator/wchar_t/copy_assign.cc  |   4 +
 .../testsuite/21_strings/basic_string/init-list.cc |   2 +
 .../regression/rand/assoc/rand_regression_test.hpp |   5 +
 .../rand/priority_queue/rand_regression_test.hpp   |   5 +
 7 files changed, 156 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index 5bffa1c..9d971c0 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -208,7 +208,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
   bool
   _M_is_local() const
-  { return _M_data() == _M_local_data(); }
+  {
+#if _GLIBCXX_DISABLE_STRING_SSO_USAGE
+	return false;
+#else
+	return _M_data() == _M_local_data();
+#endif
+  }
 
   // Create & Destroy
   pointer
@@ -223,7 +229,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
   void
   _M_destroy(size_type __size) throw()
-  { _Alloc_traits::deallocate(_M_get_allocator(), _M_data(), __size + 1); }
+  {
+#if _GLIBCXX_DISABLE_STRING_SSO_USAGE
+	if (!_M_allocated_capacity)
+	  return;
+#endif
+_Alloc_traits::deallocate(_M_get_allocator(), _M_data(), __size + 1);
+  }
 
   // _M_construct_aux is used to implement the 21.3.1 para 15 which
   // requires special behaviour if _InIterator is an integral type
@@ -420,7 +432,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   basic_string()
   _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
   : _M_dataplus(_M_local_data())
-  { _M_set_length(0); }
+  {
+#if _GLIBCXX_DISABLE_STRING_SSO_USAGE
+	_M_length(0);
+	_M_capacity(0);
+#else
+	_M_set_length(0);
+#endif
+  }
 
   /**
*  @brief  Construct an empty string using allocator @a a.
@@ -428,7 +447,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   explicit
   basic_string(const _Alloc& __a) _GLIBCXX_NOEXCEPT
   : _M_dataplus(_M_local_data(), __a)
-  { _M_set_length(0); }
+  {
+#if

Re: [PATCH, rs6000] Cleanup vsx-vector-6 test files.

2018-06-08 Thread Segher Boessenkool

Hi!

On Thu, Jun 07, 2018 at 04:10:08PM -0700, Carl Love wrote:
> The test files gcc/testsuite/gcc.target/powerpc/vsx-vector-
> 6[be|le].[p7|p8|p9].c cover testing for LE and BE for the various
> processors.  These were setup before we had the le and be targets. 
> Given that we now have the be and le targets the test files can be
> combined into a single file with the be and le qualifiers attached to
> the various instruction count checks.  This reduces the number of files
> that need to be maintained.  
> 
> This patch removes the endian specific string in the test file name and
> combines the BE and LE versions for Power 8 into a single file.

Nice :-)

> 2018-06-07  Carl Love  
> 
>   * gcc.target/powerpc/vsx-vector-6-be.p7.c: Rename vsx-vector-6.p7.c.
>   * gcc.target/powerpc/vsx-vector-6-le.p9.c: Rename vsx-vector-6.p9.c.
>   * gcc.target/powerpc/vsx-vector-6-be.p8.c: Delete file.
>   * gcc.target/powerpc/vsx-vector-6-le.c: Rename vsx-vector-6.p8.c.
>   Add le and be qualifiers for instruction counts.
> ---
>  .../gcc.target/powerpc/vsx-vector-6-be.p7.c| 43 -
>  .../gcc.target/powerpc/vsx-vector-6-be.p8.c| 43 -
>  gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c | 47 --
>  .../gcc.target/powerpc/vsx-vector-6-le.p9.c| 37 ---
>  gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p7.c | 50 
>  gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p8.c | 55 
> ++
>  gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p9.c | 39 +++
>  7 files changed, 144 insertions(+), 170 deletions(-)

The changelog does not match the diffstat.  Ah, you do only mention the
new names in the description of the old names.  Please do something like:

* gcc.target/powerpc/vsx-vector-6-be.p7.c: Rename to...
* gcc.target/powerpc/vsx-vector-6.p7.c: ... this.
* gcc.target/powerpc/vsx-vector-6-le.p9.c: Merge with...
* gcc.target/powerpc/vsx-vector-6-be.p8.c: ... this and rename to ...
* gcc.target/powerpc/vsx-vector-6.p9.c: ... this.

Well I totally messed that up but you get the idea :-)

Okay with a better changelog.  Thanks!


Segher

Re: [PATCH] Avoid excessive function type casts with splay-trees part 2

2018-06-08 Thread David Malcolm

On Fri, 2018-06-08 at 14:03 +, Bernd Edlinger wrote:
> Hi!
> 
> 
> This patch converts the splay-tree internals into a template, and
> makes
> the typed_splay_tree template really type-safe.  Previously
> everything
> would break apart if KEY_TYPE or VALUE_TYPE would not be pointer
> types.
> This limitation is now removed.
> 
> I took the freedom to add a remove function which is only for
> completeness and test coverage, but not (yet) used in a productive
> way.
> 
> 
> Bootstrapped and reg-tested on x86_64-linux-gnu.
> Is it OK for trunk?

Was this testing done with "jit" enabled? (there's some usage of
typed_splay_tree there, for jit's equivalent of switch statements)  

Note that the jit frontend isn't covered by "all"  in --enable-
languages; it has to be added manually, iirc since it requires --
enable-host-shared.

Thanks
Dave

[PATCH] Avoid excessive function type casts with splay-trees part 2

2018-06-08 Thread Bernd Edlinger

Hi!


This patch converts the splay-tree internals into a template, and makes
the typed_splay_tree template really type-safe.  Previously everything
would break apart if KEY_TYPE or VALUE_TYPE would not be pointer types.
This limitation is now removed.

I took the freedom to add a remove function which is only for
completeness and test coverage, but not (yet) used in a productive way.


Bootstrapped and reg-tested on x86_64-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
2018-06-08  Bernd Edlinger  

* typed-splay-tree.h (typed_splay_tree::remove): New function.
(typed_splay_tree::closure,
typed_splay_tree::inner_foreach_fn, typed_splay_tree::m_inner): Deleted.
(typed_splay_tree::typed_splay_tree,
typed_splay_tree::operator =): Declared private.
(typed_splay_tree::splay_tree_key, typed_splay_tree::splay_tree_value,
typed_splay_tree::splay_tree_node_s, typed_splay_tree::KDEL,
typed_splay_tree::VDEL, typed_splay_tree::splay_tree_delete_helper,
typed_splay_tree::rotate_left, typed_splay_tree::rotate_right,
typed_splay_tree::splay_tree_splay,
typed_splay_tree::splay_tree_foreach_helper,
typed_splay_tree::splay_tree_insert,
typed_splay_tree::splay_tree_remove,
typed_splay_tree::splay_tree_lookup,
typed_splay_tree::splay_tree_predecessor,
typed_splay_tree::splay_tree_successor,
typed_splay_tree::splay_tree_min,
typed_splay_tree::splay_tree_max): Took over from splay-tree.c/.h.
(typed_splay_tree::root, typed_splay_tree::comp,
typed_splay_tree::delete_key,
typed_splay_tree::delete_value): New data members.
* typed-splay-tree.c (selftest::test_str_to_int): Add a test for
typed_splay_tree::remove.
Index: gcc/typed-splay-tree.c
===
--- gcc/typed-splay-tree.c	(revision 260952)
+++ gcc/typed-splay-tree.c	(working copy)
@@ -47,7 +47,10 @@ test_str_to_int ()
   t.insert ("a", 1);
   t.insert ("b", 2);
   t.insert ("c", 3);
+  t.insert ("d", 4);
 
+  t.remove ("d");
+
   ASSERT_EQ (1, t.lookup ("a"));
   ASSERT_EQ (2, t.lookup ("b"));
   ASSERT_EQ (3, t.lookup ("c"));
Index: gcc/typed-splay-tree.h
===
--- gcc/typed-splay-tree.h	(revision 260952)
+++ gcc/typed-splay-tree.h	(working copy)
@@ -20,8 +20,6 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_TYPED_SPLAY_TREE_H
 #define GCC_TYPED_SPLAY_TREE_H
 
-#include "splay-tree.h"
-
 /* Typesafe wrapper around libiberty's splay-tree.h.  */
 template 
 class typed_splay_tree
@@ -44,27 +42,66 @@ class typed_splay_tree
   value_type predecessor (key_type k);
   value_type successor (key_type k);
   void insert (key_type k, value_type v);
+  void remove (key_type k);
   value_type max ();
   value_type min ();
   int foreach (foreach_fn, void *);
 
  private:
-  /* Helper type for typed_splay_tree::foreach.  */
-  struct closure
-  {
-closure (foreach_fn outer_cb, void *outer_user_data)
-: m_outer_cb (outer_cb), m_outer_user_data (outer_user_data) {}
+  /* Copy and assignment ops are not supported.  */
+  typed_splay_tree (const typed_splay_tree &);
+  typed_splay_tree & operator = (const typed_splay_tree &);
 
-foreach_fn m_outer_cb;
-void *m_outer_user_data;
+  typedef key_type splay_tree_key;
+  typedef value_type splay_tree_value;
+
+  /* The nodes in the splay tree.  */
+  struct splay_tree_node_s {
+/* The key.  */
+splay_tree_key key;
+
+/* The value.  */
+splay_tree_value value;
+
+/* The left and right children, respectively.  */
+splay_tree_node_s *left, *right;
+
+/* Used as temporary value for tree traversals.  */
+splay_tree_node_s *back;
   };
+  typedef splay_tree_node_s *splay_tree_node;
 
-  static int inner_foreach_fn (splay_tree_node node, void *user_data);
+  inline void KDEL (splay_tree_key);
+  inline void VDEL (splay_tree_value);
+  void splay_tree_delete_helper (splay_tree_node);
+  static inline void rotate_left (splay_tree_node *,
+  splay_tree_node, splay_tree_node);
+  static inline void rotate_right (splay_tree_node *,
+   splay_tree_node, splay_tree_node);
+  void splay_tree_splay (splay_tree_key);
+  static int splay_tree_foreach_helper (splay_tree_node,
+	foreach_fn, void*);
+  splay_tree_node splay_tree_insert (splay_tree_key, splay_tree_value);
+  void splay_tree_remove (splay_tree_key key);
+  splay_tree_node splay_tree_lookup (splay_tree_key key);
+  splay_tree_node splay_tree_predecessor (splay_tree_key);
+  splay_tree_node splay_tree_successor (splay_tree_key);
+  splay_tree_node splay_tree_max ();
+  splay_tree_node splay_tree_min ();
 
   static value_type node_to_value (splay_tree_node node);
 
- private:
-  ::splay_tree m_inner;
+  /* The root of the tree.  */
+  splay_tree_node root;
+
+  /* The comparision function.  */
+  compare_fn comp;
+
+  /* The

New Spanish PO file for 'gcc' (version 8.1.0)

2018-06-08 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/es.po

(This file, 'gcc-8.1.0.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[PATCH][OBVIOUS] Fix function signature in header file.

2018-06-08 Thread Martin Liška

Hi.

One obvious fix. The function is unused, thus no compilation errors was spotted.

I'm going to install the patch.
Martin

gcc/ChangeLog:

2018-06-08  Martin Liska  

* tree-cfg.h (debug_function): Fix argument type to match
implementation.
---
 gcc/tree-cfg.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h
index 73237a604be..9491bb45feb 100644
--- a/gcc/tree-cfg.h
+++ b/gcc/tree-cfg.h
@@ -81,7 +81,7 @@ extern void fold_loop_internal_call (gimple *, tree);
 extern basic_block move_sese_region_to_fn (struct function *, basic_block,
    basic_block, tree);
 extern void dump_function_to_file (tree, FILE *, dump_flags_t);
-extern void debug_function (tree, int) ;
+extern void debug_function (tree, dump_flags_t);
 extern void print_loops_bb (FILE *, basic_block, int, int);
 extern void print_loops (FILE *, int);
 extern void debug (struct loop );

Re: [PATCH][wwwdocs][arm] Mention removal of deprecated architectures

2018-06-08 Thread Kyrill Tkachov



On 08/06/18 14:30, Kyrill Tkachov wrote:

Hi all,

Now that Gerald has created a changes.html page for GCC 9 here's an entry about
the removal of older arm architectures.

Committing to CVS.



And I got a message from Gerald's validation bot that my commit failed 
validation :(
http://validator.w3.org/check?uri=http://gcc.gnu.org/gcc-9/changes.html

"Line 106, Column 6: document type does not allow element "li" here; missing one of "ul", "ol", 
"menu", "dir" start-tag"
From what I understand the solution is to add  tags around the 
bullet-points, which is what this patch does.

Gerald (or anyone else with insight into this), is this the right fix?
Thanks,
Kyrill

Index: htdocs/gcc-9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.7
diff -U 3 -r1.7 changes.html
--- htdocs/gcc-9/changes.html	8 Jun 2018 13:32:13 -	1.7
+++ htdocs/gcc-9/changes.html	8 Jun 2018 13:34:56 -
@@ -74,6 +74,7 @@
 
 
 ARM
+
   
 Support for the deprecated Armv2 and Armv3 architectures and their
 variants has been removed.  Their corresponding -march
@@ -85,6 +86,7 @@
  (which have no known implementations) has been removed.
  Note that Armv5T, Armv5TE and Armv5TEJ architectures remain supported.
   
+

RE: Prefer open-coding vector integer division

2018-06-08 Thread Matthew Fortune

Richard Sandiford  writes:
> vect_recog_divmod_pattern currently bails out if the target has
> native support for integer division, but I think in practice
> it's always going to be better to open-code it anyway, just as
> we usually open-code scalar divisions by constants.
> 
> I think the only currently affected target is MIPS MSA, where for:
> 
>   void
>   foo (int *x)
>   {
> for (int i = 0; i < 100; ++i)
>   x[i] /= 2;
>   }
> 
> we previously preferred to use division for powers of 2:
> 
> .setnoreorder
> bnz.w   $w1,1f
> div_s.w $w0,$w0,$w1
> break   7
> .setreorder
> 1:
> 
> (or just the div_s.w for -mno-check-zero-division), but after the patch
> we open-code them using shifts:
> 
> clt_s.w $w1,$w0,$w2
> subv.w  $w0,$w0,$w1
> srai.w  $w0,$w0,1
> 
> I assume that's better.  Matthew, is that right?

Sorry for extreme tardiness. Yes, the alternate sequence has a max latency
of 6. Although I don't have the range of latencies to hand for the FDIV, as
far as I remember 6 cycles is better than the fastest FDIV case at least for
i6400/i6500.

Matthew

[PATCH][wwwdocs][arm] Mention removal of deprecated architectures

2018-06-08 Thread Kyrill Tkachov


Hi all,

Now that Gerald has created a changes.html page for GCC 9 here's an entry about
the removal of older arm architectures.

Committing to CVS.

Thanks,
Kyrill
Index: htdocs/gcc-9/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v
retrieving revision 1.6
diff -U 3 -r1.6 changes.html
--- htdocs/gcc-9/changes.html	2 Jun 2018 21:16:18 -	1.6
+++ htdocs/gcc-9/changes.html	8 Jun 2018 09:49:28 -
@@ -73,7 +73,18 @@
 
 
 
-
+ARM
+  
+Support for the deprecated Armv2 and Armv3 architectures and their
+variants has been removed.  Their corresponding -march
+values and the -mcpu options that used these architectures
+have been removed.
+  
+  
+ Support for the Armv5 and Armv5E architectures
+ (which have no known implementations) has been removed.
+ Note that Armv5T, Armv5TE and Armv5TEJ architectures remain supported.
+

Re: [PATCH] PPC: remove usage of cgraph_node::instrumentation_clone and cgraph_node::instrumented_version.

2018-06-08 Thread David Edelsohn

On Fri, Jun 8, 2018 at 9:24 AM Martin Liška  wrote:
>
> Hi.
>
> This is MPX removal follow-up. The code is dead for PPC and was always false.
>
> I'll install that after some PPC maintainer will approve that.
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2018-06-08  Martin Liska  
>
> * config/powerpcspe/powerpcspe.c (rs6000_xcoff_visibility):
> Remove usage of MPX-related (and removed) fields.
> * config/rs6000/rs6000.c (rs6000_xcoff_visibility): Likewise.
> ---
>  gcc/config/powerpcspe/powerpcspe.c | 7 ---
>  gcc/config/rs6000/rs6000.c | 7 ---
>  2 files changed, 14 deletions(-)

Okay.

Thanks, David

[PATCH] PPC: remove usage of cgraph_node::instrumentation_clone and cgraph_node::instrumented_version.

2018-06-08 Thread Martin Liška

Hi.

This is MPX removal follow-up. The code is dead for PPC and was always false.

I'll install that after some PPC maintainer will approve that.
Thanks,
Martin

gcc/ChangeLog:

2018-06-08  Martin Liska  

* config/powerpcspe/powerpcspe.c (rs6000_xcoff_visibility):
Remove usage of MPX-related (and removed) fields.
* config/rs6000/rs6000.c (rs6000_xcoff_visibility): Likewise.
---
 gcc/config/powerpcspe/powerpcspe.c | 7 ---
 gcc/config/rs6000/rs6000.c | 7 ---
 2 files changed, 14 deletions(-)


diff --git a/gcc/config/powerpcspe/powerpcspe.c b/gcc/config/powerpcspe/powerpcspe.c
index b500cd3b668..f67505a3552 100644
--- a/gcc/config/powerpcspe/powerpcspe.c
+++ b/gcc/config/powerpcspe/powerpcspe.c
@@ -37119,13 +37119,6 @@ rs6000_xcoff_visibility (tree decl)
   };
 
   enum symbol_visibility vis = DECL_VISIBILITY (decl);
-
-  if (TREE_CODE (decl) == FUNCTION_DECL
-  && cgraph_node::get (decl)
-  && cgraph_node::get (decl)->instrumentation_clone
-  && cgraph_node::get (decl)->instrumented_version)
-vis = DECL_VISIBILITY (cgraph_node::get (decl)->instrumented_version->decl);
-
   return visibility_types[vis];
 }
 #endif
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 04186a07cd2..683cb6c6f2f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -33541,13 +33541,6 @@ rs6000_xcoff_visibility (tree decl)
   };
 
   enum symbol_visibility vis = DECL_VISIBILITY (decl);
-
-  if (TREE_CODE (decl) == FUNCTION_DECL
-  && cgraph_node::get (decl)
-  && cgraph_node::get (decl)->instrumentation_clone
-  && cgraph_node::get (decl)->instrumented_version)
-vis = DECL_VISIBILITY (cgraph_node::get (decl)->instrumented_version->decl);
-
   return visibility_types[vis];
 }
 #endif

[PATCH 1/4] Transform switch_conversion into a class.

2018-06-08 Thread marxin


gcc/ChangeLog:

2018-06-07  Martin Liska  

* tree-switch-conversion.c (MAX_CASE_BIT_TESTS): Remove.
(hoist_edge_and_branch_if_true): Likewise.
(expand_switch_using_bit_tests_p): Likewise.
(struct case_bit_test): Likewise.
(case_bit_test_cmp): Likewise.
(emit_case_bit_tests): Likewise.
(switch_conversion::switch_conversion): New class.
(struct switch_conv_info): Remove old struct.
(collect_switch_conv_info): More to ...
(switch_conversion::collect): ... this.
(check_range): Likewise.
(switch_conversion::check_range): Likewise.
(check_all_empty_except_final): Likewise.
(switch_conversion::check_all_empty_except_final): Likewise.
(check_final_bb): Likewise.
(switch_conversion::check_final_bb): Likewise.
(create_temp_arrays): Likewise.
(switch_conversion::create_temp_arrays): Likewise.
(free_temp_arrays): Likewise.
(gather_default_values): Likewise.
(switch_conversion::gather_default_values): Likewise.
(build_constructors): Likewise.
(switch_conversion::build_constructors): Likewise.
(constructor_contains_same_values_p): Likewise.
(switch_conversion::contains_same_values_p): Likewise.
(array_value_type): Likewise.
(switch_conversion::array_value_type): Likewise.
(build_one_array): Likewise.
(switch_conversion::build_one_array): Likewise.
(build_arrays): Likewise.
(switch_conversion::build_arrays): Likewise.
(gen_def_assigns): Likewise.
(switch_conversion::gen_def_assigns): Likewise.
(prune_bbs): Likewise.
(switch_conversion::prune_bbs): Likewise.
(fix_phi_nodes): Likewise.
(switch_conversion::fix_phi_nodes): Likewise.
(gen_inbound_check): Likewise.
(switch_conversion::gen_inbound_check): Likewise.
(process_switch): Use the newly created class.
(switch_conversion::expand): New.
(switch_conversion::~switch_conversion): New.
* tree-switch-conversion.h: New file.
---
 gcc/tree-switch-conversion.c | 1125 +++---
 gcc/tree-switch-conversion.h |  283 +
 2 files changed, 511 insertions(+), 897 deletions(-)
 create mode 100644 gcc/tree-switch-conversion.h

diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index dc60b34f506..2f848fcb6aa 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -55,626 +55,74 @@ Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
type in the GIMPLE type system that is language-independent?  */
 #include "langhooks.h"
 
+#include "tree-switch-conversion.h"
 
-/* Maximum number of case bit tests.
-   FIXME: This should be derived from PARAM_CASE_VALUES_THRESHOLD and
-	  targetm.case_values_threshold(), or be its own param.  */
-#define MAX_CASE_BIT_TESTS  3
-
-/* Track whether or not we have altered the CFG and thus may need to
-   cleanup the CFG when complete.  */
-bool cfg_altered;
-
-/* Split the basic block at the statement pointed to by GSIP, and insert
-   a branch to the target basic block of E_TRUE conditional on tree
-   expression COND.
-
-   It is assumed that there is already an edge from the to-be-split
-   basic block to E_TRUE->dest block.  This edge is removed, and the
-   profile information on the edge is re-used for the new conditional
-   jump.
-   
-   The CFG is updated.  The dominator tree will not be valid after
-   this transformation, but the immediate dominators are updated if
-   UPDATE_DOMINATORS is true.
-   
-   Returns the newly created basic block.  */
+using namespace tree_switch_conversion;
 
-static basic_block
-hoist_edge_and_branch_if_true (gimple_stmt_iterator *gsip,
-			   tree cond, edge e_true,
-			   bool update_dominators)
-{
-  tree tmp;
-  gcond *cond_stmt;
-  edge e_false;
-  basic_block new_bb, split_bb = gsi_bb (*gsip);
-  bool dominated_e_true = false;
-
-  gcc_assert (e_true->src == split_bb);
-
-  if (update_dominators
-  && get_immediate_dominator (CDI_DOMINATORS, e_true->dest) == split_bb)
-dominated_e_true = true;
-
-  tmp = force_gimple_operand_gsi (gsip, cond, /*simple=*/true, NULL,
-  /*before=*/true, GSI_SAME_STMT);
-  cond_stmt = gimple_build_cond_from_tree (tmp, NULL_TREE, NULL_TREE);
-  gsi_insert_before (gsip, cond_stmt, GSI_SAME_STMT);
-
-  e_false = split_block (split_bb, cond_stmt);
-  new_bb = e_false->dest;
-  redirect_edge_pred (e_true, split_bb);
-
-  e_true->flags &= ~EDGE_FALLTHRU;
-  e_true->flags |= EDGE_TRUE_VALUE;
-
-  e_false->flags &= ~EDGE_FALLTHRU;
-  e_false->flags |= EDGE_FALSE_VALUE;
-  e_false->probability = e_true->probability.invert ();
-  new_bb->count = e_false->count ();
-
-  if (update_dominators)
-{
-  if (dominated_e_true)
-	set_immediate_dominator (CDI_DOMINATORS, e_true->dest, split_bb);
-  set_immediate_dominator

[PATCH 2/4] Switch other switch expansion methods into classes.

2018-06-08 Thread marxin


gcc/ChangeLog:

2018-06-07  Martin Liska  

* tree-switch-conversion.c (switch_conversion::collect):
Record m_uniq property.
(switch_conversion::expand): Bail out for special conditions.
(group_cluster::~group_cluster): New.
(group_cluster::group_cluster): Likewise.
(group_cluster::dump): Likewise.
(jump_table_cluster::emit): New.
(switch_decision_tree::fix_phi_operands_for_edges): New.
(struct case_node): Remove struct.
(jump_table_cluster::can_be_handled): New.
(case_values_threshold): Moved to header.
(reset_out_edges_aux): Likewise.
(jump_table_cluster::is_beneficial): New.
(bit_test_cluster::can_be_handled): Likewise.
(add_case_node): Remove.
(bit_test_cluster::is_beneficial): New.
(case_bit_test::cmp): New.
(bit_test_cluster::emit): New.
(expand_switch_as_decision_tree_p): Remove.
(bit_test_cluster::hoist_edge_and_branch_if_true): New.
(fix_phi_operands_for_edge): Likewise.
(switch_decision_tree::analyze_switch_statement): New.
(compute_cases_per_edge): Move ...
(switch_decision_tree::compute_cases_per_edge): ... here.
(try_switch_expansion): Likewise.
(switch_decision_tree::try_switch_expansion): Likewise.
(record_phi_operand_mapping): Likewise.
(switch_decision_tree::record_phi_operand_mapping): Likewise.
(emit_case_decision_tree): Likewise.
(switch_decision_tree::emit): Likewise.
(balance_case_nodes): Likewise.
(switch_decision_tree::balance_case_nodes): Likewise.
(dump_case_nodes): Likewise.
(switch_decision_tree::dump_case_nodes): Likewise.
(emit_jump): Likewise.
(switch_decision_tree::emit_jump): Likewise.
(emit_cmp_and_jump_insns): Likewise.
(switch_decision_tree::emit_cmp_and_jump_insns): Likewise.
(emit_case_nodes): Likewise.
(switch_decision_tree::emit_case_nodes): Likewise.
(conditional_probability): Remove.
* tree-switch-conversion.h (enum cluster_type): New.
(PRINT_CASE): New.
(struct cluster): Likewise.
(cluster::cluster): Likewise.
(struct simple_cluster): Likewise.
(simple_cluster::simple_cluster): Likewise.
(struct group_cluster): Likewise.
(struct jump_table_cluster): Likewise.
(struct bit_test_cluster): Likewise.
(struct min_cluster_item): Likewise.
(struct case_tree_node): Likewise.
(case_tree_node::case_tree_node): Likewise.
(jump_table_cluster::case_values_threshold): Likewise.
(struct case_bit_test): Likewise.
(struct switch_decision_tree): Likewise.
(struct switch_conversion): Likewise.
(switch_decision_tree::reset_out_edges_aux): Likewise.

gcc/testsuite/ChangeLog:

2018-06-07  Martin Liska  

* gcc.dg/tree-ssa/vrp104.c: Grep just for GIMPLE IL.
---
 gcc/testsuite/gcc.dg/tree-ssa/vrp104.c |2 +-
 gcc/tree-switch-conversion.c   | 1343 ++--
 gcc/tree-switch-conversion.h   |  545 ++
 3 files changed, 1325 insertions(+), 565 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c
index 71fa3bfa2ca..1bef76f1a21 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c
@@ -2,7 +2,7 @@
 /* { dg-options "-O2 -fdump-tree-switchlower" }  */
 /* We scan for 2 switches as the dump file reports a transformation,
IL really contains just a single.  */
-/* { dg-final { scan-tree-dump-times "switch" 2 "switchlower1" } }  */
+/* { dg-final { scan-tree-dump-times "switch \\(" 2 "switchlower1" } }  */
 
 void foo (void);
 void bar (void);
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 2f848fcb6aa..8f3dc8fd8a4 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -175,6 +175,11 @@ switch_conversion::collect (gswitch *swtch)
 	  && ! tree_int_cst_equal (CASE_LOW (elt), CASE_HIGH (elt)))
 	m_count++;
 }
+
+  /* Get the number of unique non-default targets out of the GIMPLE_SWITCH
+ block.  Assume a CFG cleanup would have already removed degenerate
+ switch statements, this allows us to just use EDGE_COUNT.  */
+  m_uniq = EDGE_COUNT (gimple_bb (swtch)->succs) - 1;
 }
 
 bool
@@ -861,6 +866,22 @@ switch_conversion::expand (gswitch *swtch)
   /* A switch on a constant should have been optimized in tree-cfg-cleanup.  */
   gcc_checking_assert (!TREE_CONSTANT (m_index_expr));
 
+  /* Prefer bit test if possible.  */
+  if (tree_fits_uhwi_p (m_range_size)
+  && bit_test_cluster::can_be_handled (tree_to_uhwi (m_range_size), m_uniq)
+  && bit_test_cluster::is_beneficial (m_count, m_uniq))
+{
+  m_reason = "expanding as bit test is preferable";
+  return;
+}
+
+  if (m_uniq <= 2)
+{
+  /* This will be

[PATCH 3/4] Enable clustering for switch statements.

2018-06-08 Thread marxin


gcc/ChangeLog:

2018-06-07  Martin Liska  

* tree-switch-conversion.c (jump_table_cluster::find_jump_tables):
New.
(bit_test_cluster::find_bit_tests): Likewise.
(switch_decision_tree::analyze_switch_statement): Find clusters.
* tree-switch-conversion.h (struct jump_table_cluster): Document
hierarchy.
---
 gcc/tree-switch-conversion.c | 170 +++
 gcc/tree-switch-conversion.h |  19 +++-
 2 files changed, 169 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 8f3dc8fd8a4..60181542bcc 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1004,6 +1004,64 @@ jump_table_cluster::emit (tree index_expr, tree,
   gsi_insert_after (, s, GSI_NEW_STMT);
 }
 
+vec
+jump_table_cluster::find_jump_tables (vec )
+{
+  unsigned l = clusters.length ();
+  auto_vec min;
+  min.reserve (l + 1);
+
+  min.quick_push (min_cluster_item (0, 0, 0));
+
+  for (unsigned i = 1; i <= l; i++)
+{
+  /* Set minimal # of clusters with i-th item to infinite.  */
+  min.quick_push (min_cluster_item (INT_MAX, INT_MAX, INT_MAX));
+
+  for (unsigned j = 0; j < i; j++)
+	{
+	  unsigned HOST_WIDE_INT s = min[j].m_non_jt_cases;
+	  if (i - j < case_values_threshold ())
+	s += i - j;
+
+	  /* Prefer clusters with smaller number of numbers covered.  */
+	  if ((min[j].m_count + 1 < min[i].m_count
+	   || (min[j].m_count + 1 == min[i].m_count
+		   && s < min[i].m_non_jt_cases))
+	  && can_be_handled (clusters, j, i - 1))
+	min[i] = min_cluster_item (min[j].m_count + 1, j, s);
+	}
+}
+
+  /* No result.  */
+  if (min[l].m_count == INT_MAX)
+return clusters.copy ();
+
+  vec output;
+  output.create (4);
+
+  /* Find and build the clusters.  */
+  for (int end = l;;)
+{
+  int start = min[end].m_start;
+
+  /* Do not allow clusters with small number of cases.  */
+  if (is_beneficial (clusters, start, end - 1))
+	output.safe_push (new jump_table_cluster (clusters, start, end - 1));
+  else
+	for (int i = end - 1; i >= start; i--)
+	  output.safe_push (clusters[i]);
+
+  end = start;
+
+  if (start <= 0)
+	break;
+}
+
+  output.reverse ();
+  return output;
+}
+
 bool
 jump_table_cluster::can_be_handled (const vec ,
 unsigned start, unsigned end)
@@ -1052,6 +1110,56 @@ jump_table_cluster::is_beneficial (const vec &,
   return end - start + 1 >= case_values_threshold ();
 }
 
+vec
+bit_test_cluster::find_bit_tests (vec )
+{
+  vec output;
+  output.create (4);
+
+  unsigned l = clusters.length ();
+  auto_vec min;
+  min.reserve (l + 1);
+
+  min.quick_push (min_cluster_item (0, 0, 0));
+
+  for (unsigned i = 1; i <= l; i++)
+{
+  /* Set minimal # of clusters with i-th item to infinite.  */
+  min.quick_push (min_cluster_item (INT_MAX, INT_MAX, INT_MAX));
+
+  for (unsigned j = 0; j < i; j++)
+	{
+	  if (min[j].m_count + 1 < min[i].m_count
+	  && can_be_handled (clusters, j, i - 1))
+	min[i] = min_cluster_item (min[j].m_count + 1, j, INT_MAX);
+	}
+}
+
+  /* No result.  */
+  if (min[l].m_count == INT_MAX)
+return clusters.copy ();
+
+  /* Find and build the clusters.  */
+  for (int end = l;;)
+{
+  int start = min[end].m_start;
+
+  if (is_beneficial (clusters, start, end - 1))
+	output.safe_push (new bit_test_cluster (clusters, start, end - 1));
+  else
+	for (int i = end - 1; i >=  start; i--)
+	  output.safe_push (clusters[i]);
+
+  end = start;
+
+  if (start <= 0)
+	break;
+}
+
+  output.reverse ();
+  return output;
+}
+
 bool
 bit_test_cluster::can_be_handled (unsigned HOST_WIDE_INT range,
   unsigned int uniq)
@@ -1343,33 +1451,57 @@ switch_decision_tree::analyze_switch_statement ()
 
   reset_out_edges_aux ();
 
-  vec output;
-  output.create (1);
-
-  /* Find whether the switch statement can be expanded with a method
- different from decision tree.  */
-  unsigned end = clusters.length () - 1;
-  if (jump_table_cluster::can_be_handled (clusters, 0, end)
-  && jump_table_cluster::is_beneficial (clusters, 0, end))
-output.safe_push (new jump_table_cluster (clusters, 0, end));
-  else if (bit_test_cluster::can_be_handled (clusters, 0, end)
-	   && bit_test_cluster::is_beneficial (clusters, 0, end))
-output.safe_push (new bit_test_cluster (clusters, 0, end));
-  else
-output = clusters;
+  /* Find jump table clusters.  */
+  vec output = jump_table_cluster::find_jump_tables (clusters);
+
+  /* Find jump table clusters.  */
+  vec output2;
+  auto_vec tmp;
+  output2.create (1);
+  tmp.create (1);
+
+  for (unsigned i = 0; i < output.length (); i++)
+{
+  cluster *c = output[i];
+  if (c->get_type () != SIMPLE_CASE)
+	{
+	  if (!tmp.is_empty ())
+	{
+	  vec n = bit_test_cluster::find_bit_tests (tmp);
+	  output2.safe_splice (n);
+	  n.release ();
+	  tmp.truncate (0);
+	}
+

[PATCH 0/4][v2] Implement smart multiple switch expansion algorithms

2018-06-08 Thread marxin

Hello.

This is v2 of following patch series: 
https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00347.html
I've done some adjustments:

- patch series is split:
  * 1/4 - factoring out of switch conversion algorithm
  * 2/4 - jump table and bit test use clustering data structures, but operate
  on all cases
  * 3/4 - clustering is enabled for both algorithms
  * 4/4 - param change which makes reasonable sizes on binaries

- jump table density threshold has been changed to align with current trunk
- most of David's comments were resolved


I decided to change way how to calculate jump table and bit test benefit.
It uses the same calculation as we use now. Still I believe it's too high
number, factor of 10 basically means that expected grown would be 10x compared
to balanced tree expansion. It's caused by fact that cmp & tmp instruction
takes roughly 8B, which is size of one element in jump table.

There are statistics for parameter equal to 8.

1) GCC (-O2):

bloaty ./objdir2/gcc/cc1plus -- ./objdir3/gcc/cc1plus
 VM SIZE  FILE SIZE
 ++ GROWING++
  +0.1% +11.3Ki .text  +11.3Ki  +0.1%
  [ = ]   0 .debug_info+4.00Ki  +0.0%
  [ = ]   0 .debug_loc +3.69Ki  +0.0%
  +0.1% +1.95Ki .eh_frame  +1.95Ki  +0.1%

 -- SHRINKING  --
  -2.6%  -195Ki .rodata -195Ki  -2.6%
  [ = ]   0 .debug_line-8.29Ki  -0.0%
  [ = ]   0 [Unmapped] -2.11Ki -54.8%
  [ = ]   0 .debug_ranges  -2.08Ki  -0.0%
  [ = ]   0 .strtab   -691  -0.0%
  [ = ]   0 .symtab   -360  -0.0%
  [ = ]   0 .debug_abbrev -167  -0.0%
  -0.0% -80 .eh_frame_hdr  -80  -0.0%
  [ = ]   0 .debug_aranges -16  -0.0%

  -0.6%  -181Ki TOTAL   -187Ki  -0.1%

Thus shrinks by 0.6%.

2) SPEC 2006 benchmarks (-O2): numbers are presented only if there's a change:

BENCHMARK: /tmp/before/astar_peak.amd64-m64-mine
BENCHMARK: /tmp/before/bwaves_peak.amd64-m64-mine
BENCHMARK: /tmp/before/bzip2_peak.amd64-m64-mine
BENCHMARK: /tmp/before/cactusADM_peak.amd64-m64-mine
  +0.0% +16 .text +16  +0.0%
  -0.4%-320 .rodata  -320  -0.4%
  -0.0%-312 TOTAL -4.17Ki  -0.1%
BENCHMARK: /tmp/before/calculix_peak.amd64-m64-mine
  +0.1%+184 .rodata  +184  +0.1%
  -0.0% -16 .text -16  -0.0%
  +0.0%+112 TOTAL +1.06Ki  +0.0%
BENCHMARK: /tmp/before/dealII_peak.amd64-m64-mine
  [ = ]   0 TOTAL   +16  +0.0%
BENCHMARK: /tmp/before/gamess_peak.amd64-m64-mine
  +0.0%+480 .rodata+480  +0.0%
  -0.0%-112 .text  -112  -0.0%
  +0.0%+352 TOTAL  -160  -0.0%
BENCHMARK: /tmp/before/gcc_peak.amd64-m64-mine
  -1.2% -9.78Ki .rodata   -9.78Ki  -1.2%
  -0.1% -1.44Ki .text -1.44Ki  -0.1%
  -0.3% -11.3Ki TOTAL+264  +0.0%
BENCHMARK: /tmp/before/GemsFDTD_peak.amd64-m64-mine
BENCHMARK: /tmp/before/gobmk_peak.amd64-m64-mine
  +0.0% +64 .rodata   +64  +0.0%
  -0.0% -16 .text -16  -0.0%
  +0.0% +64 TOTAL -48  -0.0%
BENCHMARK: /tmp/before/gromacs_peak.amd64-m64-mine
  +0.0% +80 .text   +80  +0.0%
  -0.6%-512 .rodata-512  -0.6%
  -0.0%-384 TOTAL  +432  +0.0%
BENCHMARK: /tmp/before/h264ref_peak.amd64-m64-mine
  +0.5%+384 .rodata+384  +0.5%
  -0.0% -96 .text   -96  -0.0%
  +0.0%+288 TOTAL   -16  -0.0%
BENCHMARK: /tmp/before/hmmer_peak.amd64-m64-mine
  +2.6%+704 .rodata  +704  +2.6%
  -0.1%-288 .text-288  -0.1%
  +0.1%+416 TOTAL +80  +0.0%
BENCHMARK: /tmp/before/lbm_peak.amd64-m64-mine
BENCHMARK: /tmp/before/leslie3d_peak.amd64-m64-mine
BENCHMARK: /tmp/before/libquantum_peak.amd64-m64-mine
   +26%+336 .rodata+336   +26%
  +0.6%+176 .text  +176  +0.6%
  +1.2%+512 TOTAL  -256  -0.1%
BENCHMARK: /tmp/before/mcf_peak.amd64-m64-mine
BENCHMARK: /tmp/before/milc_peak.amd64-m64-mine
BENCHMARK: /tmp/before/namd_peak.amd64-m64-mine
BENCHMARK: /tmp/before/omnetpp_peak.amd64-m64-mine
BENCHMARK: /tmp/before/perlbench_peak.amd64-m64-mine
  +4.5% +5.25Ki .rodata   +5.25Ki  +4.5%
  -0.2% -1.70Ki .text -1.70Ki  -0.2%
  +0.3% +3.50Ki TOTAL +2.41Ki  +0.0%
BENCHMARK: /tmp/before/povray_peak.amd64-m64-mine
   +20% +21.1Ki .rodata   +21.1Ki   +20%
  -0.8% -5.84Ki .text -5.84Ki  -0.8%
  +1.4% +15.4Ki TOTAL +19.6Ki  +0.3%
BENCHMARK: /tmp/before/sjeng_peak.amd64-m64-mine
  +0.7%+192 .rodata+192  +0.7%
  -0.1% -80 .text   -80  -0.1%
  +0.0%+128 TOTAL   -64  -0.0%
BENCHMARK: /tmp/before/soplex_peak.amd64-m64-mine
  +2.5%+464 .rodata  +464  +2.5%
  -0.0% -32 .text -32  -0.0%
  +0.1%+444 TOTAL-224  -0.0%
BENCHMARK: /tmp/before/specrand_peak.amd64-m64-mine
BENCHMARK:

[PATCH 4/4] Change default for jump_table expansion ratio to 8.

2018-06-08 Thread marxin


gcc/ChangeLog:

2018-06-07  Martin Liska  

* tree-switch-conversion.c (jump_table_cluster::can_be_handled):
Change default ratio from 10 to 8.
---
 gcc/tree-switch-conversion.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 60181542bcc..5573f50d59f 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1075,17 +1075,11 @@ jump_table_cluster::can_be_handled (const vec ,
  make a sequence of conditional branches instead of a dispatch.
 
  The definition of "much bigger" depends on whether we are
- optimizing for size or for speed.  If the former, the maximum
- ratio range/count = 3, because this was found to be the optimal
- ratio for size on i686-pc-linux-gnu, see PR11823.  The ratio
- 10 is much older, and was probably selected after an extensive
- benchmarking investigation on numerous platforms.  Or maybe it
- just made sense to someone at some point in the history of GCC,
- who knows...  */
+ optimizing for size or for speed.  */
   if (!flag_jump_tables)
 return false;
 
-  unsigned HOST_WIDE_INT max_ratio = optimize_insn_for_size_p () ? 3 : 10;
+  unsigned HOST_WIDE_INT max_ratio = optimize_insn_for_size_p () ? 3 : 8;
 
   unsigned HOST_WIDE_INT range = get_range (clusters[start]->get_low (),
 	clusters[end]->get_high ());

Re: [AArch64][PATCH 2/2] PR target/83009: Relax strict address checking for store pair lanes

2018-06-08 Thread Kyrill Tkachov


Hi Andre,

On 07/06/18 18:02, Andre Simoes Dias Vieira wrote:

Hi,

See below a patch to address PR 83009.

Tested with aarch64-linux-gnu bootstrap and regtests for c, c++ and fortran.
Ran the adjusted testcase for -mabi=ilp32.

Is this OK for trunk?

Cheers,
Andre

PR target/83009: Relax strict address checking for store pair lanes

The operand constraint for the memory address of store/load pair lanes was
enforcing strictly hardware registers be allowed as memory addresses.  We want
to relax that such that these patterns can be used by combine, prior to reload.
During register allocation the register constraint will enforce the correct
register is chosen.

gcc
2018-06-07  Andre Vieira 

PR target/83009
* config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Make
address check not strict prior to reload.

gcc/testsuite
2018-06-07 Andre Vieira 

PR target/83009
* gcc/target/aarch64/store_v2vec_lanes.c: Add extra tests.


diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 
f0917af8b4cec945ba4e38e4dc670200f8812983..30aa88838671bf343a883077c2b606a035c030dd
 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -227,7 +227,7 @@
 (define_predicate "aarch64_mem_pair_lanes_operand"
   (and (match_code "mem")
(match_test "aarch64_legitimate_address_p (GET_MODE (op), XEXP (op, 0),
- true,
+ reload_completed,
  ADDR_QUERY_LDP_STP_N)")))
 


If you want to enforce strict checking during reload and later then I think you 
need to use reload_in_progress || reload_completed ?
I guess that would be equivalent to !can_create_pseudo ().

Thanks,
Kyrill

[committed] v2: Convert dump and optgroup flags to enums

2018-06-08 Thread David Malcolm

On Tue, 2018-06-05 at 18:16 +0200, Richard Biener wrote:
> On June 5, 2018 4:49:21 PM GMT+02:00, David Malcolm  com> wrote:
> > On Tue, 2018-06-05 at 04:40 -0400, Trevor Saunders wrote:
> > > On Fri, Jun 01, 2018 at 12:00:09PM +0200, Richard Biener wrote:
> > > > On Tue, May 29, 2018 at 10:32 PM David Malcolm  > > > .com
> > > > > wrote:
> > > > > 
> > > > > The dump machinery uses "int" in a few places, for two
> > > > > different
> > > > > sets of bitmasks.
> > > > > 
> > > > > This patch makes things more self-documenting and type-safe
> > > > > by
> > > > > using
> > > > > a new pair of enums: one for the dump_flags_t and another for
> > > > > the
> > > > > optgroup_flags.
> > > > 
> > > > Great!  This should also make them accessible in gdb w/o using
> > > > -g3.
> > > > 
> > > > > This requires adding some overloaded bit operations to the
> > > > > enums
> > > > > in question, which, in this patch is done for each enum .  If
> > > > > the
> > > > > basic
> > > > > idea is OK, should I add a template for this?  (with some
> > > > > kind of
> > > > > magic to express that bitmasking operations are only
> > > > > supported on
> > > > > certain opt-in enums).
> > > 
> > > You may want to look at gdb's enum-flags.h which I think already
> > > implements what your doing here.
> > 
> > Aha!  Thanks!
> > 
> > Browsing the git web UI, that gdb header was introduced by Pedro
> > Alves
> > in:
> > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdi
> > ff;h=8d297bbf604c8318ffc72d5a7b3db654409c5ed9
> > and the current state is here:
> > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=g
> > db/common/enum-flags.h;hb=HEAD
> > 
> > I'll try this out with GCC; hopefully it works with C++98.
> > 
> > Presumably it would be good to share this header between GCC and
> > GDB;
> > CCing Pedro; Pedro: hi!  Does this sound sane?
> > (for reference, the GCC patch we're discussing here is:
> >  https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01685.html )
> > 
> > Presumably gcc's copy should live in the gcc's top-level "include"
> > subdirectory?
> > 
> > Would we need to change that "This file is part of GDB." comment,
> > and
> > the include guards' "COMMON_ENUM_FLAGS_H"?
> > 
> > > > Does C++ allow > int enums?  I think we want some way of
> > > > knowing
> > > > when either enum exceeds int (dump_flags_t was already uint64_t
> > > > but you now make it effectively int again).  That is, general
> > > > wrapping
> > > > for enum ops should chose an appropriate unsigned integer for
> > > > the operation.  So yes, a common implementation looks useful to
> > > > me.
> > > 
> > > I don't remember very well, but istr C++ will actually use a 8
> > > byte
> > > integer if the enum contains constants larger than 2^32.  Testing
> > > sizeof enum x { A =0x4 }; gives the desired thing for me,
> > > but
> > > it
> > > would still be good to check the standard.
> > 
> > FWIW C++11 onwards has a std::underlying_type for enums:
> >  http://en.cppreference.com/w/cpp/types/underlying_type
> > (GCC is on C++98).  The gdb header seems to emulate this via the
> > use of
> > sizeof(T) to select an appropriate integer_for_size specialization
> > and
> > thus the appropriate struct enum_underlying_type specialization (if
> > I'm
> > reading it right).
> > 
> > > Trev
> > > 
> > > 
> > > > 
> > > > I think this patch is independently useful.
> > 
> > Richard: by this, did you mean that the patch is OK for trunk as-
> > is?
> 
> Yes. 
> 
> Richard. 
> 
> > (keeping a more general bitflags enum patch as followup work)  Note
> > to
> > self: still needs bootstrap-and-testing.

Thanks.

The patch needed some trivial fixes, which seemed sufficiently obvious
as to not need further approval:

* Use dump_flags_t rather than int for "dump_kind" param for
dump_basic_block, dump_generic_expr_loc's prototype (not just
the definition), and dump_dec.
* Removed "FIXME" comments
* Remove trailing comma from last item in enum dump_flag (for -Wpedantic).
* Fixed brig-to-generic.cc and graphite-poly.c (my earlier testing didn't
have brig or graphite enabled).

For reference here's what I've committed to trunk (r261325; I'm looking
at using gdb's enum-flags.h as follow-up):

gcc/brig/ChangeLog:
* brigfrontend/brig-to-generic.cc
(brig_to_generic::write_globals): Use TDF_NONE rather than 0.
(dump_function): Likewise.

gcc/c-family/ChangeLog:
* c-pretty-print.c (c_pretty_printer::statement): Use TDF_NONE
rather than 0.

gcc/ChangeLog:
* cfg.c (debug): Use TDF_NONE rather than 0.
* cfghooks.c (debug): Likewise.
* dumpfile.c (DUMP_FILE_INFO): Likewise; also for OPTGROUP.
(struct dump_option_value_info): Convert to...
(struct kv_pair): ...this template type.
(dump_options): Convert to kv_pair; use TDF_NONE
rather than 0.
(optinfo_verbosity_options): Likewise.
(optgroup_options): Convert to kv_pair; use

[PATCH] rs6000: Delete mention of -mabi={no-,}spe in the documentation

2018-06-08 Thread Segher Boessenkool

The option no longer exists.

Tested, committing.


Segher


2018-06-08  Segher Boessenkool  

* doc/invoke.texi (RS/6000 and PowerPC Options): Delete mention of
-mabi=spe and -mabi=no-spe.

---
 gcc/doc/invoke.texi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 46dcf18..0009038 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -24032,8 +24032,8 @@ SVR4 ABI)@.
 @item -mabi=@var{abi-type}
 @opindex mabi
 Extend the current ABI with a particular extension, or remove such extension.
-Valid values are @samp{altivec}, @samp{no-altivec}, @samp{spe},
-@samp{no-spe}, @samp{ibmlongdouble}, @samp{ieeelongdouble},
+Valid values are @samp{altivec}, @samp{no-altivec},
+@samp{ibmlongdouble}, @samp{ieeelongdouble},
 @samp{elfv1}, @samp{elfv2}@.
 
 @item -mabi=ibmlongdouble
-- 
1.8.3.1

[PATCH] rs6000: Delete unused min/max macros

2018-06-08 Thread Segher Boessenkool

The last use was deleted in 2017.  There are the generic MIN/MAX macros
to use already, and in this new world we should use std::min, std::max.

Tested, committing.


Segher


2018-06-08  Segher Boessenkool  

* config/rs6000/rs6000.c (min, max): Delete.

---
 gcc/config/rs6000/rs6000.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ec60c14..7c5bcd1 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -103,9 +103,6 @@
 #endif
 #endif
 
-#define min(A,B)   ((A) < (B) ? (A) : (B))
-#define max(A,B)   ((A) > (B) ? (A) : (B))
-
 static pad_direction rs6000_function_arg_padding (machine_mode, const_tree);
 
 /* Structure used to define the rs6000 stack */
-- 
1.8.3.1

Re: [PATCH 05/14] Use summaries->get where possible. Small refactoring of multiple calls.

2018-06-08 Thread Martin Liška

On 06/07/2018 02:16 PM, Jan Hubicka wrote:
>>
>> gcc/ChangeLog:
>>
>> 2018-04-24  Martin Liska  
>>
>>  * ipa-fnsummary.c (dump_ipa_call_summary): Use ::get method.
>>  (analyze_function_body): Extract multiple calls of get_create.
>>  * ipa-inline-analysis.c (simple_edge_hints): Likewise.
>>  * ipa-inline.c (recursive_inlining): Use ::get method.
>>  * ipa-inline.h (estimate_edge_growth): Likewise.
>> ---
>>  gcc/ipa-fnsummary.c   | 14 +++---
>>  gcc/ipa-inline-analysis.c |  2 +-
>>  gcc/ipa-inline.c  |  8 
>>  gcc/ipa-inline.h  |  7 +++
>>  4 files changed, 15 insertions(+), 16 deletions(-)
>>
> 
>> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
>> index 8a6c5d0b5d8..e40b537bf61 100644
>> --- a/gcc/ipa-fnsummary.c
>> +++ b/gcc/ipa-fnsummary.c
>> @@ -850,7 +850,7 @@ dump_ipa_call_summary (FILE *f, int indent, struct 
>> cgraph_node *node,
>>}
>>if (!edge->inline_failed)
>>  {
>> -  ipa_fn_summary *s = ipa_fn_summaries->get_create (callee);
>> +  ipa_fn_summary *s = ipa_fn_summaries->get (callee);
>>fprintf (f, "%*sStack frame offset %i, callee self size %i,"
>> " callee size %i\n",
>> indent + 2, "",
>> @@ -2363,10 +2363,9 @@ analyze_function_body (struct cgraph_node *node, bool 
>> early)
>>  }
>>free (body);
>>  }
>> -  set_hint_predicate (_fn_summaries->get_create 
>> (node)->loop_iterations,
>> -  loop_iterations);
>> -  set_hint_predicate (_fn_summaries->get_create (node)->loop_stride,
>> -  loop_stride);
>> +  ipa_fn_summary *s = ipa_fn_summaries->get_create (node);
>> +  set_hint_predicate (>loop_iterations, loop_iterations);
>> +  set_hint_predicate (>loop_stride, loop_stride);
> 
> I think you already have pointer info initialized to ipa_fn_summaries->get 
> (node), so just
> replace all uses pf ipa_fn_summaries in this function by that. It seems like 
> not very careful
> transition (done pehraps by me :)
> 
> We may consider modifying our getters to make them pure for gcc so it will
> optimize some of those issues for us.  That would require some code 
> refactoring
> as you have internal getter with bool parameter that also creates nodes (and
> thus is no longer pure)
>> diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
>> index c4f904730e6..2e30a6d15ba 100644
>> --- a/gcc/ipa-inline-analysis.c
>> +++ b/gcc/ipa-inline-analysis.c
>> @@ -126,7 +126,7 @@ simple_edge_hints (struct cgraph_edge *edge)
>>  ? edge->caller->global.inlined_to : edge->caller);
>>struct cgraph_node *callee = edge->callee->ultimate_alias_target ();
>>if (ipa_fn_summaries->get_create (to)->scc_no
>> -  && ipa_fn_summaries->get_create (to)->scc_no
>> +  && ipa_fn_summaries->get (to)->scc_no
>>   == ipa_fn_summaries->get_create (callee)->scc_no
> 
> Please move ipa_fn_summaries->get_create (to)->scc_no out of the
> conditional and store the result, so we don't need to call it multiple times.
> I think it would be bug if summaries was not ready here, so it should be ->get
> only.
>> diff --git a/gcc/ipa-inline.h b/gcc/ipa-inline.h
>> index e8ae206d7b7..06bd38e551e 100644
>> --- a/gcc/ipa-inline.h
>> +++ b/gcc/ipa-inline.h
>> @@ -81,10 +81,9 @@ estimate_edge_size (struct cgraph_edge *edge)
>>  static inline int
>>  estimate_edge_growth (struct cgraph_edge *edge)
>>  {
>> -  gcc_checking_assert (ipa_call_summaries->get_create (edge)->call_stmt_size
>> -   || !edge->callee->analyzed);
>> -  return (estimate_edge_size (edge)
>> -  - ipa_call_summaries->get_create (edge)->call_stmt_size);
>> +  ipa_call_summary *s = ipa_call_summaries->get_create (edge);
>> +  gcc_checking_assert (s->call_stmt_size || !edge->callee->analyzed);
>> +  return (estimate_edge_size (edge) - s->call_stmt_size);
> 
> Also if get_create ever created new summary here, it would not have right 
> sizes,
> so plase turn it into get here.
> 
> OK with those changes. (and if any of those trap, just add FIXME and we can 
> deal with
> it incrementally).
> 
> Honza
> 

Ok, I'm doing that with patch that I attach.

Martin
>From f3a2951a3bbfb4091b7d4d141adb14a181eebc0a Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 8 Jun 2018 12:50:18 +0200
Subject: [PATCH] Replace some ::get_create with ::get in IPA inline.

gcc/ChangeLog:

2018-06-08  Martin Liska  

	* ipa-inline-analysis.c (simple_edge_hints): Use ::get method.
	* ipa-inline.h (estimate_edge_growth): Likewise.
---
 gcc/ipa-inline-analysis.c | 8 
 gcc/ipa-inline.h  | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 9a7267395ea..c781d368a8a 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -96,10 +96,10 @@ simple_edge_hints (struct cgraph_edge *edge)
   struct cgraph_node *to =

Re: [PATCH 07/14] Covert ipa-pure-const.c to symbol_summary.

2018-06-08 Thread Martin Liška

On 06/07/2018 02:57 PM, Jan Hubicka wrote:
>>
>> gcc/ChangeLog:
>>
>> 2018-04-24  Martin Liska  
>>
>>  * ipa-pure-const.c (struct funct_state_d): Do it class instead
>>  of struct.
>>  (class funct_state_summary_t): New function_summary class.
>>  (has_function_state): Remove.
>>  (get_function_state): Likewise.
>>  (set_function_state): Likewise.
>>  (add_new_function): Likewise.
>>  (funct_state_summary_t::insert): New function.
>>  (duplicate_node_data): Remove.
>>  (remove_node_data): Remove.
>>  (funct_state_summary_t::duplicate): New function.
>>  (register_hooks): Create new funct_state_summaries.
>>  (pure_const_generate_summary): Use it.
>>  (pure_const_write_summary): Likewise.
>>  (pure_const_read_summary): Likewise.
>>  (propagate_pure_const): Likewise.
>>  (propagate_nothrow): Likewise.
>>  (dump_malloc_lattice): Likewise.
>>  (propagate_malloc): Likewise.
>>  (execute): Do not register hooks, just remove summary
>>  instead.
>>  (pass_ipa_pure_const::pass_ipa_pure_const): Simplify
>>  constructor.
> 
> OK with changes below.
> In general, it would be cool to reorg this pass into simple SCC propagation 
> template
> because it does same things over and over again (sometimes on slightly 
> different graph
> because it has feature to skip uninteresting edges e.g. for nothrow 
> propagation).
>> @@ -1485,7 +1439,7 @@ propagate_pure_const (void)
>>int i;
>>struct ipa_ref *ref = NULL;
>>  
>> -  funct_state w_l = get_function_state (w);
>> +  funct_state w_l = funct_state_summaries->get_create (w);
>>if (dump_file && (dump_flags & TDF_DETAILS))
>>  fprintf (dump_file, "  Visiting %s state:%s looping %i\n",
>>   w->dump_name (),
>> @@ -1527,7 +1481,7 @@ propagate_pure_const (void)
>>  }
>>if (avail > AVAIL_INTERPOSABLE)
>>  {
>> -  funct_state y_l = get_function_state (y);
>> +  funct_state y_l = funct_state_summaries->get_create (y);
>>if (dump_file && (dump_flags & TDF_DETAILS))
>>  {
>>fprintf (dump_file,
> 
> The functions are organized in a way that it goes cycle by cycle. First loop 
> initializes
> everything in the cycle and in this case you want get_create (where w_l is 
> set)
> y_l looks for calls and because it works in SCC order all calls are either in 
> same cycle
> or processed.  I do not see any guard checking that y_l is initialized and 
> because you
> now initialize it to pesimistic state I think it will turn any SCC component 
> into non-pure
> now (while it originaly worked by initializing to 0 which is optimistic state 
> CONST)
> 
> So i would add guard that y is in different SCC component and make code to 
> crash
> if get() returns NULL.
> Similarly for all other propagators.
> 
>> @@ -1642,7 +1596,7 @@ propagate_pure_const (void)
>>while (w && !can_free)
>>  {
>>struct cgraph_edge *e;
>> -  funct_state w_l = get_function_state (w);
>> +  funct_state w_l = funct_state_summaries->get_create (w);
>>  
>>if (w_l->can_free
>>|| w->get_availability () == AVAIL_INTERPOSABLE
>> @@ -1657,7 +1611,7 @@ propagate_pure_const (void)
>>e->caller);
>>  
>>if (avail > AVAIL_INTERPOSABLE)
>> -can_free = get_function_state (y)->can_free;
>> +can_free = funct_state_summaries->get_create (y)->can_free;
>>else
>>  can_free = true;
>>  }
> 
> Here everything should be computed already.
> 
> OK with those changes provides it does not affect generated code.
> 
> Honza
> 

Hi.

Ok, I'll do that with one additional patch that I'm sending.

Martin
>From 42f122845969bf39ec606a0592d7a511b87c4657 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 8 Jun 2018 12:59:34 +0200
Subject: [PATCH] Make ipa-pure-const more strict about summary constrains.

gcc/ChangeLog:

2018-06-08  Martin Liska  

	* ipa-pure-const.c (propagate_pure_const): Use ::get at places
where we expect an existing summary.
---
 gcc/ipa-pure-const.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index 8c415bc1fc0..d0f9cb8d7f7 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -1477,7 +1477,7 @@ propagate_pure_const (void)
 		}
 	  if (avail > AVAIL_INTERPOSABLE)
 		{
-		  funct_state y_l = funct_state_summaries->get_create (y);
+		  funct_state y_l = funct_state_summaries->get (y);
 		  if (dump_file && (dump_flags & TDF_DETAILS))
 		{
 		  fprintf (dump_file,
@@ -1591,7 +1591,7 @@ propagate_pure_const (void)
   while (w && !can_free)
 	{
 	  struct cgraph_edge *e;
-	  funct_state w_l = funct_state_summaries->get_create (w);
+	  funct_state w_l = funct_state_summaries->get (w);
 
 	  if

[PATCH][OBVIOUS] Fix scan in ipa-icf-38.c.

2018-06-08 Thread Martin Liška

Hi.

One obvious patch: scan proper dump file in ltrans of a LTO test-case.

Martin

gcc/testsuite/ChangeLog:

2018-06-08  Martin Liska  

* gcc.dg/ipa/ipa-icf-38.c: Scan optimized tree dump.
---
 gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)


diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
index 85531ab1cf3..788038a1c68 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-38.c
@@ -1,5 +1,5 @@
 /* { dg-do link } */
-/* { dg-options "-O2 -fdump-ipa-icf -flto -fdump-tree-fixup_cfg4" } */
+/* { dg-options "-O2 -fdump-ipa-icf -flto -fdump-tree-optimized" } */
 /* { dg-require-effective-target lto } */
 /* { dg-additional-sources "ipa-icf-38a.c" }*/
 
@@ -29,5 +29,5 @@ int main()
 
 /* { dg-final { scan-wpa-ipa-dump "Semantic equality hit:foo->bar" "icf"  } } */
 /* { dg-final { scan-wpa-ipa-dump "Equal symbols: 1" "icf"  } } */
-/* { dg-final { scan-ltrans-tree-dump "Function foo" "fixup_cfg4" } } */
-/* { dg-final { scan-ltrans-tree-dump-not "Function bar" "fixup_cfg4" } } */
+/* { dg-final { scan-ltrans-tree-dump "Function foo" "optimized" } } */
+/* { dg-final { scan-ltrans-tree-dump-not "Function bar" "optimized" } } */

Re: [Patch] Do not call the linker if we are creating precompiled header files

2018-06-08 Thread Joseph Myers

On Wed, 2 May 2018, Steve Ellcey wrote:

> I tracked this down to driver::maybe_run_linker where it sees the linker
> flags and increments num_linker_inputs, this causes the routine to call
> the linker.   This patch checks to see if we are creating precompiled
> header files and avoids calling the linker in that case.

Making it a global check like this, with an early exit from 
driver::maybe_run_linker, seems wrong to me.  I think the correct logical 
check is, for each output file, whether it is a precompiled header - if it 
is, then num_linker_inputs should not be incremented for that particular 
output file (but if there are other output files that are not precompiled 
headers, or if there are linker input files, num_linker_inputs should 
still be incremented for *those*).  Then the existing logic in the rest of 
driver::maybe_run_linker should run as usual (including the logic to warn 
about explicit linker input files when linking is not done).

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] Use flags_from_decl_or_type in lto_symtab_merge_p (PR ipa/85248).

2018-06-08 Thread Martin Liška

Hi.

Second follow-up patch uses flags_from_decl_or_type in LTO merging
of declarations. Hope it's more cleaner approach.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Martin
>From 17d598f028c723cb11e8a9f786e3026c0cfca4aa Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 8 Jun 2018 10:14:47 +0200
Subject: [PATCH] Use flags_from_decl_or_type in lto_symtab_merge_p (PR
 ipa/85248).

gcc/lto/ChangeLog:

2018-06-08  Martin Liska  

PR ipa/85248
	* lto-symtab.c (lto_symtab_merge_p): Use
flags_from_decl_or_type.
---
 gcc/lto/lto-symtab.c | 50 +---
 1 file changed, 15 insertions(+), 35 deletions(-)

diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
index b1df9bb77d1..2259358ea5f 100644
--- a/gcc/lto/lto-symtab.c
+++ b/gcc/lto/lto-symtab.c
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "lto-symtab.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "calls.h"
 
 /* Replace the cgraph node NODE with PREVAILING_NODE in the cgraph, merging
all edges and removing the old node.  */
@@ -547,7 +548,7 @@ lto_symtab_merge_p (tree prevailing, tree decl)
 {
   if (DECL_BUILT_IN (prevailing) != DECL_BUILT_IN (decl))
 	{
-  if (symtab->dump_file)
+	  if (symtab->dump_file)
 	fprintf (symtab->dump_file, "Not merging decls; "
 		 "DECL_BUILT_IN mismatch\n");
 	  return false;
@@ -561,44 +562,23 @@ lto_symtab_merge_p (tree prevailing, tree decl)
 		 "DECL_BUILT_IN_CLASS or CODE mismatch\n");
 	  return false;
 	}
-}
 
-  /* FIXME: after MPX is removed, use flags_from_decl_or_type
- function instead.  PR lto/85248.  */
-  if (DECL_ATTRIBUTES (prevailing) != DECL_ATTRIBUTES (decl))
-{
-  tree prev_attr = lookup_attribute ("error", DECL_ATTRIBUTES (prevailing));
-  tree attr = lookup_attribute ("error", DECL_ATTRIBUTES (decl));
-  if ((prev_attr == NULL) != (attr == NULL)
-	  || (prev_attr && !attribute_value_equal (prev_attr, attr)))
+  if (DECL_ATTRIBUTES (prevailing) != DECL_ATTRIBUTES (decl))
 	{
-  if (symtab->dump_file)
-	fprintf (symtab->dump_file, "Not merging decls; "
-		 "error attribute mismatch\n");
-	  return false;
-	}
-
-  prev_attr = lookup_attribute ("warning", DECL_ATTRIBUTES (prevailing));
-  attr = lookup_attribute ("warning", DECL_ATTRIBUTES (decl));
-  if ((prev_attr == NULL) != (attr == NULL)
-	  || (prev_attr && !attribute_value_equal (prev_attr, attr)))
-	{
-  if (symtab->dump_file)
-	fprintf (symtab->dump_file, "Not merging decls; "
-		 "warning attribute mismatch\n");
-	  return false;
-	}
-
-  prev_attr = lookup_attribute ("noreturn", DECL_ATTRIBUTES (prevailing));
-  attr = lookup_attribute ("noreturn", DECL_ATTRIBUTES (decl));
-  if ((prev_attr == NULL) != (attr == NULL))
-	{
-  if (symtab->dump_file)
-	fprintf (symtab->dump_file, "Not merging decls; "
-		 "noreturn attribute mismatch\n");
-	  return false;
+	  int prev_decl_attrs
+	= flags_from_decl_or_type (prevailing);
+	  int decl_attrs
+	= flags_from_decl_or_type (decl);
+	  if (prev_decl_attrs != decl_attrs)
+	{
+	  if (symtab->dump_file)
+		fprintf (symtab->dump_file, "Not merging decls; "
+			 "DECL_ATTRIBUTES mismatch\n");
+	  return false;
+	}
 	}
 }
+
   return true;
 }
 
-- 
2.17.0

Re: [AArch64][PATCH 1/2] Fix addressing printing of LDP/STP

2018-06-08 Thread Kyrill Tkachov


Hi Andre,

On 07/06/18 18:01, Andre Simoes Dias Vieira wrote:

Hi,

The address printing for LDP/STP patterns that don't use parallel was not 
working properly when dealing with a post-index addressing mode. The post-index 
address printing uses the mode's size to determine the post-index immediate. To 
fix an earlier issue with range checking of these instructions this mode was 
being hard-coded to DFmode for the operand modifier 'y', which was added for 
this particular pattern.  This was done because the range of LDP/STP for two 
64-bit operands is the same as a single 64-bit load/store. Instead of 
hard-coding the mode early on we introduce a new address query type 
'ADDR_QUERY_LDP_STP_N' to be used for such cases. This will halve the mode used 
for computing the range check, but leave the original mode of the operand as 
is, making sure the post-index printing is correct.

Bootstrapped and tested on aarch64-none-linux-gnu.

Is this OK for trunk?


This looks ok to me, but you'll need approval from a maintainer (CC'ed them for 
you).

Thanks,
Kyrill



gcc
2018-06-07  Andre Vieira 

* config/aarch64/aarch64-protos.h (aarch64_addr_query_type): Add new
enum value 'ADDR_QUERY_LDP_STP_N'.
* config/aarch64/aarch64.c (aarch64_addr_query_type): Likewise.
(aarch64_print_address_internal): Add declaration.
(aarch64_print_ldpstp_address): Remove.
(aarch64_classify_address): Adapt mode for 'ADDR_QUERY_LDP_STP_N'.
(aarch64_print_operand): Change printing of 'y'.
* config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): Use
new enum value 'ADDR_QUERY_LDP_STP_N', don't hardcode mode and use
'true' rather than '1'.
* gcc/config/aarch64/constraints.md (Uml): Likewise.

[PATCH] Come up with Deprecated option flag.

2018-06-08 Thread Martin Liška

Hi.

First follow-up MPX removal patch comes up with Deprecated option flag.
That prints warning for options that have no effect:

$ ./xgcc -B. /tmp/main.c -Wchkp -static-libmpxwrappers
xgcc: warning: deprecated command line option ‘-static-libmpxwrappers’
cc1: warning: deprecated command line option ‘-Wchkp’

Is the string OK, or?

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Martin
>From 0b1473e517373386e674c6736de5007960138d03 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 8 Jun 2018 10:52:23 +0200
Subject: [PATCH] Come up with Deprecated option flag.

gcc/ChangeLog:

2018-06-08  Martin Liska  

	* config/i386/i386.opt: Make MPX-related options as Deprecated.
	* opt-functions.awk: Handle Deprecated flag.
	* opts-common.c (decode_cmdline_option): Handle cl_deprecated
and report error.
	(read_cmdline_option): Report warning for a deprecated option.
	* opts.h (struct cl_option): Add new field cl_deprecated.
	(CL_ERR_DEPRECATED): New.

gcc/c-family/ChangeLog:

2018-06-08  Martin Liska  

	* c.opt: Make MPX-related options as Deprecated.

gcc/testsuite/ChangeLog:

2018-06-08  Martin Liska  

	* g++.dg/opt/mpx.C: New test.
	* gcc.target/i386/mpx.c: New test.
---
 gcc/c-family/c.opt  | 42 ++---
 gcc/config/i386/i386.opt|  2 +-
 gcc/opt-functions.awk   |  3 ++-
 gcc/opts-common.c   | 10 +++
 gcc/opts.h  |  3 +++
 gcc/testsuite/g++.dg/opt/mpx.C  |  5 
 gcc/testsuite/gcc.target/i386/mpx.c |  3 +++
 7 files changed, 45 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/mpx.C
 create mode 100644 gcc/testsuite/gcc.target/i386/mpx.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 1d7eafff1f7..b4aefd8d5f6 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -409,7 +409,7 @@ C ObjC C++ ObjC++ Var(warn_char_subscripts) Warning LangEnabledBy(C ObjC C++ Obj
 Warn about subscripts whose type is \"char\".
 
 Wchkp
-C ObjC C++ ObjC++ Var(warn_chkp) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
+C ObjC C++ ObjC++ Var(warn_chkp) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 Wclobbered
@@ -1259,86 +1259,86 @@ C ObjC C++ ObjC++
 Where shorter, use canonicalized paths to systems headers.
 
 fcheck-pointer-bounds
-C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds)
+C ObjC C++ ObjC++ LTO Report Var(flag_check_pointer_bounds) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-check-incomplete-type
-C ObjC C++ ObjC++ LTO Report Var(flag_chkp_incomplete_type) Init(1)
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_incomplete_type) Init(1) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-zero-input-bounds-for-main
-C ObjC C++ ObjC++ LTO Report Var(flag_chkp_zero_input_bounds_for_main) Init(0)
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_zero_input_bounds_for_main) Init(0) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-first-field-has-own-bounds
-C ObjC C++ ObjC++ LTO RejectNegative Report Var(flag_chkp_first_field_has_own_bounds)
+C ObjC C++ ObjC++ LTO RejectNegative Report Var(flag_chkp_first_field_has_own_bounds) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-narrow-bounds
-C ObjC C++ ObjC++ LTO Report Var(flag_chkp_narrow_bounds) Init(1)
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_narrow_bounds) Init(1) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-narrow-to-innermost-array
-C ObjC C++ ObjC++ LTO RejectNegative Report Var(flag_chkp_narrow_to_innermost_arrray)
+C ObjC C++ ObjC++ LTO RejectNegative Report Var(flag_chkp_narrow_to_innermost_arrray) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-flexible-struct-trailing-arrays
-C ObjC C++ ObjC++ LTO Report Var(flag_chkp_flexible_struct_trailing_arrays)
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_flexible_struct_trailing_arrays) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-optimize
 C ObjC C++ ObjC++ LTO Report Var(flag_chkp_optimize) Init(-1)
 
 fchkp-use-fast-string-functions
-C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_fast_string_functions) Init(0)
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_fast_string_functions) Init(0) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-use-nochk-string-functions
-C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_nochk_string_functions) Init(0)
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_nochk_string_functions) Init(0) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-use-static-bounds
-C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_static_bounds) Init(1)
+C ObjC C++ ObjC++ LTO Report Var(flag_chkp_use_static_bounds) Init(1) Deprecated
 Deprecated in GCC 9.  This switch has no effect.
 
 fchkp-use-static-const-bounds
-C ObjC C++ ObjC++ LTO Report

Re: [ARM/FDPIC 13/21] [ARM] FDPIC: Support unwinding across thumb2 signal trampoline

2018-06-08 Thread Kyrill Tkachov


Hi Christophe,

Similar comments to patch 11/21

On 25/05/18 09:03, Christophe Lyon wrote:

2018-XX-XX  Christophe Lyon 
Mickaël Guêné 

libgcc/
* unwind-arm-common.inc (FDPIC_T2_LDR_R12_WITH_FUNCDESC)
(FDPIC_T2_LDR_R9_WITH_GOT, FDPIC_T2_LDR_PC_WITH_RESTORER): New.
(__gnu_personality_sigframe_fdpic): Support Thumb address.
(get_eit_entry): Support Thumb code.

Change-Id: I2bb8994e733e48a89c6f4e0682921186c086f8bc

diff --git a/libgcc/unwind-arm-common.inc b/libgcc/unwind-arm-common.inc
index 80d1e88..7de4033 100644
--- a/libgcc/unwind-arm-common.inc
+++ b/libgcc/unwind-arm-common.inc
@@ -38,6 +38,9 @@
 #define FDPIC_LDR_R12_WITH_FUNCDESC 0xe59fc004
 #define FDPIC_LDR_R9_WITH_GOT   0xe59c9004
 #define FDPIC_LDR_PC_WITH_RESTORER  0xe59cf000
+#define FDPIC_T2_LDR_R12_WITH_FUNCDESC  0xc008f8df
+#define FDPIC_T2_LDR_R9_WITH_GOT   0x9004f8dc
+#define FDPIC_T2_LDR_PC_WITH_RESTORER   0xf000f8dc
 #define FDPIC_FUNCDESC_OFFSET   12
 /* Signal frame offsets.  */
 #define ARM_NEW_RT_SIGFRAME_UCONTEXT0x80
@@ -228,7 +231,7 @@ __gnu_personality_sigframe_fdpic (_Unwind_State state,
 _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, );
 _Unwind_VRS_Get (context, _UVRSC_CORE, R_PC, _UVRSD_UINT32, );

-funcdesc = *(unsigned int *)(pc + FDPIC_FUNCDESC_OFFSET);
+funcdesc = *(unsigned int *)((pc & ~1) + FDPIC_FUNCDESC_OFFSET);
 handler = *(unsigned int *)(funcdesc);
 first_handler_instruction = *(unsigned int *)(handler & ~1);

@@ -277,10 +280,13 @@ get_eit_entry (_Unwind_Control_Block *ucbp, _uw 
return_address)
   /* If we are unwinding a signal handler then perhaps we have
  reached a trampoline.  Try to detect jump to restorer
  sequence.  */
- _uw *pc = (_uw *)((return_address+2) & ~3);
- if (pc[0] == FDPIC_LDR_R12_WITH_FUNCDESC
- && pc[1] == FDPIC_LDR_R9_WITH_GOT
- && pc[2] == FDPIC_LDR_PC_WITH_RESTORER)
+ _uw *pc = (_uw *)((return_address+2) & ~1);
+ if ((pc[0] == FDPIC_LDR_R12_WITH_FUNCDESC
+  && pc[1] == FDPIC_LDR_R9_WITH_GOT
+  && pc[2] == FDPIC_LDR_PC_WITH_RESTORER)
+ || (pc[0] == FDPIC_T2_LDR_R12_WITH_FUNCDESC
+ && pc[1] == FDPIC_T2_LDR_R9_WITH_GOT
+ && pc[2] == FDPIC_T2_LDR_PC_WITH_RESTORER))
 {


This largely overwrites and extends code added in patch 11/21. Can't we just 
merge the two
patches into a final sane one?

Thanks,
Kyrill


   struct funcdesc_t *funcdesc = (struct funcdesc_t *)
 &__gnu_personality_sigframe_fdpic;
@@ -309,13 +315,16 @@ get_eit_entry (_Unwind_Control_Block *ucbp, _uw 
return_address)
   /* If we are unwinding a signal handler then perhaps we have
  reached a trampoline.  Try to detect jump to restorer
  sequence.  */
-  _uw *pc = (_uw *)((return_address+2) & ~3);
-  if (pc[0] == FDPIC_LDR_R12_WITH_FUNCDESC
- && pc[1] == FDPIC_LDR_R9_WITH_GOT
- && pc[2] == FDPIC_LDR_PC_WITH_RESTORER)
+  _uw *pc = (_uw *)((return_address+2) & ~1);
+  if ((pc[0] == FDPIC_LDR_R12_WITH_FUNCDESC
+  && pc[1] == FDPIC_LDR_R9_WITH_GOT
+  && pc[2] == FDPIC_LDR_PC_WITH_RESTORER)
+ || (pc[0] == FDPIC_T2_LDR_R12_WITH_FUNCDESC
+ && pc[1] == FDPIC_T2_LDR_R9_WITH_GOT
+ && pc[2] == FDPIC_T2_LDR_PC_WITH_RESTORER))
 {
- struct funcdesc_t *funcdesc = (struct funcdesc_t *)
-   &__gnu_personality_sigframe_fdpic;
+ struct funcdesc_t *funcdesc
+   = (struct funcdesc_t *) &__gnu_personality_sigframe_fdpic;

   UCB_PR_ADDR (ucbp) = funcdesc->ptr;
   UCB_PR_GOT (ucbp) = funcdesc->got;
@@ -335,13 +344,16 @@ get_eit_entry (_Unwind_Control_Block *ucbp, _uw 
return_address)
   /* If we are unwinding a signal handler then perhaps we have
  reached a trampoline.  Try to detect jump to restorer
  sequence.  */
-  _uw *pc = (_uw *)((return_address+2) & ~3);
-  if (pc[0] == FDPIC_LDR_R12_WITH_FUNCDESC
- && pc[1] == FDPIC_LDR_R9_WITH_GOT
- && pc[2] == FDPIC_LDR_PC_WITH_RESTORER)
+  _uw *pc = (_uw *)((return_address+2) & ~1);
+  if ((pc[0] == FDPIC_LDR_R12_WITH_FUNCDESC
+  && pc[1] == FDPIC_LDR_R9_WITH_GOT
+  && pc[2] == FDPIC_LDR_PC_WITH_RESTORER)
+ || (pc[0] == FDPIC_T2_LDR_R12_WITH_FUNCDESC
+ && pc[1] == FDPIC_T2_LDR_R9_WITH_GOT
+ && pc[2] == FDPIC_T2_LDR_PC_WITH_RESTORER))
 {
- struct funcdesc_t *funcdesc = (struct funcdesc_t *)
-   &__gnu_personality_sigframe_fdpic;
+ struct funcdesc_t *funcdesc
+   = (struct funcdesc_t *) &__gnu_personality_sigframe_fdpic;

   UCB_PR_ADDR (ucbp) = funcdesc->ptr;
   UCB_PR_GOT (ucbp) = funcdesc->got;
--
2.6.3

[PATCH] Default special members of regex types and add noexcept

2018-06-08 Thread Jonathan Wakely


Nothing very exciting, just adding noexcept and defaulting some
members.

* include/bits/regex.h (sub_match): Add noexcept to default
constructor and length observer.
(match_results): Add noexcept to default constructor and observers
with no preconditions. Define destructor as defaulted.
(operator==, operator!=, swap): Add noexcept.
(regex_iterator): Add default member initializers and define default
constructor and destructor as defaulted. Add noexcept to equality
and dereference operators.

Tested powerpc64le-linux, committed to trunk.


commit 786d6b032ca5fb4dd0bbeaa3cf8024d04c6022ad
Author: redi 
Date:   Thu Jun 7 08:56:45 2018 +

Default special members of regex types and add noexcept

* include/bits/regex.h (sub_match): Add noexcept to default
constructor and length observer.
(match_results): Add noexcept to default constructor and observers
with no preconditions. Define destructor as defaulted.
(operator==, operator!=, swap): Add noexcept.
(regex_iterator): Add default member initializers and define default
constructor and destructor as defaulted. Add noexcept to equality
and dereference operators.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index 12e830b2c68..674be9ac50c 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -868,18 +868,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 public:
   typedef typename __iter_traits::value_type   value_type;
   typedef typename __iter_traits::difference_type  difference_type;
-  typedef _BiIter iterator;
-  typedef std::basic_string string_type;
+  typedef _BiIter  iterator;
+  typedef basic_string string_type;
 
   bool matched;
 
-  constexpr sub_match() : matched() { }
+  constexpr sub_match() noexcept : matched() { }
 
   /**
* Gets the length of the matching sequence.
*/
   difference_type
-  length() const
+  length() const noexcept
   { return this->matched ? std::distance(this->first, this->second) : 0; }
 
   /**
@@ -1602,37 +1602,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* @post size() returns 0 and str() returns an empty string.
*/
   explicit
-  match_results(const _Alloc& __a = _Alloc())
+  match_results(const _Alloc& __a = _Alloc()) noexcept
   : _Base_type(__a)
   { }
 
   /**
* @brief Copy constructs a %match_results.
*/
-  match_results(const match_results& __rhs) = default;
+  match_results(const match_results&) = default;
 
   /**
* @brief Move constructs a %match_results.
*/
-  match_results(match_results&& __rhs) noexcept = default;
+  match_results(match_results&&) noexcept = default;
 
   /**
* @brief Assigns rhs to *this.
*/
   match_results&
-  operator=(const match_results& __rhs) = default;
+  operator=(const match_results&) = default;
 
   /**
* @brief Move-assigns rhs to *this.
*/
   match_results&
-  operator=(match_results&& __rhs) = default;
+  operator=(match_results&&) = default;
 
   /**
* @brief Destroys a %match_results object.
*/
-  ~match_results()
-  { }
+  ~match_results() = default;
 
   //@}
 
@@ -1642,7 +1641,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* @retval true   The object has a fully-established result state.
* @retval false  The object is not ready.
*/
-  bool ready() const { return !_Base_type::empty(); }
+  bool ready() const noexcept { return !_Base_type::empty(); }
 
   /**
* @name 28.10.2 Size
@@ -1659,11 +1658,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* @returns the number of matches found.
*/
   size_type
-  size() const
+  size() const noexcept
   { return _Base_type::empty() ? 0 : _Base_type::size() - 3; }
 
   size_type
-  max_size() const
+  max_size() const noexcept
   { return _Base_type::max_size(); }
 
   /**
@@ -1672,7 +1671,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* @retval false The %match_results object is not empty.
*/
   bool
-  empty() const
+  empty() const noexcept
   { return size() == 0; }
 
   //@}
@@ -1776,28 +1775,28 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* @brief Gets an iterator to the start of the %sub_match collection.
*/
   const_iterator
-  begin() const
+  begin() const noexcept
   { return _Base_type::begin(); }
 
   /**
* @brief Gets an iterator to the start of the %sub_match collection.
*/
   const_iterator
-  cbegin() const
+  cbegin() const noexcept
   { return this->begin(); }
 
   /**
*

Re: [ARM/FDPIC 12/21] [ARM] FDPIC: Restore r9 after we call __aeabi_read_tp

2018-06-08 Thread Kyrill Tkachov


Hi Christophe,

Again, a patch description would be welcome :)

On 25/05/18 09:03, Christophe Lyon wrote:

2018-XX-XX  Christophe Lyon 
Mickaël Guêné 

gcc/
* config/arm/arm.c (arm_load_tp): Add FDPIC support.
* config/arm/arm.md (load_tp_soft_fdpic): New pattern.
(load_tp_soft): Disable in FDPIC mode.

Change-Id: I0a2e3466c9afb869ad8e844083ad178de014658e

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 20b8f66..80fe96f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8660,7 +8660,27 @@ arm_load_tp (rtx target)

   rtx tmp;

-  emit_insn (gen_load_tp_soft ());
+  if (TARGET_FDPIC)
+   {
+ rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (3));
+
+ emit_insn (gen_load_tp_soft_fdpic ());
+
+ /* Restore r9.  */
+ XVECEXP (par, 0, 0)
+   = gen_rtx_UNSPEC (VOIDmode,
+ gen_rtvec (2, gen_rtx_REG (Pmode, 9),
+ get_hard_reg_initial_val (Pmode, 9)),
+ UNSPEC_PIC_RESTORE);


At this point I think it's worth defining something like an FDPIC_REGNUM to 9
and using that rather than using the number directly. You'll want to do this
earlier in the series however, it just came to me while looking at this patch ;)


+ XVECEXP (par, 0, 1) = gen_rtx_USE (VOIDmode, gen_rtx_REG (Pmode, 9));
+ XVECEXP (par, 0, 2)
+   = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, 9));
+ emit_insn (par);
+   }
+  else
+   {
+ emit_insn (gen_load_tp_soft ());
+   }



Braces not needed.


   tmp = gen_rtx_REG (SImode, R0_REGNUM);
   emit_move_insn (target, tmp);
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 78c236c..0bd0a6b 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11482,12 +11482,24 @@
 )

 ;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
+(define_insn "load_tp_soft_fdpic"
+  [(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
+   (clobber (reg:SI 9))
+   (clobber (reg:SI LR_REGNUM))
+   (clobber (reg:SI IP_REGNUM))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_SOFT_TP && TARGET_FDPIC"
+  "bl\\t__aeabi_read_tp\\t@ load_tp_soft"
+  [(set_attr "conds" "clob")]
+)


This also needs the "branch" type like "load_tp_soft".

Thanks,
Kyrill


+
+;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
 (define_insn "load_tp_soft"
   [(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
(clobber (reg:SI LR_REGNUM))
(clobber (reg:SI IP_REGNUM))
(clobber (reg:CC CC_REGNUM))]
-  "TARGET_SOFT_TP"
+  "TARGET_SOFT_TP && !TARGET_FDPIC"
   "bl\\t__aeabi_read_tp\\t@ load_tp_soft"
   [(set_attr "conds" "clob")
(set_attr "type" "branch")]
--
2.6.3

Re: [ARM/FDPIC 11/21] [ARM] FDPIC: Add support to unwind FDPIC signal frame

2018-06-08 Thread Kyrill Tkachov


Hi Christophe,

I'll be honest, I'm not very familiar with this part of the compiler.
I'll let Ramana or Richard comment on the approach.
A description of what this patch does and how would be appreciated.

Some comments inline nevertheless :)

Thanks,
Kyrill
On 25/05/18 09:03, Christophe Lyon wrote:

2018-XX-XX  Christophe Lyon 
Mickaël Guêné 

libgcc/
* unwind-arm-common.inc (ARM_SET_R7_RT_SIGRETURN)
(THUMB2_SET_R7_RT_SIGRETURN, FDPIC_LDR_R12_WITH_FUNCDESC)
(FDPIC_LDR_R9_WITH_GOT, FDPIC_LDR_PC_WITH_RESTORER)
(FDPIC_FUNCDESC_OFFSET, ARM_NEW_RT_SIGFRAME_UCONTEXT)
(ARM_UCONTEXT_SIGCONTEXT, ARM_SIGCONTEXT_R0): New.
(__gnu_personality_sigframe_fdpic): New.
(get_eit_entry): Add FDPIC signal frame support.

Change-Id: I7f9527cc50665dd1a731b7badf71c319fb38bf57

diff --git a/libgcc/unwind-arm-common.inc b/libgcc/unwind-arm-common.inc
index f5415c1..80d1e88 100644
--- a/libgcc/unwind-arm-common.inc
+++ b/libgcc/unwind-arm-common.inc
@@ -30,6 +30,21 @@
 #include 
 #endif

+#if __FDPIC__
+/* Load r7 with rt_sigreturn value.  */
+#define ARM_SET_R7_RT_SIGRETURN0xe3a070ad
+#define THUMB2_SET_R7_RT_SIGRETURN 0x07adf04f
+/* FDPIC jump to restorer sequence.  */
+#define FDPIC_LDR_R12_WITH_FUNCDESC0xe59fc004
+#define FDPIC_LDR_R9_WITH_GOT  0xe59c9004
+#define FDPIC_LDR_PC_WITH_RESTORER 0xe59cf000
+#define FDPIC_FUNCDESC_OFFSET  12
+/* Signal frame offsets.  */
+#define ARM_NEW_RT_SIGFRAME_UCONTEXT   0x80
+#define ARM_UCONTEXT_SIGCONTEXT0x14
+#define ARM_SIGCONTEXT_R0  0xc
+#endif


I think these are instruction opcodes? If so, please include their expected 
disassembly
in a comment next to them. That way we stand a chance of validating whether 
they actually
do what we want them to do.


+
 /* We add a prototype for abort here to avoid creating a dependency on
target headers.  */
 extern void abort (void);
@@ -195,6 +210,46 @@ search_EIT_table (const __EIT_entry * table, int nrec, _uw 
return_address)
 }
 }

+#if __FDPIC__
+/* FIXME: partial support (VFP not restored) but should be sufficient
+   to allow unwinding.  */


Not a fan of these FIXMEs in patch submissions.
Is the patch incomplete?
Does the missing support not matter?
If VFP is not supported properly then we should be rejecting
building such configurations for the time being.


+static _Unwind_Reason_Code
+__gnu_personality_sigframe_fdpic (_Unwind_State state,
+   _Unwind_Control_Block *ucbp,
+   _Unwind_Context *context)
+{
+unsigned int sp;
+unsigned int pc;
+unsigned int funcdesc;
+unsigned int handler;
+unsigned int first_handler_instruction;
+int i;
+
+_Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, );
+_Unwind_VRS_Get (context, _UVRSC_CORE, R_PC, _UVRSD_UINT32, );
+
+funcdesc = *(unsigned int *)(pc + FDPIC_FUNCDESC_OFFSET);
+handler = *(unsigned int *)(funcdesc);
+first_handler_instruction = *(unsigned int *)(handler & ~1);
+
+/* Adjust SP to point to the start of registers according to
+   signal type.  */
+if (first_handler_instruction == ARM_SET_R7_RT_SIGRETURN
+   || first_handler_instruction == THUMB2_SET_R7_RT_SIGRETURN)
+   sp += ARM_NEW_RT_SIGFRAME_UCONTEXT
+ + ARM_UCONTEXT_SIGCONTEXT
+ + ARM_SIGCONTEXT_R0;
+else
+   sp += ARM_UCONTEXT_SIGCONTEXT
+ + ARM_SIGCONTEXT_R0;
+/* Restore regs saved on stack by the kernel.  */
+for (i = 0; i < 16; i++)
+   _Unwind_VRS_Set (context, _UVRSC_CORE, i, _UVRSD_UINT32, sp + 4 * i);
+
+return _URC_CONTINUE_UNWIND;
+}
+#endif
+
 /* Find the exception index table eintry for the given address.
Fill in the relevant fields of the UCB.
Returns _URC_FAILURE if an error occurred, _URC_OK on success.  */
@@ -218,6 +273,24 @@ get_eit_entry (_Unwind_Control_Block *ucbp, _uw 
return_address)
);
   if (!eitp)
 {
+#if __FDPIC__
+ /* If we are unwinding a signal handler then perhaps we have
+reached a trampoline.  Try to detect jump to restorer
+sequence.  */
+ _uw *pc = (_uw *)((return_address+2) & ~3);
+ if (pc[0] == FDPIC_LDR_R12_WITH_FUNCDESC
+ && pc[1] == FDPIC_LDR_R9_WITH_GOT
+ && pc[2] == FDPIC_LDR_PC_WITH_RESTORER)
+   {


As I said, I'll let Richard or Ramana comment on the approach but I don't see 
any
other code in this file doing such instruction matching...


+ struct funcdesc_t *funcdesc = (struct funcdesc_t *)
+   &__gnu_personality_sigframe_fdpic;
+
+ UCB_PR_ADDR (ucbp) = funcdesc->ptr;
+ UCB_PR_GOT (ucbp) = funcdesc->got;
+
+ return _URC_OK;
+   }
+#endif
   UCB_PR_ADDR (ucbp) = 0;
   return _URC_FAILURE;
 }
@@ -232,6 +305,24 @@ get_eit_entry (_Unwind_Control_Block *ucbp, _uw 
return_address)

   if (!eitp)

Re: [ARM/FDPIC 06/21] [ARM] FDPIC: Add support for c++ exceptions

2018-06-08 Thread Richard Earnshaw (lists)

On 08/06/18 11:15, Kyrill Tkachov wrote:
> Hi Christophe,
> 
> On 25/05/18 09:03, Christophe Lyon wrote:
>> When restoring a function address, we also have to restore the FDPIC
>> register value (r9).
>>
>> 2018-XX-XX  Christophe Lyon  
>>     Mickaël Guêné 
>>
>>     gcc/
>>     * ginclude/unwind-arm-common.h (unwinder_cache): Add reserved5
>>     field.
>>
>>     libgcc/
>>     * config/arm/linux-atomic.c (__ARM_ARCH__): Define.
>>     (__kernel_cmpxchg): Add FDPIC support.
>>     (__kernel_dmb): Likewise.
>>     (__fdpic_cmpxchg): New function.
>>     (__fdpic_dmb): New function.
>>     * config/arm/unwind-arm.h (gnu_Unwind_Find_got): New function.
>>     (_Unwind_decode_typeinfo_ptr): Add FDPIC support.
>>     * unwindo-arm-common.inc (UCB_PR_GOT): New.
>>     (funcdesc_t): New struct.
>>     (get_eit_entry): Add FDPIC support.
>>     (unwind_phase2): Likewise.
>>     (unwind_phase2_forced): Likewise.
>>     (__gnu_Unwind_RaiseException): Likewise.
>>     (__gnu_Unwind_Resume): Likewise.
>>     (__gnu_Unwind_Backtrace): Likewise.
>>     * unwind-pe.h (read_encoded_value_with_base): Likewise.
>>
>>     libstdc++/
>>     * libsupc++/eh_personality.cc (get_ttype_entry): Add FDPIC
>>     support.
>>
>> Change-Id: Ic0841eb3d7bfb0b3f6d187cd52a660b8fd394d85
>>
>> diff --git a/gcc/ginclude/unwind-arm-common.h
>> b/gcc/ginclude/unwind-arm-common.h
>> index 8a1a919..150bd0f 100644
>> --- a/gcc/ginclude/unwind-arm-common.h
>> +++ b/gcc/ginclude/unwind-arm-common.h
>> @@ -91,7 +91,7 @@ extern "C" {
>>    _uw reserved2;  /* Personality routine address */
>>    _uw reserved3;  /* Saved callsite address */
>>    _uw reserved4;  /* Forced unwind stop arg */
>> - _uw reserved5;
>> + _uw reserved5;  /* Personality routine GOT value in FDPIC
>> mode.  */
>>  }
>>    unwinder_cache;
>>    /* Propagation barrier cache (valid after phase 1): */
>> diff --git a/libgcc/config/arm/linux-atomic.c
>> b/libgcc/config/arm/linux-atomic.c
>> index d334c58..a20ad94 100644
>> --- a/libgcc/config/arm/linux-atomic.c
>> +++ b/libgcc/config/arm/linux-atomic.c
>> @@ -23,13 +23,99 @@ a copy of the GCC Runtime Library Exception along
>> with this program;
>>  see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
>>  . */
>>
>> +#if defined(__ARM_ARCH_2__)
>> +# define __ARM_ARCH__ 2
>> +#endif
>> +
>> +#if defined(__ARM_ARCH_3__)
>> +# define __ARM_ARCH__ 3
>> +#endif
>> +
>> +#if defined(__ARM_ARCH_3M__) || defined(__ARM_ARCH_4__) \
>> +  || defined(__ARM_ARCH_4T__)
>> +/* We use __ARM_ARCH__ set to 4 here, but in reality it's any
>> processor with
>> +   long multiply instructions.  That includes v3M.  */
>> +# define __ARM_ARCH__ 4
>> +#endif
>> +
> 
> Support for __ARM_ARCH_2__, __ARM_ARCH_3__, __ARM_ARCH_3M__ has been
> removed in GCC 9
> so this code is dead.

Better still, use the ACLE pre-defines rather than the awkward GCC
versions which need updating each time a new architecture variant is added.

R.

> 
> I notice that in the removal I've missed out an occurrence of these in
> config/arm/lib1funcs.S.
> If you want to remove those occurrences as a separate patch that would
> be preapproved.
> 
> Thanks,
> Kyrill
> 
>> +#if defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5T__) \
>> +  || defined(__ARM_ARCH_5E__) || defined(__ARM_ARCH_5TE__) \
>> +  || defined(__ARM_ARCH_5TEJ__)
>> +# define __ARM_ARCH__ 5
>> +#endif
>> +
>> +#if defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
>> +  || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) \
>> +  || defined(__ARM_ARCH_6ZK__) || defined(__ARM_ARCH_6T2__) \
>> +  || defined(__ARM_ARCH_6M__)
>> +# define __ARM_ARCH__ 6
>> +#endif
>> +
>> +#if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
>> +  || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
>> +  || defined(__ARM_ARCH_7EM__)
>> +# define __ARM_ARCH__ 7
>> +#endif
>> +
>> +#ifndef __ARM_ARCH__
>> +#error Unable to determine architecture.
>> +#endif
>> +
>>  /* Kernel helper for compare-and-exchange.  */
>>  typedef int (__kernel_cmpxchg_t) (int oldval, int newval, int *ptr);
>> +#if __FDPIC__
>> +#define __kernel_cmpxchg __fdpic_cmpxchg
>> +#else
>>  #define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) 0x0fc0)
>> +#endif
>>
>>  /* Kernel helper for memory barrier.  */
>>  typedef void (__kernel_dmb_t) (void);
>> +#if __FDPIC__
>> +#define __kernel_dmb __fdpic_dmb
>> +#else
>>  #define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)
>> +#endif
>> +
>> +#if __FDPIC__
>> +static int __fdpic_cmpxchg (int oldval, int newval, int *ptr)
>> +{
>> +#if __ARM_ARCH__ < 6
>> +  #error architecture support not yet implemented
>> +  /* Use swap instruction (but is it always safe ? (interrupt?))  */
>> +#else
>> +  int result;
>> +
>> +  asm volatile ("1: ldrex r3, [%[ptr]]\n\t"
>> +   "subs  r3, r3, %[oldval]\n\t"
>> +

Re: [ARM/FDPIC 10/21] [ARM] FDPIC: Implement legitimize_tls_address_fdpic

2018-06-08 Thread Kyrill Tkachov


Hi Christophe,

On 25/05/18 09:03, Christophe Lyon wrote:

Support additional relocations: TLS_GD32_FDPIC, TLS_LDM32_FDPIC, and
TLS_IE32_FDPIC.

We do not support the GNU2 TLS dialect.

2018-XX-XX  Christophe Lyon  
Mickaël Guêné 

gcc/
* config/arm/arm.c (tls_reloc): Add TLS_GD32_FDPIC,
TLS_LDM32_FDPIC and TLS_IE32_FDPIC.
(arm_call_tls_get_addr_fdpic): New.
(legitimize_tls_address_fdpic): New.
(legitimize_tls_address_not_fdpic): New.
(legitimize_tls_address): Add FDPIC support.
(arm_emit_tls_decoration): Likewise.

Change-Id: I4ea5034ff654540c4658d0a79fb92f70550cdf4a

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 2434602..20b8f66 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2373,9 +2373,12 @@ char arm_arch_name[] = "__ARM_ARCH_PROFILE__";

 enum tls_reloc {
   TLS_GD32,
+  TLS_GD32_FDPIC,
   TLS_LDM32,
+  TLS_LDM32_FDPIC,
   TLS_LDO32,
   TLS_IE32,
+  TLS_IE32_FDPIC,
   TLS_LE32,
   TLS_DESCSEQ  /* GNU scheme */
 };
@@ -8681,6 +8684,30 @@ load_tls_operand (rtx x, rtx reg)
 }

 static rtx_insn *
+arm_call_tls_get_addr_fdpic (rtx x, rtx reg, rtx *valuep, int reloc)
+{


Please add a function comment


+  rtx sum;
+
+  gcc_assert (reloc != TLS_DESCSEQ);
+  start_sequence ();
+
+  sum = gen_rtx_UNSPEC (Pmode,
+   gen_rtvec (2, x, GEN_INT (reloc)),
+   UNSPEC_TLS);
+  reg = load_tls_operand (sum, reg);
+  emit_insn (gen_addsi3 (reg, reg, gen_rtx_REG (Pmode, 9)));
+
+  *valuep = emit_library_call_value (get_tls_get_addr (), NULL_RTX,
+LCT_PURE, /* LCT_CONST?  */
+Pmode, reg, Pmode);


I prefer to avoid comments with such question marks. Is there some ambiguity
on which should be used?


+
+  rtx_insn *insns = get_insns ();
+  end_sequence ();
+
+  return insns;
+}
+
+static rtx_insn *
 arm_call_tls_get_addr (rtx x, rtx reg, rtx *valuep, int reloc)
 {
   rtx label, labelno, sum;
@@ -8736,8 +8763,84 @@ arm_tls_descseq_addr (rtx x, rtx reg)
   return reg;
 }

-rtx
-legitimize_tls_address (rtx x, rtx reg)
+static rtx
+legitimize_tls_address_fdpic (rtx x, rtx reg)
+{


Please add a function comment, even if it's just a simple one.


+rtx dest, ret, eqv, addend, sum, tp;
+rtx_insn *insns;
+unsigned int model = SYMBOL_REF_TLS_MODEL (x);
+
+switch (model)
+  {
+  case TLS_MODEL_GLOBAL_DYNAMIC:
+   if (TARGET_GNU2_TLS)
+ {
+   abort ();
+ }


Use gcc_unreachable ().


+   else
+ {
+   /* Original scheme.  */
+   insns = arm_call_tls_get_addr_fdpic (x, reg, , TLS_GD32_FDPIC);
+   dest = gen_reg_rtx (Pmode);
+   emit_libcall_block (insns, dest, ret, x);
+ }
+   return dest;
+   break;
+
+  case TLS_MODEL_LOCAL_DYNAMIC:
+   if (TARGET_GNU2_TLS)
+ {
+   abort ();
+ }


Likewise.


+   else
+ {
+   insns = arm_call_tls_get_addr_fdpic (x, reg, , TLS_LDM32_FDPIC);
+   /* Attach a unique REG_EQUIV, to allow the RTL optimizers to
+  share the LDM result with other LD model accesses.  */
+   eqv = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const1_rtx),
+ UNSPEC_TLS);
+   dest = gen_reg_rtx (Pmode);
+   emit_libcall_block (insns, dest, ret, eqv);
+
+   /* Load the addend.  */
+   addend = gen_rtx_UNSPEC (Pmode,
+gen_rtvec (2, x, GEN_INT (TLS_LDO32)),
+UNSPEC_TLS);
+   addend = force_reg (SImode, gen_rtx_CONST (SImode, addend));


Nit I'm guessing, but I think this SImode should be Pmode.


+   dest = gen_rtx_PLUS (Pmode, dest, addend);
+ }
+   return dest;
+   break;
+
+  case TLS_MODEL_INITIAL_EXEC:
+   sum = gen_rtx_UNSPEC (Pmode,
+ gen_rtvec (2, x, GEN_INT (TLS_IE32_FDPIC)),
+ UNSPEC_TLS);
+   reg = load_tls_operand (sum, reg);
+   /* FIXME: optimize below? */


Not a fan of such FIXMEs. Let's either optimise it now or leave the comment out.


+   emit_insn (gen_addsi3 (reg, reg, gen_rtx_REG (Pmode, 9)));
+   emit_insn (gen_movsi (reg, gen_rtx_MEM (Pmode, reg)));


You can use the shorter emit_move_insn (). I think there are other places in 
the series
where you can do that as well.


+   tp = arm_load_tp (NULL_RTX);
+
+   return gen_rtx_PLUS (Pmode, tp, reg);
+   break;
+
+  case TLS_MODEL_LOCAL_EXEC:
+   tp = arm_load_tp (NULL_RTX);
+   reg = gen_rtx_UNSPEC (Pmode,
+ gen_rtvec (2, x, GEN_INT (TLS_LE32)),
+ UNSPEC_TLS);
+   reg = force_reg (SImode, gen_rtx_CONST (SImode, reg));
+
+   return gen_rtx_PLUS (Pmode, tp, reg);
+
+  default:
+   abort ();
+  }
+}
+
+static rtx

Re: [ARM/FDPIC 09/21] [ARM] FDPIC: Add support for taking address of nested function

2018-06-08 Thread Kyrill Tkachov


Hi Chrishophe,

Could you please provide a description of what this patch does and how it 
achieves that?

On 25/05/18 09:03, Christophe Lyon wrote:

2018-XX-XX  Christophe Lyon 
Mickaël Guêné 

gcc/
* config/arm/arm.c (arm_asm_trampoline_template): Add FDPIC
support.
(arm_trampoline_init): Likewise.
(arm_trampoline_init): Likewise.
* config/arm/arm.h (TRAMPOLINE_SIZE): Likewise.

Change-Id: I4b5127261a9aefa0f0318f110574ec07a856aeb1

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 025485d..2434602 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3967,7 +3967,27 @@ arm_asm_trampoline_template (FILE *f)
 {
   fprintf (f, "\t.syntax unified\n");

-  if (TARGET_ARM)
+  if (TARGET_FDPIC)
+{
+  /* The first two words are a function descriptor to jump into
+the trampoline code just below.  */
+  if (TARGET_ARM) fprintf (f, "\t.arm\n");
+  else if (TARGET_THUMB2) fprintf (f, "\t.thumb\n");
+  else fprintf (f, "\t.code\t16\n");
+


Please format this in GNU style.


+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+  /* Trampoline code which sets the static chain register but also
+PIC register before jumping into real code.  */
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  STATIC_CHAIN_REGNUM, PC_REGNUM, TARGET_THUMB2?8:4);
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  PIC_OFFSET_TABLE_REGNUM, PC_REGNUM, TARGET_THUMB2?8:4);
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  PC_REGNUM, PC_REGNUM, TARGET_THUMB2?8:4);


Likewise.
Also, this will use offset 4 for TARGET_THUMB1. Given that you handle 
TARGET_THUMB1
in the if statement above you expect this code can be entered for TARGET_THUMB1?

Thanks,
Kyrill


+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+}
+  else if (TARGET_ARM)
 {
   fprintf (f, "\t.arm\n");
   asm_fprintf (f, "\tldr\t%r, [%r, #0]\n", STATIC_CHAIN_REGNUM, PC_REGNUM);
@@ -4008,12 +4028,37 @@ arm_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   emit_block_move (m_tramp, assemble_trampoline_template (),
GEN_INT (TRAMPOLINE_SIZE), BLOCK_OP_NORMAL);

-  mem = adjust_address (m_tramp, SImode, TARGET_32BIT ? 8 : 12);
-  emit_move_insn (mem, chain_value);
+  if (TARGET_FDPIC)
+{
+  rtx funcdesc = XEXP (DECL_RTL (fndecl), 0);
+  rtx fnaddr = gen_rtx_MEM (Pmode, funcdesc);
+  rtx gotaddr = gen_rtx_MEM (Pmode, plus_constant (Pmode, funcdesc, 4));
+  rtx trampolineCodeStart
+   = plus_constant (Pmode, XEXP (m_tramp, 0), TARGET_THUMB2 ? 9 : 8);
+
+  /* Write initial funcdesc which will jump into trampoline.  */
+  mem = adjust_address (m_tramp, SImode, 0);
+  emit_move_insn (mem, trampolineCodeStart);
+  mem = adjust_address (m_tramp, SImode, 4);
+  emit_move_insn (mem, gen_rtx_REG (Pmode, PIC_OFFSET_TABLE_REGNUM));
+  /* Setup static chain.  */
+  mem = adjust_address (m_tramp, SImode, 20);
+  emit_move_insn (mem, chain_value);
+  /* GOT + real function entry point.  */
+  mem = adjust_address (m_tramp, SImode, 24);
+  emit_move_insn (mem, gotaddr);
+  mem = adjust_address (m_tramp, SImode, 28);
+  emit_move_insn (mem, fnaddr);
+}
+  else
+{
+  mem = adjust_address (m_tramp, SImode, TARGET_32BIT ? 8 : 12);
+  emit_move_insn (mem, chain_value);

-  mem = adjust_address (m_tramp, SImode, TARGET_32BIT ? 12 : 16);
-  fnaddr = XEXP (DECL_RTL (fndecl), 0);
-  emit_move_insn (mem, fnaddr);
+  mem = adjust_address (m_tramp, SImode, TARGET_32BIT ? 12 : 16);
+  fnaddr = XEXP (DECL_RTL (fndecl), 0);
+  emit_move_insn (mem, fnaddr);
+}

   a_tramp = XEXP (m_tramp, 0);
   emit_library_call (gen_rtx_SYMBOL_REF (Pmode, "__clear_cache"),
@@ -4027,7 +4072,9 @@ arm_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
 static rtx
 arm_trampoline_adjust_address (rtx addr)
 {
-  if (TARGET_THUMB)
+  /* For FDPIC don't fix trampoline address since it's a function
+ descriptor and not a function address.  */
+  if (TARGET_THUMB && !TARGET_FDPIC)
 addr = expand_simple_binop (Pmode, IOR, addr, const1_rtx,
 NULL, 0, OPTAB_LIB_WIDEN);
   return addr;
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index e8ef439..db17ef2 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1578,7 +1578,7 @@ typedef struct
 #define INIT_EXPANDERS  arm_init_expanders ()

 /* Length in units of the trampoline for entering a nested function.  */
-#define TRAMPOLINE_SIZE  (TARGET_32BIT ? 16 : 20)
+#define TRAMPOLINE_SIZE  (TARGET_FDPIC ? 32 : (TARGET_32BIT ? 16 : 20))

 /* Alignment required for a trampoline in bits.  */
 #define TRAMPOLINE_ALIGNMENT  32
--
2.6.3

Re: [ARM/FDPIC 04/21] [ARM] FDPIC: Add support for FDPIC for arm architecture

2018-06-08 Thread Kyrill Tkachov


Hi Christophe,

On 25/05/18 09:03, Christophe Lyon wrote:

The FDPIC register is hard-coded to r9, as defined in the ABI.

We have to disable tailcall optimizations if we don't know if the
target function is in the same module. If not, we have to set r9 to
the value associated with the target module.

When generating a symbol address, we have to take into account whether
it is a pointer to data or to a function, because different
relocations are needed.

2018-XX-XX  Christophe Lyon  
Mickaël Guêné 

* config/arm/arm-c.c (__FDPIC__): Define new pre-processor macro
in FDPIC mode.
* config/arm/arm-protos.h (arm_load_function_descriptor): Declare
new function.
* config/arm/arm.c (arm_option_override): Define pic register to
r9.
(arm_function_ok_for_sibcall) Disable sibcall optimization if we
have no decl or go through PLT.
(arm_load_pic_register): Handle TARGET_FDPIC.
(arm_is_segment_info_known): New function.
(arm_pic_static_addr): Add support for FDPIC.
(arm_load_function_descriptor): New function.
(arm_assemble_integer): Add support for FDPIC.
* config/arm/arm.h (PIC_OFFSET_TABLE_REG_CALL_CLOBBERED): Define.
* config/arm/arm.md (call): Add support for FDPIC.
(call_value): Likewise.
(*restore_pic_register_after_call): New pattern.
(untyped_call): Disable if FDPIC.
(untyped_return): Likewise.
* config/arm/unspecs.md (UNSPEC_PIC_RESTORE): New.

Change-Id: Icee8484772f97ac6f3a9574df4aa4f25a8196786

diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 4471f79..90733cc 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -202,6 +202,8 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   builtin_define ("__ARM_EABI__");
 }

+  def_or_undef_macro (pfile, "__FDPIC__", TARGET_FDPIC);
+
   def_or_undef_macro (pfile, "__ARM_ARCH_EXT_IDIV__", TARGET_IDIV);
   def_or_undef_macro (pfile, "__ARM_FEATURE_IDIV", TARGET_IDIV);

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 8537262..edebeb7 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -134,6 +134,7 @@ extern int arm_max_const_double_inline_cost (void);
 extern int arm_const_double_inline_cost (rtx);
 extern bool arm_const_double_by_parts (rtx);
 extern bool arm_const_double_by_immediates (rtx);
+extern rtx arm_load_function_descriptor (rtx funcdesc);
 extern void arm_emit_call_insn (rtx, rtx, bool);
 bool detect_cmse_nonsecure_call (tree);
 extern const char *output_call (rtx *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 4a5da7e..56670e3 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3475,6 +3475,12 @@ arm_option_override (void)
   if (flag_pic && TARGET_VXWORKS_RTP)
 arm_pic_register = 9;

+  /* If in FDPIC mode then force arm_pic_register to be r9.  */
+  if (TARGET_FDPIC)
+{
+  arm_pic_register = 9;
+}
+


Leave out the braces.
Also, I believe you'll want to add option checking for TARGET_FDPIC.
Your cover letter says that this series supports Armv7. So you should add
checks in arm_override to error out on unsupported configurations appropriately
(pre-Armv7? TARGET_THUMB1? float-abi configurations?)

Thanks,
Kyrill


   if (arm_pic_register_string != NULL)
 {
   int pic_register = decode_reg_name (arm_pic_register_string);
@@ -7256,6 +7262,21 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
   if (cfun->machine->sibcall_blocked)
 return false;

+  if (TARGET_FDPIC)
+{
+  /* In FDPIC, never tailcall something for which we have no decl:
+the target function could be in a different module, requiring
+a different r9 value.  */
+  if (decl == NULL)
+   return false;
+
+  /* Don't tailcall if we go through the PLT since r9 is then
+corrupted and we don't restore it for static function
+call.  */
+  if (!targetm.binds_local_p (decl))
+   return false;
+}
+
   /* Never tailcall something if we are generating code for Thumb-1.  */
   if (TARGET_THUMB1)
 return false;
@@ -7634,7 +7655,9 @@ arm_load_pic_register (unsigned long saved_regs 
ATTRIBUTE_UNUSED)
 {
   rtx l1, labelno, pic_tmp, pic_rtx, pic_reg;

-  if (crtl->uses_pic_offset_table == 0 || TARGET_SINGLE_PIC_BASE)
+  if (crtl->uses_pic_offset_table == 0
+  || TARGET_SINGLE_PIC_BASE
+  || TARGET_FDPIC)
 return;

   gcc_assert (flag_pic);
@@ -7702,28 +7725,167 @@ arm_load_pic_register (unsigned long saved_regs 
ATTRIBUTE_UNUSED)
   emit_use (pic_reg);
 }

+/* Try to know if the object will go in text or data segment.  */
+static bool
+arm_is_segment_info_known (rtx orig, bool *is_readonly)
+{
+  bool res = false;
+
+  *is_readonly = false;
+
+  if (GET_CODE (orig) == LABEL_REF)
+{
+  res = true;
+  *is_readonly = true;
+}
+  else if (GET_CODE (orig) == SYMBOL_REF)
+{
+  if

[PATCH] i386; Add indirect_return function attribute

2018-06-08 Thread H.J. Lu

On x86, swapcontext may return via indirect branch when shadow stack
is enabled.  To support code instrumentation of control-flow transfers
with -fcf-protection, add indirect_return function attribute to inform
compiler that a function may return via indirect branch.

Note: Unlike setjmp, swapcontext only returns once.  Mark it return
twice will unnecessarily disable compiler optimization.

OK for trunk?

H.J.

gcc/

PR target/85620
* config/i386/i386.c (rest_of_insert_endbranch): Also generate
ENDBRANCH for non-tail call which may return via indirect branch.
* doc/extend.texi: Document indirect_return attribute.

gcc/testsuite/

PR target/85620
* gcc.target/i386/pr85620-1.c: New test.
* gcc.target/i386/pr85620-2.c: Likewise.
---
 gcc/config/i386/i386.c| 23 ++-
 gcc/doc/extend.texi   |  6 ++
 gcc/testsuite/gcc.target/i386/pr85620-1.c | 15 +++
 gcc/testsuite/gcc.target/i386/pr85620-2.c | 13 +
 4 files changed, 56 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85620-2.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b95f0612562..3fb79178138 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2625,7 +2625,26 @@ rest_of_insert_endbranch (void)
{
  if (CALL_P (insn))
{
- if (find_reg_note (insn, REG_SETJMP, NULL) == NULL)
+ bool need_endbr;
+ need_endbr = find_reg_note (insn, REG_SETJMP, NULL) != NULL;
+ if (!need_endbr && !SIBLING_CALL_P (insn))
+   {
+ rtx call = get_call_rtx_from (insn);
+ rtx fnaddr = XEXP (call, 0);
+
+ /* Also generate ENDBRANCH for non-tail call which
+may return via indirect branch.  */
+ if (MEM_P (fnaddr)
+ && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF)
+   {
+ tree fndecl = SYMBOL_REF_DECL (XEXP (fnaddr, 0));
+ if (fndecl
+ && lookup_attribute ("indirect_return",
+  DECL_ATTRIBUTES (fndecl)))
+   need_endbr = true;
+   }
+   }
+ if (!need_endbr)
continue;
  /* Generate ENDBRANCH after CALL, which can return more than
 twice, setjmp-like functions.  */
@@ -46769,6 +46788,8 @@ static const struct attribute_spec 
ix86_attribute_table[] =
 ix86_handle_fndecl_attribute, NULL },
   { "function_return", 1, 1, true, false, false, false,
 ix86_handle_fndecl_attribute, NULL },
+  { "indirect_return", 0, 0, true, false, false, false,
+ix86_handle_fndecl_attribute, NULL },
 
   /* End element.  */
   { NULL, 0, 0, false, false, false, false, NULL, NULL }
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 3e6c98a554a..ddd50b0da3e 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5878,6 +5878,12 @@ foo (void)
 @}
 @end smallexample
 
+@item indirect_return
+@cindex @code{indirect_return} function attribute, x86
+
+The @code{indirect_return} attribute on a function is used to inform
+the compiler that the function may return via indiret branch.
+
 @end table
 
 On the x86, the inliner does not inline a
diff --git a/gcc/testsuite/gcc.target/i386/pr85620-1.c 
b/gcc/testsuite/gcc.target/i386/pr85620-1.c
new file mode 100644
index 000..32efb08e59e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr85620-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection" } */
+/* { dg-final { scan-assembler-times {\mendbr} 2 } } */
+
+struct ucontext;
+
+extern int bar (struct ucontext *) __attribute__((__indirect_return__));
+
+extern int res;
+
+void
+foo (struct ucontext *oucp)
+{
+  res = bar (oucp);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr85620-2.c 
b/gcc/testsuite/gcc.target/i386/pr85620-2.c
new file mode 100644
index 000..b2e680fa1fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr85620-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcf-protection" } */
+/* { dg-final { scan-assembler-times {\mendbr} 1 } } */
+
+struct ucontext;
+
+extern int bar (struct ucontext *) __attribute__((__indirect_return__));
+
+int
+foo (struct ucontext *oucp)
+{
+  return bar (oucp);
+}
-- 
2.17.1

Re: [ARM/FDPIC 08/21] [ARM] FDPIC: Fix corner case of weak symbol

2018-06-08 Thread Kyrill Tkachov


Hi Christophe,

On 25/05/18 09:03, Christophe Lyon wrote:

When checking the address of a weak symbol in an executable, it is
mandatory to use the GOTFUNCDESC scheme so that the address==NULL
semantic is valid if the symbol is not present in the final link.

2018-XX-XX  Christophe Lyon  
Mickaël Guêné 

gcc/
* config/arm/arm.c (arm_local_funcdesc_p): New function.
(legitimize_pic_address): Handle weak symbols in FDPIC mode.
(arm_assemble_integer): Likewise.

Change-Id: I3fa0b63bc0f672903f405aa72cc46052de1c0feb

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index bbf8884..025485d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3771,6 +3771,46 @@ arm_options_perform_arch_sanity_checks (void)
 }
 }

+/* Test whether a local function descriptor is canonical, i.e.,
+   whether we can use GOTOFFFUNCDESC to compute the address of the
+   function.  */
+static bool
+arm_local_funcdesc_p (rtx fnx)
+{
+  tree fn;
+  enum symbol_visibility vis;
+  bool ret;
+
+  if (!TARGET_FDPIC)
+return TRUE;
+
+  if (! SYMBOL_REF_LOCAL_P (fnx))
+return FALSE;
+
+  fn = SYMBOL_REF_DECL (fnx);
+
+  if (! fn)
+return FALSE;
+
+  /* Local function declared as weak must use GOTFUNCDESC so their
+ FUNCDESC is NULL if they are not linked in.  */
+  if (DECL_WEAK(fn))
+return FALSE;
+
+  vis = DECL_VISIBILITY (fn);
+
+  if (vis == VISIBILITY_PROTECTED)
+/* Private function descriptors for protected functions are not
+   canonical.  Temporarily change the visibility to global.  */
+vis = VISIBILITY_DEFAULT;
+
+  ret = default_binds_local_p_1 (fn, flag_pic);
+
+  DECL_VISIBILITY (fn) = vis;
+


I'm a bit suspicious of the above few lines. Does this function end up changing 
the visibility
of 'fn'? If so, this function is not a pure predicate function and should be 
documented that it
modifies some part of its argument. Or did you mean to change the visibility 
before the call to
default_binds_local_p_1 and restore it afterwards?

Thanks,
Kyrill


+  return ret;
+}
+
 static void
 arm_add_gc_roots (void)
 {
@@ -7488,7 +7528,9 @@ legitimize_pic_address (rtx orig, machine_mode mode, rtx 
reg)
|| (GET_CODE (orig) == SYMBOL_REF
&& SYMBOL_REF_LOCAL_P (orig)
&& (SYMBOL_REF_DECL (orig)
-  ? !DECL_WEAK (SYMBOL_REF_DECL (orig)) : 1)))
+  ? !DECL_WEAK (SYMBOL_REF_DECL (orig)) : 1)
+  && (!SYMBOL_REF_FUNCTION_P(orig)
+  || arm_local_funcdesc_p (orig
   && NEED_GOT_RELOC
   && arm_pic_data_is_text_relative)
 insn = arm_pic_static_addr (orig, reg);
@@ -23072,7 +23114,9 @@ arm_assemble_integer (rtx x, unsigned int size, int 
aligned_p)
   || (GET_CODE (x) == SYMBOL_REF
   && (!SYMBOL_REF_LOCAL_P (x)
   || (SYMBOL_REF_DECL (x)
- ? DECL_WEAK (SYMBOL_REF_DECL (x)) : 0
+ ? DECL_WEAK (SYMBOL_REF_DECL (x)) : 0)
+ || (SYMBOL_REF_FUNCTION_P (x)
+ && !arm_local_funcdesc_p (x)
 {
   if (TARGET_FDPIC && SYMBOL_REF_FUNCTION_P (x))
 fputs ("(GOTFUNCDESC)", asm_out_file);
--
2.6.3

Re: [ARM/FDPIC 06/21] [ARM] FDPIC: Add support for c++ exceptions

2018-06-08 Thread Kyrill Tkachov


Hi Christophe,

On 25/05/18 09:03, Christophe Lyon wrote:

When restoring a function address, we also have to restore the FDPIC
register value (r9).

2018-XX-XX  Christophe Lyon  
Mickaël Guêné 

gcc/
* ginclude/unwind-arm-common.h (unwinder_cache): Add reserved5
field.

libgcc/
* config/arm/linux-atomic.c (__ARM_ARCH__): Define.
(__kernel_cmpxchg): Add FDPIC support.
(__kernel_dmb): Likewise.
(__fdpic_cmpxchg): New function.
(__fdpic_dmb): New function.
* config/arm/unwind-arm.h (gnu_Unwind_Find_got): New function.
(_Unwind_decode_typeinfo_ptr): Add FDPIC support.
* unwindo-arm-common.inc (UCB_PR_GOT): New.
(funcdesc_t): New struct.
(get_eit_entry): Add FDPIC support.
(unwind_phase2): Likewise.
(unwind_phase2_forced): Likewise.
(__gnu_Unwind_RaiseException): Likewise.
(__gnu_Unwind_Resume): Likewise.
(__gnu_Unwind_Backtrace): Likewise.
* unwind-pe.h (read_encoded_value_with_base): Likewise.

libstdc++/
* libsupc++/eh_personality.cc (get_ttype_entry): Add FDPIC
support.

Change-Id: Ic0841eb3d7bfb0b3f6d187cd52a660b8fd394d85

diff --git a/gcc/ginclude/unwind-arm-common.h b/gcc/ginclude/unwind-arm-common.h
index 8a1a919..150bd0f 100644
--- a/gcc/ginclude/unwind-arm-common.h
+++ b/gcc/ginclude/unwind-arm-common.h
@@ -91,7 +91,7 @@ extern "C" {
   _uw reserved2;  /* Personality routine address */
   _uw reserved3;  /* Saved callsite address */
   _uw reserved4;  /* Forced unwind stop arg */
- _uw reserved5;
+ _uw reserved5;  /* Personality routine GOT value in FDPIC mode.  */
 }
   unwinder_cache;
   /* Propagation barrier cache (valid after phase 1): */
diff --git a/libgcc/config/arm/linux-atomic.c b/libgcc/config/arm/linux-atomic.c
index d334c58..a20ad94 100644
--- a/libgcc/config/arm/linux-atomic.c
+++ b/libgcc/config/arm/linux-atomic.c
@@ -23,13 +23,99 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
 . */

+#if defined(__ARM_ARCH_2__)
+# define __ARM_ARCH__ 2
+#endif
+
+#if defined(__ARM_ARCH_3__)
+# define __ARM_ARCH__ 3
+#endif
+
+#if defined(__ARM_ARCH_3M__) || defined(__ARM_ARCH_4__) \
+  || defined(__ARM_ARCH_4T__)
+/* We use __ARM_ARCH__ set to 4 here, but in reality it's any processor with
+   long multiply instructions.  That includes v3M.  */
+# define __ARM_ARCH__ 4
+#endif
+


Support for __ARM_ARCH_2__, __ARM_ARCH_3__, __ARM_ARCH_3M__ has been removed in 
GCC 9
so this code is dead.

I notice that in the removal I've missed out an occurrence of these in 
config/arm/lib1funcs.S.
If you want to remove those occurrences as a separate patch that would be 
preapproved.

Thanks,
Kyrill


+#if defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5T__) \
+  || defined(__ARM_ARCH_5E__) || defined(__ARM_ARCH_5TE__) \
+  || defined(__ARM_ARCH_5TEJ__)
+# define __ARM_ARCH__ 5
+#endif
+
+#if defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
+  || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) \
+  || defined(__ARM_ARCH_6ZK__) || defined(__ARM_ARCH_6T2__) \
+  || defined(__ARM_ARCH_6M__)
+# define __ARM_ARCH__ 6
+#endif
+
+#if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
+  || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
+  || defined(__ARM_ARCH_7EM__)
+# define __ARM_ARCH__ 7
+#endif
+
+#ifndef __ARM_ARCH__
+#error Unable to determine architecture.
+#endif
+
 /* Kernel helper for compare-and-exchange.  */
 typedef int (__kernel_cmpxchg_t) (int oldval, int newval, int *ptr);
+#if __FDPIC__
+#define __kernel_cmpxchg __fdpic_cmpxchg
+#else
 #define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) 0x0fc0)
+#endif

 /* Kernel helper for memory barrier.  */
 typedef void (__kernel_dmb_t) (void);
+#if __FDPIC__
+#define __kernel_dmb __fdpic_dmb
+#else
 #define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)
+#endif
+
+#if __FDPIC__
+static int __fdpic_cmpxchg (int oldval, int newval, int *ptr)
+{
+#if __ARM_ARCH__ < 6
+  #error architecture support not yet implemented
+  /* Use swap instruction (but is it always safe ? (interrupt?))  */
+#else
+  int result;
+
+  asm volatile ("1: ldrex r3, [%[ptr]]\n\t"
+   "subs  r3, r3, %[oldval]\n\t"
+   "itt eq\n\t"
+   "strexeq r3, %[newval], [%[ptr]]\n\t"
+   "teqeq r3, #1\n\t"
+   "it eq\n\t"
+   "beq 1b\n\t"
+   "rsbs  %[result], r3, #0\n\t"
+   : [result] "=r" (result)
+   : [oldval] "r" (oldval) , [newval] "r" (newval), [ptr] "r" (ptr)
+   : "r3");
+return result;
+#endif
+}
+
+static void __fdpic_dmb ()
+{
+#if __ARM_ARCH__ < 6
+  /* No op? Perhaps flush write buffer ?  */
+  return ;
+#else
+ #if __ARM_ARCH__ >= 7
+  asm volatile ("dmb\n\t");
+ #elif __ARM_ARCH__

Re: [PATCH 03/14] Rename get methods in symbol-summary.h to get_create.

2018-06-08 Thread Martin Liška

On 06/08/2018 11:50 AM, Martin Jambor wrote:
> Hi,
> 
> On Thu, Jun 07 2018, Jan Hubicka wrote:
>>>
>>> gcc/ChangeLog:
>>>
>>> 2018-04-24  Martin Liska  
>>>
>>> * config/i386/i386.c (ix86_can_inline_p): Use get_create instead
>>> of get.
>>> * hsa-common.c (hsa_summary_t::link_functions): Likewise.
>>> (hsa_register_kernel): Likewise.
>>> * hsa-common.h (hsa_gpu_implementation_p): Likewise.
>>> * hsa-gen.c (hsa_get_host_function): Likewise.
>>> (get_brig_function_name): Likewise.
>>> (generate_hsa): Likewise.
>>> (pass_gen_hsail::execute): Likewise.
>>> * ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
>>> (devirtualization_time_bonus): Likewise.
>>> (ipcp_propagate_stage): Likewise.
>>> * ipa-fnsummary.c (redirect_to_unreachable): Likewise.
>>> (edge_set_predicate): Likewise.
>>> (evaluate_conditions_for_known_args): Likewise.
>>> (evaluate_properties_for_edge): Likewise.
>>> (ipa_fn_summary::reset): Likewise.
>>> (ipa_fn_summary_t::duplicate): Likewise.
>>> (dump_ipa_call_summary): Likewise.
>>> (ipa_dump_fn_summary): Likewise.
>>> (analyze_function_body): Likewise.
>>> (compute_fn_summary): Likewise.
>>> (estimate_edge_devirt_benefit): Likewise.
>>> (estimate_edge_size_and_time): Likewise.
>>> (estimate_calls_size_and_time): Likewise.
>>> (estimate_node_size_and_time): Likewise.
>>> (inline_update_callee_summaries): Likewise.
>>> (remap_edge_change_prob): Likewise.
>>> (remap_edge_summaries): Likewise.
>>> (ipa_merge_fn_summary_after_inlining): Likewise.
>>> (ipa_update_overall_fn_summary): Likewise.
>>> (read_ipa_call_summary): Likewise.
>>> (inline_read_section): Likewise.
>>> (write_ipa_call_summary): Likewise.
>>> (ipa_fn_summary_write): Likewise.
>>> (ipa_free_fn_summary): Likewise.
>>> * ipa-hsa.c (process_hsa_functions): Likewise.
>>> (ipa_hsa_write_summary): Likewise.
>>> (ipa_hsa_read_section): Likewise.
>>> * ipa-icf.c (sem_function::merge): Likewise.
>>> * ipa-inline-analysis.c (simple_edge_hints): Likewise.
>>> (do_estimate_edge_time): Likewise.
>>> (estimate_size_after_inlining): Likewise.
>>> (estimate_growth): Likewise.
>>> (growth_likely_positive): Likewise.
>>> * ipa-inline-transform.c (clone_inlined_nodes): Likewise.
>>> (inline_call): Likewise.
>>> * ipa-inline.c (caller_growth_limits): Likewise.
>>> (can_inline_edge_p): Likewise.
>>> (can_inline_edge_by_limits_p): Likewise.
>>> (compute_uninlined_call_time): Likewise.
>>> (compute_inlined_call_time): Likewise.
>>> (want_inline_small_function_p): Likewise.
>>> (edge_badness): Likewise.
>>> (update_caller_keys): Likewise.
>>> (update_callee_keys): Likewise.
>>> (recursive_inlining): Likewise.
>>> (inline_small_functions): Likewise.
>>> (inline_to_all_callers_1): Likewise.
>>> (dump_overall_stats): Likewise.
>>> (early_inline_small_functions): Likewise.
>>> (early_inliner): Likewise.
>>> * ipa-inline.h (estimate_edge_growth): Likewise.
>>> * ipa-profile.c (ipa_propagate_frequency_1): Likewise.
>>> * ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
>>> * ipa-prop.h (IPA_NODE_REF): Likewise.
>>> (IPA_EDGE_REF): Likewise.
>>> * ipa-pure-const.c (malloc_candidate_p): Likewise.
>>> (propagate_malloc): Likewise.
>>> * ipa-split.c (execute_split_functions): Likewise.
>>> * symbol-summary.h: Rename get to get_create.
>>> (get): Likewise.
>>> (get_create): Likewise.
>>> * tree-sra.c (ipa_sra_preliminary_function_checks): Likewise.
> 
> ...
> 
>>>ipa_fn_summaries->release ();
>>>ipa_fn_summaries = NULL;
>>>ipa_call_summaries->release ();
>>> diff --git a/gcc/ipa-hsa.c b/gcc/ipa-hsa.c
>>> index 1df273c7f28..90d193fe517 100644
>>> --- a/gcc/ipa-hsa.c
>>> +++ b/gcc/ipa-hsa.c
>>
>> Probably Martin Jambor can comment on ipa-hsa.
> 
> That's how it works today, so this patch does not change anything.  It
> should be easy to create much fewer summaries - this is a leftover from
> early stages of HSA BE development.  I will put it on my TODO list.

Yes, I've just discussed that with Honza on IRC. He's fine with the renaming
and then doing step by step conversion to ::get at places where it's expected
no insertion is done.

Martin

> 
> Martin
>

Re: [PATCH 03/14] Rename get methods in symbol-summary.h to get_create.

2018-06-08 Thread Martin Jambor

Hi,

On Thu, Jun 07 2018, Jan Hubicka wrote:
>> 
>> gcc/ChangeLog:
>> 
>> 2018-04-24  Martin Liska  
>> 
>>  * config/i386/i386.c (ix86_can_inline_p): Use get_create instead
>>  of get.
>>  * hsa-common.c (hsa_summary_t::link_functions): Likewise.
>>  (hsa_register_kernel): Likewise.
>>  * hsa-common.h (hsa_gpu_implementation_p): Likewise.
>>  * hsa-gen.c (hsa_get_host_function): Likewise.
>>  (get_brig_function_name): Likewise.
>>  (generate_hsa): Likewise.
>>  (pass_gen_hsail::execute): Likewise.
>>  * ipa-cp.c (ipcp_cloning_candidate_p): Likewise.
>>  (devirtualization_time_bonus): Likewise.
>>  (ipcp_propagate_stage): Likewise.
>>  * ipa-fnsummary.c (redirect_to_unreachable): Likewise.
>>  (edge_set_predicate): Likewise.
>>  (evaluate_conditions_for_known_args): Likewise.
>>  (evaluate_properties_for_edge): Likewise.
>>  (ipa_fn_summary::reset): Likewise.
>>  (ipa_fn_summary_t::duplicate): Likewise.
>>  (dump_ipa_call_summary): Likewise.
>>  (ipa_dump_fn_summary): Likewise.
>>  (analyze_function_body): Likewise.
>>  (compute_fn_summary): Likewise.
>>  (estimate_edge_devirt_benefit): Likewise.
>>  (estimate_edge_size_and_time): Likewise.
>>  (estimate_calls_size_and_time): Likewise.
>>  (estimate_node_size_and_time): Likewise.
>>  (inline_update_callee_summaries): Likewise.
>>  (remap_edge_change_prob): Likewise.
>>  (remap_edge_summaries): Likewise.
>>  (ipa_merge_fn_summary_after_inlining): Likewise.
>>  (ipa_update_overall_fn_summary): Likewise.
>>  (read_ipa_call_summary): Likewise.
>>  (inline_read_section): Likewise.
>>  (write_ipa_call_summary): Likewise.
>>  (ipa_fn_summary_write): Likewise.
>>  (ipa_free_fn_summary): Likewise.
>>  * ipa-hsa.c (process_hsa_functions): Likewise.
>>  (ipa_hsa_write_summary): Likewise.
>>  (ipa_hsa_read_section): Likewise.
>>  * ipa-icf.c (sem_function::merge): Likewise.
>>  * ipa-inline-analysis.c (simple_edge_hints): Likewise.
>>  (do_estimate_edge_time): Likewise.
>>  (estimate_size_after_inlining): Likewise.
>>  (estimate_growth): Likewise.
>>  (growth_likely_positive): Likewise.
>>  * ipa-inline-transform.c (clone_inlined_nodes): Likewise.
>>  (inline_call): Likewise.
>>  * ipa-inline.c (caller_growth_limits): Likewise.
>>  (can_inline_edge_p): Likewise.
>>  (can_inline_edge_by_limits_p): Likewise.
>>  (compute_uninlined_call_time): Likewise.
>>  (compute_inlined_call_time): Likewise.
>>  (want_inline_small_function_p): Likewise.
>>  (edge_badness): Likewise.
>>  (update_caller_keys): Likewise.
>>  (update_callee_keys): Likewise.
>>  (recursive_inlining): Likewise.
>>  (inline_small_functions): Likewise.
>>  (inline_to_all_callers_1): Likewise.
>>  (dump_overall_stats): Likewise.
>>  (early_inline_small_functions): Likewise.
>>  (early_inliner): Likewise.
>>  * ipa-inline.h (estimate_edge_growth): Likewise.
>>  * ipa-profile.c (ipa_propagate_frequency_1): Likewise.
>>  * ipa-prop.c (ipa_make_edge_direct_to_target): Likewise.
>>  * ipa-prop.h (IPA_NODE_REF): Likewise.
>>  (IPA_EDGE_REF): Likewise.
>>  * ipa-pure-const.c (malloc_candidate_p): Likewise.
>>  (propagate_malloc): Likewise.
>>  * ipa-split.c (execute_split_functions): Likewise.
>>  * symbol-summary.h: Rename get to get_create.
>>  (get): Likewise.
>>  (get_create): Likewise.
>>  * tree-sra.c (ipa_sra_preliminary_function_checks): Likewise.

...

>>ipa_fn_summaries->release ();
>>ipa_fn_summaries = NULL;
>>ipa_call_summaries->release ();
>> diff --git a/gcc/ipa-hsa.c b/gcc/ipa-hsa.c
>> index 1df273c7f28..90d193fe517 100644
>> --- a/gcc/ipa-hsa.c
>> +++ b/gcc/ipa-hsa.c
>
> Probably Martin Jambor can comment on ipa-hsa.

That's how it works today, so this patch does not change anything.  It
should be easy to create much fewer summaries - this is a leftover from
early stages of HSA BE development.  I will put it on my TODO list.

Martin

Re: [PATCH 11/14] Port IPA CP to edge_clone_summaries.

2018-06-08 Thread Martin Jambor

On Thu, Jun 07 2018, Jan Hubicka wrote:
>> 
>> gcc/ChangeLog:
>> 
>> 2018-04-24  Martin Liska  
>> 
>>  * ipa-cp.c (class edge_clone_summary): New summary.
>>  (grow_edge_clone_vectors): Remove.
>>  (ipcp_edge_duplication_hook): Remove.
>>  (class edge_clone_summary_t): New call_summary class.
>>  (ipcp_edge_removal_hook): Remove.
>>  (edge_clone_summary_t::duplicate): New function.
>>  (get_next_cgraph_edge_clone): Use edge_clone_summaries.
>>  (create_specialized_node): Likewise.
>>  (ipcp_driver): Initialize edge_clone_summaries and do not
>>  register hooks.
>
> I will be happy to leave this one to Martin Jambor as well. Looks fine to me 
> :)
>

As I wrote yesterday, I'm fine with all IPA-CP/ipa-prop bits in the
series (assuming it did not substantially change since the last time I
saw it).

Martin

75 matches

Mail list logo