Re: [PATCH, ARM] MI-thunk fix for TARGET_THUMB1_ONLY

2014-06-20 Thread Chung-Lin Tang
On 2014/6/18 上午 06:26, Ramana Radhakrishnan wrote:
 On Sun, Jun 8, 2014 at 12:27 PM, Chung-Lin Tang clt...@codesourcery.com 
 wrote:
  Hi Richard, Ramana,
 
  Attached is a small fix for resolving a g++.old-deja/g++.jason/thunk2.C
  regression we found under a TARGET_THUMB1_ONLY multilib (-mthumb
  -march=armv6-m to be exact). Basically under those conditions, the thunk
  is in Thumb mode, so the subtraction should be 4 rather than 8.
 Yep, this is OK with a minor change to the comment to make it more explicit.
 
 +  /* Output .word .LTHUNKn-[37]-.LTHUNKPCn.  */
 s/37/3,7/
 
 
 Ok with that change and if no regressions.
 
 OK for release branches unless the RM's object in 24 hours.

Re-tested on a recent trunk, verified g++.jason/thunk2.C resolved with
patch and no regressions. Committed on trunk and backported to 4.8, 4.9
branches.

Thanks,
Chung-Lin



Re: [PATCH, ARM] Cortex-A9 MPCore volatile load workaround

2014-06-20 Thread Chung-Lin Tang
Ping.

On 2014/6/9 10:03 PM, Chung-Lin Tang wrote:
 Hi Richard,
 As we talked about earlier, here's a patch to add a compiler option to
 work around Cortex-A9 MPCore errata 761319:
 http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf
 
 What the option does basically, is to scan for volatile loads during
 reorg, and add a dmb barrier after it. It also strives to make dmb
 conditionally executed under TARGET_THUMB2, which means a new Thumb-2
 specific *memory_barrier_t2 pattern in sync.md, with adjusted
 conds/predicable attributes and %? in output strings.
 
 Patch originally written by Julian, with additions by Meador, and
 finally a few trivial adjustments by me. Again, we've been carrying this
 fix for a release or two. Okay for trunk?
 
 Thanks,
 Chung-Lin
 
 2014-06-09  Julian Brown  jul...@codesourcery.com
 Meador Inge  mead...@codesourcery.com
 Chung-Lin Tang  clt...@codesourcery.com
 
   * config/arm/arm.c (arm_option_override): Emit warning if
   -mfix-cortex-a9-volatile-hazards is used on an incompatible CPU.
   (any_volatile_loads_p): New.
   (arm_cortex_a9_errata_reorg): New.
   (arm_reorg): Call arm_cortex_a9_errata_reorg.
   * config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add
   option.
   * config/arm/sync.md (*memory_barrier): Don't use on Thumb-2.
   (*memory_barrier_t2): New, allow conditional execution on
   Thumb-2.
   * doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add
   documentation.
 
 testsuite/
 * lib/target-supports.exp (check_effective_target_arm_dmb): New.
 * gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test.
 



Move DECL_INIT_PRIORITY/FINI_PRIORITY to symbol table

2014-06-20 Thread Jan Hubicka
Hi,
this patch moves init and fini priorities to symbol table instead of trees.
They are already in on-side hashtables, but the hashtables are now maintaned
by symbol table.  This is needed for correctness with LTO.

Currently tree merging may load declaration with priority and then ggc_free
it creating a stale entry in the hashtable.  This is usually not problem, 
because
ctor declarations are usually static, but it is not safe.

I would really like to have template for such a sparse annotations to symbols
(writting our old school pch friendly hashtable is not a fun) but I am not sure
I can get it done in GGC/PCH safe way. Is user marking working for PCH?

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

* cgraph.h (struct symtab_node): Add field in_init_priority_hash
(set_init_priority, get_init_priority, set_fini_priority,
get_fini_priority): New methods.
* tree.c (init_priority_for_decl): Remove.
(init_ttree): Do not initialize init priority.
(decl_init_priority_lookup, decl_fini_priority_lookup): Rewrite.
(decl_priority_info): Remove.
(decl_init_priority_insert): Rewrite.
(decl_fini_priority_insert): Rewrite.
* tree.h (tree_priority_map_eq, tree_priority_map_hash,
tree_priority_map_marked_p): Remove.
* lto-cgraph.c (lto_output_node, input_node): Stream init priorities.
* lto-streamer-out.c (hash_tree): Do not hash priorities.
* tree-streamer-out.c (pack_ts_decl_with_vis_value_fields): Do
not output priorities.
(pack_ts_function_decl_value_fields): Likewise.
* tree-streamer-in.c (unpack_ts_decl_with_vis_value_fields): Do
not input priorities.
(unpack_ts_function_decl_value_fields): Likewise.
* symtab.c (symbol_priority_map): Declare.
(init_priority_hash): Declare.
(symtab_unregister_node): Unregister from priority hash, too.
(symtab_node::get_init_priority, cgraph_node::get_fini_priority):
New methods.
(symbol_priority_map_eq, symbol_priority_map_hash): New functions.
(symbol_priority_info): New function.
(symtab_node::set_init_priority, cgraph_node::set_fini_priority):
New methods.
* tree-core.h (tree_priority_map): Remove.

* lto.c (compare_tree_sccs_1): Do not compare priorities.
Index: cgraph.h
===
--- cgraph.h(revision 211831)
+++ cgraph.h(working copy)
@@ -130,6 +130,8 @@ public:
 
   /* Set when symbol has address taken. */
   unsigned address_taken : 1;
+  /* Set when init priority is set.  */
+  unsigned in_init_priority_hash : 1;
 
 
   /* Ordering of all symtab entries.  */
@@ -163,6 +165,7 @@ public:
   return x_comdat_group;
 }
 
+  /* Return comdat group as identifier_node.  */
   tree get_comdat_group_id ()
 {
   if (x_comdat_group  TREE_CODE (x_comdat_group) != IDENTIFIER_NODE)
@@ -208,6 +211,9 @@ public:
   /* Set section for symbol and its aliases.  */
   void set_section (const char *section);
   void set_section_for_node (const char *section);
+
+  void set_init_priority (priority_type priority);
+  priority_type get_init_priority ();
 };
 
 enum availability
@@ -497,6 +503,9 @@ public:
   /* True if this decl calls a COMDAT-local function.  This is set up in
  compute_inline_parameters and inline_call.  */
   unsigned calls_comdat_local : 1;
+
+  void set_fini_priority (priority_type priority);
+  priority_type get_fini_priority ();
 };
 
 
Index: tree.c
===
--- tree.c  (revision 211831)
+++ tree.c  (working copy)
@@ -219,10 +219,6 @@ static GTY ((if_marked (tree_decl_map_m
 static GTY ((if_marked (tree_vec_map_marked_p), param_is (struct 
tree_vec_map)))
  htab_t debug_args_for_decl;
 
-static GTY ((if_marked (tree_priority_map_marked_p),
-param_is (struct tree_priority_map)))
-  htab_t init_priority_for_decl;
-
 static void set_type_quals (tree, int);
 static int type_hash_eq (const void *, const void *);
 static hashval_t type_hash_hash (const void *);
@@ -573,8 +569,6 @@ init_ttree (void)
 
   value_expr_for_decl = htab_create_ggc (512, tree_decl_map_hash,
 tree_decl_map_eq, 0);
-  init_priority_for_decl = htab_create_ggc (512, tree_priority_map_hash,
-   tree_priority_map_eq, 0);
 
   int_cst_hash_table = htab_create_ggc (1024, int_cst_hash_hash,
int_cst_hash_eq, NULL);
@@ -6492,13 +6486,12 @@ tree_decl_map_hash (const void *item)
 priority_type
 decl_init_priority_lookup (tree decl)
 {
-  struct tree_priority_map *h;
-  struct tree_map_base in;
+  symtab_node *snode = symtab_get_node (decl);
 
-  gcc_assert (VAR_OR_FUNCTION_DECL_P (decl));
-  in.from = decl;
-  h = (struct tree_priority_map *) htab_find 

Delay RTL initialization until it is really needed

2014-06-20 Thread Jan Hubicka
Hi,
IRA initialization shows high in profiles even when building lto objects.  This 
patch simply
delays RTL backend initialization until we really decide to output a function.  
In some cases
this avoids the initialization completely (like in the case of LTO but also 
user target attributes)
and there is some hope for better cache locality.

Basic idea is to have two flags saying whether lang and target dependent bits
needs initialization and check it when starting function codegen.

Bootstrapped/regtested x86_64-linux, testing also at AIX. Ok if it passes?

Honza

* toplev.c (backend_init_target): Move init_emit_regs and init_regs 
to...
(backend_init) ... here; skip ira_init_once and backend_init_target.
(target_reinit) ... and here; clear 
this_target_rtl-lang_dependent_initialized.
(lang_dependent_init_target): Clear 
this_target_rtl-lang_dependent_initialized;
break out rtl initialization to ...
(initialize_rtl): ... here; call also backend_init_target and 
ira_init_once.
* toplev.h (initialize_rtl): New function.
* function.c: Include toplev.h
(init_function_start): Call initialize_rtl.
* rtl.h (target_rtl): Add target_specific_initialized,
lang_dependent_initialized.
Index: toplev.c
===
--- toplev.c(revision 211837)
+++ toplev.c(working copy)
@@ -1583,14 +1583,6 @@ backend_init_target (void)
   /* Initialize alignment variables.  */
   init_alignments ();
 
-  /* This reinitializes hard_frame_pointer, and calls init_reg_modes_target()
- to initialize reg_raw_mode[].  */
-  init_emit_regs ();
-
-  /* This invokes target hooks to set fixed_reg[] etc, which is
- mode-dependent.  */
-  init_regs ();
-
   /* This depends on stack_pointer_rtx.  */
   init_fake_stack_mems ();
 
@@ -1632,9 +1624,13 @@ backend_init (void)
   init_varasm_once ();
   save_register_info ();
 
-  /* Initialize the target-specific back end pieces.  */
-  ira_init_once ();
-  backend_init_target ();
+  /* Middle end needs this initialization for default mem attributes
+ used by early calls to make_decl_rtl.  */
+  init_emit_regs ();
+
+  /* Middle end needs this initialization for mode tables used to assign
+ modes to vector variables.  */
+  init_regs ();
 }
 
 /* Initialize excess precision settings.  */
@@ -1686,6 +1682,31 @@ lang_dependent_init_target (void)
  front end is initialized.  It also depends on the HAVE_xxx macros
  generated from the target machine description.  */
   init_optabs ();
+  this_target_rtl-lang_dependent_initialized = false;
+}
+
+/* Perform initializations that are lang-dependent or target-dependent.
+   but matters only for late optimizations and RTL generation.  */
+
+void
+initialize_rtl (void)
+{
+  static int initialized_once;
+
+  /* Initialization done just once per compilation, but delayed
+ till code generation.  */
+  if (!initialized_once)
+ira_init_once ();
+  initialized_once = true;
+
+  /* Target specific RTL backend initialization.  */
+  if (!this_target_rtl-target_specific_initialized)
+backend_init_target ();
+  this_target_rtl-target_specific_initialized = true;
+
+  if (this_target_rtl-lang_dependent_initialized)
+return;
+  this_target_rtl-lang_dependent_initialized = true;
 
   /* The following initialization functions need to generate rtl, so
  provide a dummy function context for them.  */
@@ -1784,8 +1805,15 @@ target_reinit (void)
   regno_reg_rtx = NULL;
 }
 
-  /* Reinitialize RTL backend.  */
-  backend_init_target ();
+  this_target_rtl-lang_dependent_initialized = false;
+
+  /* This initializes hard_frame_pointer, and calls init_reg_modes_target()
+ to initialize reg_raw_mode[].  */
+  init_emit_regs ();
+
+  /* This invokes target hooks to set fixed_reg[] etc, which is
+ mode-dependent.  */
+  init_regs ();
 
   /* Reinitialize lang-dependent parts.  */
   lang_dependent_init_target ();
Index: toplev.h
===
--- toplev.h(revision 211837)
+++ toplev.h(working copy)
@@ -77,4 +77,6 @@ extern bool set_src_pwd  (const c
 extern HOST_WIDE_INT get_random_seed (bool);
 extern const char *set_random_seed (const char *);
 
+extern void initialize_rtl (void);
+
 #endif /* ! GCC_TOPLEV_H */
Index: function.c
===
--- function.c  (revision 211837)
+++ function.c  (working copy)
@@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.
 #include params.h
 #include bb-reorder.h
 #include shrink-wrap.h
+#include toplev.h
 
 /* So we can assign to cfun in this file.  */
 #undef cfun
@@ -4630,6 +4631,10 @@ init_function_start (tree subr)
 set_cfun (DECL_STRUCT_FUNCTION (subr));
   else
 allocate_struct_function (subr, false);
+
+  /* Initialize backend, if needed.  */
+  initialize_rtl ();
+
   

Re: [PATCH][AArch64] Fix some saturating math NEON intrinsics types

2014-06-20 Thread Marcus Shawcroft
On 16 June 2014 15:26, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 I noticed that a few saturating math intrinsics in arm_neon.h for aarch64
 have the wrong types, i.e. not what's mandated by the ACLE spec.

 This patch fixes that by adjusting the types of the builtin functions that
 those intrinsics map to (and in the process cleaning up the VCON iterator)
 and adding tests for the affected intrinsics.

 I realise it's quite big, but the changes are mostly uniform.

 Bootstrapped and tested aarch64-none-linux-gnu.

 Ok for trunk?

OK, can you prepare a 4.9 backport?
Cheers
/Marcus


Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-20 Thread Marek Polacek
On Thu, Jun 19, 2014 at 07:19:31PM +0200, Jakub Jelinek wrote:
  + case IFN_UBSAN_BOUNDS:
  +   ubsan_expand_bounds_btn (gsi);
  +   break;
default:
 
 Why *_btn instead of *_ifn ?

Remnant from when I was using __builtin.ubsan instead of the internal
call.  Fixed.
 
  +static tree
  +ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, void *data)
  +{
  +  struct pointer_set_t *pset = (struct pointer_set_t *) data;
  +
  +  if (TREE_CODE (*tp) == BIND_EXPR)
  +{
 
 I think it would be worth adding here a comment why do you handle BIND_EXPR
 here, that it doesn't walk the vars, but only their initializers etc. and
 thus in order to prevent walking DECL_INITIAL of TREE_STATIC decls
 we have to duplicate this part of walk_tree.

Done.

  +  for (tree decl = BIND_EXPR_VARS (*tp); decl; decl = DECL_CHAIN 
  (decl))
  +   {
  + if (TREE_STATIC (decl))
  +   {
  + *walk_subtrees = 0;
  + continue;
  +   }
  + walk_tree (DECL_INITIAL (decl), ubsan_walk_array_refs_r, NULL, pset);
  + walk_tree (DECL_SIZE (decl), ubsan_walk_array_refs_r, NULL, pset);
  + walk_tree (DECL_SIZE_UNIT (decl), ubsan_walk_array_refs_r, NULL, 
  pset);
 
 Shouldn't that use pset, pset); or data, pset); ?
 Also, too long lines (at least the last one, first one likely too).

Oops, fixed.
 
  +  tree bound = TYPE_MAX_VALUE (domain);
  +  if (ignore_off_by_one)
  +bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
  +build_int_cst (TREE_TYPE (bound), 1));
  +
  +  /* Detect flexible array members and suchlike.  */
  +  tree base = get_base_address (array);
  +  if (base  TREE_CODE (base) == INDIRECT_REF)
 
 I'd check also == MEM_REF here, while the FEs often use INDIRECT_REFs,
 there are already spots where it creates MEM_REFs.

Fixed.

  +void
  +ubsan_maybe_instrument_array_ref (tree *expr_p, bool ignore_off_by_one)
  +{
  +  if (!ubsan_array_ref_instrumented_p (*expr_p)
  +   current_function_decl != 0
 
 Please use != NULL_TREE.

Ok.
 
  +   !lookup_attribute (no_sanitize_undefined,
  +   DECL_ATTRIBUTES (current_function_decl)))
  +{
  +  tree t = copy_node (*expr_p);
  +  tree op0 = TREE_OPERAND (t, 0);
  +  tree op1 = TREE_OPERAND (t, 1);
 
 Please don't call copy_node until you know you want to instrument it.
 I.e.
   tree op0 = TREE_OPERAND (*expr_p, 0);
   tree op1 = TREE_OPERAND (*expr_p, 1);
  +  tree e = ubsan_instrument_bounds (EXPR_LOCATION (t), op0, op1,
 
 s/t/*expr_p/ above.
 
  +   ignore_off_by_one);
  +  if (e != NULL_TREE)
  +   {
 
 and only here add:
 tree t = copy_node (*expr_p);
  + TREE_OPERAND (t, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
  +   e, op1);
  + *expr_p = t;
  +   }
  +}
  +}

Fixed as well.

  --- gcc/tree-pretty-print.c
  +++ gcc/tree-pretty-print.c
  @@ -3218,6 +3218,13 @@ print_call_name (pretty_printer *buffer, tree node, 
  int flags)
   {
 tree op0 = node;
   
  +  if (node == NULL_TREE)
  +{
  +  /* TODO Print builtin name.  */
  +  pp_string (buffer, internal function call);
 
 Use internal_fn_name function?

Uh, not sure how I missed that.  I print the internal function before calling
print_call_name, since that gets NULL node - and we can't determine
CALL_EXPR_IFN.

I added some docs as well, as promised.

2014-06-20  Marek Polacek  pola...@redhat.com

* asan.c (pass_sanopt::execute): Handle IFN_UBSAN_BOUNDS.
* flag-types.h (enum sanitize_code): Add SANITIZE_BOUNDS and or it
into SANITIZE_UNDEFINED.
* doc/invoke.texi: Describe -fsanitize=bounds.
* gimplify.c (gimplify_call_expr): Add gimplification of internal
functions created in the FEs.
* internal-fn.c: Move internal-fn.h after tree.h.
(expand_UBSAN_BOUNDS): New function.
* internal-fn.def (UBSAN_BOUNDS): New internal function.
* internal-fn.h: Don't define internal functions here.
* opts.c (common_handle_option): Add -fsanitize=bounds.
* sanitizer.def (BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS,
BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS_ABORT): Add.
* tree-core.h: Define internal functions here.
(struct tree_base): Add ifn field.
* tree-pretty-print.c: Include internal-fn.h.
(dump_generic_node): Handle functions without CALL_EXPR_FN.
* tree.c (get_callee_fndecl): Likewise.
(build_call_expr_internal_loc): New function.
* tree.def (CALL_EXPR): Update description.
* tree.h (CALL_EXPR_IFN): Define.
(build_call_expr_internal_loc): Declare.
* ubsan.c (get_ubsan_type_info_for_type): Return 0 for non-arithmetic
types.
(ubsan_type_descriptor): Change bool parameter to enum
ubsan_print_style.  Adjust the code.  Add handling of
UBSAN_PRINT_ARRAY.

Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 10:43:04AM +0200, Marek Polacek wrote:
 +
 +/* Internal function code.  */
 +ENUM_BITFIELD(internal_fn) ifn : 5;

Any reason for the  : 5 here?  I mean, the union also contains
unsigned int, so it doesn't hurt if you use full 32 bits for it there,
and it should be faster and you won't run into problems when we'll have
more than 32 internal functions.

Otherwise the patch looks good to me, but please wait for comments
from Joseph and/or Jason.

Jakub


[linaro/gcc-4_9-branch] AArch64 costs model backports

2014-06-20 Thread Yvan Roux
Hi all,

we have backported a set of AArch64 costs model related revisions in
the linaro/gcc-4_9-branch at r211843.  The backported revisions are:

210493 : [AArch64 costs 1/18] Refactor aarch64_address_costs.
210494 : [AArch64 costs 2/18] Add cost tables for Cortex-A57
210495 : [AArch64 costs 3/18] Wrap aarch64_rtx_costs to dump verbose output
210496 : [AArch64 costs 4/18] Better estimate cost of building a constant
210497 : [AArch64 costs 5/18] Factor out common MULT cases
210498 : [AArch64 costs 6/18] Set default costs and handle vector modes.
210499 : [AArch64 costs 7/18] Improve SET cost.
210500 : [AArch64 costs 8/18] Cost memory accesses using address costs
210501 : [AArch64 costs 9/18] Better cost logical operations
210502 : [AArch64 costs 10/18] Improve costs for sign/zero extend operations
210503 : [AArch64 costs 11/18] Improve costs for rotate and shift operations.
210504 : [AArch64 costs 12/18] Improve costs for sign/zero extracts
210505 : [AArch64 costs 13/18] Improve costs for div/mod
210506 : [AArch64 costs 14/18] Cost comparisons, flag setting
operators and IF_THEN_ELSE
210507 : [AArch64 costs 15/18] Cost more Floating point RTX.
210508 : [AArch64 costs 16/18] Cost TRUNCATE
210509 : [AArch64 costs 17/18] Cost for SYMBOL_REF, HIGH and LO_SUM
210510 : [AArch64 costs 18/18] Dump a message if we are unable to cost an insn.
210512 : [AArch64 costs] Fixup to costing of FNMUL
211205 : aarch64_if_then_else_costs refactor
211206 : aarch64_if_then_else_costs

Thanks,
Yvan


Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-20 Thread Marek Polacek
On Fri, Jun 20, 2014 at 10:57:47AM +0200, Jakub Jelinek wrote:
 On Fri, Jun 20, 2014 at 10:43:04AM +0200, Marek Polacek wrote:
  +
  +/* Internal function code.  */
  +ENUM_BITFIELD(internal_fn) ifn : 5;
 
 Any reason for the  : 5 here?  I mean, the union also contains
 unsigned int, so it doesn't hurt if you use full 32 bits for it there,
 and it should be faster and you won't run into problems when we'll have
 more than 32 internal functions.

The sole reason was that all other ENUM_BITFIELDs have it - on the
other hand, they're not in a union and here the bit-field is
pointless.  I'll drop it.
 
 Otherwise the patch looks good to me, but please wait for comments
 from Joseph and/or Jason.

Thanks for all your help!

The following is the same patch with only  : 5 dropped.

2014-06-20  Marek Polacek  pola...@redhat.com

* asan.c (pass_sanopt::execute): Handle IFN_UBSAN_BOUNDS.
* flag-types.h (enum sanitize_code): Add SANITIZE_BOUNDS and or it
into SANITIZE_UNDEFINED.
* doc/invoke.texi: Describe -fsanitize=bounds.
* gimplify.c (gimplify_call_expr): Add gimplification of internal
functions created in the FEs.
* internal-fn.c: Move internal-fn.h after tree.h.
(expand_UBSAN_BOUNDS): New function.
* internal-fn.def (UBSAN_BOUNDS): New internal function.
* internal-fn.h: Don't define internal functions here.
* opts.c (common_handle_option): Add -fsanitize=bounds.
* sanitizer.def (BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS,
BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS_ABORT): Add.
* tree-core.h: Define internal functions here.
(struct tree_base): Add ifn field.
* tree-pretty-print.c: Include internal-fn.h.
(dump_generic_node): Handle functions without CALL_EXPR_FN.
* tree.c (get_callee_fndecl): Likewise.
(build_call_expr_internal_loc): New function.
* tree.def (CALL_EXPR): Update description.
* tree.h (CALL_EXPR_IFN): Define.
(build_call_expr_internal_loc): Declare.
* ubsan.c (get_ubsan_type_info_for_type): Return 0 for non-arithmetic
types.
(ubsan_type_descriptor): Change bool parameter to enum
ubsan_print_style.  Adjust the code.  Add handling of
UBSAN_PRINT_ARRAY.
(ubsan_expand_bounds_ifn): New function.
(ubsan_expand_null_ifn): Adjust ubsan_type_descriptor call.
(ubsan_build_overflow_builtin): Likewise.
(instrument_bool_enum_load): Likewise.
(ubsan_instrument_float_cast): Likewise.
* ubsan.h (enum ubsan_print_style): New enum.
(ubsan_expand_bounds_ifn): Declare.
(ubsan_type_descriptor): Adjust declaration.  Use a default parameter.
c-family/
* c-gimplify.c: Include c-ubsan.h and pointer-set.h.
(ubsan_walk_array_refs_r): New function.
(c_genericize): Instrument array bounds.
* c-ubsan.c: Include internal-fn.h.
(ubsan_instrument_division): Mark instrumented arrays as having
side effects.  Adjust ubsan_type_descriptor call.
(ubsan_instrument_shift): Likewise.
(ubsan_instrument_vla): Adjust ubsan_type_descriptor call.
(ubsan_instrument_bounds): New function.
(ubsan_array_ref_instrumented_p): New function.
(ubsan_maybe_instrument_array_ref): New function.
* c-ubsan.h (ubsan_instrument_bounds): Declare.
(ubsan_array_ref_instrumented_p): Declare.
(ubsan_maybe_instrument_array_ref): Declare.
testsuite/
* c-c++-common/ubsan/bounds-1.c: New test.
* c-c++-common/ubsan/bounds-2.c: New test.
* c-c++-common/ubsan/bounds-3.c: New test.
* c-c++-common/ubsan/bounds-4.c: New test.
* c-c++-common/ubsan/bounds-5.c: New test.
* c-c++-common/ubsan/bounds-6.c: New test.

diff --git gcc/asan.c gcc/asan.c
index 281a795..b7c76cf 100644
--- gcc/asan.c
+++ gcc/asan.c
@@ -2761,6 +2761,9 @@ pass_sanopt::execute (function *fun)
  case IFN_UBSAN_NULL:
ubsan_expand_null_ifn (gsi);
break;
+ case IFN_UBSAN_BOUNDS:
+   ubsan_expand_bounds_ifn (gsi);
+   break;
  default:
break;
  }
@@ -2771,6 +2774,10 @@ pass_sanopt::execute (function *fun)
  print_gimple_stmt (dump_file, stmt, 0, dump_flags);
  fprintf (dump_file, \n);
}
+
+ /* ubsan_expand_bounds_ifn might move us to the end of the BB.  */
+ if (gsi_end_p (gsi))
+   break;
}
 }
   return 0;
diff --git gcc/c-family/c-gimplify.c gcc/c-family/c-gimplify.c
index 737be4d..c797d99 100644
--- gcc/c-family/c-gimplify.c
+++ gcc/c-family/c-gimplify.c
@@ -45,6 +45,8 @@ along with GCC; see the file COPYING3.  If not see
 #include c-pretty-print.h
 #include cgraph.h
 #include cilk.h
+#include c-ubsan.h
+#include pointer-set.h
 
 /*  The gimplification pass converts the language-dependent 

Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 11:34:26AM +0200, Marek Polacek wrote:
 On Fri, Jun 20, 2014 at 10:57:47AM +0200, Jakub Jelinek wrote:
  On Fri, Jun 20, 2014 at 10:43:04AM +0200, Marek Polacek wrote:
   +
   +/* Internal function code.  */
   +ENUM_BITFIELD(internal_fn) ifn : 5;
  
  Any reason for the  : 5 here?  I mean, the union also contains
  unsigned int, so it doesn't hurt if you use full 32 bits for it there,
  and it should be faster and you won't run into problems when we'll have
  more than 32 internal functions.
 
 The sole reason was that all other ENUM_BITFIELDs have it - on the
 other hand, they're not in a union and here the bit-field is
 pointless.  I'll drop it.

Well, no point to use ENUM_BITFIELD either, just use
enum internal_fn ifn;
?

Jakub


Re: [RFC] optimize x - y cmp 0 with undefined overflow

2014-06-20 Thread Eric Botcazou
[I'm at last back to this...]

  With [1, -x + INF] as the resulting range?  But it can be bogus if x is
  itself equal to +INF (unlike the input range [x + 1, +INF] which is
  always correct)

 Hmm, indeed.
 
  so this doesn't look valid to me.  I don't see how we can get away
  without a +INF(OVF) here, but I can compute it in
  extract_range_from_binary_expr_1 if you prefer and try only [op0,op0]
  and [op1,op1].
 
 Yeah, I'd prefer that.

To recap, the range of y is [x + 1, +INF] and we're trying to evaluate the 
range of y - x, in particular we want to prove that y - x  0.

We compute the range of y - x as [1, -x + INF] by combining [x + 1, +INF] with 
[x, x] and we want to massage it because compare_values will rightly choke.

If overflow is undefined, we can simply change it to [1, +INF (OVF)] and be 
done with that.  But if overflow is defined, we need to prove that -x + INF
cannot wrap around (which is true if the type is unsigned) and the simplest 
way to do that in the general case is to recursively invoke the machinery
of extract_range_from_binary_expr_1 on range_of(-x) + INF and analyze the 
result.  This looks much more complicated implementation-wise (and would very 
likely buy us nothing in practice) than my scheme, where we just approximate 
range_of(-x) by [-INF,+INF] and don't need to implement the recursion at all.

However I can change extract_range_from_binary_expr so that only one range 
among [-INF,x], [x,+INF] and [x,x] is tried instead of the 3 ranges in a row.
I initially didn't want to do that because this breaks the separation between 
extract_range_from_binary_expr_1 and extract_range_from_binary_expr but, given 
their names, this is very likely acceptable.  What do you think?

-- 
Eric Botcazou


Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-20 Thread Marek Polacek
On Fri, Jun 20, 2014 at 11:39:23AM +0200, Jakub Jelinek wrote:
 On Fri, Jun 20, 2014 at 11:34:26AM +0200, Marek Polacek wrote:
  On Fri, Jun 20, 2014 at 10:57:47AM +0200, Jakub Jelinek wrote:
   On Fri, Jun 20, 2014 at 10:43:04AM +0200, Marek Polacek wrote:
+
+/* Internal function code.  */
+ENUM_BITFIELD(internal_fn) ifn : 5;
   
   Any reason for the  : 5 here?  I mean, the union also contains
   unsigned int, so it doesn't hurt if you use full 32 bits for it there,
   and it should be faster and you won't run into problems when we'll have
   more than 32 internal functions.
  
  The sole reason was that all other ENUM_BITFIELDs have it - on the
  other hand, they're not in a union and here the bit-field is
  pointless.  I'll drop it.
 
 Well, no point to use ENUM_BITFIELD either, just use
 enum internal_fn ifn;

Works as well.

2014-06-20  Marek Polacek  pola...@redhat.com

* asan.c (pass_sanopt::execute): Handle IFN_UBSAN_BOUNDS.
* flag-types.h (enum sanitize_code): Add SANITIZE_BOUNDS and or it
into SANITIZE_UNDEFINED.
* doc/invoke.texi: Describe -fsanitize=bounds.
* gimplify.c (gimplify_call_expr): Add gimplification of internal
functions created in the FEs.
* internal-fn.c: Move internal-fn.h after tree.h.
(expand_UBSAN_BOUNDS): New function.
* internal-fn.def (UBSAN_BOUNDS): New internal function.
* internal-fn.h: Don't define internal functions here.
* opts.c (common_handle_option): Add -fsanitize=bounds.
* sanitizer.def (BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS,
BUILT_IN_UBSAN_HANDLE_OUT_OF_BOUNDS_ABORT): Add.
* tree-core.h: Define internal functions here.
(struct tree_base): Add ifn field.
* tree-pretty-print.c: Include internal-fn.h.
(dump_generic_node): Handle functions without CALL_EXPR_FN.
* tree.c (get_callee_fndecl): Likewise.
(build_call_expr_internal_loc): New function.
* tree.def (CALL_EXPR): Update description.
* tree.h (CALL_EXPR_IFN): Define.
(build_call_expr_internal_loc): Declare.
* ubsan.c (get_ubsan_type_info_for_type): Return 0 for non-arithmetic
types.
(ubsan_type_descriptor): Change bool parameter to enum
ubsan_print_style.  Adjust the code.  Add handling of
UBSAN_PRINT_ARRAY.
(ubsan_expand_bounds_ifn): New function.
(ubsan_expand_null_ifn): Adjust ubsan_type_descriptor call.
(ubsan_build_overflow_builtin): Likewise.
(instrument_bool_enum_load): Likewise.
(ubsan_instrument_float_cast): Likewise.
* ubsan.h (enum ubsan_print_style): New enum.
(ubsan_expand_bounds_ifn): Declare.
(ubsan_type_descriptor): Adjust declaration.  Use a default parameter.
c-family/
* c-gimplify.c: Include c-ubsan.h and pointer-set.h.
(ubsan_walk_array_refs_r): New function.
(c_genericize): Instrument array bounds.
* c-ubsan.c: Include internal-fn.h.
(ubsan_instrument_division): Mark instrumented arrays as having
side effects.  Adjust ubsan_type_descriptor call.
(ubsan_instrument_shift): Likewise.
(ubsan_instrument_vla): Adjust ubsan_type_descriptor call.
(ubsan_instrument_bounds): New function.
(ubsan_array_ref_instrumented_p): New function.
(ubsan_maybe_instrument_array_ref): New function.
* c-ubsan.h (ubsan_instrument_bounds): Declare.
(ubsan_array_ref_instrumented_p): Declare.
(ubsan_maybe_instrument_array_ref): Declare.
testsuite/
* c-c++-common/ubsan/bounds-1.c: New test.
* c-c++-common/ubsan/bounds-2.c: New test.
* c-c++-common/ubsan/bounds-3.c: New test.
* c-c++-common/ubsan/bounds-4.c: New test.
* c-c++-common/ubsan/bounds-5.c: New test.
* c-c++-common/ubsan/bounds-6.c: New test.

diff --git gcc/asan.c gcc/asan.c
index 281a795..b7c76cf 100644
--- gcc/asan.c
+++ gcc/asan.c
@@ -2761,6 +2761,9 @@ pass_sanopt::execute (function *fun)
  case IFN_UBSAN_NULL:
ubsan_expand_null_ifn (gsi);
break;
+ case IFN_UBSAN_BOUNDS:
+   ubsan_expand_bounds_ifn (gsi);
+   break;
  default:
break;
  }
@@ -2771,6 +2774,10 @@ pass_sanopt::execute (function *fun)
  print_gimple_stmt (dump_file, stmt, 0, dump_flags);
  fprintf (dump_file, \n);
}
+
+ /* ubsan_expand_bounds_ifn might move us to the end of the BB.  */
+ if (gsi_end_p (gsi))
+   break;
}
 }
   return 0;
diff --git gcc/c-family/c-gimplify.c gcc/c-family/c-gimplify.c
index 737be4d..c797d99 100644
--- gcc/c-family/c-gimplify.c
+++ gcc/c-family/c-gimplify.c
@@ -45,6 +45,8 @@ along with GCC; see the file COPYING3.  If not see
 #include c-pretty-print.h
 #include cgraph.h
 #include cilk.h
+#include c-ubsan.h
+#include 

[PATCH] Fix ICE in Asan

2014-06-20 Thread Maxim Ostapenko

Hi,

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61530.

Tested on x86_64-unknown-linux-gnu, no regressions.

Ok to commit?

-Maxim
gcc/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
	  Max Ostapenko  m.ostape...@partner.samsung.com

	* asan.c (build_check_stmt): Add condition.

gcc/testsuite/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
	  Max Ostapenko  m.ostape...@partner.samsung.com

	* c-c++-common/asan/pr61530.c: New test.


diff --git a/gcc/asan.c b/gcc/asan.c
index 281a795..4d87dad 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1654,6 +1654,7 @@ build_check_stmt (location_t location, tree base, tree len,
   if (size_in_bytes  1)
 {
   if ((size_in_bytes  (size_in_bytes - 1)) != 0
+	  || !is_scalar_access
 	  || size_in_bytes  16)
 	size_in_bytes = -1;
   else if (align  align  size_in_bytes * BITS_PER_UNIT)
diff --git a/gcc/testsuite/c-c++-common/asan/pr61530.c b/gcc/testsuite/c-c++-common/asan/pr61530.c
new file mode 100644
index 000..e306a71
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/pr61530.c
@@ -0,0 +1,17 @@
+/* { dg-do run } */
+/* { dg-shouldfail asan } */
+
+__attribute__((noinline,noclone)) void
+foo (char *a, char *b) {
+  a[0] = b[0] = 0;
+  __builtin_memcpy(a, b, 4);
+}
+
+int
+main () {
+  char a, b;
+  foo (a, b);
+  return 0;
+}
+
+/* { dg-output ERROR: AddressSanitizer: stack-buffer-overflow } */


Re: [PATCH] Fix ICE in Asan

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 02:04:25PM +0400, Maxim Ostapenko wrote:
 
 This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61530.
 
 Tested on x86_64-unknown-linux-gnu, no regressions.
 
 Ok to commit?
 
 -Maxim

 gcc/ChangeLog:
 
 2014-06-20  Yury Gribov  y.gri...@samsung.com
 Max Ostapenko  m.ostape...@partner.samsung.com
 
   * asan.c (build_check_stmt): Add condition.

Please align your name below Yury's (i.e. tab + 4 spaces indent).
Also, please add
PR sanitizer/61530
to both changelog entries.  Ok with those changes.
 
 gcc/testsuite/ChangeLog:
 
 2014-06-20  Yury Gribov  y.gri...@samsung.com
 Max Ostapenko  m.ostape...@partner.samsung.com
 
   * c-c++-common/asan/pr61530.c: New test.

Jakub


Re: [PATCH] Fix ICE in Asan

2014-06-20 Thread Maxim Ostapenko


On 06/20/2014 02:07 PM, Jakub Jelinek wrote:

On Fri, Jun 20, 2014 at 02:04:25PM +0400, Maxim Ostapenko wrote:

This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61530.

Tested on x86_64-unknown-linux-gnu, no regressions.

Ok to commit?

-Maxim
gcc/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
  Max Ostapenko  m.ostape...@partner.samsung.com

* asan.c (build_check_stmt): Add condition.

Please align your name below Yury's (i.e. tab + 4 spaces indent).
Also, please add
PR sanitizer/61530
to both changelog entries.  Ok with those changes.

gcc/testsuite/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
  Max Ostapenko  m.ostape...@partner.samsung.com

* c-c++-common/asan/pr61530.c: New test.

Jakub


Thanks, done in r211846.

-Maxim


RE: [PATCH] Fix PR61375: cancel bswap optimization when value doesn't fit in a HOST_WIDE_INT

2014-06-20 Thread Thomas Preud'homme
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: Tuesday, June 10, 2014 5:05 PM
 
 Backports are welcome - please post a patch.
 

Sorry for the delay. Here you are:

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61375.c 
b/gcc/testsuite/gcc.c-torture/execute/pr61375.c
new file mode 100644
index 000..d3b54a8
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61375.c
@@ -0,0 +1,35 @@
+#ifdef __UINT64_TYPE__
+typedef __UINT64_TYPE__ uint64_t;
+#else
+typedef unsigned long long uint64_t;
+#endif
+
+#ifndef __SIZEOF_INT128__
+#define __int128 long long
+#endif
+
+/* Some version of bswap optimization would ICE when analyzing a mask constant
+   too big for an HOST_WIDE_INT (PR61375).  */
+
+__attribute__ ((noinline, noclone)) uint64_t
+uint128_central_bitsi_ior (unsigned __int128 in1, uint64_t in2)
+{
+  __int128 mask = (__int128)0x  56;
+  return ((in1  mask)  56) | in2;
+}
+
+int
+main (int argc)
+{
+  __int128 in = 1;
+#ifdef __SIZEOF_INT128__
+  in = 64;
+#endif
+  if (sizeof (uint64_t) * __CHAR_BIT__ != 64)
+return 0;
+  if (sizeof (unsigned __int128) * __CHAR_BIT__ != 128)
+return 0;
+  if (uint128_central_bitsi_ior (in, 2) != 0x102)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 9ff857c..9d64205 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1741,6 +1741,8 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int 
limit)
  if (n-size % BITS_PER_UNIT != 0)
return NULL_TREE;
  n-size /= BITS_PER_UNIT;
+ if (n-size  (int)sizeof (unsigned HOST_WIDEST_INT))
+   return NULL_TREE;
  n-n = (sizeof (HOST_WIDEST_INT)  8 ? 0 :
  (unsigned HOST_WIDEST_INT)0x08070605  32 | 0x04030201);
 
@@ -1781,6 +1783,8 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int 
limit)
type_size = TYPE_PRECISION (gimple_expr_type (stmt));
if (type_size % BITS_PER_UNIT != 0)
  return NULL_TREE;
+   if (type_size  (int)HOST_BITS_PER_WIDEST_INT)
+ return NULL_TREE;
 
if (type_size / BITS_PER_UNIT  (int)(sizeof (HOST_WIDEST_INT)))
  {

Ok for GCC 4.8 and GCC 4.9 branches?

Best regards,

Thomas




[PATCH] Fix for invalid sanitization of trailing byte in __builtin_strlen

2014-06-20 Thread Maxim Ostapenko
This patch fixes invalid sanitization of trailing byte in 
__builtin_strlen ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61547).


Tested on x86_64-unknown-linux-gnu, no regressions.

Ok to commit?

-Maxim
gcc/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
	Max Ostapenko  m.ostape...@partner.samsung.com

	PR sanitizer/61547
	* asan.c (instrument_strlen_call): Fixed instrumentation of
	trailing byte.

gcc/testsuite/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
	Max Ostapenko  m.ostape...@partner.samsung.com

	PR sanitizer/61547
	* c-c++-common/asan/strlen-overflow-1.c: New test.


diff --git a/gcc/asan.c b/gcc/asan.c
index 281a795..71c063b 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2036,19 +2036,19 @@ instrument_strlen_call (gimple_stmt_iterator *iter)
 
   build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), NULL_TREE, 1, iter,
 		/*non_zero_len_p*/true, /*before_p=*/true,
-		/*is_store=*/false, /*is_scalar_access*/false, /*align*/0);
+		/*is_store=*/false, /*is_scalar_access*/true, /*align*/0);
 
-  gimple stmt =
-gimple_build_assign_with_ops (PLUS_EXPR,
-  make_ssa_name (TREE_TYPE (len), NULL),
-  len,
-  build_int_cst (TREE_TYPE (len), 1));
-  gimple_set_location (stmt, loc);
-  gsi_insert_after (iter, stmt, GSI_NEW_STMT);
+  gimple g =
+gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+  make_ssa_name (cptr_type, NULL),
+  gimple_assign_lhs (str_arg_ssa),
+  len);
+  gimple_set_location (g, loc);
+  gsi_insert_after (iter, g, GSI_NEW_STMT);
 
-  build_check_stmt (loc, gimple_assign_lhs (stmt), len, 1, iter,
+  build_check_stmt (loc, gimple_assign_lhs (g), NULL_TREE, 1, iter,
 		/*non_zero_len_p*/true, /*before_p=*/false,
-		/*is_store=*/false, /*is_scalar_access*/false, /*align*/0);
+		/*is_store=*/false, /*is_scalar_access*/true, /*align*/0);
 
   return true;
 }
diff --git a/gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c b/gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c
new file mode 100644
index 000..4c4585e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-skip-if  { *-*-* } { -flto } {  } } */
+/* { dg-shouldfail asan } */
+
+#include sanitizer/asan_interface.h
+
+char a[2] = 0;
+
+#ifdef __cplusplus
+extern C
+#endif
+__SIZE_TYPE__ strlen (const char *p) {
+  /* Simulate error */
+  if (p == a)
+return 1;
+
+  __SIZE_TYPE__ n = 0;
+  for (; *p; ++n, ++p);
+  return n;
+}
+
+int main () {
+  char *p = a[0];
+  asm ( : +r(p));
+  __asan_poison_memory_region ((char *)a[1], 1);
+  return __builtin_strlen (a);
+}
+
+/* { dg-output READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r) } */
+/* { dg-output #0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*strlen-overflow-1.c:26|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*0x\[0-9a-f\]+ is located 1 bytes inside of global variable } */


[PATCH 2/N] allow storing values directly in hash tables

2014-06-20 Thread tsaunders
From: Trevor Saunders tsaund...@mozilla.com

Hi,

this patch allows you to define the type the hash table stores as elements
instead of the type elements point at by having your hash descriptor define the
type store_values_directly.  It turns out trying to implement both cases with
the same code is really confusing, so I ended up providing one partial
specialization for each case.  Its a lot of coppying, but I'm hoping the next
patch will get rid of many direct users of hash_table, and the rest can all get
converted to tell the hash table the type entries should have at which point
the dupplication can be removed.

bootstrapped + regtested without regression on x86_64-unknown-linux-gnu, ok?

Trev

gcc/

* hash-table.h: Add a template arg to choose between storing values
and storing pointers to values, and then provide partial
specializations for both.
* tree-browser.c (tree_upper_hasher): Provide the type the hash table
should store, not the type values should point to.
* tree-into-ssa.c (var_info_hasher): Likewise.
* tree-ssa-dom.c (expr_elt_hasher): Likewise.
* tree-complex.c: Adjust.
* tree-hasher.h (int_tree_hasher): store int_tree_map in the hash
table instead of int_tree_map *.
* tree-parloops.c: Adjust.
* tree-ssa-reassoc.c (ocount_hasher): Don't lie to hash_map about what
type is being stored.
* tree-vectorizer.c: Adjust.

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 41cc19e..22af12f 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -272,19 +272,18 @@ typed_noop_remove Type::remove (Type *p 
ATTRIBUTE_UNUSED)
 template typename Type
 struct pointer_hash : typed_noop_remove Type
 {
-  typedef Type value_type;
-  typedef Type compare_type;
+  typedef Type *value_type;
+  typedef Type *compare_type;
+  typedef int store_values_directly;
 
-  static inline hashval_t
-  hash (const value_type *);
+  static inline hashval_t hash (const value_type );
 
-  static inline int
-  equal (const value_type *existing, const compare_type *candidate);
+  static inline bool equal (const value_type existing, const compare_type 
candidate);
 };
 
 template typename Type
 inline hashval_t
-pointer_hash Type::hash (const value_type *candidate)
+pointer_hash Type::hash (const value_type candidate)
 {
   /* This is a really poor hash function, but it is what the current code uses,
  so I am reusing it to avoid an additional axis in testing.  */
@@ -292,9 +291,9 @@ pointer_hash Type::hash (const value_type *candidate)
 }
 
 template typename Type
-inline int
-pointer_hash Type::equal (const value_type *existing,
-  const compare_type *candidate)
+inline bool
+pointer_hash Type::equal (const value_type existing,
+  const compare_type candidate)
 {
   return existing == candidate;
 }
@@ -319,10 +318,147 @@ extern unsigned int hash_table_higher_prime_index 
(unsigned long n);
 extern hashval_t hash_table_mod1 (hashval_t hash, unsigned int index);
 extern hashval_t hash_table_mod2 (hashval_t hash, unsigned int index);
 
+/* The below is some template meta programming to decide if we should use the
+   hash table partial specialization that directly stores value_type instead of
+   pointers to value_type.  If the Descriptor type defines the type
+   Descriptor::store_values_directly then values are stored directly otherwise
+   pointers to them are stored.  */
+templatetypename T struct notype { typedef void type; };
+
+templatetypename T, typename = void
+struct storage_tester
+{
+  static const bool value = false;
+};
+
+templatetypename T
+struct storage_testerT, typename notypetypename
+T::store_values_directly::type
+{
+  static const bool value = true;
+};
+
+ templatetypename Traits
+ struct has_is_deleted
+{
+  templatetypename U, bool (*)(U ) struct helper {};
+  templatetypename U static char test (helperU, U::is_deleted *);
+  templatetypename U static int test (...);
+  static const bool value = sizeof (testTraits (0)) == sizeof (char);
+};
+
+templatetypename Type, typename Traits, bool = has_is_deletedTraits::value
+struct is_deleted_helper
+{
+  static inline bool
+  call (Type v)
+  {
+return Traits::is_deleted (v);
+  }
+};
+
+templatetypename Type, typename Traits
+struct is_deleted_helperType *, Traits, false
+{
+  static inline bool
+  call (Type *v)
+  {
+return v == HTAB_DELETED_ENTRY;
+  }
+};
+
+ templatetypename Traits
+ struct has_is_empty
+{
+  templatetypename U, bool (*)(U ) struct helper {};
+  templatetypename U static char test (helperU, U::is_empty *);
+  templatetypename U static int test (...);
+  static const bool value = sizeof (testTraits (0)) == sizeof (char);
+};
+
+templatetypename Type, typename Traits, bool = has_is_deletedTraits::value
+struct is_empty_helper
+{
+  static inline bool
+  call (Type v)
+  {
+return Traits::is_empty (v);
+  }
+};
+

[PATCH 3/3] add hash_map class

2014-06-20 Thread tsaunders
From: Trevor Saunders tsaund...@mozilla.com

Hi,

This patch adds a hash_map class so we can consolidate the boiler plate around
using hash_table as a map, it also allows us to get rid of pointer_map which I
do in this patch by converting its users to hash_map.

bootstrapped + regtested without regression on x86_64-unknown-linux-gnu, ok?

Trev

gcc/

* alloc-pool.c (alloc_pool_hash): Use hash_map instead of hash_table.
* dominance.c (iterate_fix_dominators): Use hash_map instead of
pointer_map.
* hash-map.h: New file.
* ipa-comdats.c: Use hash_map instead of pointer_map.
* lto-section-out.c: Adjust.
* lto-streamer.h: Replace pointer_map with hash_map.
* symtab.c (verify_symtab): Likewise.
* tree-ssa-strlen.c (decl_to_stridxlist_htab): Likewise.
* tree-ssa-uncprop.c (val_ssa_equiv): Likewise.
* tree-streamer.h: Likewise.
* tree-streamer.c: Adjust.
* pointer-set.h: Remove pointer_map.

lto/

* lto.c (canonical_type_hash_cache): Use hash_map instead of
pointer_map.

diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c
index 49209ee..0d31835 100644
--- a/gcc/alloc-pool.c
+++ b/gcc/alloc-pool.c
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 #include system.h
 #include alloc-pool.h
 #include hash-table.h
+#include hash-map.h
 
 #define align_eight(x) (((x+7)  3)  3)
 
@@ -69,7 +70,6 @@ static ALLOC_POOL_ID_TYPE last_id;
size for that pool.  */
 struct alloc_pool_descriptor
 {
-  const char *name;
   /* Number of pools allocated.  */
   unsigned long created;
   /* Gross allocated storage.  */
@@ -82,48 +82,17 @@ struct alloc_pool_descriptor
   int elt_size;
 };
 
-/* Hashtable helpers.  */
-struct alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor
-{
-  typedef alloc_pool_descriptor value_type;
-  typedef char compare_type;
-  static inline hashval_t hash (const alloc_pool_descriptor *);
-  static inline bool equal (const value_type *, const compare_type *);
-};
-
-inline hashval_t
-alloc_pool_hasher::hash (const value_type *d)
-{
-  return htab_hash_pointer (d-name);
-}
-
-inline bool
-alloc_pool_hasher::equal (const value_type *d,
-  const compare_type *p2)
-{
-  return d-name == p2;
-}
-
 /* Hashtable mapping alloc_pool names to descriptors.  */
-static hash_tablealloc_pool_hasher *alloc_pool_hash;
+static hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash;
 
 /* For given name, return descriptor, create new if needed.  */
 static struct alloc_pool_descriptor *
 allocate_pool_descriptor (const char *name)
 {
-  struct alloc_pool_descriptor **slot;
-
   if (!alloc_pool_hash)
-alloc_pool_hash = new hash_tablealloc_pool_hasher (10);
-
-  slot = alloc_pool_hash-find_slot_with_hash (name,
-  htab_hash_pointer (name),
-  INSERT);
-  if (*slot)
-return *slot;
-  *slot = XCNEW (struct alloc_pool_descriptor);
-  (*slot)-name = name;
-  return *slot;
+alloc_pool_hash = new hash_mapconst char *, alloc_pool_descriptor (10);
+
+  return alloc_pool_hash-get_or_insert (name);
 }
 
 /* Create a pool of things of size SIZE, with NUM in each block we
@@ -375,23 +344,22 @@ struct output_info
   unsigned long total_allocated;
 };
 
-/* Called via hash_table.traverse.  Output alloc_pool descriptor pointed out by
+/* Called via hash_map.traverse.  Output alloc_pool descriptor pointed out by
SLOT and update statistics.  */
-int
-print_alloc_pool_statistics (alloc_pool_descriptor **slot,
+bool
+print_alloc_pool_statistics (const char *const name,
+const alloc_pool_descriptor d,
 struct output_info *i)
 {
-  struct alloc_pool_descriptor *d = *slot;
-
-  if (d-allocated)
+  if (d.allocated)
 {
   fprintf (stderr,
   %-22s %6d %10lu %10lu(%10lu) %10lu(%10lu) %10lu(%10lu)\n,
-  d-name, d-elt_size, d-created, d-allocated,
-  d-allocated / d-elt_size, d-peak, d-peak / d-elt_size,
-  d-current, d-current / d-elt_size);
-  i-total_allocated += d-allocated;
-  i-total_created += d-created;
+  name, d.elt_size, d.created, d.allocated,
+  d.allocated / d.elt_size, d.peak, d.peak / d.elt_size,
+  d.current, d.current / d.elt_size);
+  i-total_allocated += d.allocated;
+  i-total_created += d.created;
 }
   return 1;
 }
diff --git a/gcc/dominance.c b/gcc/dominance.c
index 7adec4f..be0a439 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -43,6 +43,7 @@
 #include diagnostic-core.h
 #include et-forest.h
 #include timevar.h
+#include hash-map.h
 #include pointer-set.h
 #include graphds.h
 #include bitmap.h
@@ -1258,7 +1259,6 @@ iterate_fix_dominators (enum cdi_direction dir, 
vecbasic_block bbs,
   size_t dom_i;
   edge e;
   edge_iterator ei;
-  pointer_mapint *map;
   int *parent, *son, 

Re: Move DECL_INIT_PRIORITY/FINI_PRIORITY to symbol table

2014-06-20 Thread Trevor Saunders
On Fri, Jun 20, 2014 at 08:41:22AM +0200, Jan Hubicka wrote:
 Hi,
 this patch moves init and fini priorities to symbol table instead of trees.
 They are already in on-side hashtables, but the hashtables are now maintaned
 by symbol table.  This is needed for correctness with LTO.
 
 Currently tree merging may load declaration with priority and then ggc_free
 it creating a stale entry in the hashtable.  This is usually not problem, 
 because
 ctor declarations are usually static, but it is not safe.
 
 I would really like to have template for such a sparse annotations to symbols
 (writting our old school pch friendly hashtable is not a fun) but I am not 
 sure
 I can get it done in GGC/PCH safe way. Is user marking working for PCH?

hm, so thinking about this a little more I wonder if you can just use
hash_table, and add a dtor to symtab_node that removes its entry from
the hash table.  ggc should invoke your dtor as a finalizer for the node
then.

Trev

 
 Bootstrapped/regtested x86_64-linux, will commit it shortly.
 
 Honza
 
   * cgraph.h (struct symtab_node): Add field in_init_priority_hash
   (set_init_priority, get_init_priority, set_fini_priority,
   get_fini_priority): New methods.
   * tree.c (init_priority_for_decl): Remove.
   (init_ttree): Do not initialize init priority.
   (decl_init_priority_lookup, decl_fini_priority_lookup): Rewrite.
   (decl_priority_info): Remove.
   (decl_init_priority_insert): Rewrite.
   (decl_fini_priority_insert): Rewrite.
   * tree.h (tree_priority_map_eq, tree_priority_map_hash,
   tree_priority_map_marked_p): Remove.
   * lto-cgraph.c (lto_output_node, input_node): Stream init priorities.
   * lto-streamer-out.c (hash_tree): Do not hash priorities.
   * tree-streamer-out.c (pack_ts_decl_with_vis_value_fields): Do
   not output priorities.
   (pack_ts_function_decl_value_fields): Likewise.
   * tree-streamer-in.c (unpack_ts_decl_with_vis_value_fields): Do
   not input priorities.
   (unpack_ts_function_decl_value_fields): Likewise.
   * symtab.c (symbol_priority_map): Declare.
   (init_priority_hash): Declare.
   (symtab_unregister_node): Unregister from priority hash, too.
   (symtab_node::get_init_priority, cgraph_node::get_fini_priority):
   New methods.
   (symbol_priority_map_eq, symbol_priority_map_hash): New functions.
   (symbol_priority_info): New function.
   (symtab_node::set_init_priority, cgraph_node::set_fini_priority):
   New methods.
   * tree-core.h (tree_priority_map): Remove.
 
   * lto.c (compare_tree_sccs_1): Do not compare priorities.
 Index: cgraph.h
 ===
 --- cgraph.h  (revision 211831)
 +++ cgraph.h  (working copy)
 @@ -130,6 +130,8 @@ public:
  
/* Set when symbol has address taken. */
unsigned address_taken : 1;
 +  /* Set when init priority is set.  */
 +  unsigned in_init_priority_hash : 1;
  
  
/* Ordering of all symtab entries.  */
 @@ -163,6 +165,7 @@ public:
return x_comdat_group;
  }
  
 +  /* Return comdat group as identifier_node.  */
tree get_comdat_group_id ()
  {
if (x_comdat_group  TREE_CODE (x_comdat_group) != IDENTIFIER_NODE)
 @@ -208,6 +211,9 @@ public:
/* Set section for symbol and its aliases.  */
void set_section (const char *section);
void set_section_for_node (const char *section);
 +
 +  void set_init_priority (priority_type priority);
 +  priority_type get_init_priority ();
  };
  
  enum availability
 @@ -497,6 +503,9 @@ public:
/* True if this decl calls a COMDAT-local function.  This is set up in
   compute_inline_parameters and inline_call.  */
unsigned calls_comdat_local : 1;
 +
 +  void set_fini_priority (priority_type priority);
 +  priority_type get_fini_priority ();
  };
  
  
 Index: tree.c
 ===
 --- tree.c(revision 211831)
 +++ tree.c(working copy)
 @@ -219,10 +219,6 @@ static GTY ((if_marked (tree_decl_map_m
  static GTY ((if_marked (tree_vec_map_marked_p), param_is (struct 
 tree_vec_map)))
   htab_t debug_args_for_decl;
  
 -static GTY ((if_marked (tree_priority_map_marked_p),
 -  param_is (struct tree_priority_map)))
 -  htab_t init_priority_for_decl;
 -
  static void set_type_quals (tree, int);
  static int type_hash_eq (const void *, const void *);
  static hashval_t type_hash_hash (const void *);
 @@ -573,8 +569,6 @@ init_ttree (void)
  
value_expr_for_decl = htab_create_ggc (512, tree_decl_map_hash,
tree_decl_map_eq, 0);
 -  init_priority_for_decl = htab_create_ggc (512, tree_priority_map_hash,
 - tree_priority_map_eq, 0);
  
int_cst_hash_table = htab_create_ggc (1024, int_cst_hash_hash,
   int_cst_hash_eq, NULL);
 @@ -6492,13 +6486,12 @@ 

Re: [PATCH] Fix for invalid sanitization of trailing byte in __builtin_strlen

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 02:49:12PM +0400, Maxim Ostapenko wrote:
 This patch fixes invalid sanitization of trailing byte in __builtin_strlen (
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61547).
 
 Tested on x86_64-unknown-linux-gnu, no regressions.
 +
 +#include sanitizer/asan_interface.h
 +
 +char a[2] = 0;
 +
 +#ifdef __cplusplus
 +extern C
 +#endif
 +__SIZE_TYPE__ strlen (const char *p) {
 +  /* Simulate error */
 +  if (p == a)
 +return 1;

Why this?  Can't you instead just use __attribute__((no_sanitize_address, 
noinline))
on it instead?

Otherwise looks good to me

Jakub


Re: [PATCH] Fix for invalid sanitization of trailing byte in __builtin_strlen

2014-06-20 Thread Yuri Gribov
On Fri, Jun 20, 2014 at 2:57 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Jun 20, 2014 at 02:49:12PM +0400, Maxim Ostapenko wrote:
 This patch fixes invalid sanitization of trailing byte in __builtin_strlen (
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61547).

 Tested on x86_64-unknown-linux-gnu, no regressions.

What about bootstrap though?

 +__SIZE_TYPE__ strlen (const char *p) {
 +  /* Simulate error */
 +  if (p == a)
 +return 1;

 Why this?  Can't you instead just use __attribute__((no_sanitize_address, 
 noinline))
 on it instead?

Yeah, good point.

-Y


Re: [PATCH] Change default for --param allow-...-data-races to off

2014-06-20 Thread Martin Jambor
Hi,

On Thu, Jun 19, 2014 at 06:18:47PM +0200, Bernd Edlinger wrote:
 Hi,
 
 from a recent discussion on g...@gcc.gnu.org I have learned that the default 
 of
 --param allow-store-data-races is still 1, and it is causing problems.
 Therefore I would like to suggest to change the default of this option to 0.

I was about to propose a similar patch but I intended to leave the
parameter set to one when -Ofast is specified so that benchmarks are
not hurt by this and as a nice pointer for people exploring our
options to really squeeze out 100% performance (which would of course
mean documenting it too).

Thanks,

Martin

 
 Boot-strapped and regression tested on x86_64-linux-gnu.
 Ok for trunk?
 
 
 Thanks
 Bernd.
 

 gcc/ChangeLog:
 2014-06-19  Bernd Edlinger  bernd.edlin...@hotmail.de
 
   Set default for --param allow-...-data-races to off.
   * params.def (PARAM_ALLOW_LOAD_DATA_RACES,
   PARAM_ALLOW_STORE_DATA_RACES, PARAM_ALLOW_PACKED_LOAD_DATA_RACES,
   PARAM_ALLOW_PACKED_STORE_DATA_RACES): Set default to off.
 
 testsuite/ChangeLog:
 2014-06-19  Bernd Edlinger  bernd.edlin...@hotmail.de
 
   Adjust to new default for --param allow-...-data-races.
   * c-c++-common/cxxbitfields-3.c: Adjust.
   * c-c++-common/cxxbitfields-6.c: Adjust.
   * c-c++-common/simulate-thread/bitfields-1.c: Adjust.
   * c-c++-common/simulate-thread/bitfields-2.c: Adjust.
   * c-c++-common/simulate-thread/bitfields-3.c: Adjust.
   * c-c++-common/simulate-thread/bitfields-4.c: Adjust.
   * g++.dg/simulate-thread/bitfields.C: Adjust.
   * g++.dg/simulate-thread/bitfields-2.C: Adjust.
   * gcc.dg/lto/pr52097_0.c: Adjust.
   * gcc.dg/simulate-thread/speculative-store.c: Adjust.
   * gcc.dg/simulate-thread/speculative-store-2.c: Adjust.
   * gcc.dg/simulate-thread/speculative-store-3.c: Adjust.
   * gcc.dg/simulate-thread/speculative-store-4.c: Adjust.
   * gcc.dg/simulate-thread/strict-align-global.c: Adjust.
   * gcc.dg/simulate-thread/subfields.c: Adjust.
   * gcc.dg/tree-ssa/20050314-1.c: Adjust.
 




Re: [PATCH] dwarf2out.c: Pass DWARF type modifiers around as flags argument.

2014-06-20 Thread Mark Wielaard
While adding some new type modifiers I did find a typo in my original
patch, so I decided to add some testcases to make sure no regressions
were introduced. This is the same patch as the original, but with the
typo in modified_type_die fixed that could accidentally add an extra
layer of type modifiers and a new test based on the guality.exp
testsuite, but tweaked to inspect the type of a variable instead of its
value.
From b3140b1ee59560b33d08e2583c20be5a615e588b Mon Sep 17 00:00:00 2001
From: Mark Wielaard m...@redhat.com
Date: Wed, 18 Jun 2014 22:41:38 +0200
Subject: [PATCH] dwarf2out.c: Pass type modifiers as flags arguments. Add guality type test.

modified_type_die and add_type_attribute take two separate arguments
for whether the type should be const and/or volatile. To help add
more type modifiers pass the requested modifiers as one flag value
to these functions. And introduce helper functions dw_mod_type_flags
and dw_mod_decl_flags to easily extract the modifiers from type and
declaration trees.

Add a new type:var variant to the guality.exp testsuite to check that
gdb gets the correct type for a variable and use it to make sure the
change doesn't cause any regressions.

DWARFv3 added restrict_type [PR debug/59051] and DWARFv5 has proposals
for atomic_type and aligned_type. Which will hopefully be easier to
implement based on this change.

gcc/ChangeLog

	* dwarf2out.h (enum dw_mod_flag): New enum.
	* dwarf2out.c (dw_mod_decl_flags): New function.
	(dw_mod_type_flags): Likewise.
	(modified_type_die): Take one modifiers flag argument instead of
	one for const and one for volatile.
	(add_type_attribute): Likewise.
	(generic_parameter_die): Call add_type_attribute with one modifier
	argument.
	(base_type_for_mode): Likewise.
	(add_bounds_info): Likewise.
	(add_subscript_info): Likewise.
	(gen_array_type_die): Likewise.
	(gen_descr_array_type_die): Likewise.
	(gen_entry_point_die): Likewise.
	(gen_enumeration_type_die): Likewise.
	(gen_formal_parameter_die): Likewise.
	(gen_subprogram_die): Likewise.
	(gen_variable_die): Likewise.
	(gen_const_die): Likewise.
	(gen_field_die): Likewise.
	(gen_pointer_type_die): Likewise.
	(gen_reference_type_die): Likewise.
	(gen_ptr_to_mbr_type_die): Likewise.
	(gen_inheritance_die): Likewise.
	(gen_subroutine_type_die): Likewise.
	(gen_typedef_die): Likewise.
	(force_type_die): Likewise.

gcc/testsuite/ChangeLog

	* lib/gcc-gdb-test.exp (gdb-test): Handle type:var for gdb ptype
	matching.
	* gcc/testsuite/gcc.dg/guality/const-volatile.c: New test.
---
 gcc/ChangeLog |   30 ++
 gcc/dwarf2out.c   |  133 +---
 gcc/dwarf2out.h   |8 ++
 gcc/testsuite/ChangeLog   |6 +
 gcc/testsuite/gcc.dg/guality/const-volatile.c |   83 +++
 gcc/testsuite/lib/gcc-gdb-test.exp|   45 -
 6 files changed, 241 insertions(+), 64 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/guality/const-volatile.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2d0a07c..4dfd9a5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,33 @@
+2014-06-20  Mark Wielaard  m...@redhat.com
+
+	* dwarf2out.h (enum dw_mod_flag): New enum.
+	* dwarf2out.c (dw_mod_decl_flags): New function.
+	(dw_mod_type_flags): Likewise.
+	(modified_type_die): Take one modifiers flag argument instead of
+	one for const and one for volatile.
+	(add_type_attribute): Likewise.
+	(generic_parameter_die): Call add_type_attribute with one modifier
+	argument.
+	(base_type_for_mode): Likewise.
+	(add_bounds_info): Likewise.
+	(add_subscript_info): Likewise.
+	(gen_array_type_die): Likewise.
+	(gen_descr_array_type_die): Likewise.
+	(gen_entry_point_die): Likewise.
+	(gen_enumeration_type_die): Likewise.
+	(gen_formal_parameter_die): Likewise.
+	(gen_subprogram_die): Likewise.
+	(gen_variable_die): Likewise.
+	(gen_const_die): Likewise.
+	(gen_field_die): Likewise.
+	(gen_pointer_type_die): Likewise.
+	(gen_reference_type_die): Likewise.
+	(gen_ptr_to_mbr_type_die): Likewise.
+	(gen_inheritance_die): Likewise.
+	(gen_subroutine_type_die): Likewise.
+	(gen_typedef_die): Likewise.
+	(force_type_die): Likewise.
+
 2014-06-18  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
 	* config/arm/arm_neon.h (vadd_f32): Change #ifdef to __FAST_MATH.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 933ec62..3d3508d 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3140,7 +3140,9 @@ static void output_file_names (void);
 static dw_die_ref base_type_die (tree);
 static int is_base_type (tree);
 static dw_die_ref subrange_type_die (tree, tree, tree, dw_die_ref);
-static dw_die_ref modified_type_die (tree, int, int, dw_die_ref);
+static int dw_mod_decl_flags (const_tree);
+static int dw_mod_type_flags (const_tree);
+static dw_die_ref modified_type_die (tree, int, dw_die_ref);
 static dw_die_ref generic_parameter_die (tree, tree, bool, dw_die_ref);
 static dw_die_ref 

Re: [PATCH AArch64 1/2] PR/60825 Make float64x1_t in arm_neon.h a proper vector type

2014-06-20 Thread Marcus Shawcroft
On 19 June 2014 13:27, Alan Lawrence alan.lawre...@arm.com wrote:
 This updates the .md files to generate V1DFmode patterns instead of DFmode
 for create and reinterpret, and the corresponding __builtins.

 The various other float64x1_t intrinsics can then be rewritten, generally
 I've tried to use gcc vector extensions rather than unnecessary/custom
 builtins where possible, and have started adding some range checking using
 __builtin_aarch64_im_lane_boundsi.

 Finally, rewrite the cases in arm_neon.h and various tests, that relied on
 float64[x1]_t being assignment-compatible, including arm_neon.h vfma
 functions which had the wrong (but previously equivalent) type signature;
 and add some new ABI tests.

OK /Marcus


Re: [PATCH AArch64 2/2] PR/60825 Make {int,uint}64x1_t in arm_neon.h a proper vector type

2014-06-20 Thread Marcus Shawcroft
On 19 June 2014 13:30, Alan Lawrence alan.lawre...@arm.com wrote:
 Similarly, this makes int64x1_t a proper vector type, updating arm_neon.h
 with many explicit vector construction/destruction operations (also
 including some range checking using __builtin_aarch64_im_lane_boundsi).

 Change the vabs_s64 intrinsic from using __builtin_llabs to
 __builtin_aarch64_absdi, the latter is consistent with other intrinsics and
 should have different behaviour (aarch64_abs on the minimum negative value
 should be defined to return said minimum negative value rather than
 undefined). This __builtin was previously being generated as a binary
 operator (but this was not noticed as it was unused), so I've tweaked the
 qualifiers to force unary ops to unary.

Ok /Marcus


Re: [PATCH] Fix for invalid sanitization of trailing byte in __builtin_strlen

2014-06-20 Thread Maxim Ostapenko

 What about bootstrap though?

Bootstrap in progress.

 +__SIZE_TYPE__ strlen (const char *p) {
 +  /* Simulate error */
 +  if (p == a)
 +return 1;

 Why this?  Can't you instead just use 
__attribute__((no_sanitize_address, noinline))

 on it instead?

Done.

Ok to commit if bootstrap will succeed?

-Maxim
gcc/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
	Max Ostapenko  m.ostape...@partner.samsung.com

	PR sanitizer/61547
	* asan.c (instrument_strlen_call): Fixed instrumentation of
	trailing byte.

gcc/testsuite/ChangeLog:

2014-06-20  Yury Gribov  y.gri...@samsung.com
	Max Ostapenko  m.ostape...@partner.samsung.com

	PR sanitizer/61547
	* c-c++-common/asan/strlen-overflow-1.c: New test.


diff --git a/gcc/asan.c b/gcc/asan.c
index 281a795..71c063b 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2036,19 +2036,19 @@ instrument_strlen_call (gimple_stmt_iterator *iter)
 
   build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), NULL_TREE, 1, iter,
 		/*non_zero_len_p*/true, /*before_p=*/true,
-		/*is_store=*/false, /*is_scalar_access*/false, /*align*/0);
+		/*is_store=*/false, /*is_scalar_access*/true, /*align*/0);
 
-  gimple stmt =
-gimple_build_assign_with_ops (PLUS_EXPR,
-  make_ssa_name (TREE_TYPE (len), NULL),
-  len,
-  build_int_cst (TREE_TYPE (len), 1));
-  gimple_set_location (stmt, loc);
-  gsi_insert_after (iter, stmt, GSI_NEW_STMT);
+  gimple g =
+gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+  make_ssa_name (cptr_type, NULL),
+  gimple_assign_lhs (str_arg_ssa),
+  len);
+  gimple_set_location (g, loc);
+  gsi_insert_after (iter, g, GSI_NEW_STMT);
 
-  build_check_stmt (loc, gimple_assign_lhs (stmt), len, 1, iter,
+  build_check_stmt (loc, gimple_assign_lhs (g), NULL_TREE, 1, iter,
 		/*non_zero_len_p*/true, /*before_p=*/false,
-		/*is_store=*/false, /*is_scalar_access*/false, /*align*/0);
+		/*is_store=*/false, /*is_scalar_access*/true, /*align*/0);
 
   return true;
 }
diff --git a/gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c b/gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c
new file mode 100644
index 000..4c4585e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-skip-if  { *-*-* } { -flto } {  } } */
+/* { dg-shouldfail asan } */
+
+#include sanitizer/asan_interface.h
+
+char a[2] = 0;
+
+#ifdef __cplusplus
+extern C
+#endif
+
+__attribute__((no_sanitize_address, noinline)) __SIZE_TYPE__
+strlen (const char *p) {
+
+  __SIZE_TYPE__ n = 0;
+  for (; *p; ++n, ++p);
+  return n;
+}
+
+int main () {
+  char *p = a[0];
+  asm ( : +r(p));
+  __asan_poison_memory_region ((char *)a[1], 1);
+  return __builtin_strlen (a);
+}
+
+/* { dg-output READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r) } */
+/* { dg-output #0 0x\[0-9a-f\]+ (in _*main (\[^\n\r]*strlen-overflow-1.c:24|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*0x\[0-9a-f\]+ is located 1 bytes inside of global variable } */


Re: [PATCH] Fix for invalid sanitization of trailing byte in __builtin_strlen

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 04:55:43PM +0400, Maxim Ostapenko wrote:
  What about bootstrap though?
 
 Bootstrap in progress.
 
  +__SIZE_TYPE__ strlen (const char *p) {
  +  /* Simulate error */
  +  if (p == a)
  +return 1;
 
  Why this?  Can't you instead just use __attribute__((no_sanitize_address,
 noinline))
  on it instead?
 
 Done.
 
 Ok to commit if bootstrap will succeed?

Ok, thanks.

Jakub


Re: C++ PATCH for c++/59296 (rvalue object and lvalue ref-qualifier)

2014-06-20 Thread Jason Merrill

On 06/19/2014 01:29 PM, Marc Glisse wrote:

That looks weird to me. The const version is a better match than the
const, so we should pick that one in overload resolution, but if we
remove the const version, the other one seems valid to me


Hmm, you're right, I was confused.

Jason





Re: [PATCH] Fix for invalid sanitization of trailing byte in __builtin_strlen

2014-06-20 Thread Maxim Ostapenko


On 06/20/2014 04:59 PM, Jakub Jelinek wrote:

On Fri, Jun 20, 2014 at 04:55:43PM +0400, Maxim Ostapenko wrote:

What about bootstrap though?

Bootstrap in progress.


+__SIZE_TYPE__ strlen (const char *p) {
+  /* Simulate error */
+  if (p == a)
+return 1;

Why this?  Can't you instead just use __attribute__((no_sanitize_address,

noinline))

on it instead?

Done.

Ok to commit if bootstrap will succeed?

Ok, thanks.

Jakub


Thanks, done in r211849.

-Maxim


Re: [PATCH, Testsuite, AArch64] Make aapcs64.exp Tests Big-Endian Friendly

2014-06-20 Thread Marcus Shawcroft
On 19 June 2014 14:32, Yufeng Zhang yufeng.zh...@arm.com wrote:
 Hi,

 This patch updates a number of aapcs64 tests to make them big-endian
 friendly.  Changes are mainly:

 * checking the W regs instead of X regs for integral arguments less than 8
 bytes
 * correcting the corresponding stack location checks in big-endian mode

 With this patch, make check-gcc RUNTESTFLAGS=aapcs64.exp gives a clean
 result on aarch64_be-none-elf.

 OK for trunk?

OK thanks.
/Marcus


Re: [RFC] Add a .gitattributes file for use with git-merge-changelog

2014-06-20 Thread Tom Tromey
 Samuel == Samuel Bronson naes...@gmail.com writes:

Samuel [Am I really supposed to CC this to gcc@ like binutils/MAINTAINERS
Samuel says I should?]

I think just for files that are intended to be put in both trees and
shared.

Samuel Individual users will still have to:
Samuel  1. Install git-merge-changelog
Samuel  2. Set up the merge driver in their git config

What happens if they do not?

Tom


Re: [PATCH][AArch64] Fix some saturating math NEON intrinsics types

2014-06-20 Thread Kyrill Tkachov


On 20/06/14 09:41, Marcus Shawcroft wrote:

On 16 June 2014 15:26, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

Hi all,

I noticed that a few saturating math intrinsics in arm_neon.h for aarch64
have the wrong types, i.e. not what's mandated by the ACLE spec.

This patch fixes that by adjusting the types of the builtin functions that
those intrinsics map to (and in the process cleaning up the VCON iterator)
and adding tests for the affected intrinsics.

I realise it's quite big, but the changes are mostly uniform.

Bootstrapped and tested aarch64-none-linux-gnu.

Ok for trunk?

OK, can you prepare a 4.9 backport?


Sure, but it depends on 
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00779.html.

Is it ok to backport that one as well?

It passes regtest on aarch64-none-elf and aarch64_be-none-elf.

Kyrill


Cheers
/Marcus






[PATCH][AArch64] Implement vfma_f64, vmla_f64, vfms_f64, vmls_f64 intrinsics

2014-06-20 Thread Kyrill Tkachov

Hi all,

Now that Alan fixed the float64x1_t machinery, this patch implements 
some low-hanging intrinsics

in arm_neon.h.

Tested aarch64-none-elf and bootstrapped on aarch64-linux.

Ok for trunk?

Thanks,
Kyrill

2014-06-20  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/arm_neon.h (vfma_f64): New intrinsic.
(vmla_f64): Likewise.
(vfms_f64): Likewise.
(vmls_f64): Likewise.

2014-06-20  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/aarch64/simd/vfma_f64.c: New test.
* gcc.target/aarch64/simd/vmla_f64.c: Likewise.
* gcc.target/aarch64/simd/vfms_f64.c: Likewise.
* gcc.target/aarch64/simd/vmls_f64.c: Likewise.
commit ffb5a3efe38e50c0d410b5517e030aa37cad88b7
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Wed Jun 18 14:25:17 2014 +0100

[AArch64] Implement vfma_f64, vmla_f64, vfms_f64, vmls_f64 in arm_neon.h

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index fd520f5..2809b3e 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -16701,6 +16701,14 @@ vextq_u64 (uint64x2_t __a, uint64x2_t __b, __const int __c)
 #endif
 }
 
+/* vfma  */
+
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vfma_f64 (float64x1_t __a, float64x1_t __b, float64x1_t __c)
+{
+  return (float64x1_t) {__builtin_fma (__b[0], __c[0], __a[0])};
+}
+
 /* vfma_lane  */
 
 __extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
@@ -16804,6 +16812,14 @@ vfmaq_laneq_f64 (float64x2_t __a, float64x2_t __b,
 __a);
 }
 
+/* vfms  */
+
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vfms_f64 (float64x1_t __a, float64x1_t __b, float64x1_t __c)
+{
+  return (float64x1_t) {__builtin_fma (-__b[0], __c[0], __a[0])};
+}
+
 /* vfms_lane  */
 
 __extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
@@ -18432,6 +18448,12 @@ vmla_f32 (float32x2_t a, float32x2_t b, float32x2_t c)
   return a + b * c;
 }
 
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vmla_f64 (float64x1_t __a, float64x1_t __b, float64x1_t __c)
+{
+  return __a + __b * __c;
+}
+
 __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
 vmlaq_f32 (float32x4_t a, float32x4_t b, float32x4_t c)
 {
@@ -18600,6 +18622,12 @@ vmls_f32 (float32x2_t a, float32x2_t b, float32x2_t c)
   return a - b * c;
 }
 
+__extension__ static __inline float64x1_t __attribute__ ((__always_inline__))
+vmls_f64 (float64x1_t __a, float64x1_t __b, float64x1_t __c)
+{
+  return __a - __b * __c;
+}
+
 __extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
 vmlsq_f32 (float32x4_t a, float32x4_t b, float32x4_t c)
 {
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vfma_f64.c b/gcc/testsuite/gcc.target/aarch64/simd/vfma_f64.c
new file mode 100644
index 000..d6bcf1c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vfma_f64.c
@@ -0,0 +1,41 @@
+/* Test the vfma_f64 AArch64 SIMD intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-options -save-temps -O3 } */
+
+#include arm_neon.h
+#include stdio.h
+
+#define EPS 1.0e-15
+
+
+extern void abort (void);
+
+int
+main (void)
+{
+  float64x1_t arg1;
+  float64x1_t arg2;
+  float64x1_t arg3;
+
+  float64_t expected;
+  float64_t actual;
+
+  arg1 = vcreate_f64 (0x3fe3955382d35b0eULL);
+  arg2 = vcreate_f64 (0x3fa88480812d6670ULL);
+  arg3 = vcreate_f64 (0x3fd5791ae2a92572ULL);
+
+  expected = 0.6280448184360076;
+  actual = vget_lane_f64 (vfma_f64 (arg1, arg2, arg3), 0);
+
+  if (__builtin_fabs (expected - actual)  EPS)
+{
+  fprintf (stderr, Expected: %lf, got %lf\n, expected, actual);
+  abort ();
+}
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times fmadd\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n 1 } } */
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vfms_f64.c b/gcc/testsuite/gcc.target/aarch64/simd/vfms_f64.c
new file mode 100644
index 000..3f34758
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vfms_f64.c
@@ -0,0 +1,41 @@
+/* Test the vfms_f64 AArch64 SIMD intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-options -save-temps -O3 } */
+
+#include arm_neon.h
+#include stdio.h
+
+#define EPS 1.0e-15
+
+
+extern void abort (void);
+
+int
+main (void)
+{
+  float64x1_t arg1;
+  float64x1_t arg2;
+  float64x1_t arg3;
+
+  float64_t expected;
+  float64_t actual;
+
+  arg1 = vcreate_f64 (0x3fe730af8db9e6f7ULL);
+  arg2 = vcreate_f64 (0x3fe6b78680fa29ceULL);
+  arg3 = vcreate_f64 (0x3feea3cbf921fbe0ULL);
+
+  expected = 4.4964705746355915e-2;
+  actual = vget_lane_f64 (vfms_f64 (arg1, arg2, arg3), 0);
+
+  if (__builtin_fabs (expected - actual)  EPS)
+{
+  fprintf (stderr, Expected: %lf, got %lf\n, expected, actual);
+  abort ();
+}
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times fmsub\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, 

[linaro/gcc-4_8-branch] Merge from gcc-4_8-branch

2014-06-20 Thread Yvan Roux
Hi,

we have merged the gcc-4_8-branch into linaro/gcc-4_8-branch up to
revision 210799 as r211850.

Thanks
Yvan


Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-20 Thread Joseph S. Myers
I don't have any comments on this patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 5/5] add libcc1

2014-06-20 Thread Tom Tromey
 Trevor == Trevor Saunders tsaund...@mozilla.com writes:

Trevor I'm curious, what is the reason you choose not to write this in C++11 or
Trevor later?  Its distributed with gcc, so the only case where you aren't
Trevor building with the in tree compiler and libraries is when your cross
Trevor compiling gcc, and it doesn't seem particularly important to support
Trevor building the plugin or library in that configuration.  So istm you could
Trevor go all the way and assume you are being built with trunk gcc and
Trevor libraries.

The plugin has to be ABI compatible with GCC itself, and my
understanding was that C++11 and however GCC is built are not
necessarily compatible.

Switching to C++11 would be an improvement -- variadic templates would
simplify the RPC code (with a complicated caveat).  So if it is possible
I am interested.

Trevor I'm going to use this as an excuse to bring up something I've wanted to
Trevor discuss for a while.

Trevor So can we add C++ stuff to libiberty and allow building
Trevor libiberty without it for binutils / gdb, or can we do something
Trevor else to avoid this kind of stuff?

One way would be to just make a new top-level directory for a new
library.

Trevor This question also arises in the case of templating splay_tree, and I
Trevor imagine if gdb switches to C++ some day they'll want to reuse vec.h.

While I would like that to happen, I think the odds are very long now.

 +connection (int fd)
 +  : m_fd (fd),
 +m_aux_fd (-1),
 +m_callbacks ()

Trevor  Personally I'd leave that to the compiler to write, but I guess there's
Trevor  something to be said for being explicit.

I can't recall if I did this in response to a warning or if it was just
because I wanted to be explicit.

I'm inclined to leave it, but I suppose only out of inertia.

 +  void print (const char *buf)

Trevor explicitly mark it as virtual?

Good idea, done.

 +// This is a wrapper function that is called by the RPC system and
 +// that then forwards the call to the library user.  Note that the
 +// return value is not used; the type cannot be 'void' due to
 +// limitations in our simple RPC.
 +gcc_address

Trevor looks like this one probably is used?

Thanks, fixed.

 +  char **argv = new (std::nothrow) char *[self-args.size () + 1];

Trevor What's the point of making this no throw? you don't null check it so
Trevor you'll crash anyway afaict.

Thanks.  I changed it to do a NULL check.  It's nothrow because nothing
in libcc1 or gdb is prepared for a C++ exception.  While I like
exceptions (gdb uses its own longjmp-based exception system
extensively), my understanding is that they aren't currently used in
gcc.

 +cc1_plugin::status
 +cc1_plugin::unmarshall (connection *conn, char **result)
 +{
 +  unsigned long long len;
 +
 +  if (!conn-require ('s'))
 +return FAIL;
 +  if (!conn-get (len, sizeof (len)))
 +return FAIL;
 +
 +  if (len == -1ULL)
 +{
 +  *result = NULL;
 +  return OK;
 +}
 +
 +  char *str = new (std::nothrow) char[len + 1];

Trevor It'd be really nice if the type of the out arg forced the caller to deal
Trevor with deleting the string like unique_ptrchar, it would be even nicer
Trevor if you could stick a random buffer in a std::string, but I guess you
Trevor can't :(

Yeah, it's all quite simplistic.  I suppose it could be upgraded, there
just didn't seem to be a need.

Trevor Also where does this array get deleted?

The unmarshalling methods are generally called via argument_wrappers.
An example is in connection.cc:

// Use an argument_wrapper here to simplify management
// of the string's lifetime.
argument_wrapperchar * method_name;

if (!method_name.unmarshall (this))
  return FAIL;

Then in rpc.hh:

  // Specialization for string types.
  template
  class argument_wrapperconst char *
  {
  public:
argument_wrapper () : m_object (NULL) { }
~argument_wrapper ()
{
  delete[] m_object;
}

Tom


Re: [PATCH] Implement -fsanitize=bounds and internal calls in FEs

2014-06-20 Thread Marek Polacek
On Thu, Jun 19, 2014 at 07:47:54PM +0200, Jakub Jelinek wrote:
 On Thu, Jun 19, 2014 at 04:56:53PM +0200, Marek Polacek wrote:
  +   /* Don't instrument this FMA-like array in non-strict
 
 Also, please don't use FMA to mean flexible member array, it is
 flexible array member, but more importantly, FMA is used for fused
 multiply-add, so IMHO it is better to spell it without acronym.

Just for the record, I've fixed this up in some earlier patch.

Marek


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Kai Tietz
I tested variant to use additional the split pass for memory combining
for indirect jumps.  By this we don't need to add a second peephole2
pass.  Other advantage of this version is that even for -O1 we do
combining.

ChangeLog

2014-06-20  Kai Tietz  kti...@redhat.com

PR target/39284
* config/i386/i386.md (peehole2): To combine
indirect jump with memory.
(split2): Likewise.

2014-06-20  Kai Tietz  kti...@redhat.com

* gcc.target/i386/indjmp-1.c: New test.

Tested for i686-pc-cygwin, and x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: testsuite/gcc.target/i386/indjmp-1.c
===
--- testsuite/gcc.target/i386/indjmp-1.c(Revision 0)
+++ testsuite/gcc.target/i386/indjmp-1.c(Arbeitskopie)
@@ -0,0 +1,23 @@
+/* { dg-do compile  { target ia32 } } */
+/* { dg-options -O2 } */
+
+#define ADVANCE_AND_DISPATCH() goto *addresses[*pc++]
+
+void
+Interpret(const unsigned char *pc)
+{
+static const void *const addresses[] = {
+  l0, l1, l2
+};
+
+l0:
+ADVANCE_AND_DISPATCH();
+
+l1:
+ADVANCE_AND_DISPATCH();
+
+l2:
+return;
+}
+
+/* { dg-final { scan-assembler-not jmp\[ \t\]*.%eax } } */
Index: config/i386/i386.md
===
--- config/i386/i386.md(Revision 211850)
+++ config/i386/i386.md(Arbeitskopie)
@@ -11466,6 +11466,24 @@
 (match_dup 4)))
   (unspec [(const_int 0)] UNSPEC_PEEPSIB)])])

+;; Combining simple memory jump instruction
+
+(define_peephole2
+  [(set (match_operand:W 0 register_operand)
+(match_operand:W 1 memory_operand))
+   (set (pc) (match_dup 0))]
+  !TARGET_X32  peep2_reg_dead_p (2, operands[0])
+  [(set (pc) (match_dup 1))])
+
+;; For avoiding a second pass for peephole, we use here split pass
+
+(define_split
+  [(set (match_operand:W 0 register_operand)
+(match_operand:W 1 memory_operand))
+   (set (pc) (match_dup 0))]
+  !TARGET_X32  peep2_reg_dead_p (2, operands[0])
+  [(set (pc) (match_dup 1))])
+
 ;; Call subroutine, returning value in operand 0

 (define_expand call_value


[PATCH] Remove bogus include path with in-tree cloog

2014-06-20 Thread Bernd Edlinger
Hi,

I have noticed there is a minor flaw with the include path when cloog is 
installed in-tree.

That is, the cloog-include directory is added twice, first with absolute path, 
and then
again with relative path, but with one ../ to little, so this is useless when 
compiling
sources in the gcc directory.

For example, if I call ../gcc-4.10-20140608/configure, the following is added 
to each
invocation of xg++:
-I/absolute_path/gcc-4.10-20140608/cloog/include 
-I../gcc-4.10-20140608/cloog/include
The attached patch removes the bogus relative include path for in-tree 
cloog/include.

Boot-strapped and regression-tested on x86_64-linux-gnu.
OK for trunk?


Thanks
Bernd.
  2014-06-20  Bernd Edlinger  bernd.edlin...@hotmail.de

Fix include path for in-tree cloog.
* config/cloog.m4 (CLOOG_INIT_FLAGS): Remove bogus include path.
* configure: Regenerate.


patch-cloog.diff
Description: Binary data


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Richard Henderson
On 06/20/2014 08:56 AM, Kai Tietz wrote:
 +(define_split
 +  [(set (match_operand:W 0 register_operand)
 +(match_operand:W 1 memory_operand))
 +   (set (pc) (match_dup 0))]
 +  !TARGET_X32  peep2_reg_dead_p (2, operands[0])
 +  [(set (pc) (match_dup 1))])
 +

Huh?  You can't use peep2 data structures in split passes.


r~


[PATCH] Fix arrays in rtx.u + add minor rtx verification

2014-06-20 Thread Marek Polacek
When implementing -fsanitize=bounds I noticed a whole slew of
errors about accessing u.fld[] field in rtx_def.  Turned out this
is indeed a bug, the array should have a size of 8; u.hwint[] array
had similar issue.  Thus fixed, plus I added some verification code
to genpreds.c (can't do it in gengenrtl.c as that doesn't include
rtl.h) so this won't happen again.

Verified that the bootstrap crashes by bootstrapping with changed
RTX_FLD_WIDTH/RTX_HWINT_WIDTH, otherwise the bootstrapp passes.

Ok for trunk?

2014-06-20  Marek Polacek  pola...@redhat.com

* genpreds.c (verify_rtx_codes): New function.
(main): Call it.
* rtl.h (RTX_FLD_WIDTH, RTX_HWINT_WIDTH): Define.
(struct rtx_def): Use them.

diff --git gcc/genpreds.c gcc/genpreds.c
index b14a4ac..3826757 100644
--- gcc/genpreds.c
+++ gcc/genpreds.c
@@ -1471,6 +1471,40 @@ parse_option (const char *opt)
 return 0;
 }
 
+/* Verify RTX codes.  We can't call fatal_error here, so call
+   gcc_unreachable after error to really abort.  */
+
+static void
+verify_rtx_codes (void)
+{
+  unsigned int i, j;
+
+  for (i = 0; i  NUM_RTX_CODE; i++)
+if (strchr (GET_RTX_FORMAT (i), 'w') == NULL)
+  {
+   if (strlen (GET_RTX_FORMAT (i))  RTX_FLD_WIDTH)
+ {
+   error (insufficient size of RTX_FLD_WIDTH);
+   gcc_unreachable ();
+ }
+  }
+else
+  {
+   const size_t len = strlen (GET_RTX_FORMAT (i));
+   for (j = 0; j  len; j++)
+ if (GET_RTX_FORMAT (i)[j] != 'w')
+   {
+ error (rtx format does not contain only hwint entries);
+ gcc_unreachable ();
+   }
+   if (len  RTX_HWINT_WIDTH)
+ {
+   error (insufficient size of RTL_MAX_HWINT_WIDTH);
+   gcc_unreachable ();
+ }
+  }
+}
+
 /* Master control.  */
 int
 main (int argc, char **argv)
@@ -1518,5 +1552,7 @@ main (int argc, char **argv)
   if (have_error || ferror (stdout) || fflush (stdout) || fclose (stdout))
 return FATAL_EXIT_CODE;
 
+  verify_rtx_codes ();
+
   return SUCCESS_EXIT_CODE;
 }
diff --git gcc/rtl.h gcc/rtl.h
index 6ec91a8..3f2e774 100644
--- gcc/rtl.h
+++ gcc/rtl.h
@@ -264,6 +264,12 @@ struct GTY((variable_size)) hwivec_def {
 #define CWI_PUT_NUM_ELEM(RTX, NUM) \
   (RTL_FLAG_CHECK1(CWI_PUT_NUM_ELEM, (RTX), CONST_WIDE_INT)-u2.num_elem = 
(NUM))
 
+/* The maximum number of entries in the FLD array in rtx.  */
+#define RTX_FLD_WIDTH 8
+
+/* The maximum number of entries in the HWINT array in rtx.  */
+#define RTX_HWINT_WIDTH (MAX (REAL_WIDTH, 3))
+
 /* RTL expression (rtx).  */
 
 struct GTY((chain_next (RTX_NEXT (%h)),
@@ -378,8 +384,8 @@ struct GTY((chain_next (RTX_NEXT (%h)),
  The number of operands and their types are controlled
  by the `code' field, according to rtl.def.  */
   union u {
-rtunion fld[1];
-HOST_WIDE_INT hwint[1];
+rtunion fld[RTX_FLD_WIDTH];
+HOST_WIDE_INT hwint[RTX_HWINT_WIDTH];
 struct block_symbol block_sym;
 struct real_value rv;
 struct fixed_value fv;

Marek


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Kai Tietz
2014-06-20 19:23 GMT+02:00 Richard Henderson r...@redhat.com:
 On 06/20/2014 08:56 AM, Kai Tietz wrote:
 +(define_split
 +  [(set (match_operand:W 0 register_operand)
 +(match_operand:W 1 memory_operand))
 +   (set (pc) (match_dup 0))]
 +  !TARGET_X32  peep2_reg_dead_p (2, operands[0])
 +  [(set (pc) (match_dup 1))])
 +

 Huh?  You can't use peep2 data structures in split passes.


 r~

Duh, you are right ... that shouldn't work, nevertheless it
bootstrapped fine.  Well, so we will need second peephole2 pass.  I
will come with patch for that soon.

Kai


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Kai Tietz
So revert to use a second peephole2 pass before final split before sched2 pass.

Ok for apply
ChangeLog

2014-06-20  Kai Tietz  kti...@redhat.com

PR target/39284
* passes.def (peephole2): Add second peephole2 pass before
split before sched2 pass.
* config/i386/i386.md (peehole2): To combine
indirect jump with memory.
(split2): Likewise.

2014-06-20  Kai Tietz  kti...@redhat.com

* gcc.target/i386/indjmp-1.c: New test.

Tested for i686-pc-cygwin, and x86_64-unknown-linux-gnu running.  Ok
for apply when boostraps are passing?

Regards,
Kai

Index: testsuite/gcc.target/i386/indjmp-1.c
===
--- testsuite/gcc.target/i386/indjmp-1.c(Revision 0)
+++ testsuite/gcc.target/i386/indjmp-1.c(Arbeitskopie)
@@ -0,0 +1,23 @@
+/* { dg-do compile  { target ia32 } } */
+/* { dg-options -O2 } */
+
+#define ADVANCE_AND_DISPATCH() goto *addresses[*pc++]
+
+void
+Interpret(const unsigned char *pc)
+{
+static const void *const addresses[] = {
+  l0, l1, l2
+};
+
+l0:
+ADVANCE_AND_DISPATCH();
+
+l1:
+ADVANCE_AND_DISPATCH();
+
+l2:
+return;
+}
+
+/* { dg-final { scan-assembler-not jmp\[ \t\]*.%eax } } */
Index: passes.def
===
--- passes.def(Revision 211850)
+++ passes.def(Arbeitskopie)
@@ -393,6 +393,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_reorder_blocks);
   NEXT_PASS (pass_branch_target_load_optimize2);
   NEXT_PASS (pass_leaf_regs);
+  NEXT_PASS (pass_peephole2);
   NEXT_PASS (pass_split_before_sched2);
   NEXT_PASS (pass_sched2);
   NEXT_PASS (pass_stack_regs);
Index: config/i386/i386.md
===
--- config/i386/i386.md(Revision 211850)
+++ config/i386/i386.md(Arbeitskopie)
@@ -11466,6 +11466,15 @@
 (match_dup 4)))
   (unspec [(const_int 0)] UNSPEC_PEEPSIB)])])

+;; Combining simple memory jump instruction
+
+(define_peephole2
+  [(set (match_operand:W 0 register_operand)
+(match_operand:W 1 memory_operand))
+   (set (pc) (match_dup 0))]
+  !TARGET_X32  peep2_reg_dead_p (2, operands[0])
+  [(set (pc) (match_dup 1))])
+
 ;; Call subroutine, returning value in operand 0

 (define_expand call_value


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Richard Henderson
On 06/20/2014 10:52 AM, Kai Tietz wrote:
 2014-06-20  Kai Tietz  kti...@redhat.com
 
 PR target/39284
 * passes.def (peephole2): Add second peephole2 pass before
 split before sched2 pass.
 * config/i386/i386.md (peehole2): To combine
 indirect jump with memory.
 (split2): Likewise.

Why are we adding a second pass instead of just moving the one?


r~


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Kai Tietz
2014-06-20 19:55 GMT+02:00 Richard Henderson r...@redhat.com:
 On 06/20/2014 10:52 AM, Kai Tietz wrote:
 2014-06-20  Kai Tietz  kti...@redhat.com

 PR target/39284
 * passes.def (peephole2): Add second peephole2 pass before
 split before sched2 pass.
 * config/i386/i386.md (peehole2): To combine
 indirect jump with memory.
 (split2): Likewise.

 Why are we adding a second pass instead of just moving the one?


 r~

I told that in a prior mail in that thread to Jeff. IIRC there are
some conversion of impossible pushes then done too late, additional
some patterns about split  dieing register too.  Means we produce
weaker code.

Kai


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Jeff Law

On 06/20/14 12:07, Kai Tietz wrote:

2014-06-20 19:55 GMT+02:00 Richard Henderson r...@redhat.com:

On 06/20/2014 10:52 AM, Kai Tietz wrote:

2014-06-20  Kai Tietz  kti...@redhat.com

 PR target/39284
 * passes.def (peephole2): Add second peephole2 pass before
 split before sched2 pass.
 * config/i386/i386.md (peehole2): To combine
 indirect jump with memory.
 (split2): Likewise.


Why are we adding a second pass instead of just moving the one?


r~


I told that in a prior mail in that thread to Jeff. IIRC there are
some conversion of impossible pushes then done too late, additional
some patterns about split  dieing register too.  Means we produce
weaker code.
So let's dig into this deeper.  Examples  explanations would help.  I 
know it feels like a bit of a runaround, but avoiding adding the pass 
would be good.


jeff


Re: C++ PATCH for c++/59296 (rvalue object and lvalue ref-qualifier)

2014-06-20 Thread Jason Merrill

On 06/20/2014 03:11 PM, Jason Merrill wrote:

On 06/19/2014 01:29 PM, Marc Glisse wrote:

That looks weird to me. The const version is a better match than the
const, so we should pick that one in overload resolution, but if we
remove the const version, the other one seems valid to me


Hmm, you're right, I was confused.


Here's a patch that fixes this properly.

Tested x86_64-pc-linux-gnu, applying to trunk.


commit cc1e903c60c452ad7b618f6a9ff25ae85741424e
Author: Jason Merrill ja...@redhat.com
Date:   Fri Jun 20 14:51:21 2014 +0200

	PR c++/59296
	* call.c (add_function_candidate): Avoid special 'this' handling
	if we have a ref-qualifier.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index da91433..4847c3a 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -2025,9 +2025,9 @@ add_function_candidate (struct z_candidate **candidates,
 		 object parameter has reference type.  */
 		  bool rv = FUNCTION_RVALUE_QUALIFIED (TREE_TYPE (fn));
 		  parmtype = cp_build_reference_type (parmtype, rv);
-		  /* Don't bind an rvalue to a const lvalue ref-qualifier.  */
-		  if (!rv)
-		lflags |= LOOKUP_NO_RVAL_BIND|LOOKUP_NO_TEMP_BIND;
+		  /* The special handling of 'this' conversions in compare_ics
+		 does not apply if there is a ref-qualifier.  */
+		  is_this = false;
 		}
 	  else
 		{
@@ -8597,10 +8597,11 @@ compare_ics (conversion *ics1, conversion *ics2)
   /* [over.ics.rank]
 
  --S1 and S2 are reference bindings (_dcl.init.ref_) and neither refers
- to an implicit object parameter, and either S1 binds an lvalue reference
- to an lvalue and S2 binds an rvalue reference or S1 binds an rvalue
- reference to an rvalue and S2 binds an lvalue reference
- (C++0x draft standard, 13.3.3.2)
+ to an implicit object parameter of a non-static member function
+ declared without a ref-qualifier, and either S1 binds an lvalue
+ reference to an lvalue and S2 binds an rvalue reference or S1 binds an
+ rvalue reference to an rvalue and S2 binds an lvalue reference (C++0x
+ draft standard, 13.3.3.2)
 
  --S1 and S2 are reference bindings (_dcl.init.ref_), and the
  types to which the references refer are the same type except for


C++ PATCH for c++/61556 (constexpr member function)

2014-06-20 Thread Jason Merrill
Now that we're calling build_this in build_over_call, it needs to happen 
on the template path as well.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit daf445b34181c222baa792e7310fc4af2d26ec3c
Author: Jason Merrill ja...@redhat.com
Date:   Thu Jun 19 14:29:51 2014 +0200

	PR c++/61556
	* call.c (build_over_call): Call build_this in template path.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index e147abd..da91433 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6896,7 +6896,7 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain)
 
 	  ++nargs;
 	  alcarray = XALLOCAVEC (tree, nargs);
-	  alcarray[0] = first_arg;
+	  alcarray[0] = build_this (first_arg);
 	  FOR_EACH_VEC_SAFE_ELT (args, ix, arg)
 	alcarray[ix + 1] = arg;
 	  argarray = alcarray;
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-template7.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-template7.C
new file mode 100644
index 000..e835dbf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-template7.C
@@ -0,0 +1,32 @@
+// PR c++/61556
+// { dg-do compile { target c++11 } }
+
+class ValueType {
+public:
+constexpr operator int() const {return m_ID;};
+constexpr ValueType(const int v)
+: m_ID(v) {}
+private:
+int m_ID;
+};
+
+class ValueTypeEnum {
+public:
+static constexpr ValueType doubleval = ValueType(1);
+};
+
+template int format
+class ValueTypeInfo {
+};
+
+template typename Format
+class FillFunctor {
+public:
+FillFunctor() {
+ValueTypeInfoValueTypeEnum::doubleval v;
+}
+};
+
+int main() {
+ValueTypeInfoValueTypeEnum::doubleval v;
+}


Re: [PATCH, AARCH64] Enable fuse-caller-save for AARCH64

2014-06-20 Thread Tom de Vries

On 19-06-14 20:41, Richard Henderson wrote:

On 06/19/2014 11:25 AM, Tom de Vries wrote:

On 19-06-14 05:53, Richard Henderson wrote:

On 06/01/2014 03:00 AM, Tom de Vries wrote:

+aarch64_emit_call_insn (rtx pat)
+{
+  rtx insn = emit_call_insn (pat);
+
+  rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
+  clobber_reg (fusage, gen_rtx_REG (word_mode, IP0_REGNUM));
+  clobber_reg (fusage, gen_rtx_REG (word_mode, IP1_REGNUM));

Actually, I'd like to know more about how this is supposed to work.

Why are you only marking the two registers that would be used by a PLT entry,
but not those clobbered by the ld.so trampoline, or indeed the unknown function
that would be called from the PLT.

Oh, I see, looking at the code we do actually follow the cgraph and make sure
it is a direct call with a known destination.  So, in fact, it's only the
registers that could be clobbered by ld branch islands (so these two are still
correct for aarch64).

This means the documentation is actually wrong when it mentions PLTs at all.


Yes, if we go from the point of view that the
TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS hooks sole purpose is to enable
the fuse-caller-save optimization.

How about this updated definition ? OK for trunk if re-testing on arm succeeds ?


I did like the doc including mention of stubs, because they're easy to
forget.  How about

Set to true if each call that binds to a local definition explicitly clobbers
or sets all non-fixed registers modified by performing the call.  That is, by
the call pattern itself, or by code that might be inserted by the linker
(e.g. stubs, veneers, branch islands), but not including those modifiable by
the callee.  The affected registers may be mentioned explicitly in the
call pattern, or included as clobbers in CALL_INSN_FUNCTION_USAGE.
The default version of this hook is set to false.  The purpose of this hook
is to enable the fuse-caller-save optimization.




Looks good to me.  Bootstrapped and committed as attached.

Thanks,
- Tom

2014-06-20  Tom de Vries  t...@codesourcery.com

	* target.def (call_fusage_contains_non_callee_clobbers): Update
	definition.
	* doc/tm.texi: Regenerate.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c272630..45281ae 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4884,14 +4884,14 @@ Whether this target supports splitting the stack when the options described in @
 @cindex miscellaneous register hooks
 
 @deftypevr {Target Hook} bool TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
-set to true if all the calls in the current function contain clobbers in
-CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call
-rather than by the callee, and are not already set or clobbered in the call
-pattern.  Examples of such registers are registers used in PLTs and stubs,
-and temporary registers used in the call instruction but not present in the
-rtl pattern.  Another way to formulate it is the registers not present in the
-rtl pattern that are clobbered by the call assuming the callee does not
-clobber any register.  The default version of this hook is set to false.
+Set to true if each call that binds to a local definition explicitly
+clobbers or sets all non-fixed registers modified by performing the call.
+That is, by the call pattern itself, or by code that might be inserted by the
+linker (e.g. stubs, veneers, branch islands), but not including those
+modifiable by the callee.  The affected registers may be mentioned explicitly
+in the call pattern, or included as clobbers in CALL_INSN_FUNCTION_USAGE.
+The default version of this hook is set to false.  The purpose of this hook
+is to enable the fuse-caller-save optimization.
 @end deftypevr
 
 @node Varargs
diff --git a/gcc/target.def b/gcc/target.def
index e455211..ee250e6 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5128,18 +5128,18 @@ FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM, and the PIC_OFFSET_TABLE_REGNUM.,
  hook_void_bitmap)
 
 /* Targets should define this target hook to mark that non-callee clobbers are
-   present in CALL_INSN_FUNCTION_USAGE for all the calls in the current
-   function.  */
+   present in CALL_INSN_FUNCTION_USAGE for all the calls that bind to a local
+   definition.  */
 DEFHOOKPOD
 (call_fusage_contains_non_callee_clobbers,
- set to true if all the calls in the current function contain clobbers in\n\
-CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call\n\
-rather than by the callee, and are not already set or clobbered in the call\n\
-pattern.  Examples of such registers are registers used in PLTs and stubs,\n\
-and temporary registers used in the call instruction but not present in the\n\
-rtl pattern.  Another way to formulate it is the registers not present in the\n\
-rtl pattern that are clobbered by the call assuming the callee does not\n\
-clobber any register.  The default version of this hook is set to false.,
+ Set to true if each call that binds to a local definition explicitly\n\

Re: [PATCH] Power/GCC: Remove trailing NOP from byte-swap code

2014-06-20 Thread David Edelsohn
014-06-20  Maciej W. Rozycki  ma...@codesourcery.com

gcc/
* config/rs6000/rs6000.md: Append `DONE' to preparation
statements of `bswap' pattern splitters.

Okay.

Thanks, David


Re: [PATCH] Fix arrays in rtx.u + add minor rtx verification

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 07:36:41PM +0200, Marek Polacek wrote:
 2014-06-20  Marek Polacek  pola...@redhat.com
 
   * genpreds.c (verify_rtx_codes): New function.
   (main): Call it.
   * rtl.h (RTX_FLD_WIDTH, RTX_HWINT_WIDTH): Define.
   (struct rtx_def): Use them.

Note, e.g. Coverity also complains about this stuff loudly too,
apparently not just in the problematic case where rtx_def is used
in a middle of structure, but also when used in flexible array
like spot.  Most RTLs are allocated through rtx_alloc and the size
is determined from RTX_HDR_SIZE (i.e. offsetof) and/or RTX_CODE_SIZE,
so your rtl.h change IMHO shouldn't affect anything but make the
expmed.c init_expmed_rtl structure somewhat longer.

 --- gcc/genpreds.c
 +++ gcc/genpreds.c
 @@ -1471,6 +1471,40 @@ parse_option (const char *opt)
  return 0;
  }
  
 +/* Verify RTX codes.  We can't call fatal_error here, so call
 +   gcc_unreachable after error to really abort.  */
 +
 +static void
 +verify_rtx_codes (void)
 +{
 +  unsigned int i, j;
 +
 +  for (i = 0; i  NUM_RTX_CODE; i++)
 +if (strchr (GET_RTX_FORMAT (i), 'w') == NULL)
 +  {
 + if (strlen (GET_RTX_FORMAT (i))  RTX_FLD_WIDTH)
 +   {
 + error (insufficient size of RTX_FLD_WIDTH);
 + gcc_unreachable ();

I think it would be nice to be more verbose, i.e. say
which rtx has longer format string than RTX_FLD_WIDTH, and perhaps also
the size and RTX_FLD_WIDTH value.  Also, can't you use internal_error
instead of error + gcc_unreachable ?

So perhaps
internal_error (%s format %s longer than RTX_FLD_WIDTH %d\n,
GET_RTX_NAME (i), GET_RTX_FORMAT (i),
(int) RTX_FLD_WIDTH);
?

 +   }
 +  }
 +else
 +  {
 + const size_t len = strlen (GET_RTX_FORMAT (i));
 + for (j = 0; j  len; j++)
 +   if (GET_RTX_FORMAT (i)[j] != 'w')
 + {
 +   error (rtx format does not contain only hwint entries);
 +   gcc_unreachable ();
 + }
 + if (len  RTX_HWINT_WIDTH)
 +   {
 + error (insufficient size of RTL_MAX_HWINT_WIDTH);
 + gcc_unreachable ();
 +   }
 +  }

And similarly here.

Otherwise, LGTM, but as I've been discussing this with Marek,
I'd prefer somebody else to review it.

Jakub


Re: [PATCH] Power/GCC: Remove trailing NOP from byte-swap code

2014-06-20 Thread Maciej W. Rozycki
On Fri, 20 Jun 2014, David Edelsohn wrote:

 gcc/
 * config/rs6000/rs6000.md: Append `DONE' to preparation
 statements of `bswap' pattern splitters.
 
 Okay.

 Committed, thanks for your review.

  Maciej


Re: [PATCH] Fix arrays in rtx.u + add minor rtx verification

2014-06-20 Thread Jeff Law

On 06/20/14 13:01, Jakub Jelinek wrote:

On Fri, Jun 20, 2014 at 07:36:41PM +0200, Marek Polacek wrote:

2014-06-20  Marek Polacek  pola...@redhat.com

* genpreds.c (verify_rtx_codes): New function.
(main): Call it.
* rtl.h (RTX_FLD_WIDTH, RTX_HWINT_WIDTH): Define.
(struct rtx_def): Use them.


Note, e.g. Coverity also complains about this stuff loudly too,
Yes it does.  IIRC, these warnings from Coverity were the cause of GCC 
reaching the #1 or #2 position across Red Hat's packages in terms of 
Coverity warnings.  Not a good position to be in.




like spot.  Most RTLs are allocated through rtx_alloc and the size
is determined from RTX_HDR_SIZE (i.e. offsetof) and/or RTX_CODE_SIZE,
so your rtl.h change IMHO shouldn't affect anything but make the
expmed.c init_expmed_rtl structure somewhat longer.
Right.  This comment was actually very helpful in that I wasn't aware of 
precisely which cases Marek was trying to address.


Presumably the [1] sizing is what prevents any compile-time checking of 
this?


No strong opinion on the internal_error vs error+unreachable.




--- gcc/genpreds.c
+++ gcc/genpreds.c
@@ -1471,6 +1471,40 @@ parse_option (const char *opt)
  return 0;
  }

+/* Verify RTX codes.  We can't call fatal_error here, so call
+   gcc_unreachable after error to really abort.  */
+
+static void
+verify_rtx_codes (void)
+{
+  unsigned int i, j;
+
+  for (i = 0; i  NUM_RTX_CODE; i++)
+if (strchr (GET_RTX_FORMAT (i), 'w') == NULL)
+  {
+   if (strlen (GET_RTX_FORMAT (i))  RTX_FLD_WIDTH)
+ {
+   error (insufficient size of RTX_FLD_WIDTH);
+   gcc_unreachable ();


I think it would be nice to be more verbose, i.e. say
which rtx has longer format string than RTX_FLD_WIDTH, and perhaps also
the size and RTX_FLD_WIDTH value.  Also, can't you use internal_error
instead of error + gcc_unreachable ?

Agreed, definitely indicate which RTX code is problematical.

Jeff


[PATCH] Fix 61565 -- cmpelim vs non-call exceptions

2014-06-20 Thread Richard Henderson
There aren't too many users of the cmpelim pass, and previously they were all
small embedded targets without an FPU.

I'm a bit surprised that Ramana decided to enable this pass for aarch64, as
that target is not so limited as the block comment for the pass describes.
Honestly, whatever is being deleted here ought to have been found earlier,
either via combine or cse.  We ought to find out why any changes are made
during this pass for aarch64.

That said, this PR does demonstrate a bug in the handling of fp comparisons in
the presence of -fnon-call-exceptions, so I go ahead and fix that regardless of
what we do with the aarch64 port longer term.

Bootstrap still in progress, but the original testcase is resolved.


r~
* compare-elim.c (struct comparison): Add eh_note.
(find_comparison_dom_walker::before_dom_children): Don't eliminate
a redundant comparison in a different EH region.  Purge EH edges if
necessary.


diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c
index 2fbb75b..4ecdd48 100644
--- a/gcc/compare-elim.c
+++ b/gcc/compare-elim.c
@@ -100,6 +100,9 @@ struct comparison
  constants.  */
   rtx in_a, in_b;
 
+  /* The REG_EH_REGION of the comparison.  */
+  rtx eh_note;
+
   /* Information about how this comparison is used.  */
   struct comparison_use uses[MAX_CMP_USE];
 
@@ -262,6 +265,7 @@ find_comparison_dom_walker::before_dom_children 
(basic_block bb)
   struct comparison *last_cmp;
   rtx insn, next, last_clobber;
   bool last_cmp_valid;
+  bool need_purge = false;
   bitmap killed;
 
   killed = BITMAP_ALLOC (NULL);
@@ -303,44 +307,60 @@ find_comparison_dom_walker::before_dom_children 
(basic_block bb)
   if (src)
{
  enum machine_mode src_mode = GET_MODE (src);
+ rtx eh_note = NULL;
 
- /* Eliminate a compare that's redundant with the previous.  */
- if (last_cmp_valid
-  rtx_equal_p (last_cmp-in_a, XEXP (src, 0))
-  rtx_equal_p (last_cmp-in_b, XEXP (src, 1)))
-   {
- rtx flags, x;
- enum machine_mode new_mode
-   = targetm.cc_modes_compatible (last_cmp-orig_mode, src_mode);
+ if (flag_non_call_exceptions)
+   eh_note = find_reg_note (insn, REG_EH_REGION, NULL);
 
- /* New mode is incompatible with the previous compare mode.  */
- if (new_mode == VOIDmode)
-   continue;
+ if (!last_cmp_valid)
+   goto dont_delete;
 
- if (new_mode != last_cmp-orig_mode)
-   {
- flags = gen_rtx_REG (src_mode, targetm.flags_regnum);
+ /* Take care that it's in the same EH region.  */
+ if (flag_non_call_exceptions
+  !rtx_equal_p (eh_note, last_cmp-eh_note))
+   goto dont_delete;
 
- /* Generate new comparison for substitution.  */
- x = gen_rtx_COMPARE (new_mode, XEXP (src, 0), XEXP (src, 1));
- x = gen_rtx_SET (VOIDmode, flags, x);
+ /* Make sure the compare is redundant with the previous.  */
+ if (!rtx_equal_p (last_cmp-in_a, XEXP (src, 0))
+ || !rtx_equal_p (last_cmp-in_b, XEXP (src, 1)))
+   goto dont_delete;
 
- if (!validate_change (last_cmp-insn,
-   PATTERN (last_cmp-insn), x, false))
-   continue;
+ /* New mode must be compatible with the previous compare mode.  */
+ {
+   enum machine_mode new_mode
+ = targetm.cc_modes_compatible (last_cmp-orig_mode, src_mode);
+   if (new_mode == VOIDmode)
+ goto dont_delete;
 
- last_cmp-orig_mode = new_mode;
-   }
+   if (new_mode != last_cmp-orig_mode)
+ {
+   rtx x, flags = gen_rtx_REG (src_mode, targetm.flags_regnum);
 
- delete_insn (insn);
- continue;
-   }
+   /* Generate new comparison for substitution.  */
+   x = gen_rtx_COMPARE (new_mode, XEXP (src, 0), XEXP (src, 1));
+   x = gen_rtx_SET (VOIDmode, flags, x);
 
+   if (!validate_change (last_cmp-insn,
+ PATTERN (last_cmp-insn), x, false))
+ goto dont_delete;
+
+   last_cmp-orig_mode = new_mode;
+ }
+ }
+
+ /* All tests and substitutions succeeded!  */
+ if (eh_note)
+   need_purge = true;
+ delete_insn (insn);
+ continue;
+
+   dont_delete:
  last_cmp = XCNEW (struct comparison);
  last_cmp-insn = insn;
  last_cmp-prev_clobber = last_clobber;
  last_cmp-in_a = XEXP (src, 0);
  last_cmp-in_b = XEXP (src, 1);
+ last_cmp-eh_note = eh_note;
  last_cmp-orig_mode = src_mode;
  all_compares.safe_push (last_cmp);
 
@@ -404,6 +424,11 @@ find_comparison_dom_walker::before_dom_children 

testsuite allocators patch

2014-06-20 Thread François Dumont

Hi

I would like to finally propose this patch before the one on 
_Rb_tree, as a separate one.


I have adopted the same evolution on the tracker_allocator with 
even a perfect forwarding constructor to allow its usage on top of the 
uneq_allocator  which take a personality parameter. Doing so I realized 
that move_assign_neg.cc tests were not accurate enough as they needed a 
non move propagating allocator and the uneq_allocator were not 
explicitly non propagating.


2014-06-21  François Dumont  fdum...@gcc.gnu.org

* testsuite/util/testsuite_allocator.h
(tracker_allocator_counter::allocate): Remove new invocation, only
collect information.
(tracker_allocator_counter::deallocate): Remove delete invocation, only
collect information.
(check_inconsistent_alloc_value_type): New.
(tracker_allocator): Transform as a facade for any allocator type.
(uneq_allocator): Likewise.
(propagating_allocator): Likewise.
* testsuite/23_containers/forward_list/debug/move_assign_neg.cc: Use an
explicitly non propagating allocator.
* testsuite/23_containers/map/debug/move_assign_neg.cc: Likewise.
* testsuite/23_containers/multimap/debug/move_assign_neg.cc: likewise.
* testsuite/23_containers/multiset/debug/move_assign_neg.cc: Likewise.
* testsuite/23_containers/set/debug/move_assign_neg.cc: Likewise.
* testsuite/23_containers/unordered_map/debug/move_assign_neg.cc:
Likewise.
* testsuite/23_containers/unordered_multimap/debug/move_assign_neg.cc:
Likewise.
* testsuite/23_containers/unordered_multiset/debug/move_assign_neg.cc:
Likewise.
* testsuite/23_containers/unordered_set/debug/move_assign_neg.cc:
Likewise.
* testsuite/23_containers/vector/debug/move_assign_neg.cc: Likewise.

Tested under Linux x86_64.

Ok to commit ?

François



[i386] logical shift right in shrd

2014-06-20 Thread Marc Glisse

Hello,

as reported in PR 61503, there seems to be a typo in the shrd pattern. I 
think it is quite unlikely to cause any problem, because the pattern is 1 
instruction too long for combine to recognize it (by the way, if someone 
has suggestions for PR 55583...). But it is still better to fix it.


Bootstrap+testsuite on x86_64-linux-gnu.

2014-06-21  Marc Glisse  marc.gli...@inria.fr

PR target/61503
* config/i386/i386.md (x86_64_shrd, x86_shrd): Replace ashiftrt
with lshiftrt.

--
Marc GlisseIndex: gcc/config/i386/i386.md
===
--- gcc/config/i386/i386.md (revision 211856)
+++ gcc/config/i386/i386.md (working copy)
@@ -9601,37 +9601,37 @@
 (match_operand:DWI 1 register_operand)
 (match_operand:QI 2 nonmemory_operand)))
  (clobber (reg:CC FLAGS_REG))])
(match_dup 3)]
   TARGET_CMOVE
   [(const_int 0)]
   ix86_split_shift_insn (operands, operands[3], DWImode); DONE;)
 
 (define_insn x86_64_shrd
   [(set (match_operand:DI 0 nonimmediate_operand +r*m)
-(ior:DI (ashiftrt:DI (match_dup 0)
+(ior:DI (lshiftrt:DI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Jc))
(ashift:DI (match_operand:DI 1 register_operand r)
  (minus:QI (const_int 64) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   TARGET_64BIT
   shrd{q}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode DI)
(set_attr athlon_decode vector)
(set_attr amdfam10_decode vector)
(set_attr bdver1_decode vector)])
 
 (define_insn x86_shrd
   [(set (match_operand:SI 0 nonimmediate_operand +r*m)
-(ior:SI (ashiftrt:SI (match_dup 0)
+(ior:SI (lshiftrt:SI (match_dup 0)
  (match_operand:QI 2 nonmemory_operand Ic))
(ashift:SI (match_operand:SI 1 register_operand r)
  (minus:QI (const_int 32) (match_dup 2)
(clobber (reg:CC FLAGS_REG))]
   
   shrd{l}\t{%s2%1, %0|%0, %1, %2}
   [(set_attr type ishift)
(set_attr prefix_0f 1)
(set_attr mode SI)
(set_attr pent_pair np)


Re: [PATCH] Fix arrays in rtx.u + add minor rtx verification

2014-06-20 Thread Marek Polacek
On Fri, Jun 20, 2014 at 09:01:14PM +0200, Jakub Jelinek wrote:
 On Fri, Jun 20, 2014 at 07:36:41PM +0200, Marek Polacek wrote:
  2014-06-20  Marek Polacek  pola...@redhat.com
  
  * genpreds.c (verify_rtx_codes): New function.
  (main): Call it.
  * rtl.h (RTX_FLD_WIDTH, RTX_HWINT_WIDTH): Define.
  (struct rtx_def): Use them.
 
 Note, e.g. Coverity also complains about this stuff loudly too,
 apparently not just in the problematic case where rtx_def is used
 in a middle of structure, but also when used in flexible array
 like spot.  Most RTLs are allocated through rtx_alloc and the size
 is determined from RTX_HDR_SIZE (i.e. offsetof) and/or RTX_CODE_SIZE,
 so your rtl.h change IMHO shouldn't affect anything but make the
 expmed.c init_expmed_rtl structure somewhat longer.


 
  --- gcc/genpreds.c
  +++ gcc/genpreds.c
  @@ -1471,6 +1471,40 @@ parse_option (const char *opt)
   return 0;
   }
   
  +/* Verify RTX codes.  We can't call fatal_error here, so call
  +   gcc_unreachable after error to really abort.  */
  +
  +static void
  +verify_rtx_codes (void)
  +{
  +  unsigned int i, j;
  +
  +  for (i = 0; i  NUM_RTX_CODE; i++)
  +if (strchr (GET_RTX_FORMAT (i), 'w') == NULL)
  +  {
  +   if (strlen (GET_RTX_FORMAT (i))  RTX_FLD_WIDTH)
  + {
  +   error (insufficient size of RTX_FLD_WIDTH);
  +   gcc_unreachable ();
 
 I think it would be nice to be more verbose, i.e. say
 which rtx has longer format string than RTX_FLD_WIDTH, and perhaps also
 the size and RTX_FLD_WIDTH value.  Also, can't you use internal_error
 instead of error + gcc_unreachable ?
 
 So perhaps
   internal_error (%s format %s longer than RTX_FLD_WIDTH %d\n,
   GET_RTX_NAME (i), GET_RTX_FORMAT (i),
   (int) RTX_FLD_WIDTH);
 ?
 
Ok, that's much better.  I actually can use internal_error - we have 
that function in both diagnostic.c and errors.c, while fatal_error is
only in diagnostic.c.

  + }
  +  }
  +else
  +  {
  +   const size_t len = strlen (GET_RTX_FORMAT (i));
  +   for (j = 0; j  len; j++)
  + if (GET_RTX_FORMAT (i)[j] != 'w')
  +   {
  + error (rtx format does not contain only hwint entries);
  + gcc_unreachable ();
  +   }
  +   if (len  RTX_HWINT_WIDTH)
  + {
  +   error (insufficient size of RTL_MAX_HWINT_WIDTH);
  +   gcc_unreachable ();
  + }
  +  }
 
 And similarly here.

Fixed.
 
 Otherwise, LGTM, but as I've been discussing this with Marek,
 I'd prefer somebody else to review it.

Sure.  So can anybody look at this, please?

2014-06-20  Marek Polacek  pola...@redhat.com

* genpreds.c (verify_rtx_codes): New function.
(main): Call it.
* rtl.h (RTX_FLD_WIDTH, RTX_HWINT_WIDTH): Define.
(struct rtx_def): Use them.

diff --git gcc/genpreds.c gcc/genpreds.c
index b14a4ac..7e62124 100644
--- gcc/genpreds.c
+++ gcc/genpreds.c
@@ -1471,6 +1471,36 @@ parse_option (const char *opt)
 return 0;
 }
 
+/* Verify RTX codes.  */
+
+static void
+verify_rtx_codes (void)
+{
+  unsigned int i, j;
+
+  for (i = 0; i  NUM_RTX_CODE; i++)
+if (strchr (GET_RTX_FORMAT (i), 'w') == NULL)
+  {
+   if (strlen (GET_RTX_FORMAT (i))  RTX_FLD_WIDTH)
+ internal_error (%s format %s longer than RTX_FLD_WIDTH %d\n,
+ GET_RTX_NAME (i), GET_RTX_FORMAT (i),
+ (int) RTX_FLD_WIDTH);
+  }
+else
+  {
+   const size_t len = strlen (GET_RTX_FORMAT (i));
+   for (j = 0; j  len; j++)
+ if (GET_RTX_FORMAT (i)[j] != 'w')
+   internal_error (%s format %s should contain only w, but 
+   has %c\n, GET_RTX_NAME (i), GET_RTX_FORMAT (i),
+   GET_RTX_FORMAT (i)[j]);
+   if (len  RTX_HWINT_WIDTH)
+ internal_error (%s format %s longer than RTX_HWINT_WIDTH %d\n,
+ GET_RTX_NAME (i), GET_RTX_FORMAT (i),
+ (int) RTX_HWINT_WIDTH);
+  }
+}
+
 /* Master control.  */
 int
 main (int argc, char **argv)
@@ -1518,5 +1548,7 @@ main (int argc, char **argv)
   if (have_error || ferror (stdout) || fflush (stdout) || fclose (stdout))
 return FATAL_EXIT_CODE;
 
+  verify_rtx_codes ();
+
   return SUCCESS_EXIT_CODE;
 }
diff --git gcc/rtl.h gcc/rtl.h
index 6ec91a8..3f2e774 100644
--- gcc/rtl.h
+++ gcc/rtl.h
@@ -264,6 +264,12 @@ struct GTY((variable_size)) hwivec_def {
 #define CWI_PUT_NUM_ELEM(RTX, NUM) \
   (RTL_FLAG_CHECK1(CWI_PUT_NUM_ELEM, (RTX), CONST_WIDE_INT)-u2.num_elem = 
(NUM))
 
+/* The maximum number of entries in the FLD array in rtx.  */
+#define RTX_FLD_WIDTH 8
+
+/* The maximum number of entries in the HWINT array in rtx.  */
+#define RTX_HWINT_WIDTH (MAX (REAL_WIDTH, 3))
+
 /* RTL expression (rtx).  */
 
 struct GTY((chain_next (RTX_NEXT (%h)),
@@ -378,8 +384,8 @@ struct GTY((chain_next (RTX_NEXT (%h)),
  The number of operands and 

Re: [PATCH] Fix arrays in rtx.u + add minor rtx verification

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 01:55:41PM -0600, Jeff Law wrote:
 like spot.  Most RTLs are allocated through rtx_alloc and the size
 is determined from RTX_HDR_SIZE (i.e. offsetof) and/or RTX_CODE_SIZE,
 so your rtl.h change IMHO shouldn't affect anything but make the
 expmed.c init_expmed_rtl structure somewhat longer.
 Right.  This comment was actually very helpful in that I wasn't aware of
 precisely which cases Marek was trying to address.
 
 Presumably the [1] sizing is what prevents any compile-time checking of
 this?

First version of Marek's patch did that (never instrumented [], [0] and
[1] arrays, no matter where they appeared, and instrumented everything
else).
Latest patch only never instruments [] (which, by definition can only
appear at the end of structure), other arrays (no matter what size)
aren't instrumented if they aren't followed by any fields, or
if the base of the handled components is not INDIRECT_REF/MEM_REF
(so, typically is a decl).
u.fld[1] array is the last field, so we don't warn for that, but when
rtx_def appears in another structure (in expmed.c) or if e.g. even
some code had a rtx_def typed variable and accessed say u.fld[1] in there,
it would be instrumented.

Whether we should have a strict array bounds mode where we would instrument
even arrays at the end of structures (with the exception of []) is something
to be discussed.

Jakub


Re: [Committed] [PATCH] PR61123 : Fix the ABI mis-matching error caused by LTO

2014-06-20 Thread Jakub Jelinek
On Fri, Jun 20, 2014 at 01:14:52PM +0800, Hale Wang wrote:
 2014-06-20 Hale Wang hale.w...@arm.com
 
 * gcc.target/arm/lto/: New folder to verify the LTO option.
 * gcc.target/arm/lto/pr61123-enum-size_0.c: New test case.
 * gcc.target/arm/lto/pr61123-enum-size_1.c: Likewise.
 * gcc.target/arm/lto/lto.exp: New exp file used to test LTO option.
 * lib/lto.exp (object-readelf): New procedure.

This FAILs on non-arm targets.

I've committed following fix as obvious.

On the other side, if you don't plan to add too many arm LTO tests,
supposedly putting it into gcc.dg/lto and just using arm*-*-* target
selector might be better.

2014-06-20  Jakub Jelinek  ja...@redhat.com

* gcc.target/arm/lto/lto.exp: Exit immediately if not arm*-*-* target.

--- gcc/testsuite/gcc.target/arm/lto/lto.exp.jj 2014-06-20 08:02:50.0 
+0200
+++ gcc/testsuite/gcc.target/arm/lto/lto.exp2014-06-20 23:19:33.850043692 
+0200
@@ -16,6 +16,10 @@
 #
 # Contributed by Diego Novillo dnovi...@google.com
 
+# Exit immediately if this isn't an ARM target.
+if ![istarget arm*-*-*] then {
+  return
+}
 
 # Test link-time optimization across multiple files.
 #


Jakub


Re: [PATCH] Fix arrays in rtx.u + add minor rtx verification

2014-06-20 Thread Marek Polacek
On Fri, Jun 20, 2014 at 11:10:18PM +0200, Jakub Jelinek wrote:
 On Fri, Jun 20, 2014 at 01:55:41PM -0600, Jeff Law wrote:
  like spot.  Most RTLs are allocated through rtx_alloc and the size
  is determined from RTX_HDR_SIZE (i.e. offsetof) and/or RTX_CODE_SIZE,
  so your rtl.h change IMHO shouldn't affect anything but make the
  expmed.c init_expmed_rtl structure somewhat longer.
  Right.  This comment was actually very helpful in that I wasn't aware of
  precisely which cases Marek was trying to address.
  
  Presumably the [1] sizing is what prevents any compile-time checking of
  this?
 
 First version of Marek's patch did that (never instrumented [], [0] and
 [1] arrays, no matter where they appeared, and instrumented everything
 else).
 Latest patch only never instruments [] (which, by definition can only
 appear at the end of structure), other arrays (no matter what size)
 aren't instrumented if they aren't followed by any fields, or
 if the base of the handled components is not INDIRECT_REF/MEM_REF
 (so, typically is a decl).
 u.fld[1] array is the last field, so we don't warn for that, but when
 rtx_def appears in another structure (in expmed.c) or if e.g. even
 some code had a rtx_def typed variable and accessed say u.fld[1] in there,
 it would be instrumented.

Yeah - init_expmed in expmed.c has
XEXP (all.plus, 1) = all.reg;
which is expanded to
(((all.plus)-u.fld[1]).rt_rtx) = all.reg;
but since the expression (A)-B is the same as A.B (if A is a valid
pointer expression), the above was turned into
all.plus.u.fld[1].rt_rtx) = all.reg;
and that doesn't contain any INDIRECT_REF/MEM_REFs - it's being
instrumented.  With this patch I don't see any -fsanitize=bounds
errors when doing bootstrap-ubsan.
 
 Whether we should have a strict array bounds mode where we would instrument
 even arrays at the end of structures (with the exception of []) is something
 to be discussed.

This should be basically just about adding a new option - e.g.
-fsanitize=bounds-strict.

Marek


[PATCH] Fix up -march=native handling under KVM (PR target/61570)

2014-06-20 Thread Jakub Jelinek
Hi!

As mentioned in the PR, some? KVM versions disable some CPU flags,
supposedly so that it can be migrated to any other x86-64 hw.
Thus, it announces only sse2 and lm, but already not sse3, ssse3
nor 3dnow, and (unfortunately) identifies itself as GenuineIntel
family 6 model 13.  There is no 64-bit CPU that actually is the lowest
common denominator of x86-64 CPUs (AMDs had 3dNOW etc., Intel first
x86-64 CPUs had already SSSE3).

This patch just makes sure we can use -march=native -m64 on such hosts,
without that gcc complains that the selected CPU doesn't support 64-bit
mode, because -march=native gives -march=pentium-m.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk/4.9?

2014-06-20  Jakub Jelinek  ja...@redhat.com

PR target/61570
* config/i386/driver-i386.c (host_detect_local_cpu): For unknown
model family 6 CPU with has_longmode never use a CPU without
64-bit support.

--- gcc/config/i386/driver-i386.c.jj2014-05-14 14:45:54.0 +0200
+++ gcc/config/i386/driver-i386.c   2014-06-20 18:59:57.805006358 +0200
@@ -745,6 +745,11 @@ const char *host_detect_local_cpu (int a
/* Assume Core 2.  */
cpu = core2;
}
+ else if (has_longmode)
+   /* Perhaps some emulator?  Assume x86-64, otherwise gcc
+  -march=native would be unusable for 64-bit compilations,
+  as all the CPUs below are 32-bit only.  */
+   cpu = x86-64;
  else if (has_sse3)
/* It is Core Duo.  */
cpu = pentium-m;

Jakub


Re: [patch i386]: Combine memory and indirect jump

2014-06-20 Thread Kai Tietz
2014-06-20 20:14 GMT+02:00 Jeff Law l...@redhat.com:
 On 06/20/14 12:07, Kai Tietz wrote:

 2014-06-20 19:55 GMT+02:00 Richard Henderson r...@redhat.com:

 On 06/20/2014 10:52 AM, Kai Tietz wrote:

 2014-06-20  Kai Tietz  kti...@redhat.com

  PR target/39284
  * passes.def (peephole2): Add second peephole2 pass before
  split before sched2 pass.
  * config/i386/i386.md (peehole2): To combine
  indirect jump with memory.
  (split2): Likewise.


 Why are we adding a second pass instead of just moving the one?


 r~


 I told that in a prior mail in that thread to Jeff. IIRC there are
 some conversion of impossible pushes then done too late, additional
 some patterns about split  dieing register too.  Means we produce
 weaker code.

 So let's dig into this deeper.  Examples  explanations would help.  I know
 it feels like a bit of a runaround, but avoiding adding the pass would be
 good.

 jeff

I dug into it a bit.  And couldn't find any significant difference for
x64 target for existing testcases.
I am still a bit concerned - I can't reproduce it for x86/x86_64
targets - that we might cause regressions for targets by moving
peephole2 pass too close before the sched2 pass.  Therefore I searched
for the closest place to the prior place of the peephole2 pass, which
solves still the indirect jump optimization on memory.
By testing for x86/x64 the pass needs to be run directly after the
reorder blocks pass.

So I suggest following change of passes.def:

Index: passes.def
===
--- passes.def  (Revision 211850)
+++ passes.def  (Arbeitskopie)
@@ -384,7 +384,6 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_rtl_dse2);
  NEXT_PASS (pass_stack_adjustments);
  NEXT_PASS (pass_jump2);
- NEXT_PASS (pass_peephole2);
  NEXT_PASS (pass_if_after_reload);
  NEXT_PASS (pass_regrename);
  NEXT_PASS (pass_cprop_hardreg);
@@ -391,6 +390,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_fast_rtl_dce);
  NEXT_PASS (pass_duplicate_computed_gotos);
  NEXT_PASS (pass_reorder_blocks);
+ NEXT_PASS (pass_peephole2);
  NEXT_PASS (pass_branch_target_load_optimize2);
  NEXT_PASS (pass_leaf_regs);
  NEXT_PASS (pass_split_before_sched2);

Kai


Re: [PATCH, PR61219]: Fix sNaN handling in ARM float to double conversion

2014-06-20 Thread Aurelien Jarno
On Tue, Jun 17, 2014 at 11:29:01PM +0100, Ramana Radhakrishnan wrote:
 On Sun, May 18, 2014 at 10:23 PM, Aurelien Jarno aurel...@aurel32.net wrote:
  On ARM soft-float, the float to double conversion doesn't convert a sNaN
  to qNaN as the IEEE Std 754 standard mandates:
 
  Under default exception handling, any operation signaling an invalid
  operation exception and for which a floating-point result is to be
  delivered shall deliver a quiet NaN.
 
  Given the soft float ARM code ignores exceptions and always provides a
  result, a float to double conversion of a signaling NaN should return a
  quiet NaN. Fix this in extendsfdf2.
 
 
  2014-05-18  Aurelien Jarno  aurel...@aurel32.net
 
  PR target/61219
  * config/arm/ieee754-df.S (extendsfdf2): Convert sNaN to qNaN.
 
 Ok if no regressions along with a testcase to catch this case please
 and fixing the PR number

I have added such a testcase in the new version below. I didn't add it
in the target specific subdirectory, as other architectures might be
affected. Actually aarch64 is also affected, though for different
reasons. I have tested that the testcase correctly catch the issue, and
that the whole patch doesn't cause any regression.

Please find the new patch below. If it is fine, I would appreciate if
someone can commit the patch, as I don't have SVN write access (though I
have done the copyright assignment stuff).

 Sorry about the slow review.

No problem, I am also very often short on time.

Aurelien



gcc/testsuite/ChangeLog

2014-06-18  Aurelien Jarno  aurel...@aurel32.net

PR target/59833
* gcc.dg/pr59833.c: New testcase.

Index: gcc/testsuite/gcc.dg/pr59833.c
===
--- gcc/testsuite/gcc.dg/pr59833.c  (revision 0)
+++ gcc/testsuite/gcc.dg/pr59833.c  (working copy)
@@ -0,0 +1,15 @@
+/* PR target/59833 */
+/* { dg-do run } */
+/* { dg-options -lm } */
+
+extern int __issignaling (double);
+
+int
+main ()
+{
+  float sNaN = __builtin_nansf ();
+  double x = (double) sNaN;
+  if (__issignaling (x))
+  __builtin_abort ();
+  return 0;
+}


libgcc/ChangeLog
2014-06-18  Aurelien Jarno  aurel...@aurel32.net

PR target/59833
* config/arm/ieee754-df.S (extendsfdf2): Convert sNaN to qNaN.

Index: libgcc/config/arm/ieee754-df.S
===
--- libgcc/config/arm/ieee754-df.S  (revision 211428)
+++ libgcc/config/arm/ieee754-df.S  (working copy)
@@ -473,11 +473,15 @@
eorne   xh, xh, #0x3800 @ fixup exponent otherwise.
RETc(ne)@ and return it.
 
-   teq r2, #0  @ if actually 0
-   do_it   ne, e
-   teqne   r3, #0xff00 @ or INF or NAN
+   bicsr2, r2, #0xff00 @ isolate mantissa
+   do_it   eq  @ if 0, that is ZERO or INF
RETc(eq)@ we are done already.
 
+   teq r3, #0xff00 @ check for NAN
+   do_it   eq, t
+   orreq   xh, xh, #0x0008 @ change to quiet NAN
+   RETc(eq)@ and return it.
+
@ value was denormalized.  We can normalize it now.
do_push {r4, r5, lr}
mov r4, #0x380  @ setup corresponding exponent


-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


Re: [PATCH] Fix up -march=native handling under KVM (PR target/61570)

2014-06-20 Thread H.J. Lu
On Fri, Jun 20, 2014 at 2:42 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 As mentioned in the PR, some? KVM versions disable some CPU flags,
 supposedly so that it can be migrated to any other x86-64 hw.
 Thus, it announces only sse2 and lm, but already not sse3, ssse3
 nor 3dnow, and (unfortunately) identifies itself as GenuineIntel
 family 6 model 13.  There is no 64-bit CPU that actually is the lowest
 common denominator of x86-64 CPUs (AMDs had 3dNOW etc., Intel first
 x86-64 CPUs had already SSSE3).

 This patch just makes sure we can use -march=native -m64 on such hosts,
 without that gcc complains that the selected CPU doesn't support 64-bit
 mode, because -march=native gives -march=pentium-m.

 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk/4.9?

 2014-06-20  Jakub Jelinek  ja...@redhat.com

 PR target/61570
 * config/i386/driver-i386.c (host_detect_local_cpu): For unknown
 model family 6 CPU with has_longmode never use a CPU without
 64-bit support.

 --- gcc/config/i386/driver-i386.c.jj2014-05-14 14:45:54.0 +0200
 +++ gcc/config/i386/driver-i386.c   2014-06-20 18:59:57.805006358 +0200
 @@ -745,6 +745,11 @@ const char *host_detect_local_cpu (int a
 /* Assume Core 2.  */
 cpu = core2;
 }
 + else if (has_longmode)
 +   /* Perhaps some emulator?  Assume x86-64, otherwise gcc
 +  -march=native would be unusable for 64-bit compilations,
 +  as all the CPUs below are 32-bit only.  */
 +   cpu = x86-64;
   else if (has_sse3)
 /* It is Core Duo.  */
 cpu = pentium-m;

 Jakub

host_detect_local_cpu guesses the cpu based on the real processors.
It doesn't work with emulators due to some conflicts.  This isn't the
only only place which has the same issue.   I prefer something like
this.

-- 
H.J.
---
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 3e8a995..2aada71 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -415,6 +415,7 @@ const char *host_detect_local_cpu (int argc, const
char **argv)
   bool arch;

   unsigned int l2sizekb = 0;
+  unsigned int arch_64bit = 1;

   if (argc  1)
 return NULL;
@@ -656,11 +657,14 @@ const char *host_detect_local_cpu (int argc,
const char **argv)
 {
 case PROCESSOR_I386:
   /* Default.  */
+  arch_64bit = 0;
   break;
 case PROCESSOR_I486:
+  arch_64bit = 0;
   cpu = i486;
   break;
 case PROCESSOR_PENTIUM:
+  arch_64bit = 0;
   if (arch  has_mmx)
  cpu = pentium-mmx;
   else
@@ -745,21 +749,25 @@ const char *host_detect_local_cpu (int argc,
const char **argv)
 /* Assume Core 2.  */
 cpu = core2;
  }
-  else if (has_sse3)
- /* It is Core Duo.  */
- cpu = pentium-m;
-  else if (has_sse2)
- /* It is Pentium M.  */
- cpu = pentium-m;
-  else if (has_sse)
- /* It is Pentium III.  */
- cpu = pentium3;
-  else if (has_mmx)
- /* It is Pentium II.  */
- cpu = pentium2;
   else
- /* Default to Pentium Pro.  */
- cpu = pentiumpro;
+ {
+  arch_64bit = 0;
+  if (has_sse3)
+/* It is Core Duo.  */
+cpu = pentium-m;
+  else if (has_sse2)
+/* It is Pentium M.  */
+cpu = pentium-m;
+  else if (has_sse)
+/* It is Pentium III.  */
+cpu = pentium3;
+  else if (has_mmx)
+/* It is Pentium II.  */
+cpu = pentium2;
+  else
+/* Default to Pentium Pro.  */
+cpu = pentiumpro;
+ }
 }
   else
 /* For -mtune, we default to -mtune=generic.  */
@@ -773,21 +781,30 @@ const char *host_detect_local_cpu (int argc,
const char **argv)
   if (has_longmode)
 cpu = nocona;
   else
-cpu = prescott;
+{
+  cpu = prescott;
+  arch_64bit = 0;
+}
  }
   else
- cpu = pentium4;
+ {
+  cpu = pentium4;
+  arch_64bit = 0;
+ }
   break;
 case PROCESSOR_GEODE:
+  arch_64bit = 0;
   cpu = geode;
   break;
 case PROCESSOR_K6:
+  arch_64bit = 0;
   if (arch  has_3dnow)
  cpu = k6-3;
   else
  cpu = k6;
   break;
 case PROCESSOR_ATHLON:
+  arch_64bit = 0;
   if (arch  has_sse)
  cpu = athlon-4;
   else
@@ -896,6 +913,9 @@ const char *host_detect_local_cpu (int argc, const
char **argv)
   const char *xsavec = has_xsavec ?  -mxsavec :  -mno-xsavec;
   const char *xsaves = has_xsaves ?  -mxsaves :  -mno-xsaves;

+  if (arch_64bit == 0  has_sse2  has_longmode)
+ cpu = x86-64;
+
   options = concat (options, mmx, mmx3dnow, sse, sse2, sse3, ssse3,
  sse4a, cx16, sahf, movbe, aes, sha, pclmul,
  popcnt, abm, lwp, fma, fma4, xop, bmi, bmi2,


Re: [PATCH, PR61219]: Fix sNaN handling in ARM float to double conversion

2014-06-20 Thread Joseph S. Myers
On Sat, 21 Jun 2014, Aurelien Jarno wrote:

 Index: gcc/testsuite/gcc.dg/pr59833.c
 ===
 --- gcc/testsuite/gcc.dg/pr59833.c(revision 0)
 +++ gcc/testsuite/gcc.dg/pr59833.c(working copy)
 @@ -0,0 +1,15 @@
 +/* PR target/59833 */
 +/* { dg-do run } */
 +/* { dg-options -lm } */
 +
 +extern int __issignaling (double);

__issignaling is a recent glibc addition, not a standard C library 
function; you can't assume it's available in a test without using a 
suitable effective-target that tests for whether it's available.

I believe linking with -lm is the default in the testsuite so shouldn't 
need specifying explicitly.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Emit DW_tag_restrict_type for restrict-qualified pointers.

2014-06-20 Thread Mark Wielaard
The following patch is based on and extends the guality test from my
earlier Pass type modifiers as flags arguments. Add guality type test.
patch.

gcc/ChangeLog

PR debug/59051
* dwarf2out.h (enum dw_mod_flag): Add dw_mod_restrict.
* dwarf2out.c (dw_mod_decl_flags): Handle TYPE_RESTRICT.
(dw_mod_type_flags): Likewise.
(dw_mods_to_quals): New function.
(dw_mod_qualified_type): Likewise.
(modified_type_die): Handle dw_mod_restrict.

gcc/testsuite/ChangeLog

PR debug/59051
* gcc.dg/guality/restrict.c: New test.
---
 gcc/ChangeLog   |   10 
 gcc/dwarf2out.c |   76 +-
 gcc/dwarf2out.h |1 +
 gcc/testsuite/ChangeLog |4 ++
 gcc/testsuite/gcc.dg/guality/restrict.c |   48 +++
 5 files changed, 126 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/guality/restrict.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4dfd9a5..06a8767 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,15 @@
 2014-06-20  Mark Wielaard  m...@redhat.com
 
+   PR debug/59051
+   * dwarf2out.h (enum dw_mod_flag): Add dw_mod_restrict.
+   * dwarf2out.c (dw_mod_decl_flags): Handle TYPE_RESTRICT.
+   (dw_mod_type_flags): Likewise.
+   (dw_mods_to_quals): New function.
+   (dw_mod_qualified_type): Likewise.
+   (modified_type_die): Handle dw_mod_restrict.
+
+2014-06-20  Mark Wielaard  m...@redhat.com
+
* dwarf2out.h (enum dw_mod_flag): New enum.
* dwarf2out.c (dw_mod_decl_flags): New function.
(dw_mod_type_flags): Likewise.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 3d3508d..b99d1b9 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -10504,14 +10504,52 @@ static int
 dw_mod_decl_flags (const_tree decl)
 {
   return ((TREE_READONLY (decl) ? dw_mod_const : dw_mod_none)
- | (TREE_THIS_VOLATILE (decl) ? dw_mod_volatile : dw_mod_none));
+ | (TREE_THIS_VOLATILE (decl) ? dw_mod_volatile : dw_mod_none)
+ | ((POINTER_TYPE_P (TREE_TYPE (decl))
+  TYPE_RESTRICT (TREE_TYPE (decl)))
+? dw_mod_restrict : dw_mod_none));
 }
 
 static int
 dw_mod_type_flags (const_tree type)
 {
   return ((TYPE_READONLY (type) ? dw_mod_const : dw_mod_none)
- | (TYPE_VOLATILE (type) ? dw_mod_volatile : dw_mod_none));
+ | (TYPE_VOLATILE (type) ? dw_mod_volatile : dw_mod_none)
+ | ((POINTER_TYPE_P (type)  TYPE_RESTRICT (type))
+? dw_mod_restrict : dw_mod_none));
+}
+
+static int
+dw_mods_to_quals (int mods)
+{
+  return (((mods  dw_mod_const) ? TYPE_QUAL_CONST : 0)
+ | ((mods  dw_mod_volatile) ? TYPE_QUAL_VOLATILE : 0)
+ | ((mods  dw_mod_restrict) ? TYPE_QUAL_RESTRICT : 0));
+}
+
+/* Returns true if there is a qualified type with at least one
+   modifier given in mods.  Returns false if mods == dw_mod_none or
+   there is no qualified type with at least one of the given mods.  */
+
+static bool
+dw_mod_qualified_type (tree type, int mods)
+{
+  if (mods == dw_mod_none)
+return false;
+
+  if (get_qualified_type (type, dw_mods_to_quals (mods)) != NULL_TREE)
+return true;
+
+  if (mods  dw_mod_const)
+return dw_mod_qualified_type (type, mods  ~dw_mod_const);
+
+  if (mods  dw_mod_volatile)
+return dw_mod_qualified_type (type, mods  ~dw_mod_volatile);
+
+  if (mods  dw_mod_restrict)
+return dw_mod_qualified_type (type, mods  ~dw_mod_restrict);
+
+  gcc_unreachable ();
 }
 
 /* Given a pointer to an arbitrary ..._TYPE tree node, return a debugging
@@ -10531,13 +10569,15 @@ modified_type_die (tree type, int mods, dw_die_ref 
context_die)
   if (code == ERROR_MARK)
 return NULL;
 
+  /* Don't emit DW_TAG_restrict_type for DWARFv2, since it is a type
+ tag modifier (and not an attribute) old consumers won't be able
+ to handle it.  */
+  if (dwarf_version  3)
+mods = ~dw_mod_restrict;
+
   /* See if we already have the appropriately qualified variant of
  this type.  */
-  qualified_type
-= get_qualified_type (type, (((mods  dw_mod_const)
- ? TYPE_QUAL_CONST : 0)
-| ((mods  dw_mod_volatile)
-   ? TYPE_QUAL_VOLATILE : 0)));
+  qualified_type = get_qualified_type (type, dw_mods_to_quals (mods));
 
   if (qualified_type == sizetype
TYPE_NAME (qualified_type)
@@ -10577,8 +10617,10 @@ modified_type_die (tree type, int mods, dw_die_ref 
context_die)
}
   else if ((mods  dw_mod_const)  TYPE_READONLY (dtype)
   || (mods  dw_mod_volatile)  TYPE_VOLATILE (dtype)
+  || (mods  dw_mod_restrict)  TYPE_RESTRICT (dtype)
   || ((mods  dw_mod_const) = TYPE_READONLY (dtype)
(mods  dw_mod_volatile) = TYPE_VOLATILE (dtype)
+   (mods  dw_mod_restrict) = TYPE_RESTRICT (dtype)
   

Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-06-20 Thread Sriraman Tallam
Patch Updated.

Sri

On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Optimize access to globals with -fpie, x86_64 only:

 Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the 
 module
 using the GOT.  This is two instructions, one to get the address of the 
 global
 from the GOT and the other to get the value.  If it turns out that the 
 global
 gets defined in the executable at link-time, it still needs to go through 
 the
 GOT as it is too late then to generate a direct access.

 Examples:

 foo.cc
 --
 int a_glob;
 int main () {
   return a_glob; // defined in this file
 }

 With -O2 -fpie -pie, the generated code directly accesses the global via
 PC-relative insn:

 5e0   main:
mov0x165a(%rip),%eax# 1c40 a_glob

 foo.cc
 --

 extern int a_glob;
 int main () {
   return a_glob; // defined in this file
 }

 With -O2 -fpie -pie, the generated code accesses global via GOT using two
 memory loads:

 6f0  main:
mov0x1609(%rip),%rax   # 1d00 _DYNAMIC+0x230
mov(%rax),%eax

 This is true even if in the latter case the global was defined in the
 executable through a different file.

 Some experiments on google benchmarks shows that the extra memory loads 
 affects
 performance by 1% to 5%.


 Solution - Copy Relocations:

 When the linker supports copy relocations, GCC can always assume that the
 global will be defined in the executable.  For globals that are truly extern
 (come from shared objects), the linker will create copy relocations and have
 them defined in the executable. Result is that no global access needs to go
 through the GOT and hence improves performance.

 This patch to the gold linker :
 https://sourceware.org/ml/binutils/2014-05/msg00092.html
 submitted recently allows gold to generate copy relocations for -pie mode 
 when
 necessary.

 I have added option -mld-pie-copyrelocs which when combined with -fpie 
 would do
 this.  Note that the BFD linker does not support pie copyrelocs yet and this
 option cannot be used there.

 Please review.


 ChangeLog:

 * config/i386/i36.opt (mld-pie-copyrelocs): New option.
 * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
  address is still legitimate in the presence of copy relocations
  and -fpie.
 * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
 * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.



 Patch attached.
 Thanks
 Sri
Optimize access to globals with -fpie, x86_64 only:

Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module
using the GOT.  This is two instructions, one to get the address of the global
from the GOT and the other to get the value.  If it turns out that the global
gets defined in the executable at link-time, it still needs to go through the
GOT as it is too late then to generate a direct access. 

Examples:

foo.cc
--
int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code directly accesses the global via
PC-relative insn:

5e0   main:
   mov    0x165a(%rip),%eax    # 1c40 a_glob

foo.cc
--

extern int a_glob;
int main () {
  return a_glob; // defined in this file
}

With -O2 -fpie -pie, the generated code accesses global via GOT using two
memory loads:

6f0  main:
   mov    0x1609(%rip),%rax   # 1d00 _DYNAMIC+0x230
   mov    (%rax),%eax

This is true even if in the latter case the global was defined in the
executable through a different file.

Some experiments on google benchmarks shows that the extra memory loads affects
performance by 1% to 5%. 


Solution - Copy Relocations:

When the linker supports copy relocations, GCC can always assume that the
global will be defined in the executable.  For globals that are truly extern
(come from shared objects), the linker will create copy relocations and have
them defined in the executable. Result is that no global access needs to go
through the GOT and hence improves performance.

This patch to the gold linker :
https://sourceware.org/ml/binutils/2014-05/msg00092.html
submitted recently allows gold to generate copy relocations for -pie mode when
necessary.

I have added option -mcopyrelocs which when combined with -fpie would do
this.  Note that the BFD linker does not support pie copyrelocs yet and this
option cannot be used there.

Please review.


ChangeLog:

* config/i386/i36.opt (mcopyrelocs): New option.
* config/i386/i386.c (legitimate_pic_address_disp_p): Check if this
  address is still legitimate in the presence of copy relocations
  and -fpie.
* doc/invoke.texi (mcopyrelocs): Document.
* testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test.
* testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test.


Index: 

Re: Another AIX Bootstrap failure

2014-06-20 Thread Jan Hubicka
Hello,
after some lengthly investigation it turned out that aliases on AIX doesn't
behave in the way we expect.  In particular creating a static alias of a global
symbol has no effect. This is somewhat special behaviour of AIX's .set
pseudo-op I think I can get this fixed by simply emitting alternative symbols
into every definition instead of relying on semantic of assembler's .set.

This patch disables aliases for !SUPPORTS_ONE_ONLY targets (I hope this
to be turned back soon after we fix AIX output macros) and adds testcase
for weird behavoiur of aliases so we can possibly catch other targets
that do not behave as expected.

Bootstrapped/regtested x86_64-linux.

* gcc.dg/localalias.c: New testcase.
* gcc.dg/localalias-2.c: New testcase.
* gcc.dg/globalalias.c: New testcase.
* gcc.dg/globalalias-2.c: New testcase.
* ipa-visibility.c (function_and_variable_visibility): Disable
temporarily local aliases for some targets.

Index: testsuite/gcc.dg/localalias.c
===
--- testsuite/gcc.dg/localalias.c   (revision 0)
+++ testsuite/gcc.dg/localalias.c   (revision 0)
@@ -0,0 +1,42 @@
+/* This test checks that local aliases behave sanely.  This is necessary for 
code correctness
+   of aliases introduced by ipa-visibility pass.
+
+   If this test fails either aliases needs to be disabled on given target on 
aliases with
+   proper semantic needs to be implemented.  This is problem with e.g. AIX 
.set pseudo-op
+   that implementes alias syntactically (by substituting in assembler) rather 
as alternative
+   symbol defined on a target's location.  */
+
+/* { dg-do run }
+   { dg-options -Wstrict-aliasing=2 -fstrict-aliasing } 
+   { dg-require-alias  }
+   { dg-xfail-if  { powerpc-ibm-aix* } { * } {  } } 
+   { dg-additional-sources localalias-2.c } */
+extern void abort (void);
+extern int test2count;
+int testcount;
+__attribute__ ((weak,noinline))
+void test(void)
+{
+  testcount++;
+}
+__attribute ((alias(test)))
+static void test2(void);
+
+void main()
+{
+  test2();
+  /* This call must bind locally.  */
+  if (!testcount)
+abort ();
+  test();
+  /* Depending on linker choice, this one may bind locally
+ or to the other unit.  */
+  if (!testcount  !test2count)
+abort();
+  tt();
+
+  if ((testcount != 1 || test2count != 3)
+   (testcount != 3 || test2count != 1))
+abort ();
+  reutrn 0;
+}
Index: testsuite/gcc.dg/globalalias.c
===
--- testsuite/gcc.dg/globalalias.c  (revision 0)
+++ testsuite/gcc.dg/globalalias.c  (revision 0)
@@ -0,0 +1,42 @@
+/* This test checks that local aliases behave sanely.  This is necessary for 
code correctness
+   of aliases introduced by ipa-visibility pass.
+
+   This test expose weird behaviour of AIX's .set pseudo-op where the global 
symbol is created,
+   but all uses of the alias are syntactically replaced by uses of the target. 
 This means that
+   both counters are increased to 2.  */
+
+/* { dg-do run }
+   { dg-options -O2 } 
+   { dg-require-alias  }
+   { dg-xfail-if  { powerpc-ibm-aix* } { * } {  } } 
+   { dg-additional-sources globalalias-2.c } */
+extern int test2count;
+extern void abort (void);
+int testcount;
+static
+void test(void)
+{
+  testcount++;
+}
+__attribute__ ((weak,noinline))
+__attribute ((alias(test)))
+void test2(void);
+
+void main()
+{
+  test();
+  /* This call must bind locally.  */
+  if (!testcount)
+abort ();
+  test2();
+  /* Depending on linker choice, this one may bind locally
+ or to the other unit.  */
+  if (!testcount  !test2count)
+abort();
+  tt();
+
+  if ((testcount != 1 || test2count != 3)
+   (testcount != 3 || test2count != 1))
+abort ();
+  return 0;
+}
Index: testsuite/gcc.dg/globalalias-2.c
===
--- testsuite/gcc.dg/globalalias-2.c(revision 0)
+++ testsuite/gcc.dg/globalalias-2.c(revision 0)
@@ -0,0 +1,20 @@
+int test2count;
+extern void abort (void);
+static
+void test(void)
+{
+  test2count++;
+}
+__attribute__ ((weak,noinline))
+__attribute ((alias(test)))
+void test2(void);
+
+void tt() 
+{
+  int prev = test2count;
+  /* This call must bind locally.  */
+  test();
+  if (test2count == prev)
+abort();
+  test2();
+ }
Index: testsuite/gcc.dg/localalias-2.c
===
--- testsuite/gcc.dg/localalias-2.c (revision 0)
+++ testsuite/gcc.dg/localalias-2.c (revision 0)
@@ -0,0 +1,19 @@
+extern void abort (void);
+int test2count;
+__attribute__ ((weak,noinline))
+void test(void)
+{
+  test2count++;
+}
+__attribute ((alias(test)))
+static void test2(void);
+
+void tt() 
+{
+  int prev = test2count;
+  /* This call must bind locally.  */
+  test2();
+  if (test2count == prev)
+abort();
+  test();
+ }
Index: ipa-visibility.c