Re: [PATCH 01/13] recog: Increased max number of alternatives - v2

2015-05-22 Thread Andreas Krebbel
On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote:
 On 05/18/2015 04:19 PM, Richard Biener wrote:
  Please use uint64_t instead.
 
 Done. Ok with that change?

I've applied the following patch.

Bye,

-Andreas-

gcc/
* recog.h: Increase MAX_RECOG_ALTERNATIVES.
Change type of alternative_mask to uint64_t.
---
 gcc/recog.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/recog.h b/gcc/recog.h
index 463c748..3a09304 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Random number that should be large enough for all purposes.  Also define
a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
bit giving an invalid value that can be used to mean uninitialized.  */
-#define MAX_RECOG_ALTERNATIVES 30
-typedef unsigned int alternative_mask;
+#define MAX_RECOG_ALTERNATIVES 35
+typedef uint64_t alternative_mask;
 
 /* A mask of all alternatives.  */
 #define ALL_ALTERNATIVES ((alternative_mask) -1)
-- 
1.7.9.5



Re: Calculate TYPE_CANONICAL only for types that can be accessed in memory

2015-05-22 Thread Richard Biener
On Thu, 21 May 2015, Jan Hubicka wrote:

 Hi,
 this is next part of the series.  It disables canonical type calculation for
 incomplete types with exception of arrays based on claim that we do not have
 good notion of those.
 
 I can botostrap this with additional checks in alias.c that canonical types 
 are
 always present with LTO but I need fix to ICF that compare alias sets of types
 it does not need to and trips incomplete types otherwise.  I will push out
 these fixes separately and incrementally add the fix.  The purpose of those
 checks is to avoid alias.c degenerating to structural equality path for no 
 good
 reason.
 
 I tried the alternative to disable it on ARRAY_TYPES too and add avoid
 recursion to those for fields.  THis does not fly because we can have
 ARRAY_REFS of incomplete types:
  array_ref 0x769c2968
 type pointer_type 0x76942150
 type integer_type 0x76cd50a8 char readonly string-flag QI
 size integer_cst 0x76ad7ca8 constant 8
 unit size integer_cst 0x76ad7cc0 constant 1
 align 8 symtab -158253232 alias set 0 canonical type 
 0x76adb498 precision 8 min integer_cst 0x76cd2090 -128 max 
 integer_cst 0x76cd2078 127
 pointer_to_this pointer_type 0x76cd5150
 readonly unsigned DI
 size integer_cst 0x76ad7bb8 constant 64
 unit size integer_cst 0x76ad7bd0 constant 8
 align 64 symtab 0 alias set -1 canonical type 0x76af37e0
 pointer_to_this pointer_type 0x769479d8
 readonly
 arg 0 mem_ref 0x769d1028
 type array_type 0x7694b348 type pointer_type 0x76942150
 BLK
 align 64 symtab 0 alias set -1 structural equality
 pointer_to_this pointer_type 0x7694b3f0

 arg 0 addr_expr 0x769ce380 type pointer_type 0x7694b3f0
 constant arg 0 var_decl 0x76941480 reg_note_name
 arg 1 integer_cst 0x769b6678 constant 0
 arg 1 ssa_name 0x769c5630
 type integer_type 0x76adb690 int asm_written public SI
 size integer_cst 0x76ad7df8 constant 32
 unit size integer_cst 0x76ad7e10 constant 4
 align 32 symtab -158421968 alias set 3 canonical type 
 0x76adb690 precision 32 min integer_cst 0x76ad7db0 -2147483648 max 
 integer_cst 0x76ad7dc8 2147483647
 pointer_to_this pointer_type 0x76af37e0 reference_to_this 
 reference_type 0x76942b28
 visiteddef_stmt _103 = (int) _101;
 
 version 103
 ptr-info 0x769f04a0
 ../../gcc/print-rtl.c:173:4
 
 and we compute alias set for it via:
 #0  internal_error (gmsgid=0x1b86c8f in %s, at %s:%d) at 
 ../../gcc/diagnostic.c:1271
 #1  0x015e2416 in fancy_abort (file=0x167ea2a ../../gcc/alias.c, 
 line=823, function=0x167f7d6 get_alias_set(tree_node*)::__FUNCTION__ 
 get_alias_set)
 at ../../gcc/diagnostic.c:1341
 #2  0x007109b9 in get_alias_set (t=0x7694b2a0) at 
 ../../gcc/alias.c:823
 #3  0x0070fecf in component_uses_parent_alias_set_from 
 (t=0x769c2968) at ../../gcc/alias.c:607
 #4  0x00710497 in reference_alias_ptr_type_1 (t=0x7fffe068) at 
 ../../gcc/alias.c:719
 #5  0x007107e8 in get_alias_set (t=0x769c2968) at 
 ../../gcc/alias.c:799
 #6  0x00ebca97 in vn_reference_lookup (op=0x769c2968, 
 vuse=0x769ca798, kind=VN_WALKREWRITE, vnresult=0x0) at 
 ../../gcc/tree-ssa-sccvn.c:2217
 #7  0x00ebea99 in visit_reference_op_load (lhs=0x769c5678, 
 op=0x769c2968, stmt=0x769cf730) at ../../gcc/tree-ssa-sccvn.c:3030
 #8  0x00ec05ec in visit_use (use=0x769c5678) at 
 ../../gcc/tree-ssa-sccvn.c:3685
 #9  0x00ec1047 in process_scc (scc=...) at 
 ../../gcc/tree-ssa-sccvn.c:3927
 #10 0x00ec1679 in extract_and_process_scc_for_name 
 (name=0x769c5678) at ../../gcc/tree-ssa-sccvn.c:4013
 #11 0x00ec1848 in DFS (name=0x769c5678) at 
 ../../gcc/tree-ssa-sccvn.c:4065
 #12 0x00ec26d1 in cond_dom_walker::before_dom_children 
 (this=0x7fffe5a0, bb=0x769b9888) at ../../gcc/tree-ssa-sccvn.c:4345
 #13 0x014c05c0 in dom_walker::walk (this=0x7fffe5a0, 
 bb=0x769b9888) at ../../gcc/domwalk.c:188
 #14 0x00ec2b0e in run_scc_vn (default_vn_walk_kind_=VN_WALKREWRITE) 
 at ../../gcc/tree-ssa-sccvn.c:4436
 #15 0x00e98d59 in (anonymous namespace)::pass_fre::execute 
 (this=0x1f621b0, fun=0x7698db28) at ../../gcc/tree-ssa-pre.c:4972
 #16 0x00bb6c8f in execute_one_pass (pass=0x1f621b0) at 
 ../../gcc/passes.c:2317
 #17 0x00bb6ede in execute_pass_list_1 (pass=0x1f621b0) at 
 ../../gcc/passes.c:2370
 #18 0x00bb6f0f in execute_pass_list_1 (pass=0x1f61d90) at 
 ../../gcc/passes.c:2371
 #19 0x00bb6f51 in execute_pass_list (fn=0x7698db28, 
 pass=0x1f61cd0) at ../../gcc/passes.c:2381
 #20 0x007bb3f6 in 

Re: [C++ Patch] PR 61683

2015-05-22 Thread Paolo Carlini

Hi,

On 04/30/2015 01:56 AM, Paolo Carlini wrote:

Hi,

this seems pretty straightforward given the grammar. Tested x86_64-linux.
... again, given the grammar, I think this is even obvious: if nobody 
screams, I'm going to commit the patch in a day or so (but I'm naming 
the testcase decltype-mem-initializer1.C instead, seems more correct to me)


Thanks!
Paolo.


Re: Cleanup and improve canonical type construction in LTO

2015-05-22 Thread Richard Biener
On Thu, 21 May 2015, Jan Hubicka wrote:

  On Wed, 20 May 2015, Jan Hubicka wrote:
Code quality does not seem to be affected too much,
   which I suppose is partly thanks to that tree-ssa-alias.c pointer hack.  
   My
   main point was to cleanup the hack about comparing only TYPE_CODE of
   pointer_type and make more sense of the whole machinery.
  
  Well, TYPE_CODE was supposed to mimic what you do explicitely now.  I'm
  curious when the recursion determines incompatibility but the TYPE_CODE
  is equal.  What cases are those?
  
  How do you handle void * vs. any other pointer type?  For cross-language
  LTO I think we need to treat those compatible.  At least void * is
  equal to struct Foo * (with incomplete Foo).
 
 Well, if you would force this rule for canonical type calculation, then by
 transitivity we could make no difference between pointers at all.

That's kind of my point - and it was the reason to add that
get_alias_set () hack globbing all pointer-types.  I think it was
inspired by actual code in the wild (mostly C code though...) that
expects to be able to store all pointer types to *(void *)x, thus
void * being a valid abstraction for all pointer types.  At the
point of introducing the hack I decided to not allow that by
assigning alias-set zero to void * pointers (well, all pointers to
incomplete types) but to glob all pointer types instead.

 I think the only way to enforce such non-transitive rules is via alias sets.
 Just like alias set 0 alias with everything without really degenerating every
 canonical type to a single type.

But alias-sets also obey transitivity rules, no?  Or you mean introducing
another alias-set zero globbing all pointers and using that for
pointers to incomplete types?  (what about void **?  The same 
abstraction argument applies to that if I want to store a X **)

In the end I like the simplicity of the current approach very much
(likewise using element alias-sets for vector/array type 
alias-sets)

   Bootstrapped/regtested x86_64-linux with LTO and also tested on Firefox. 
   ppc64
   build in progress. OK?
  
  Not looking at the patch - please split it up into small pieces so we
  can bisect in case of problems.  This is a tricky area, esp. considering
  cross-language LTO.  We might be not conservative enough with
  canonical types (given the get_alias_set pointer handling).
 
 Yep, i looked into the pointer handling hack. 
 
 Thanks for looking into this.  I know it is a tricky area.

Indeed.

Richard.


Re: [PATCH 1/3][AArch64][PR target/65697] Strengthen barriers for sync-fetch-op builtins.

2015-05-22 Thread Matthew Wahab

[Added PR number and updated patches]

On Aarch64, the __sync builtins are implemented using the __atomic operations
and barriers. This makes the the __sync builtins inconsistent with their
documentation which requires stronger barriers than those for the __atomic
builtins.

The difference between __sync and __atomic builtins is that the restrictions
imposed by a __sync operation's barrier apply to all memory references while the
restrictions of an __atomic operation's barrier only need to apply to a
subset. This affects Aarch64 in particular because, although its implementation
of __atomic builtins is correct, the barriers generated are too weak for the
__sync builtins.

The affected __sync builtins are the __sync_fetch_and_op (and
__sync_op_and_fetch) functions, __sync_compare_and_swap and
__sync_lock_test_and_set. This and a following patch modifies the code generated
for these functions to weaken initial load-acquires to a simple load and to add
a final fence to prevent code-hoisting. The last patch will add tests for the
code generated by the Aarch64 backend for the __sync builtins.

- Full barriers:  __sync_fetch_and_op, __sync_op_and_fetch
  __sync_*_compare_and_swap

  [load-acquire; code; store-release]
  becomes
  [load; code ; store-release; fence].

- Acquire barriers:  __sync_lock_test_and_set

  [load-acquire; code; store]
  becomes
  [load; code; store; fence]

The code generated for release barriers and for the __atomic builtins is
unchanged.

This patch changes the code generated for __sync_fetch_and_op and
__sync_op_and_fetch builtins.

Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

gcc/
2015-05-22  Matthew Wahab  matthew.wa...@arm.com

PR target/65697
* config/aarch64/aarch64.c (aarch64_emit_post_barrier): New.
(aarch64_split_atomic_op): Check for __sync memory models, emit
appropriate initial and final barriers.



From 5a2d546359f78cd3f304a62617f0fc385664374e Mon Sep 17 00:00:00 2001
From: Matthew Wahab matthew.wa...@arm.com
Date: Fri, 15 May 2015 09:26:28 +0100
Subject: [PATCH 1/3] [AArch64] Strengthen barriers for sync-fetch-op builtin.

Change-Id: I3342a572d672163ffc703e4e51603744680334fc
---
 gcc/config/aarch64/aarch64.c | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 8c25d75..182dbad 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9407,6 +9407,22 @@ aarch64_expand_compare_and_swap (rtx operands[])
   emit_insn (gen_rtx_SET (bval, x));
 }
 
+/* Emit a post-operation barrier.  */
+
+static void
+aarch64_emit_post_barrier (enum memmodel model)
+{
+  const enum memmodel base_model = memmodel_base (model);
+
+  if (is_mm_sync (model)
+   (base_model == MEMMODEL_ACQUIRE
+	  || base_model == MEMMODEL_ACQ_REL
+	  || base_model == MEMMODEL_SEQ_CST))
+{
+  emit_insn (gen_mem_thread_fence (GEN_INT (MEMMODEL_SEQ_CST)));
+}
+}
+
 /* Split a compare and swap pattern.  */
 
 void
@@ -9469,12 +9485,20 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 {
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
+  const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
+  const bool is_sync = is_mm_sync (model);
+  rtx load_model_rtx = model_rtx;
   rtx_code_label *label;
   rtx x;
 
   label = gen_label_rtx ();
   emit_label (label);
 
+  /* A __sync operation will emit a final fence to stop code hoisting, so the
+ load can be relaxed.  */
+  if (is_sync)
+load_model_rtx = GEN_INT (MEMMODEL_RELAXED);
+
   if (new_out)
 new_out = gen_lowpart (wmode, new_out);
   if (old_out)
@@ -9483,7 +9507,7 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 old_out = new_out;
   value = simplify_gen_subreg (wmode, value, mode, 0);
 
-  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
+  aarch64_emit_load_exclusive (mode, old_out, mem, load_model_rtx);
 
   switch (code)
 {
@@ -9519,6 +9543,10 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
 			gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+
+  /* Emit any fence needed for a __sync operation.  */
+  if (is_sync)
+aarch64_emit_post_barrier (model);
 }
 
 static void
-- 
1.9.1



RE: [Patch] [AArch64] PR target 66049: fix add/extend gcc test suite failures

2015-05-22 Thread Kumar, Venkataramanan
Hi Kyrill, 

Sorry for little delay in responding. 

 -Original Message-
 From: Kyrill Tkachov [mailto:kyrylo.tkac...@foss.arm.com]
 Sent: Tuesday, May 19, 2015 9:13 PM
 To: Kumar, Venkataramanan; James Greenhalgh; gcc-patches@gcc.gnu.org
 Cc: Ramana Radhakrishnan; seg...@kernel.crashing.org; Marcus Shawcroft
 Subject: Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test suite
 failures
 
 Hi Venkat,
 
 On 19/05/15 16:37, Kumar, Venkataramanan wrote:
  Hi Maintainers,
 
  Please find the attached patch, that fixes add/extend gcc test suite 
  failures
 in Aarch64 target.
  Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049
 
  These tests started to fail after we prevented combiner from converting
 shift RTX to mult RTX, when the RTX is not inside a memory operation
 (r222874) .
  Now I have added new add/extend patterns which are based on shift
 operations,  to fix these cases.
 
  Testing status with the patch.
 
  (1) GCC bootstrap on AArch64 successful.
  (2)  SPEC2006 INT runs did not show any degradation.
 
 Does that mean there was no performance regression? Or no codegen
 difference?

Yes there was no performance regression.


 What I'd expect from this patch is that the codegen would be the same as
 before the combine patch
 (r222874). A performance difference can sometimes be hard to measure
 even at worse code quality.
 Can you please confirm that on SPEC2006 INT the adds and shifts are now
 back to being combined
 into a single instruction?

I  used -dp --save-temps to dump pattern names in assembly files. 
I used revision before combiner patch (r222873) and the patch on top of trunk 
(little older r223194) for comparison.

Quickly compared the counts generated for the below patterns from the SPEC INT 
binaries. 

*adds_optabmode_multp2 vs *adds_optabALLX:mode_shft_GPI:mode 
*subs_optabmode_multp2 vs *subs_optabALLX:mode_shft_GPI:mode
*add_uxtmode_multp2  vs  *add_uxtmode_shift2
*add_uxtsi_multp2_uxtw vs  *add_uxtsi_shift2_uxtw
*sub_uxtmode_multp2  vs  *sub_uxtmode_shift2
*sub_uxtsi_multp2_uxtw vs  *sub_uxtsi_shift2_uxtw
*adds_mul_imm_modevs *adds_shift_imm_mode 
*subs_mul_imm_modevs *subs_shift_imm_mode

Patterns are found in few benchmarks. The  generated counts are matching in the 
binaries compiled with the two revisions. 

 Also Looked at assembly generated and  the adds and shifts are combined 
properly using the new patterns. Please let me know if this is OK.

 
 Thanks,
 Kyrill
 
  (3) gcc regression testing passed.
 
  (-Snip-)
  # Comparing 3 common sum files
  ## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998
 /tmp/gxx-sum2.24998
  Tests that now work, but didn't before:
 
  gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-
 9]+, lsl 3
  gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+, x[0-9]+,
 lsl 3
  gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+,
 x[0-9]+, sxtw 2
  gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
  gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
  gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
  gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
  gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
  gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-
 9]+, lsl 3
  gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+,
 lsl 3
  gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+,
 x[0-9]+, sxtw 2
 
  # No differences found in 3 common sum files
  (-Snip-)
 
  The patterns are fixing the regressing tests, so I have not added any new
 tests.
  Regarding  removal of the old patterns based on mults,  I am planning to
 do it as a separate work.
 
  Is this OK for trunk ?
 
  gcc/ChangeLog
 
  2015-05-19  Venkataramanan Kumar  venkataramanan.ku...@amd.com
 
   * config/aarch64/aarch64.md
   (*adds_shift_imm_mode):  New pattern.
   (*subs_shift_imm_mode):  Likewise.
   (*adds_optabALLX:mode_shift_GPI:mode):  Likewise.
   (*subs_optabALLX:mode_shift_GPI:mode): Likewise.
   (*add_uxtmode_shift2): Likewise.
   (*add_uxtsi_shift2_uxtw): Likewise.
  (*sub_uxtmode_shift2): Likewise.
  (*sub_uxtsi_shift2_uxtw): Likewise.
 
 
  Regards,
  Venkat.
 
 
 
 

Regards,
Venkat,


Re: [PATCH] Fix PR65701(?), fix typo in vect_enhance_data_refs_alignment

2015-05-22 Thread Richard Biener
On Fri, 10 Apr 2015, Richard Biener wrote:

 On Fri, 10 Apr 2015, Richard Biener wrote:
 
  
  The following patch fixes a typo (I think) which is present since the
  original introduction of the code in vect_enhance_data_refs_alignment.
  
if (do_peeling
 all_misalignments_unknown
 vect_supportable_dr_alignment (dr0, false))
  {
/* Check if the target requires to prefer stores over loads, i.e., 
  if
   misaligned stores are more expensive than misaligned loads 
  (taking
   drs with same alignment into account).  */
if (first_store  DR_IS_READ (dr0))
  {
  ...
  }
  
/* In case there are only loads with different unknown 
  misalignments, use
   peeling only if it may help to align other accesses in the loop.  
  */
if (!first_store
 !STMT_VINFO_SAME_ALIGN_REFS (
vinfo_for_stmt (DR_STMT (dr0))).length ()
 vect_supportable_dr_alignment (dr0, false)
!= dr_unaligned_supported)
  do_peeling = false;
  }
  
  the last vect_supportable_dr_alignment check doesn't make much sense.
  It's not mentioned in the comment and I see no reason for treating
  dr_unaligned_supported any different here compared to
  dr_explicit_realign_[optimized].
 
 Ok, as always a second after you hit send you think of the reason
 of this test.  I suppose the idea was that aligning a single load
 might improve load bandwith (same reason we eventually prefer
 aligning stores).  dr_explicit_realign_[optimized] perform
 aligned loads thus it makes sense to special-case them.
 I suppose the cost model in that case should be more precise
 (and note that for bdver2 aligned and unaligned loads have the
 same cost).
 
 So sth like
 
   /* In case there are only loads with different unknown 
 misalignments, use
  peeling only if it may help to align other accesses in the loop 
 or
  if it may help improving load bandwith when we'd end up using
  unaligned loads.  */
   tree dr0_vt = STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr0)));
   if (!first_store
!STMT_VINFO_SAME_ALIGN_REFS (
   vinfo_for_stmt (DR_STMT (dr0))).length ()
(vect_supportable_dr_alignment (dr0, false)
   != dr_unaligned_supported
   || (builtin_vectorization_cost (vector_load, dr0_vt, 0)
   == builtin_vectorization_cost (unaligned_load, dr0_vt, 
 -1
 do_peeling = false;
 
 Looks like all Intel CPUs still have larger unaligned load cost
 compared to aligned load cost (same is true for bdver2 in principle -
 the cost isn't about the instruction encoding but the actual
 data alignment).  So it might be artificially fixing 187.facerec with
 bdver2 (we don't have a good idea whether we'll going to be
 store or load bandwith limited in the vectorizer).

Still the following is what I have bootstrapped and tested
on x86_64-unknown-linux-gnu and applied to trunk.

Richard.

2015-05-22  Richard Biener  rguent...@suse.de

PR tree-optimization/65701
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Move peeling cost models into one place.  Peel for alignment
for single loads only if an aligned load is cheaper than
an unaligned load.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 223349)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -1541,16 +1541,6 @@ vect_enhance_data_refs_alignment (loop_v
   || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
 do_peeling = false;
 
-  /* If we don't know how many times the peeling loop will run
- assume it will run VF-1 times and disable peeling if the remaining
- iters are less than the vectorization factor.  */
-  if (do_peeling
-   all_misalignments_unknown
-   LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
-   (LOOP_VINFO_INT_NITERS (loop_vinfo)
-  2 * (unsigned) LOOP_VINFO_VECT_FACTOR (loop_vinfo) - 1))
-do_peeling = false;
-
   if (do_peeling
all_misalignments_unknown
vect_supportable_dr_alignment (dr0, false))
@@ -1619,12 +1609,17 @@ vect_enhance_data_refs_alignment (loop_v
 }
 
   /* In case there are only loads with different unknown misalignments, use
- peeling only if it may help to align other accesses in the loop.  */
+ peeling only if it may help to align other accesses in the loop or
+if it may help improving load bandwith when we'd end up using
+unaligned loads.  */
+  tree dr0_vt = STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr0)));
   if (!first_store
   !STMT_VINFO_SAME_ALIGN_REFS (
  vinfo_for_stmt (DR_STMT (dr0))).length ()
-   vect_supportable_dr_alignment (dr0, false)
-  != dr_unaligned_supported)
+  (vect_supportable_dr_alignment 

Re: [patch 1/10] debug-early merge: Ada front-end

2015-05-22 Thread Eric Botcazou
 My apologies for the delay on Ada.  I have reworked the patch to leave
 the first pass on the TYPE_DECLs which are definitely needed.  I also
 optimized things a bit, since we don't need to save all the globals any
 more.

Thanks, this looks fine modulo a couple of nits, see below.

 There is one regression which I'd like you and Jason's input before
 proceeding:
 
 +FAIL: gnat.dg/specs/debug1.ads scan-assembler-times DW_AT_artificial 17
 
 The problem is that the Ada front-end twiddles the DECL_ARTIFICIAL flag
 *after* it has called debug_hooks-early_global_decl().  The function
 gnat_to_gnu_entity() calls create_var_decl_1-rest_of_decl_compilation,
 but then much later twiddles DECL_ARTIFICIAL:
 
if (!Comes_From_Source (gnat_entity))
   DECL_ARTIFICIAL (gnu_decl) = 1;
 
 Twiddling DECL_ARTIFICIAL after we create early dwarf, means the DIE
 does not get the DW_AT_artificial.  Would it be possible for you guys
 (ahem, Ada folk) to set DECL_ARTIFICIAL before calling
 rest_of_decl_compilation?

Sure, just add a ??? comment before the above lines for now.


@@ -535,8 +535,7 @@ extern tree gnat_type_for_size (unsigned precision, int 
unsignedp);
an unsigned type; otherwise a signed type is returned.  */
 extern tree gnat_type_for_mode (machine_mode mode, int unsignedp);
 
-/* Emit debug info for all global variable declarations.  */
-extern void gnat_write_global_declarations (void);
+extern void note_types_used_by_globals (void);

The comment needs to be adjusted instead of removed and preferably be the same 
as the one (ajusted too) above the function in utils.c.

-- 
Eric Botcazou


Re: [PATCH 3/3][Aarch64][PR target/65697] Add tests for __sync_builtins.

2015-05-22 Thread Matthew Wahab

[Added PR number and updated patches]

This patch adds tests for the code generated by the Aarch64 backend for the
__sync builtins.

Tested aarch64-none-linux-gnu with check-gcc.

Ok for trunk?
Matthew

gcc/testsuite/
2015-05-21  Matthew Wahab  matthew.wa...@arm.com

PR target/65697
* gcc.target/aarch64/sync-comp-swap.c: New.
* gcc.target/aarch64/sync-comp-swap.x: New.
* gcc.target/aarch64/sync-op-acquire.c: New.
* gcc.target/aarch64/sync-op-acquire.x: New.
* gcc.target/aarch64/sync-op-full.c: New.
* gcc.target/aarch64/sync-op-full.x: New.
* gcc.target/aarch64/sync-op-release.c: New.
* gcc.target/aarch64/sync-op-release.x: New.


From a3e8df9afce1098c1c616d66a309ce5bc5b95593 Mon Sep 17 00:00:00 2001
From: Matthew Wahab matthew.wa...@arm.com
Date: Fri, 15 May 2015 09:31:42 +0100
Subject: [PATCH 3/3] [Aarch64] Add tests for __sync_builtins.

Change-Id: I9f7cde85613dfe2cb6df55cbc732e683092f14d8
---
 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c  |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x  | 13 
 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x |  7 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-full.c|  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-full.x| 73 ++
 gcc/testsuite/gcc.target/aarch64/sync-op-release.c |  6 ++
 gcc/testsuite/gcc.target/aarch64/sync-op-release.x |  7 +++
 8 files changed, 130 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-full.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-full.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-release.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-release.x

diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
new file mode 100644
index 000..126b997
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fno-ipa-icf } */
+
+#include sync-comp-swap.x
+
+/* { dg-final { scan-assembler-times ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\] 2 } } */
+/* { dg-final { scan-assembler-times stlxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\] 2 } } */
+/* { dg-final { scan-assembler-times dmb\tish 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
new file mode 100644
index 000..eda52e40
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
@@ -0,0 +1,13 @@
+int v = 0;
+
+int
+sync_bool_compare_swap (int a, int b)
+{
+  return __sync_bool_compare_and_swap (v, a, b);
+}
+
+int
+sync_val_compare_swap (int a, int b)
+{
+  return __sync_val_compare_and_swap (v, a, b);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
new file mode 100644
index 000..2639f9f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+#include sync-op-acquire.x
+
+/* { dg-final { scan-assembler-times ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\] 1 } } */
+/* { dg-final { scan-assembler-times stxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\] 1 } } */
+/* { dg-final { scan-assembler-times dmb\tish 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
new file mode 100644
index 000..4c4548c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
@@ -0,0 +1,7 @@
+int v;
+
+int
+sync_lock_test_and_set (int a)
+{
+  return __sync_lock_test_and_set (v, a);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
new file mode 100644
index 000..10fc8fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+#include sync-op-full.x
+
+/* { dg-final { scan-assembler-times ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\] 12 } } */
+/* { dg-final { scan-assembler-times stlxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\] 12 } } */
+/* { dg-final { scan-assembler-times dmb\tish 12 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.x b/gcc/testsuite/gcc.target/aarch64/sync-op-full.x
new file mode 100644
index 000..c24223d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.x
@@ -0,0 +1,73 @@
+int v = 0;
+
+int
+sync_fetch_and_add (int a)
+{
+  return __sync_fetch_and_add (v, a);
+}
+
+int
+sync_fetch_and_sub (int a)
+{
+  return __sync_fetch_and_sub (v, a);
+}
+
+int

Re: C/C++ PATCH to allow deprecating enum values (PR c/47043)

2015-05-22 Thread Marek Polacek
On Thu, May 21, 2015 at 02:00:26PM -0400, Jason Merrill wrote:
 On 05/07/2015 12:22 PM, Marek Polacek wrote:
 -  mark_used (decl);
 +  mark_used (decl, 0);
 
 This should use tf_none rather than 0.

Fixed.

 +  build_enumerator (DECL_NAME (decl), value, newtag,
 +DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl));
 
 This is assuming that enumerators can't have dependent attributes.  I guess
 that's currently true, but please add a comment about it.

Done.
 
 OK with those changes.

Thanks, applied.  Here's the final version.

2015-05-22  Marek Polacek  pola...@redhat.com
Edward Smith-Rowland  3dw...@verizon.net

PR c/47043
* c-common.c (handle_deprecated_attribute): Allow CONST_DECL.

* c-parser.c (c_parser_enum_specifier): Parse and apply enumerator
attributes.

* cp-tree.h (build_enumerator): Update declaration.
* decl.c (build_enumerator): Add attributes parameter.  Call
cplus_decl_attributes.
* init.c (constant_value_1): Pass tf_none to mark_used.
* parser.c (cp_parser_enumerator_definition): Parse attributes and
pass them down to build_enumerator.
* pt.c (tsubst_enum): Pass decl attributes to build_enumerator.
* semantics.c (finish_id_expression): Don't warn_deprecated_use here.

* doc/extend.texi (Enumerator Attributes): New section.
Document syntax of enumerator attributes.

* c-c++-common/attributes-enum-1.c: New test.
* c-c++-common/attributes-enum-2.c: New test.
* g++.dg/cpp0x/attributes-enum-1.C: New test.
* g++.dg/cpp1y/attributes-enum-1.C: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index a2b3793..36c984c 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -9088,6 +9088,7 @@ handle_deprecated_attribute (tree *node, tree name,
  || TREE_CODE (decl) == PARM_DECL
  || VAR_OR_FUNCTION_DECL_P (decl)
  || TREE_CODE (decl) == FIELD_DECL
+ || TREE_CODE (decl) == CONST_DECL
  || objc_method_decl (TREE_CODE (decl)))
TREE_DEPRECATED (decl) = 1;
   else
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index f496733..965b4b9 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -2516,6 +2516,13 @@ c_parser_declspecs (c_parser *parser, struct c_declspecs 
*specs,
enumerator:
  enumeration-constant
  enumeration-constant = constant-expression
+
+   GNU Extensions:
+
+   enumerator:
+ enumeration-constant attributes[opt]
+ enumeration-constant attributes[opt] = constant-expression
+
 */
 
 static struct c_typespec
@@ -2575,6 +2582,8 @@ c_parser_enum_specifier (c_parser *parser)
  c_parser_set_source_position_from_token (token);
  decl_loc = value_loc = token-location;
  c_parser_consume_token (parser);
+ /* Parse any specified attributes.  */
+ tree enum_attrs = c_parser_attributes (parser);
  if (c_parser_next_token_is (parser, CPP_EQ))
{
  c_parser_consume_token (parser);
@@ -2584,7 +2593,9 @@ c_parser_enum_specifier (c_parser *parser)
  else
enum_value = NULL_TREE;
  enum_decl = build_enumerator (decl_loc, value_loc,
-   the_enum, enum_id, enum_value);
+   the_enum, enum_id, enum_value);
+ if (enum_attrs)
+   decl_attributes (TREE_PURPOSE (enum_decl), enum_attrs, 0);
  TREE_CHAIN (enum_decl) = values;
  values = enum_decl;
  seen_comma = false;
diff --git gcc/cp/cp-tree.h gcc/cp/cp-tree.h
index 4136d98..91619e2 100644
--- gcc/cp/cp-tree.h
+++ gcc/cp/cp-tree.h
@@ -5400,7 +5400,7 @@ extern bool xref_basetypes(tree, 
tree);
 extern tree start_enum (tree, tree, tree, bool, bool 
*);
 extern void finish_enum_value_list (tree);
 extern void finish_enum(tree);
-extern void build_enumerator   (tree, tree, tree, location_t);
+extern void build_enumerator   (tree, tree, tree, tree, 
location_t);
 extern tree lookup_enumerator  (tree, tree);
 extern bool start_preparsed_function   (tree, tree, int);
 extern bool start_function (cp_decl_specifier_seq *,
diff --git gcc/cp/decl.c gcc/cp/decl.c
index e4d3c1d..5396994 100644
--- gcc/cp/decl.c
+++ gcc/cp/decl.c
@@ -13057,11 +13057,12 @@ finish_enum (tree enumtype)
 
 /* Build and install a CONST_DECL for an enumeration constant of the
enumeration type ENUMTYPE whose NAME and VALUE (if any) are provided.
-   LOC is the location of NAME.
+   Apply ATTRIBUTES if available.  LOC is the location of NAME.
Assignment of sequential values by default is handled here.  */
 
 void
-build_enumerator (tree name, tree value, tree enumtype, location_t loc)
+build_enumerator (tree name, tree value, tree enumtype, 

Re: [PATCH 02/13] optabs: Fix vec_perm - V16QI middle end lowering.

2015-05-22 Thread Andreas Krebbel
On Tue, May 19, 2015 at 07:48:29AM -0700, Richard Henderson wrote:
  Ok to apply with that change?
 
 Yes, thanks.

I've applied the following.

Bye,

-Andreas-

gcc/
* optabs.c (expand_vec_perm): Don't re-use SEL as target operand.
---
 gcc/optabs.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/optabs.c b/gcc/optabs.c
index bd03fc1..bc19029 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6796,11 +6796,11 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx 
sel, rtx target)
   machine_mode selmode = GET_MODE (sel);
   if (u == 2)
sel = expand_simple_binop (selmode, PLUS, sel, sel,
-  sel, 0, OPTAB_DIRECT);
+  NULL, 0, OPTAB_DIRECT);
   else
sel = expand_simple_binop (selmode, ASHIFT, sel,
   GEN_INT (exact_log2 (u)),
-  sel, 0, OPTAB_DIRECT);
+  NULL, 0, OPTAB_DIRECT);
   gcc_assert (sel != NULL);
 
   /* Broadcast the low byte each element into each of its bytes.  */
-- 
1.7.9.5



Re: Check canonical types in verify_type

2015-05-22 Thread Richard Biener
On Thu, 21 May 2015, Jan Hubicka wrote:

   Hmm, I see, interesting hack.  For the first part of comment, I see that
   qualifiers needs to be ignored, but I do not see why we put
   short * and int * pointers to same class.
  
  For the reason that people are very lazy.  For example GCC has code
  punning void ** and void * and void * to Type * (and vice versa).  People
  just don't expect pointers to be in different alias sets (and there is
  little gained with doing that).
 
 I think this observation may be out of date.  Removing that code (punting all
 pointer to ptr_type_node) increase number of TBAA disambiguations for firefox
 by 20%. It also drops number of queries so in reality the number is higher.
 
 So it seems that it may be worth to track this in more sensible way.  This was
 tested on LTO where most pointer alias sets are te same anyway, so the 
 increase
 should be bigger in non-LTO build. I will gather some stats.
 
 The issue is that modern C++ code packs everything into instances, puts 
 pointers
 to instances everywhere and as wel inline everything into one glob, we could
 track down a lot of data if we did not get lost in pointers.
 
 We may want to have flag disabling that and we may want to throw away some
 precision without throwing away all.
 
 I tried to just trhow away the qualifier from pointer (i.e. make pointer to
 qualified type to be globbed to pointer to unqualified type). This seems to
 work in a way bootstrapping GCC with one bug (in ipa-icf) and passing
 testsuite.

Well, see the big comment before the alias.c pointer type handling.
This is what the C family frontends did before in its get_alias_set
langhook:

  t1 = build_type_no_quals (t);
  if (t1 != t)
return get_alias_set (t1);

where build_type_no_quals strips qualifiers recursively from t and
pointed-to-types.

   The complete/incomplete type fun with LTO can be solved for C++ but 
   indeed I
   see why pointer to incomplete type needs to be considered compatible with 
   every
   other pointer to structure.  Can't this be dealt with by adding correct 
   edges
   into the aliasing dag?
  
  I don't think so, without the dag degenerating.  We've had this mess
  before (complete vs. incomplete structs and so).  It didn't work very 
  well.
 
 Can you point me to the code/issues, please?

It was the old type merging code.  I can spot it in the 4.5 code,
gimple.c:gimple_types_compatible_p where in the POINTER_TYPE
case it does

 compare_type_names_p (TYPE_MAIN_VARIANT (TREE_TYPE (t1)),
 TYPE_MAIN_VARIANT (TREE_TYPE (t2)), 
true))
  {
/* Replace the pointed-to incomplete type with the
   complete one.
   ???  This simple name-based merging causes at least some
   of the ICEs in canonicalizing FIELD_DECLs during stmt
   read.  For example in GCC we have two different struct deps
   and we mismatch the use in struct cpp_reader in sched-int.h
   vs. mkdeps.c.  Of course the whole exercise is for TBAA
   with structs which contain pointers to incomplete types
   in one unit and to complete ones in another.  So we
   probably should merge these types only with more context.  
*/
if (COMPLETE_TYPE_P (TREE_TYPE (t2)))
  TREE_TYPE (t1) = TREE_TYPE (t2);
else
  TREE_TYPE (t2) = TREE_TYPE (t1);

where that of course wrecks the tree graph quite a bit (and it is
dependent on the order of TUs).  Luckily we now have SCC based
streaming and merging, not trying to complete types to get more
merging done.

  
  I don't remember if I added a testcase, so no.  It's just very clear to me
  that the look of pointer members may be very different (consider 
  reference vs. pointer or pointer to array vs pointer to element).
 
 Yep, depending on how much of cross-lanugage datastructures we want to permit.
 We could definitely unpeel those differences just like I did with qualifiers.
 However this trick is kind of weird because it does not propagate up to 
 aggregates
 buildt from these.
 
 In a way some of this sould probably better be handled at canonical type
 calculation time if we really want to permit mixing up aggregates with those
 differences.
 
 For example I think it would make sense to look throught ARRAY_TYPEs when
 handling determining pointer canonical type (so pointer to array and pointer 
 to
 element are the same).  We also may ignore TREE_CODE match for POINTER_TYPE_P
 if we really do have languages that mixes that.

Another case (I've seen this in fortran vs. C) is:

 int i;

vs.

 common/i/a(1)

thus inter-operability of int[1] and int.  A similar issue (with alias
sets used for accesses) is

struct X { int trailing[1]; };

and code doing

 struct X *p;
 int (*a[16]) = p-trailing;

and

 (*a)[i] vs. p-trailing[i]

(or in the middle-end with frontends that can emit array assignments).


[PATCH] Combine related fail of gcc.target/powerpc/ti_math1.c

2015-05-22 Thread Alan Modra
This patch fixes
FAIL: gcc.target/powerpc/ti_math1.c scan-assembler-times adde 1
a failure caused by combine simplifying this i2src

(plus:DI (plus:DI (reg:DI 165 [ val+8 ])
(reg:DI 169 [+8 ]))
(reg:DI 76 ca))

to this

(plus:DI (plus:DI (reg:DI 76 ca)
(reg:DI 165 [ val+8 ]))
(reg:DI 169 [+8 ]))

which no longer matches rs6000.md adddi3_carry_in_internal.  See
https://gcc.gnu.org/ml/gcc/2015-05/msg00206.html for related
discussion.  Bootstrapped and regression tested powerpc64le-linux,
powerpc64-linux and x86_64-linux.  OK to apply mainline?

* rtlanal.c (commutative_operand_precedence): Correct comments.
* simplify-rtx.c (simplify_plus_minus_op_data_cmp): Delete forward
declaration.  Return an int.  Distinguish REG,REG return from
others.
(struct simplify_plus_minus_op_data): Make local to function.
(simplify_plus_minus): Rename canonicalized to not_canonical.
Don't set not_canonical if merely sorting registers.  Avoid
packing ops if nothing changes.  White space fixes.

Some notes: Renaming canonicalized to not_canonical better reflects
its usage.  At the time the var is set, the expression hasn't been
canonicalized.
diff -w shown to exclude the whitespace fixes.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 223463)
+++ gcc/rtlanal.c   (working copy)
@@ -3140,10 +3140,9 @@ regno_use_in (unsigned int regno, rtx x)
 }
 
 /* Return a value indicating whether OP, an operand of a commutative
-   operation, is preferred as the first or second operand.  The higher
-   the value, the stronger the preference for being the first operand.
-   We use negative values to indicate a preference for the first operand
-   and positive values for the second operand.  */
+   operation, is preferred as the first or second operand.  The more
+   positive the value, the stronger the preference for being the first
+   operand.  */
 
 int
 commutative_operand_precedence (rtx op)
@@ -3150,7 +3149,7 @@ commutative_operand_precedence (rtx op)
 {
   enum rtx_code code = GET_CODE (op);
 
-  /* Constants always come the second operand.  Prefer nice constants.  */
+  /* Constants always become the second operand.  Prefer nice constants.  */
   if (code == CONST_INT)
 return -8;
   if (code == CONST_WIDE_INT)
Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  (revision 223463)
+++ gcc/simplify-rtx.c  (working copy)
@@ -71,7 +71,6 @@ along with GCC; see the file COPYING3.  If not see
 
 static rtx neg_const_int (machine_mode, const_rtx);
 static bool plus_minus_operand_p (const_rtx);
-static bool simplify_plus_minus_op_data_cmp (rtx, rtx);
 static rtx simplify_plus_minus (enum rtx_code, machine_mode, rtx, rtx);
 static rtx simplify_immed_subreg (machine_mode, rtx, machine_mode,
  unsigned int);
@@ -4064,20 +4063,10 @@ simplify_const_binary_operation (enum rtx_code cod
 
 
 
-/* Simplify a PLUS or MINUS, at least one of whose operands may be another
-   PLUS or MINUS.
+/* Return a positive integer if X should sort after Y.  The value
+   returned is 1 if and only if X and Y are both regs.  */
 
-   Rather than test for specific case, we do this by a brute-force method
-   and do all possible simplifications until no more changes occur.  Then
-   we rebuild the operation.  */
-
-struct simplify_plus_minus_op_data
-{
-  rtx op;
-  short neg;
-};
-
-static bool
+static int
 simplify_plus_minus_op_data_cmp (rtx x, rtx y)
 {
   int result;
@@ -4085,23 +4074,36 @@ simplify_plus_minus_op_data_cmp (rtx x, rtx y)
   result = (commutative_operand_precedence (y)
- commutative_operand_precedence (x));
   if (result)
-return result  0;
+return result + result;
 
   /* Group together equal REGs to do more simplification.  */
   if (REG_P (x)  REG_P (y))
 return REGNO (x)  REGNO (y);
-  else
-return false;
+
+  return 0;
 }
 
+/* Simplify and canonicalize a PLUS or MINUS, at least one of whose
+   operands may be another PLUS or MINUS.
+
+   Rather than test for specific case, we do this by a brute-force method
+   and do all possible simplifications until no more changes occur.  Then
+   we rebuild the operation.
+
+   May return NULL_RTX when no changes were made.  */
+
 static rtx
 simplify_plus_minus (enum rtx_code code, machine_mode mode, rtx op0,
 rtx op1)
 {
-  struct simplify_plus_minus_op_data ops[16];
+  struct simplify_plus_minus_op_data
+  {
+rtx op;
+short neg;
+  } ops[16];
   rtx result, tem;
   int n_ops = 2;
-  int changed, n_constants, canonicalized = 0;
+  int changed, n_constants, not_canonical = 0;
   int i, j;
 
   memset (ops, 0, sizeof ops);
@@ -4139,7 +4141,18 @@ simplify_plus_minus (enum rtx_code code, machine_m
 
  ops[i].op = XEXP (this_op, 0);
  changed = 1;
- 

[PATCH][RFA] PR rtl-optimization/66237

2015-05-22 Thread Mikhail Maltsev
This patch fixes a bug introduced by refactoring. A cast from rtx_insn
to rtx_jump_insn in fix_crossing_conditional_branches was placed before
the check, and that caused ICE if the instruction is actually a call,
rather than a jump.

Bootstrapped/regtested on x86_64 linux and tested the regressed case
(PR34999) on aarch64 crosscompiler using the provided profiling data.

OK for trunk?

-- 
Regards,
Mikhail Maltsev
gcc/ChangeLog:

2015-05-22  Mikhail Maltsev  malts...@gmail.com

PR rtl-optimization/66237
* bb-reorder.c (fix_crossing_conditional_branches): Fix wrong
location of an as_a cast.

diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index 20e15fc..cbf6343 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -2044,7 +2044,7 @@ fix_crossing_conditional_branches (void)
 
   if (crossing_edge)
{
- rtx_jump_insn *old_jump = as_a rtx_jump_insn * (BB_END (cur_bb));
+ rtx_insn *old_jump = BB_END (cur_bb);
 
  /* Check to make sure the jump instruction is a
 conditional jump.  */
@@ -2067,6 +2067,9 @@ fix_crossing_conditional_branches (void)
 
  if (set_src  (GET_CODE (set_src) == IF_THEN_ELSE))
{
+ rtx_jump_insn *old_jump_insn =
+   as_a rtx_jump_insn * (old_jump);
+
  if (GET_CODE (XEXP (set_src, 1)) == PC)
old_label = XEXP (set_src, 2);
  else if (GET_CODE (XEXP (set_src, 2)) == PC)
@@ -2095,7 +2098,7 @@ fix_crossing_conditional_branches (void)
  emit_label (new_label);
 
  gcc_assert (GET_CODE (old_label) == LABEL_REF);
- old_jump_target = old_jump-jump_target ();
+ old_jump_target = old_jump_insn-jump_target ();
  new_jump = as_a rtx_jump_insn *
(emit_jump_insn (gen_jump (old_jump_target)));
  new_jump-set_jump_target (old_jump_target);
@@ -2114,7 +2117,7 @@ fix_crossing_conditional_branches (void)
 
  /* Make old jump branch to new bb.  */
 
- redirect_jump (old_jump, new_label, 0);
+ redirect_jump (old_jump_insn, new_label, 0);
 
  /* Remove crossing_edge as predecessor of 'dest'.  */
 


Re: [PATCH 2/3][AArch64][PR target/65697] Strengthen barriers for sync-compare-swap builtins.

2015-05-22 Thread Matthew Wahab

[Added PR number and updated patches]

This patch changes the code generated for __sync_type_compare_and_swap to

  ldxr reg; cmp; bne label; stlxr; cbnz; label: dmb ish; mov .., reg

This removes the acquire-barrier from the load and ends the operation with a
fence to prevent memory references appearing after the __sync operation from
being moved ahead of the store-release.

This also strengthens the acquire barrier generated for __sync_lock_test_and_set
(which, like compare-and-swap, is implemented as a form of atomic exchange):

  ldaxr; stxr; cbnz
becomes
  ldxr; stxr; cbnz; dmb ish


Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

2015-05-22  Matthew Wahab  matthew.wa...@arm.com

PR target/65697
* config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
for __sync memory models, emit appropriate initial and final
barriers.

From 1e5cda95944e7176b8934296b1bb1ec4c9fb1362 Mon Sep 17 00:00:00 2001
From: Matthew Wahab matthew.wa...@arm.com
Date: Fri, 15 May 2015 09:31:06 +0100
Subject: [PATCH 2/3] [AArch64] Strengthen barriers for sync-compare-swap
 builtins.

Change-Id: I335771f2f42ea951d227f20f6cb9daa07330614d
---
 gcc/config/aarch64/aarch64.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 182dbad..5b9feee 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9433,14 +9433,19 @@ aarch64_split_compare_and_swap (rtx operands[])
   bool is_weak;
   rtx_code_label *label1, *label2;
   rtx x, cond;
+  enum memmodel model;
+  rtx model_rtx;
+  rtx load_model_rtx;
 
   rval = operands[0];
   mem = operands[1];
   oldval = operands[2];
   newval = operands[3];
   is_weak = (operands[4] != const0_rtx);
+  model_rtx = operands[5];
   scratch = operands[7];
   mode = GET_MODE (mem);
+  model = memmodel_from_int (INTVAL (model_rtx));
 
   label1 = NULL;
   if (!is_weak)
@@ -9450,7 +9455,13 @@ aarch64_split_compare_and_swap (rtx operands[])
 }
   label2 = gen_label_rtx ();
 
-  aarch64_emit_load_exclusive (mode, rval, mem, operands[5]);
+  /* A __sync operation will end with a fence so the load can be relaxed.  */
+  if (is_mm_sync (model))
+load_model_rtx = GEN_INT (MEMMODEL_RELAXED);
+  else
+load_model_rtx = model_rtx;
+
+  aarch64_emit_load_exclusive (mode, rval, mem, load_model_rtx);
 
   cond = aarch64_gen_compare_reg (NE, rval, oldval);
   x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
@@ -9458,7 +9469,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 			gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
-  aarch64_emit_store_exclusive (mode, scratch, mem, newval, operands[5]);
+  aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
   if (!is_weak)
 {
@@ -9475,6 +9486,10 @@ aarch64_split_compare_and_swap (rtx operands[])
 }
 
   emit_label (label2);
+
+  /* A __sync operation may need a final fence.  */
+  if (is_mm_sync (model))
+aarch64_emit_post_barrier (model);
 }
 
 /* Split an atomic operation.  */
-- 
1.9.1



Re: [patch 10/10] debug-early merge: compiler proper

2015-05-22 Thread Eric Botcazou
 FWIW, Ada is filled with these temporaries and/or types that should
 really be ignored, and are currently causing grief.

It's a little hard to believe that types created in a front-end should be 
marked ignored.  Either they are used by some objects and thus can be needed 
in the debug info, or they aren't and will be discarded by -feliminate-unused-
debug-types.  And those not present in the source should be marked artificial.

-- 
Eric Botcazou


[Ada] Correct some anmolies in the handling of Atomic

2015-05-22 Thread Arnaud Charlet
This update corrects two problems in the handling of Atomic.
First we do not need Atomic_Synchronization for an object
renaming declaration. Second, when we do have a renaming of
an atomic object, the renaming object should be marked as atomic.

Compiling this test:

 1. package Renamed_Atomic is
 2.
 3.   I : Integer;
 4.   pragma Atomic (I);
 5.
 6.   function Get_I return Integer;
 7.   procedure Set_I (Val : Integer);
 8.
 9.   J : Integer renames I;
10.
11.   function Get_J return Integer;
12.   procedure Set_J (Val : Integer);
13.
14. end Renamed_Atomic;

We do not want Is_Atomic set on the identifier reference
in line 9, but we do want Is_Atomic set on the entity for J.

So if we use -gnatdt to generate a tree file, it should have
two Is_Atomic references, one for the entity I and one for the
entity J.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-21  Robert Dewar  de...@adacore.com

* exp_util.adb (Activate_Atomic_Synchronization): Do not set
Atomic_Sync_Required for an object renaming declaration.
* sem_ch8.adb (Analyze_Object_Renaming): Copy Is_Atomic and
Is_Independent to renaming object.

Index: exp_util.adb
===
--- exp_util.adb(revision 223476)
+++ exp_util.adb(working copy)
@@ -204,6 +204,13 @@
  when others = null;
   end case;
 
+  --  Nothing to do for the identifier in an object renaming declaration,
+  --  the renaming itself does not need atomic syncrhonization.
+
+  if Nkind (Parent (N)) = N_Object_Renaming_Declaration then
+ return;
+  end if;
+
   --  Go ahead and set the flag
 
   Set_Atomic_Sync_Required (N);
Index: sem_ch8.adb
===
--- sem_ch8.adb (revision 223476)
+++ sem_ch8.adb (working copy)
@@ -1344,6 +1344,13 @@
 
   Set_Is_Volatile (Id, Is_Volatile_Object (Nam));
 
+  --  Also copy settings of Is_Atomic and Is_Independent
+
+  if Is_Entity_Name (Nam) then
+ Set_Is_Atomic  (Id, Is_Atomic  (Entity (Nam)));
+ Set_Is_Independent (Id, Is_Independent (Entity (Nam)));
+  end if;
+
   --  Treat as volatile if we just set the Volatile flag
 
   if Is_Volatile (Id)


Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

2015-05-22 Thread Pedro Alves
On 05/21/2015 11:02 PM, Sriraman Tallam wrote:
 On Thu, May 21, 2015 at 2:51 PM, Pedro Alves pal...@redhat.com wrote:
 On 05/21/2015 10:12 PM, Sriraman Tallam wrote:

 My original proposal, for x86_64 only, was to add
 -fno-plt=function-name. This lets the user decide for which
 functions PLT must be avoided.  Let the compiler always generate an
 indirect call using call *func@GOTPCREL(%rip).  We could do this for
 non-PIC code too.  No need for linker fixups since this relies on the
 user to know that func is from a shared object.

 Having to pass function names on the command line seems like an odd
 interface.  E.g, you'll need to pass the mangled name for
 C++ functions.  Any reason this isn't a function attribute?
 
 It is not clear to me where I would stick the attribute.  Example
 usage in foo.cc:
 
 #includestring.h
 
 int main() {
   int n = memcmp();
 }
 
 I want memcmp to not go through PLT, do you propose explicitly
 re-declaring it in foo.cc with the attribute?

I guess you'd do:

#includestring.h

__attribute__((no_plt)) typeof (memcpy) memcpy;

int main() {
  int n = memcmp();
}

or even:

#includestring.h

int main() {
  if (hotpath) {
__attribute__((no_plt)) typeof (memcpy) memcpy;
for (..) {
  int n = memcmp();
}
  } else {
  int n = memcmp();
  }
}

or globally:

$ cat no-plt/string.h:
#include_next string.h
__attribute__((no_plt)) typeof (memcpy) memcpy;

$ gcc -I no-plt/ ...

Thanks,
Pedro Alves



Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test suite failures

2015-05-22 Thread Kyrill Tkachov

Hi Venkat,

On 22/05/15 09:50, Kumar, Venkataramanan wrote:

Hi Kyrill,

Sorry for little delay in responding.


-Original Message-
From: Kyrill Tkachov [mailto:kyrylo.tkac...@foss.arm.com]
Sent: Tuesday, May 19, 2015 9:13 PM
To: Kumar, Venkataramanan; James Greenhalgh; gcc-patches@gcc.gnu.org
Cc: Ramana Radhakrishnan; seg...@kernel.crashing.org; Marcus Shawcroft
Subject: Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test suite
failures

Hi Venkat,

On 19/05/15 16:37, Kumar, Venkataramanan wrote:

Hi Maintainers,

Please find the attached patch, that fixes add/extend gcc test suite failures

in Aarch64 target.

Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049

These tests started to fail after we prevented combiner from converting

shift RTX to mult RTX, when the RTX is not inside a memory operation
(r222874) .

Now I have added new add/extend patterns which are based on shift

operations,  to fix these cases.

Testing status with the patch.

(1) GCC bootstrap on AArch64 successful.
(2)  SPEC2006 INT runs did not show any degradation.

Does that mean there was no performance regression? Or no codegen
difference?

Yes there was no performance regression.



What I'd expect from this patch is that the codegen would be the same as
before the combine patch
(r222874). A performance difference can sometimes be hard to measure
even at worse code quality.
Can you please confirm that on SPEC2006 INT the adds and shifts are now
back to being combined
into a single instruction?

I  used -dp --save-temps to dump pattern names in assembly files.
I used revision before combiner patch (r222873) and the patch on top of trunk 
(little older r223194) for comparison.

Quickly compared the counts generated for the below patterns from the SPEC INT 
binaries.

*adds_optabmode_multp2 vs *adds_optabALLX:mode_shft_GPI:mode
*subs_optabmode_multp2 vs *subs_optabALLX:mode_shft_GPI:mode
*add_uxtmode_multp2vs  *add_uxtmode_shift2
*add_uxtsi_multp2_uxtw vs  *add_uxtsi_shift2_uxtw
*sub_uxtmode_multp2vs  *sub_uxtmode_shift2
*sub_uxtsi_multp2_uxtw vs  *sub_uxtsi_shift2_uxtw
*adds_mul_imm_mode  vs *adds_shift_imm_mode
*subs_mul_imm_mode  vs *subs_shift_imm_mode

Patterns are found in few benchmarks. The  generated counts are matching in the 
binaries compiled with the two revisions.

  Also Looked at assembly generated and  the adds and shifts are combined 
properly using the new patterns. Please let me know if this is OK.


Thanks for investigating this. I've run your patch on aarch64.exp and I see the 
add+shift/extend failures
that we were seeing go away (I'm sure you saw that as well ;).
Up to the maintainers to review the patch.

Thanks,
Kyrill




Thanks,
Kyrill


(3) gcc regression testing passed.

(-Snip-)
# Comparing 3 common sum files
## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998

/tmp/gxx-sum2.24998

Tests that now work, but didn't before:

gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-

9]+, lsl 3

gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+, x[0-9]+,

lsl 3

gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+,

x[0-9]+, sxtw 2

gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-

9]+, lsl 3

gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+,

lsl 3

gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+,

x[0-9]+, sxtw 2

# No differences found in 3 common sum files
(-Snip-)

The patterns are fixing the regressing tests, so I have not added any new

tests.

Regarding  removal of the old patterns based on mults,  I am planning to

do it as a separate work.

Is this OK for trunk ?

gcc/ChangeLog

2015-05-19  Venkataramanan Kumar  venkataramanan.ku...@amd.com

  * config/aarch64/aarch64.md
  (*adds_shift_imm_mode):  New pattern.
  (*subs_shift_imm_mode):  Likewise.
  (*adds_optabALLX:mode_shift_GPI:mode):  Likewise.
  (*subs_optabALLX:mode_shift_GPI:mode): Likewise.
  (*add_uxtmode_shift2): Likewise.
  (*add_uxtsi_shift2_uxtw): Likewise.
 (*sub_uxtmode_shift2): Likewise.
 (*sub_uxtsi_shift2_uxtw): Likewise.


Regards,
Venkat.





Regards,
Venkat,




Re: [5/9] Create sensible dummy registers

2015-05-22 Thread Jeff Law

On 05/22/2015 09:39 AM, Richard Sandiford wrote:

Eric Botcazou ebotca...@adacore.com writes:

Some pieces of code create a temporary REG or MEM and only fill it
in later when they're testing the cost of a particular rtx.  This patch
makes sure that even the dummy REG or MEM is valid, rather than force
the gen_* code to handle garbage values.


gcc/
* caller-save.c (init_caller_save): Use word_mode and
FIRST_PSEUDO_REGISTER when creating temporary rtxes.
* expr.c (init_expr_target): Likewise.
* ira.c (setup_prohibited_mode_move_regs): Likewise.
* postreload.c (reload_cse_regs_1): Likewise.


Isn't LAST_VIRTUAL_REGISTER + 1 the canonical regno to be used in this case?


Ah, yeah.  Here's patch to fix all instances of that (I hope).

Bootstrapped  regression-tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* caller-save.c (init_caller_save): Base temporary register numbers
on LAST_VIRTUAL_REGISTER + 1 rather than FIRST_PSEUDO_REGISTER.
* cfgloopanal.c (init_set_costs): Likewise.
* dojump.c (prefer_and_bit_test): Likewise.
* expr.c (init_expr_target): Likewise.
* ira.c (setup_prohibited_mode_move_regs): Likewise.
* lower-subreg.c (init_lower_subreg): Likewise.
* postreload.c (reload_cse_regs_1): Likewise.

OK.
jeff



Re: [PATCH][RFA] PR rtl-optimization/66237

2015-05-22 Thread Jeff Law

On 05/22/2015 02:27 AM, Mikhail Maltsev wrote:

This patch fixes a bug introduced by refactoring. A cast from rtx_insn
to rtx_jump_insn in fix_crossing_conditional_branches was placed before
the check, and that caused ICE if the instruction is actually a call,
rather than a jump.

Bootstrapped/regtested on x86_64 linux and tested the regressed case
(PR34999) on aarch64 crosscompiler using the provided profiling data.

OK for trunk?

-- Regards, Mikhail Maltsev


pr66237.clog


gcc/ChangeLog:

2015-05-22  Mikhail Maltsevmalts...@gmail.com

PR rtl-optimization/66237
* bb-reorder.c (fix_crossing_conditional_branches): Fix wrong
location of an as_a cast.

OK.
jeff


Final patch to cleanup hppa port's handling of shadd/scaled indexed addressing modes

2015-05-22 Thread Jeff Law


This is the final patch to the PA backend to cleanup its handling of 
shadd insns and scaled indexed addressing modes.


First, it removes the old non-canonical shadd insns.

Second, it removes some non-canonical peephole patterns.  No idea what I 
was thinking when I wrote them.  Given they're non-canonical and I don't 
know how to trigger them, best to just make this disappear.


Finally, the remaining peepholes which rewrite indexed stores of FP 
registers in integer modes are updated to generate canonical RTL for the 
shift-add insns they generate.


Like the other recent changes, these don't affect the code generation 
for the 300+ files I'm testing with.   I've also verified that hppa.exp 
still runs successfully.


Installed on the trunk.

Jeff
* pa.md (non-canonical shift-add insns): Remove.
(peepholes with non-canonical RTL sources): Remove.
(peepholes for indexed stores of FP regs in integer modes): Match and
generate canonical RTL.


diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index 6cc7a3c..2686f38 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -2270,8 +2270,8 @@
 ; computes the address to be deleted if the register it sets is dead.
 (define_peephole2
   [(set (match_operand:SI 0 register_operand )
-   (plus:SI (mult:SI (match_operand:SI 1 register_operand )
- (const_int 4))
+   (plus:SI (ashift:SI (match_operand:SI 1 register_operand )
+   (const_int 2))
 (match_operand:SI 2 register_operand )))
(set (mem:SI (match_dup 0))
 (match_operand:SI 3 register_operand ))]
@@ -2281,31 +2281,14 @@
 FP_REGNO_P (REGNO (operands[3]))
   [(set (mem:SI (plus:SI (mult:SI (match_dup 1) (const_int 4)) (match_dup 2)))
(match_dup 3))
-   (set (match_dup 0) (plus:SI (mult:SI (match_dup 1) (const_int 4))
-  (match_dup 2)))]
-  )
-
-(define_peephole2
-  [(set (match_operand:SI 0 register_operand )
-   (plus:SI (match_operand:SI 2 register_operand )
-(mult:SI (match_operand:SI 1 register_operand )
- (const_int 4
-   (set (mem:SI (match_dup 0))
-(match_operand:SI 3 register_operand ))]
-  !TARGET_SOFT_FLOAT
-!TARGET_DISABLE_INDEXING
-REG_OK_FOR_BASE_P (operands[2])
-FP_REGNO_P (REGNO (operands[3]))
-  [(set (mem:SI (plus:SI (mult:SI (match_dup 1) (const_int 4)) (match_dup 2)))
-   (match_dup 3))
-   (set (match_dup 0) (plus:SI (mult:SI (match_dup 1) (const_int 4))
+   (set (match_dup 0) (plus:SI (ashift:SI (match_dup 1) (const_int 2))
   (match_dup 2)))]
   )
 
 (define_peephole2
   [(set (match_operand:DI 0 register_operand )
-   (plus:DI (mult:DI (match_operand:DI 1 register_operand )
- (const_int 4))
+   (plus:DI (ashift:DI (match_operand:DI 1 register_operand )
+   (const_int 2))
 (match_operand:DI 2 register_operand )))
(set (mem:SI (match_dup 0))
 (match_operand:SI 3 register_operand ))]
@@ -2316,25 +2299,7 @@
 FP_REGNO_P (REGNO (operands[3]))
   [(set (mem:SI (plus:DI (mult:DI (match_dup 1) (const_int 4)) (match_dup 2)))
(match_dup 3))
-   (set (match_dup 0) (plus:DI (mult:DI (match_dup 1) (const_int 4))
-  (match_dup 2)))]
-  )
-
-(define_peephole2
-  [(set (match_operand:DI 0 register_operand )
-   (plus:DI (match_operand:DI 2 register_operand )
-(mult:DI (match_operand:DI 1 register_operand )
- (const_int 4
-   (set (mem:SI (match_dup 0))
-(match_operand:SI 3 register_operand ))]
-  !TARGET_SOFT_FLOAT
-!TARGET_DISABLE_INDEXING
-TARGET_64BIT
-REG_OK_FOR_BASE_P (operands[2])
-FP_REGNO_P (REGNO (operands[3]))
-  [(set (mem:SI (plus:DI (mult:DI (match_dup 1) (const_int 4)) (match_dup 2)))
-   (match_dup 3))
-   (set (match_dup 0) (plus:DI (mult:DI (match_dup 1) (const_int 4))
+   (set (match_dup 0) (plus:DI (ashift:DI (match_dup 1) (const_int 2))
   (match_dup 2)))]
   )
 
@@ -3896,8 +3861,8 @@
 
 (define_peephole2
   [(set (match_operand:SI 0 register_operand )
-   (plus:SI (mult:SI (match_operand:SI 1 register_operand )
- (const_int 8))
+   (plus:SI (ashift:SI (match_operand:SI 1 register_operand )
+   (const_int 3))
 (match_operand:SI 2 register_operand )))
(set (mem:DF (match_dup 0))
 (match_operand:DF 3 register_operand ))]
@@ -3907,15 +3872,15 @@
 FP_REGNO_P (REGNO (operands[3]))
   [(set (mem:DF (plus:SI (mult:SI (match_dup 1) (const_int 8)) (match_dup 2)))
(match_dup 3))
-   (set (match_dup 0) (plus:SI (mult:SI (match_dup 1) (const_int 8))
+   (set (match_dup 0) (plus:SI (ashift:SI (match_dup 1) (const_int 3))
   (match_dup 2)))]
   )
 
 (define_peephole2
   [(set 

Re: [PATCH] Fix PR66168: ICE due to incorrect invariant register info

2015-05-22 Thread Jeff Law

On 05/20/2015 08:04 PM, Thomas Preud'homme wrote:

From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme

From: Steven Bosscher [mailto:stevenb@gmail.com]
Sent: Tuesday, May 19, 2015 7:21 PM

Not OK.
This will break in move_invariants() when it looks at REGNO (inv-reg).


Indeed. I'm even surprised all tests passed. Ok I will just prevent moving
in such a case. I'm running the tests now and will get back to you
tomorrow.


Patch is now tested via bootstrap + testsuite run on x86_64-linux-gnu and
building arm-none-eabi cross-compiler + testsuite run. Both testsuite run
show no regression.

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 76a009f..4ce3576 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1626,7 +1626,7 @@ move_invariant_reg (struct loop *loop, unsigned invno)
if (REG_P (reg))
regno = REGNO (reg);

-  if (!can_move_invariant_reg (loop, inv, reg))
+  if (!can_move_invariant_reg (loop, inv, dest))

Won't this run into into the same problem if DEST is a SUBREG?

Jeff





Re: [patch] testsuite enable PIE tests on FreeBSD

2015-05-22 Thread Jeff Law

On 05/21/2015 02:01 PM, Andreas Tobler wrote:

On 21.05.15 20:14, Andreas Tobler wrote:

On 20.05.15 22:30, Jeff Law wrote:

On 05/20/2015 11:04 AM, Andreas Tobler wrote:

Hi,

the attached patch enables some PIE tests on FreeBSD.

Ok for trunk?

Thanks,
Andreas

2015-05-20  Andreas Tobler  andre...@gcc.gnu.org

   * gcc.target/i386/pr32219-1.c: Enable test on FreeBSD.
   * gcc.target/i386/pr32219-2.c: Likewise.
   * gcc.target/i386/pr32219-3.c: Likewise.
   * gcc.target/i386/pr32219-4.c: Likewise.
   * gcc.target/i386/pr32219-5.c: Likewise.
   * gcc.target/i386/pr32219-6.c: Likewise
   * gcc.target/i386/pr32219-7.c: Likewise.
   * gcc.target/i386/pr32219-8.c: Likewise.
   * gcc.target/i386/pr39013-1.c: Likewise.
   * gcc.target/i386/pr39013-2.c: Likewise.
   * gcc.target/i386/pr64317.c: Likewise.

Wouldn't it be better to remove the target selector and instead add:

/* { dg-require-effective-target pie } */

In each of those tests?

While the net effect is the same today, it means there's only one place
to change if another x86 target gains PIE support in the future.

Pre-approved using that style.


Thanks!

Tested on amd64-freebsd and CentOS.

Andreas


This is what I committed:

2015-05-21  Andreas Tobler  andre...@gcc.gnu.org

* gcc.target/i386/pr32219-1.c: Use 'dg-require-effective-target pie'
instead of listing several targets on its own.



* gcc.target/i386/pr64317.c: Likewise.


Yes, I know. The comment and the content for this test case do not match.

Is this ok:

Index: gcc.target/i386/pr64317.c
===
--- gcc.target/i386/pr64317.c(revision 223498)
+++ gcc.target/i386/pr64317.c(working copy)
@@ -1,4 +1,5 @@
-/* { dg-do compile { target { { *-*-freebsd* *-*-linux* }  ia32 } } } */
+/* { dg-do compile { target ia32 } } */
+/* { dg-require-effective-target pie } */
  /* { dg-options -O2 -fpie } */
  /* { dg-final { scan-assembler addl\[
\\t\]+\[$\]_GLOBAL_OFFSET_TABLE_, %ebx } } */
  /* { dg-final { scan-assembler movl\[ \\t\]+c@GOTOFF\[(\]%ebx\[)\] }
} */


or do you prefer instead of

/* { dg-do compile { target ia32 } } */

this one:

/* { dg-do compile } */
/* { dg-require-effective-target ia32 } */


Btw, all three variants run on i386.
It looks like the dg-require-effective-target is typically on a line by 
itself.  So let's stick with the last version.


jeff



Re: [gofrontend-dev] Re: GO tools for gccgo cross

2015-05-22 Thread Ian Lance Taylor
On Fri, May 22, 2015 at 2:01 PM, Andrew Chambers
andrewchambe...@gmail.com wrote:

 For example, I've tested on an x86, built cross compilers for ppc64 and
 ppc64le and then can invoke the native go tool (built to run on x86) to
 compile with either target compiler by changing my GOARCH value.  I don't
 want to have a separate cross go tool to use only for ppc64 and another to
 use only for ppc64le.


 If I have mips-linux-gcc , mips-linux-gccgo, there is no reason  to not have
 a mips-linux-go which automatically uses the correct gccgo and is installed
 into the same prefix.
 It is in line with how gcc works currently, Also, for a canadian cross
 compile (--build differs to --host), building a new go tool is necessary
 anyway.

With the gc toolchain, it's very convenient to be able to say
GOARCH=arm go build program to get the ARM version of a program.  I
don't think we should move away from that in order to follow GCC
conventions.  The Go conventions are better and easier to use.

Ian


Re: [PATCH/libiberty] fix build of gdb/binutils with clang.

2015-05-22 Thread Ian Lance Taylor
On Wed, May 20, 2015 at 3:58 PM, Yunlian Jiang yunl...@google.com wrote:
 GCC bootstraps with this patch.

Committed as follows.

Ian


include/:

2015-05-22  Yunlian Jiang  yunl...@google.com

* libiberty.h (asprintf): Don't declare if HAVE_DECL_ASPRINTF is
not defined.

libiberty/:

2015-05-22  Yunlian Jiang  yunl...@google.com

* configure.ac: Add AC_GNU_SOURCE.
* Makefile.in (COMPILE.c): Add -D_GNU_SOURCE.
* configure, config.in: Rebuild.
* floatformat.c (_GNU_SOURCE): Don't define if already defined.
Index: include/libiberty.h
===
--- include/libiberty.h (revision 223578)
+++ include/libiberty.h (working copy)
@@ -621,7 +621,7 @@ extern int pexecute (const char *, char
 
 extern int pwait (int, int *, int);
 
-#if !HAVE_DECL_ASPRINTF
+#if defined(HAVE_DECL_ASPRINTF)  !HAVE_DECL_ASPRINTF
 /* Like sprintf but provides a pointer to malloc'd storage, which must
be freed by the caller.  */
 
Index: libiberty/Makefile.in
===
--- libiberty/Makefile.in   (revision 223578)
+++ libiberty/Makefile.in   (working copy)
@@ -113,7 +113,8 @@ installcheck: installcheck-subdir
 
 INCDIR=$(srcdir)/$(MULTISRCTOP)../include
 
-COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(CPPFLAGS) -I. -I$(INCDIR) $(HDEFINES) 
@ac_libiberty_warn_cflags@
+COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(CPPFLAGS) -I. -I$(INCDIR) \
+   $(HDEFINES) @ac_libiberty_warn_cflags@ -D_GNU_SOURCE
 
 # Just to make sure we don't use a built-in rule with VPATH
 .c.$(objext):
Index: libiberty/configure.ac
===
--- libiberty/configure.ac  (revision 223578)
+++ libiberty/configure.ac  (working copy)
@@ -155,6 +155,7 @@ AC_MSG_NOTICE([target_header_dir = $targ
 
 GCC_NO_EXECUTABLES
 AC_PROG_CC
+AC_GNU_SOURCE
 AC_SYS_LARGEFILE
 AC_PROG_CPP_WERROR
 
Index: libiberty/floatformat.c
===
--- libiberty/floatformat.c (revision 223578)
+++ libiberty/floatformat.c (working copy)
@@ -19,7 +19,9 @@ along with this program; if not, write t
 Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, 
USA.  */
 
 /* This is needed to pick up the NAN macro on some systems.  */
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 
 #ifdef HAVE_CONFIG_H
 #include config.h


Re: match.pd: (x | y) ~x - y ~x

2015-05-22 Thread Marc Glisse

On Mon, 18 May 2015, Richard Biener wrote:


On Fri, May 15, 2015 at 7:22 PM, Marc Glisse marc.gli...@inria.fr wrote:


we already have the more complicated: x  ~(x  y) - x  ~y (which I am
reindenting by the way) and the simpler: (~x | y)  x - x  y, so I am
proposing this one for completeness. Regtested on ppc64le-redhat-linux.


Ok (doesn't seem to be in fold-const.c).

Btw, there are quite some (simple) ones only in fold-const.c which are 
eligible for moving to match.pd (thus remove them from fold-const.c and 
implement in match.pd).  Mostly canonicalization ones though.


Haven't you already done a lot of those in the branch though?

--
Marc Glisse


Re: Reuse predicate code analysis for constraints

2015-05-22 Thread Jeff Law

On 05/22/2015 09:42 AM, Richard Sandiford wrote:

This patch adjusts the fix for PR target/65689 along the lines suggested
in https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01559.html.  The idea
is to reuse the existing gensupport.c routine to work out the codes
accepted by constraints.

I'd originally done this with an eye to using compute_test_codes for
the problem that Andreas found on s390.  I don't think it's going to
be useful for that after all, but it seems worth having for its on sake.

Bootstrapped  regression-tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* gensupport.h (compute_test_codes): Declare.
* gensupport.c (compute_predicate_codes): Rename to...
(compute_test_codes): ...this.  Generalize error message.
(process_define_predicate): Update accordingly.
* genpreds.c (compute_maybe_allows): Delete.
(add_constraint): Use compute_test_codes to determine whether
something can accept a SUBREG, REG or MEM.

OK.
jeff



Fix pa.md splitters for recent combine changes

2015-05-22 Thread Jeff Law



These should have gone in with the first patch in the series. 
Thankfully the splitters aren't terribly important anymore and thus 
having them goof'd up for a couple days hasn't been a problem.


In fact, much like the hppa_legitimize_address code to handle 
shift-add/scaled addressing modes, these splitters don't make a bit of 
difference in my 300+ testfiles -- again, these were probably in place 
for f2c translated code to avoid losing because the PA has integer 
indexed loads, but not integer indexed stores.


Verified hppa.exp still passes and that the 300+ testfiles get the same 
code before/after this change.


Installed on the trunk.

Jeff

* config/pa/pa.md (integer_indexed_store splitters): Use
mem_shadd_operand.  Use ASHIFT rather than MULT in the resulting
insns -- adjusting the constant 2nd operand accordingly.

diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index aaec27d..6cc7a3c 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -2819,42 +2819,54 @@
 ;; a 2 insn store with some creative RTL rewriting.
 (define_split
   [(set (mem:SI (plus:SI (mult:SI (match_operand:SI 0 register_operand )
-  (match_operand:SI 1 shadd_operand ))
+  (match_operand:SI 1 mem_shadd_operand ))
   (plus:SI (match_operand:SI 2 register_operand )
(match_operand:SI 3 const_int_operand 
(match_operand:SI 4 register_operand ))
(clobber (match_operand:SI 5 register_operand ))]
   
-  [(set (match_dup 5) (plus:SI (mult:SI (match_dup 0) (match_dup 1))
+  [(set (match_dup 5) (plus:SI (ashift:SI (match_dup 0) (match_dup 1))
   (match_dup 2)))
(set (mem:SI (plus:SI (match_dup 5) (match_dup 3))) (match_dup 4))]
-  )
+  
+{
+  operands[1] = GEN_INT (exact_log2 (INTVAL (operands[1])));
+
+})
 
 (define_split
   [(set (mem:HI (plus:SI (mult:SI (match_operand:SI 0 register_operand )
-  (match_operand:SI 1 shadd_operand ))
+  (match_operand:SI 1 mem_shadd_operand ))
   (plus:SI (match_operand:SI 2 register_operand )
(match_operand:SI 3 const_int_operand 
(match_operand:HI 4 register_operand ))
(clobber (match_operand:SI 5 register_operand ))]
   
-  [(set (match_dup 5) (plus:SI (mult:SI (match_dup 0) (match_dup 1))
+  [(set (match_dup 5) (plus:SI (ashift:SI (match_dup 0) (match_dup 1))
   (match_dup 2)))
(set (mem:HI (plus:SI (match_dup 5) (match_dup 3))) (match_dup 4))]
-  )
+  
+{
+  operands[1] = GEN_INT (exact_log2 (INTVAL (operands[1])));
+
+})
 
 (define_split
   [(set (mem:QI (plus:SI (mult:SI (match_operand:SI 0 register_operand )
-  (match_operand:SI 1 shadd_operand ))
+  (match_operand:SI 1 mem_shadd_operand ))
   (plus:SI (match_operand:SI 2 register_operand )
(match_operand:SI 3 const_int_operand 
(match_operand:QI 4 register_operand ))
(clobber (match_operand:SI 5 register_operand ))]
   
-  [(set (match_dup 5) (plus:SI (mult:SI (match_dup 0) (match_dup 1))
+  [(set (match_dup 5) (plus:SI (ashift:SI (match_dup 0) (match_dup 1))
   (match_dup 2)))
(set (mem:QI (plus:SI (match_dup 5) (match_dup 3))) (match_dup 4))]
-  )
+  
+{
+  operands[1] = GEN_INT (exact_log2 (INTVAL (operands[1])));
+
+})
 
 (define_expand movhi
   [(set (match_operand:HI 0 general_operand )


[PATCH], Add IEEE 128-bit floating point to PowerPC, patch #1

2015-05-22 Thread Michael Meissner
This patch is the first in a series of patches that will eventually add support
for IEEE 128-bit floating point support to the PowerPC GCC compiler.  At the
current time, we do not plan to change the default for long double.  I added a
new type keyword (__float128) to get access to IEEE 128-bit floating point, and
another (__ibm128) to get access to IBM extended double type.

Until all of the GCC and LIBGCC patches have been committed, you will not be
able to use IEEE 128-bit floating point, and -mfloat128-software will not be
turned on by default.

This patch adds the new modes (KFmode and IFmode) and the switches
(-mfloat128-{none,software}).

Due to the fact that TFmode in the PowerPC compiler either represents IEEE
128-bit floating point or the IBM extended double (double-double) format.  For
most PowerPC users, the default is to use IBM extended double for long double.
Because TFmode can be either floating point format, I added new new modes:

KFmode  -- IEEE 128-bit floating point
IFmode  -- IBM extended double floating point

If the default for TFmode is ibm extended double, the port will eventually use
KFmode for IEEE 128-bit floating point.  Likewise if the default for TFmode is
IEEE 128-bit floating point, the port will use TFmode for IEEE 128-bit floating
point, and IFmode for IBM extended double.

I have bootstraped these patches on a power7 and compared them to the unpatched
compiler.  There were no changes when running make check.  Are these patches ok
to install in the trunk?

2015-05-22  Michael Meissner  meiss...@linux.vnet.ibm.com

* config/rs6000/rs6000-modes.def (IFmode): Define IFmode to
provide access to the IBM extended double floating point mode if
long double is IEEE 128-bit floating point.
(KFmode): Define KFmode to provide access to IEEE 128-bit floating
point if long double is the IBM extended double type.

* config/rs6000/rs6000.opt (-mfloat128-none): New switches to
enable adding IEEE 128-bit floating point support.
(-mfloat128-software): Likewise.
(-mfloat128-sw): Likewise.

* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Do not allow
128-bit floating point types to occupy any register if
-mlong-double-64.  Do not allow use of IFmode/KFmode unless
-mfloat128-software is enabled.
(rs6000_debug_reg_global): Add IEEE 128-bit floating point debug
support.
(rs6000_option_override_internal): Add -mfloat128-* support.
(rs6000_init_builtins): Setup __ibm128 and __float128 type modes.

* config/rs6000/rs6000.h (rs6000_builtin_type_index): Add ibm128
and float128 type nodes.
(ieee128_float_type_node): Likewise.
(ibm128_float_type_node): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-modes.def
===
--- gcc/config/rs6000/rs6000-modes.def  (revision 223458)
+++ gcc/config/rs6000/rs6000-modes.def  (working copy)
@@ -18,6 +18,13 @@
along with GCC; see the file COPYING3.  If not see
http://www.gnu.org/licenses/.  */
 
+/* IBM 128-bit floating point.  IFmode and KFmode use the fractional float
+   support in order to declare 3 128-bit floating point types.  */
+FRACTIONAL_FLOAT_MODE (IF, 106, 16, ibm_extended_format);
+
+/* Explicit IEEE 128-bit floating point.  */
+FRACTIONAL_FLOAT_MODE (KF, 113, 16, ieee_quad_format);
+
 /* 128-bit floating point.  ABI_V4 uses IEEE quad, AIX/Darwin
adjust this in rs6000_option_override_internal.  */
 FLOAT_MODE (TF, 16, ieee_quad_format);
Index: gcc/config/rs6000/rs6000.opt
===
--- gcc/config/rs6000/rs6000.opt(revision 223458)
+++ gcc/config/rs6000/rs6000.opt(working copy)
@@ -600,3 +600,19 @@ Allow float/double variables in upper re
 moptimize-swaps
 Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
 Analyze and remove doubleword swaps from VSX computations.
+
+mfloat128-
+Target RejectNegative Joined Enum(float128_type_t) Var(TARGET_FLOAT128) 
Init(FLOAT128_UNSET) Save
+-mfloat128-{software,none} - Specify how IEEE 128-bit floating point is used.
+
+Enum
+Name(float128_type_t) Type(enum float128_type_t)
+
+EnumValue
+Enum(float128_type_t) String(none) Value(FLOAT128_NONE)
+
+EnumValue
+Enum(float128_type_t) String(software) Value(FLOAT128_SW)
+
+EnumValue
+Enum(float128_type_t) String(sw) Value(FLOAT128_SW)
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 223458)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1817,6 +1817,16 @@ rs6000_hard_regno_mode_ok (int regno, ma
 IN_RANGE (last_regno, FIRST_GPR_REGNO, LAST_GPR_REGNO)
 ((regno  1) == 

Re: C/C++ PATCH to allow deprecating enum values (PR c/47043)

2015-05-22 Thread Mikhail Maltsev
On 22.05.2015 12:10, Marek Polacek wrote:
 Thanks, applied.  Here's the final version.

By the way, we have a feature test macro, __cpp_attributes=200809 which
can be used to determine, whether C++11 attribute syntax is supported by
the compiler.

I propose to add something similar for this extension (like
__cpp_gnu_enum_attributes=... for C++ and __GCC_HAVE_ENUM_ATTRIBUTES for
C). Thoughts?

-- 
Regards,
Mikhail Maltsev


Re: [RFA] Fix combine to canonicalize (mult X pow2)) more often

2015-05-22 Thread Jeff Law

On 05/22/2015 09:45 AM, Segher Boessenkool wrote:

On Thu, May 21, 2015 at 09:24:37AM -0600, Jeff Law wrote:

When combine needs to split a complex insn, it will canonicalize a
simple (mult X (const_int Y)) where Y is a power of 2 into the expected
(ashift X (const_int Y')) if the (mult ...) is selected as a split point.

However if the split point is (plus (mult (X (const_int Y)) Z) combine
fails to canonicalize.



OK for the trunk?


Okay.  Something went wrong with your changelog though.  Rebasing
changelogs is a recipe for disaster.
Yup. I still haven't found a workflow that makes sense with ChangeLogs. 
Ultimately I end up having to check them before each commit/push, so 
assume it'll end up in the right place :-)




The following is just copy-paste, but anyway...


+ /* Similarly for (plus (mult FOO (const_int pow2))).  */
+ if (split_code == PLUS
+  GET_CODE (XEXP (*split, 0)) == MULT
+  CONST_INT_P (XEXP (XEXP (*split, 0), 1))
+  INTVAL (XEXP (XEXP (*split, 0), 1))  0
+  (i = exact_log2 (UINTVAL (XEXP (XEXP (*split, 0), 1 = 0)


INTVAL (..)  0  here disallows 163 while exact_log2 allows it
(with a 64-bit HWI).  Most places don't check the  0 and it doesn't
seem necessary in the places that do.

This won't ever trigger for a SImode 131 either.

None of this matters when we're just looking at scaled indexing, of
course.  But exact_log2 is much harder to use correctly than to use
it incorrectly :-(
That test was just copied from the equivalent hunk of code that tests 
for (mult FOO (const_int pow2)) above. It's certainly ok for the INTVAL 
to reject things that might be allowed through the exact_log2 test.  But 
as you say, this isn't going to matter.


Jeff


RE: [PATCH] Print Pass Names

2015-05-22 Thread Aditya K



 Subject: Re: [PATCH] Print Pass Names
 From: richard.guent...@gmail.com
 Date: Fri, 22 May 2015 21:32:24 +0200
 To: hiradi...@msn.com; gcc-patches@gcc.gnu.org

 On May 22, 2015 6:32:38 PM GMT+02:00, Aditya K hiradi...@msn.com wrote:
Currently, when we print the passes it does not print its name. This
becomes confusing when we want to print all the passes at once (e.g.,
-fdump-tree-all-all=stderr  pass.dump).
This patch adds functionality to print the pass name. It passes
bootstrap (with default configurations).

Hope this is useful.

 Can't you just use current_pass-name?

You are right. I have updated the patch.
Thanks
-Aditya

gcc/ChangeLog:
 
2015-05-22  Aditya Kumar  hiradi...@msn.com
 
* statistics.c (statistics_fini_pass): Print pass name.
 
 
---
 gcc/statistics.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 
diff --git a/gcc/statistics.c b/gcc/statistics.c
index 8cbe88d..50b41d1 100644
--- a/gcc/statistics.c
+++ b/gcc/statistics.c
@@ -201,7 +201,7 @@ statistics_fini_pass (void)
dump_flags  TDF_STATS)
 {
   fprintf (dump_file, \n);
-  fprintf (dump_file, Pass statistics:\n);
+  fprintf (dump_file, Pass statistics of \%s\: , current_pass-name);
   fprintf (dump_file, \n);
   curr_statistics_hash ()
-traverse_noresize void *, statistics_fini_pass_1 (NULL);
-- 
2.1.0.243.g30d45f7




 Richard.

Thanks,
-Aditya


gcc/ChangeLog:

2015-05-22 Aditya Kumar hiradi...@msn.com

 * passes.c (execute_todo): Added a parameter to pass the pass name.
 (execute_one_pass): Likewise
 * statistics.c (statistics_fini_pass): Likewise
 * statistics.h: Likewise

---
 gcc/passes.c | 9 +
 gcc/statistics.c | 4 ++--
 gcc/statistics.h | 2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/gcc/passes.c b/gcc/passes.c
index 04ff042..7d41bd8 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1984,7 +1984,7 @@ execute_function_todo (function *fn, void *data)

 /* Perform all TODO actions. */
 static void
-execute_todo (unsigned int flags)
+execute_todo (unsigned int flags, const char *pass_name=NULL)
 {
 #if defined ENABLE_CHECKING
 if (cfun
@@ -1997,7 +1997,7 @@ execute_todo (unsigned int flags)
 /* Inform the pass whether it is the first time it is run. */
 first_pass_instance = (flags  TODO_mark_first_instance) != 0;

- statistics_fini_pass ();
+ statistics_fini_pass (pass_name);

 if (flags)
 do_per_function (execute_function_todo, (void *)(size_t) flags);
@@ -2302,7 +2302,7 @@ execute_one_pass (opt_pass *pass)
 pass_init_dump_file (pass);

 /* Run pre-pass verification. */
- execute_todo (pass-todo_flags_start);
+ execute_todo (pass-todo_flags_start, pass-name);

 #ifdef ENABLE_CHECKING
 do_per_function (verify_curr_properties,
@@ -2327,7 +2327,8 @@ execute_one_pass (opt_pass *pass)
 check_profile_consistency (pass-static_pass_number, 0, true);

 /* Run post-pass cleanup and verification. */
- execute_todo (todo_after | pass-todo_flags_finish |
TODO_verify_il);
+ execute_todo (todo_after | pass-todo_flags_finish | TODO_verify_il,
+ pass-name);
 if (profile_report  cfun  (cfun-curr_properties  PROP_cfg))
 check_profile_consistency (pass-static_pass_number, 1, true);

diff --git a/gcc/statistics.c b/gcc/statistics.c
index 8cbe88d..54f81d5 100644
--- a/gcc/statistics.c
+++ b/gcc/statistics.c
@@ -192,7 +192,7 @@ statistics_fini_pass_3 (statistics_counter_t
**slot,
 /* Dump the current statistics incrementally. */

 void
-statistics_fini_pass (void)
+statistics_fini_pass (const char *pass_name)
 {
 if (current_pass-static_pass_number == -1)
 return;
@@ -201,7 +201,7 @@ statistics_fini_pass (void)
  dump_flags  TDF_STATS)
 {
 fprintf (dump_file, \n);
- fprintf (dump_file, Pass statistics:\n);
+ fprintf (dump_file, Pass statistics of \%s\:\n, pass_name);
 fprintf (dump_file, \n);
 curr_statistics_hash ()
 -traverse_noresize void *, statistics_fini_pass_1 (NULL);
diff --git a/gcc/statistics.h b/gcc/statistics.h
index 0b871ec..4348b7a 100644
--- a/gcc/statistics.h
+++ b/gcc/statistics.h
@@ -64,7 +64,7 @@ struct function;
 extern void statistics_early_init (void);
 extern void statistics_init (void);
 extern void statistics_fini (void);
-extern void statistics_fini_pass (void);
+extern void statistics_fini_pass (const char *pass_name = NULL);
extern void statistics_counter_event (struct function *, const char *,
int);
extern void statistics_histogram_event (struct function *, const char
*, int);



  

Re: [PATCH] Print Pass Names

2015-05-22 Thread Jeff Law

On 05/22/2015 02:38 PM, Aditya K wrote:





Subject: Re: [PATCH] Print Pass Names
From: richard.guent...@gmail.com
Date: Fri, 22 May 2015 21:32:24 +0200
To: hiradi...@msn.com; gcc-patches@gcc.gnu.org

On May 22, 2015 6:32:38 PM GMT+02:00, Aditya K hiradi...@msn.com wrote:

Currently, when we print the passes it does not print its name. This
becomes confusing when we want to print all the passes at once (e.g.,
-fdump-tree-all-all=stderr  pass.dump).
This patch adds functionality to print the pass name. It passes
bootstrap (with default configurations).

Hope this is useful.


Can't you just use current_pass-name?


You are right. I have updated the patch.
Thanks
-Aditya

gcc/ChangeLog:

2015-05-22  Aditya Kumar  hiradi...@msn.com

 * statistics.c (statistics_fini_pass): Print pass name.OK.

jeff



Re: [Patch libstdc++] Rewrite cpu/generic/atomic_word.h

2015-05-22 Thread David Edelsohn
I bootstrapped this on powerpc-ibm-aix7.1.0.0 and my colleagues
bootstrapped this on powerpc64-linux and powerpc64le-linux.

It works and produces reasonable instruction sequences.

We can iterate on the syntax, but the core concept seems to work correctly.

Thanks, David


Re: [Patch]: libbacktrace - add support of PE/COFF

2015-05-22 Thread Ian Lance Taylor
On Thu, May 21, 2015 at 5:41 AM, Tristan Gingold ging...@adacore.com wrote:

 2015-05-21  Tristan Gingold  ging...@adacore.com

 * pecoff.c: New file.
 * Makefile.am (FORMAT_FILES): Add pecoff.c and dependencies.
 * Makefile.in: Regenerate.
 * filetype.awk: Detect pecoff.
 * configure.ac: Define BACKTRACE_SUPPORTS_DATA on elf platforms.
 Add pecoff.
 * btest.c (test5): Test enabled only if BACKTRACE_SUPPORTS_DATA is
 true.
 * backtrace-supported.h.in (BACKTRACE_SUPPORTS_DATA): Define.
 * configure: Regenerate.
 * pecoff.c: New file.


Thanks for working on this.


 diff --git a/libbacktrace/backtrace-supported.h.in 
 b/libbacktrace/backtrace-supported.h.in
 index 5115ce1..4574635 100644
 --- a/libbacktrace/backtrace-supported.h.in
 +++ b/libbacktrace/backtrace-supported.h.in
 @@ -59,3 +59,8 @@ POSSIBILITY OF SUCH DAMAGE.  */
 as 0.  */

  #define BACKTRACE_SUPPORTS_THREADS @BACKTRACE_SUPPORTS_THREADS@
 +
 +/* BACKTRACE_SUPPORTS_DATA will be #defined'd as 1 if the backtrace library
 +   also handles data symbols, 0 if not.  */
 +
 +#define BACKTRACE_SUPPORTS_DATA @BACKTRACE_SUPPORTS_DATA@

End users are expected to read and understand this file, so I think
this comment is too obscure.  I suggest:

BACKTRACE_SUPPORTS_DATA will be #define'd as 1 if backtrace_syminfo
will work for variables.  It will always work for functions.


I would have thought you could distinguish relevant symbols using the
storage class and type.  But perhaps not.



 diff --git a/libbacktrace/filetype.awk b/libbacktrace/filetype.awk
 index 0a656f7..37099ad 100644
 --- a/libbacktrace/filetype.awk
 +++ b/libbacktrace/filetype.awk
 @@ -1,3 +1,4 @@
  # An awk script to determine the type of a file.
  /\177ELF\001/ { if (NR == 1) { print elf32; exit } }
  /\177ELF\002/ { if (NR == 1) { print elf64; exit } }
 +/\114\001/{ if (NR == 1) { print pecoff; exit } }

That works for 32-bit, but I think not for 64-bit.  For 64-bit I would
expect \144\206.


 +#include windows.h

Where is windows.h going to come from when building a
cross-compiler?  I think this needs to be removed.  I see that you
define the structs yourself, as you should, so why do you need
windows.h?



 +/* Read a potentially unaligned 2 byte word at P, using native endianness.  
 */

Is there really ever a case where a 2 byte word might be misaligned?


 +/* Return true iff SYM is a defined symbol for a function.  Data symbols
 +   are discarded because they aren't easily identified.  */
 +
 +static int
 +coff_is_symbol (const b_coff_internal_symbol *isym)
 +{
 +  return isym-type == 0x20  isym-sec  0;
 +}

Is this really right?  This seems to test for DT_FCN set, but won't a
function returning, say, int, have type 0x24 (DT_FCN  N_TBSHFT) | T_INT?

Also, the name isn't right--this is coff_is_function_symbol.


 +  if (coff_expand_symbol (isym, asym, sects_num, strtab, strtab_size)  
 0)
 +   {
 + error_callback (data, invalid coff symbol, 0);
 + return 0;
 +   }

That's not a very useful error message--can you be more specific?


 +  /* Allocate memory for symbols are strings.  */

Comment looks wrong--omit are?


Ian


[PATCH, AARCH64] make stdarg functions work with +nofp

2015-05-22 Thread Jim Wilson
The compiler currently ICEs when compiling a stdarg function with
+nofp, as reported in PR 66258.

The aarch64.md file disables FP instructions using TARGET_FLOAT, which
supports both -mgeneral-regs-only and +nofp.  But there is code in
aarch64.c that checks TARGET_GENERAL_REGS_ONLY.  This results in FP
instructions when using +nofp,  The aarch64.c code needs to use
TARGET_FLOAT instead like the md file already does.

I can't meaningfully test this with a bootstrap, since the patch has
no effect unless I bootstrap with +nofp, and that will fail as gcc
contains floating point code.

The testsuite already has multiple stdarg tests, so there is no need
for another one.

I tested this by verifying I get the same results for some simple
testcasess with and without the patch, with and without using
-mgeneral-regs-only and -mcpu=cortex-a53+nofp.
2015-05-22  Jim Wilson  jim.wil...@linaro.org

	PR target/66258
	* config/aarch64/aarch64.c (aarch64_function_value_regno_p): Change
	!TARGET_GENERAL_REGS_ONLY to TARGET_FLOAT.
	(aarch64_secondary_reload): Likewise
	(aarch64_expand_builtin_va_start): Change TARGET_GENERAL_REGS_ONLY
	to !TARGET_FLOAT.
	(aarch64_gimplify_va_arg_expr, aarch64_setup_incoming_varargs):
	Likewise.

Index: config/aarch64/aarch64.c
===
--- config/aarch64/aarch64.c	(revision 223590)
+++ config/aarch64/aarch64.c	(working copy)
@@ -1666,7 +1666,7 @@ aarch64_function_value_regno_p (const un
   /* Up to four fp/simd registers can return a function value, e.g. a
  homogeneous floating-point aggregate having four members.  */
   if (regno = V0_REGNUM  regno  V0_REGNUM + HA_MAX_NUM_FLDS)
-return !TARGET_GENERAL_REGS_ONLY;
+return TARGET_FLOAT;
 
   return false;
 }
@@ -4783,7 +4783,7 @@ aarch64_secondary_reload (bool in_p ATTR
   /* A TFmode or TImode memory access should be handled via an FP_REGS
  because AArch64 has richer addressing modes for LDR/STR instructions
  than LDP/STP instructions.  */
-  if (!TARGET_GENERAL_REGS_ONLY  rclass == GENERAL_REGS
+  if (TARGET_FLOAT  rclass == GENERAL_REGS
GET_MODE_SIZE (mode) == 16  MEM_P (x))
 return FP_REGS;
 
@@ -7571,7 +7571,7 @@ aarch64_expand_builtin_va_start (tree va
   vr_save_area_size
 = (NUM_FP_ARG_REGS - cum-aapcs_nvrn) * UNITS_PER_VREG;
 
-  if (TARGET_GENERAL_REGS_ONLY)
+  if (!TARGET_FLOAT)
 {
   if (cum-aapcs_nvrn  0)
 	sorry (%qs and floating point or vector arguments,
@@ -7681,7 +7681,7 @@ aarch64_gimplify_va_arg_expr (tree valis
 	   is_ha))
 {
   /* TYPE passed in fp/simd registers.  */
-  if (TARGET_GENERAL_REGS_ONLY)
+  if (!TARGET_FLOAT)
 	sorry (%qs and floating point or vector arguments,
 	   -mgeneral-regs-only);
 
@@ -7918,7 +7918,7 @@ aarch64_setup_incoming_varargs (cumulati
   gr_saved = NUM_ARG_REGS - local_cum.aapcs_ncrn;
   vr_saved = NUM_FP_ARG_REGS - local_cum.aapcs_nvrn;
 
-  if (TARGET_GENERAL_REGS_ONLY)
+  if (!TARGET_FLOAT)
 {
   if (local_cum.aapcs_nvrn  0)
 	sorry (%qs and floating point or vector arguments,


Re: Fix two more memory leaks in threader

2015-05-22 Thread Jeff Law

On 05/20/2015 10:41 AM, Jakub Jelinek wrote:

On Wed, May 20, 2015 at 10:36:25AM -0600, Jeff Law wrote:


These fix the remaining leaks in the threader that I'm aware of.  We failed
to properly clean-up when we had to cancel certain jump threading
opportunities.  So thankfully this wasn't a big leak.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Installed on
the trunk.

Jeff



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index fe4dfc4..27435c6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2015-05-20  Jeff Law  l...@redhat.com
+
+   * tree-ssa-threadupdate.c (mark_threaded_blocks): Properly
+   dispose of the jump thread path when the jump threading
+   opportunity is cancelled.
+
  2015-05-20  Manuel López-Ibáñez  m...@gcc.gnu.org

* diagnostic.c (diagnostic_print_caret_line): Fix off-by-one error
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index c5b78a4..4bccad0 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2159,9 +2159,16 @@ mark_threaded_blocks (bitmap threaded_blocks)
  {
  /* Attach the path to the starting edge if none is yet recorded.  */
if ((*path)[0]-e-aux == NULL)
-(*path)[0]-e-aux = path;
- else if (dump_file  (dump_flags  TDF_DETAILS))
-   dump_jump_thread_path (dump_file, *path, false);
+   {
+  (*path)[0]-e-aux = path;
+   }


Why the braces around single stmt if body?

To match what was happening in the else clause.  I always find

if (fubar)
  something
else
  {
more stuff
even more stuff
  }

more painful to parse because of the mis-matched indentation than

if (fubar)
  {
something
  }
else
  {
more stuff
even more stuff
  }

It's purely a style thing and if you'd prefer not to have the extra 
braces, I wouldn't lose any sleep over removing them.



Also, the indentation seems to be weird.
Looks like spaces vs tabs issue.  I'll do a pass over 
tree-ssa-threadedge.c and fix them all up at once.


jeff



Re: [gofrontend-dev] Re: GO tools for gccgo cross

2015-05-22 Thread Ian Lance Taylor
On Fri, May 22, 2015 at 3:11 PM, Andrew Chambers
andrewchambe...@gmail.com wrote:

 I'm not suggesting breaking go conventions, I just think the default if no
 GOARCH is specified then it should match --target.

Sounds good to me.

 Perhaps we could check the symlink name for the target triple if no GOARCH
 is set.

We should probably build it into the tool, perhaps in zdefaultcc.go
created in gotools/Makefile.am.

Ian


Re: PR fortran/44054 Convert all gfc_error_1 calls to gfc_error

2015-05-22 Thread Manuel López-Ibáñez
PING: https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01511.html

This only needs approval from Fortran maintainers.

On 17 May 2015 at 20:22, Manuel López-Ibáñez lopeziba...@gmail.com wrote:
 Hi,

 This patch finishes the conversion of Fortran diagnostics to use the
 common diagnostics by removing all gfc_error*_1 variants.

 I noticed that whether some buffered gfc_error_1() end up printed may
 depend on whether a gfc_error_now is given or not, and not only on
 whether there is any output buffered. Thus, I reintroduced a new
 error_buffer of type gfc_buffer_error.

 The rest is mostly mechanic.

 I did not make an attempt in this patch to remove all code that has
 become obsolete now:

 gfc_get_terminal_width (already implemented in diagnostics.c)
 error_char (already empty, but used by other obsolete functions)
 error_string (obsolete, just use %s)
 error_uinteger (obsolete, just use %lu)
 error_integer (obsolete, just use %ld)
 gfc_widechar_display_length, gfc_wide_display_length,
 print_wide_char_into_buffer, gfc_print_wide_char (I'm not sure how
 this functionality differs from what the common diagnostics already
 do, perhaps some of it should be moved to the common code)
 show_locus (obsolete, except Included at handling should be moved to
 the common diagnostics, no testcase is testing this).
 show_loci (obsolete, except During initialization handling should be
 moved to the common diagnostics, no testcase is testing this)
 error_print, error_printf (obsolete)

 Bootstrapped and regression tested on x86_64-linux-gnu.

 OK?

 gcc/fortran/ChangeLog:

 2015-05-17  Manuel López-Ibáñez  m...@gcc.gnu.org

 PR fortran/44054
 * gfortran.h (struct gfc_error_buf): Rename as
 gfc_error_buffer. Move closer to push, pop and free
 methods. Reimplement using an output_buffer.
 * error.c (errors, warnings, warning_buffer, cur_error_buffer):
 Delete everywhere in this file.
 (error_char): Delete all contents.
 (gfc_increment_error_count): Delete.
 (gfc_error_now): Update comment. Set error_buffer.flag.
 (gfc_warning_check): Do not handle warning_buffer.
 (gfc_error_1): Delete.
 (gfc_error_now_1): Delete.
 (gfc_error_check): Simplify.
 (gfc_move_error_buffer_from_to): Renamed from
 gfc_move_output_buffer_from_to.
 (gfc_push_error): Handle only gfc_error_buffer.
 (gfc_pop_error): Likewise.
 (gfc_free_error): Likewise.
 (gfc_get_errors): Remove warnings and errors.
 (gfc_diagnostics_init): Use static error_buffer.
 (gfc_error_1,gfc_error_now_1): Delete declarations.
 * symbol.c, decl.c, trans-common.c, data.c, expr.c, expr.c,
 frontend-passes.c, resolve.c, match.c, parse.c: Replace
 gfc_error_1 with gfc_error and gfc_error_now_1 with gfc_error_1
 everywhere.
 * f95-lang.c (gfc_be_parse_file): Do not update errorcount and
 warningcount here.
 * primary.c (match_complex_constant): Replace gfc_error_buf and
 output_buffer with gfc_error_buffer.


Re: [PATCH GCC]Improve how we handle overflow for type conversion in scev/ivopts, part I

2015-05-22 Thread Richard Biener
On Wed, May 20, 2015 at 11:41 AM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 As we know, GCC is too conservative when checking overflow behavior in SCEV
 and loop related optimizers.  Result is some variable can't be recognized as
 scalar evolution and thus optimizations are missed.  To be specific,
 optimizers like ivopts and vectorizer are affected.
 This issue is more severe on 64 bit platforms, for example, PR62173 is
 failed on aarch64; scev-3.c and scev-4.c were marked as XFAIL on lp64
 platforms.

 As the first part to improve overflow checking in GCC, this patch does below
 improvements:
   1) Ideally, chrec_convert should be responsible to convert scev like
 (type){base, step} to scev like {(type)base, (type)step} when the result
 scev doesn't overflow; chrec_convert_aggressive should do the conversion if
 the result scev could overflow/wrap.  Unfortunately, current implementation
 may use chrec_convert_aggressive to return a scev that won't overflow.  This
 is because of a) the static parameter fold_conversions for
 instantiate_scev_convert can only tracks whether chrec_convert_aggressive
 may be called, rather than if it does some overflow conversion or not;  b)
 the implementation of instantiate_scev_convert sometimes shortcuts the call
 to chrec_convert and misses conversion opportunities.  This patch improves
 this.
   2) iv-no_overflow computed in simple_iv is too conservative.  With 1)
 fixed, iv-no_overflow should reflects whether chrec_convert_aggressive does
 return an overflow scev.  This patch improves this.
   3) chrec_convert should be able to prove the resulting scev won't overflow
 with loop niter information.  This patch doesn't finish this, but it
 factored a new interface out of scev_probably_wraps_p for future
 improvement.  And that will be the part II patch.

 With the improvements in SCEV, this patch also improves optimizer(IVOPT)
 that uses scev information like below:
   For array reference in the form of arr[IV], GCC tries to derive new
 address iv {arr+iv.base, iv.step*elem_size} from IV.  If IV overflow wrto a
 type that is narrower than address space, this derivation is not true
 because arr[IV] isn't a scev.  Root cause why scev-*.c are failed now is
 the overflow information of IV is too conservative.  IVOPT has to be
 conservative to reject arr[IV] as a scev.  With more accurate overflow
 information, IVOPT can be improved too.  So this patch fixes the mentioned
 long standing issues.

 Bootstrap and test on x86_64, x86 and aarch64.
 BTW, test gcc.target/i386/pr49781-1.c failed on x86_64, but I can confirmed
 it's not this patch's fault.

 So what's your opinion on this?.

I maybe mixing things up but does

+chrec_convert_aggressive (tree type, tree chrec, bool *fold_conversions)
 {
...
+  if (evolution_function_is_affine_p (chrec))
+{
+  tree base, step;
+  struct loop *loop;
+
+  loop = get_chrec_loop (chrec);
+  base = CHREC_LEFT (chrec);
+  step = CHREC_RIGHT (chrec);
+  if (convert_affine_scev (loop, type, base, step, NULL, true))
+   return build_polynomial_chrec (loop-num, base, step);

^^^ not forget to set *fold_conversions to true?  Or we need to use
convert_affine_scev (..., false)?

+}

(bah, and the diff somehow messes up -p context :/  which is why I like
context diffs more)

Other from the above the patch looks good to me.

Thanks,
Richard.

 Thanks,
 bin

 2015-05-20  Bin Cheng  bin.ch...@arm.com

 PR tree-optimization/62173
 * tree-ssa-loop-ivopts.c (struct iv): New field.  Reorder fields.
 (alloc_iv, set_iv): New parameter.
 (determine_biv_step): Delete.
 (find_bivs): Inline original determine_biv_step.  Pass new
 argument to set_iv.
 (idx_find_step): Use no_overflow information for conversion.
 * tree-scalar-evolution.c (analyze_scalar_evolution_in_loop): Let
 resolve_mixers handle folded_casts.
 (instantiate_scev_name): Change bool parameter to bool pointer.
 (instantiate_scev_poly, instantiate_scev_binary): Ditto.
 (instantiate_array_ref, instantiate_scev_not): Ditto.
 (instantiate_scev_3, instantiate_scev_2): Ditto.
 (instantiate_scev_1, instantiate_scev_r): Ditto.
 (instantiate_scev_convert, ): Change parameter.  Pass argument
 to chrec_convert_aggressive.
 (instantiate_scev): Change argument.
 (resolve_mixers): New parameter and set it.
 (scev_const_prop): New argument.
 * tree-scalar-evolution.h (resolve_mixers): New parameter.
 * tree-chrec.c (convert_affine_scev): Call chrec_convert instead
 of chrec_conert_1.
 (chrec_convert): New parameter.  Move definition below.
 (chrec_convert_aggressive): New parameter and set it.  Call
 convert_affine_scev.
 * tree-chrec.h (chrec_convert): New parameter.
 (chrec_convert_aggressive): Ditto.
 * tree-ssa-loop-niter.c (loop_exits_before_overflow): 

Re: ODR merging and implicit typedefs

2015-05-22 Thread Eric Botcazou
 I will take a look if I can improve type_in_anonymous_namepsace somehow.  So
 Ada produces TYPE_DECL with DECL_ABSTRACT that do have TYPE_STUB_DECL with
 TREE_PUBLIC NULL I suppose.

Do you mean DECL_ARTIFICIAL instead of DECL_ABSTRACT?  If so, presumably, yes, 
why wouldn't it do that?  That seems the natural description of an artificial 
private type with file scope.

And I don't really understand why DECL_ARTIFICIAL is allowed to make such a 
difference in semantics here.  That looks dangerous to me.


Note that the problematic assertions:

  gcc_assert (!type_with_linkage_p (t1) || !type_in_anonymous_namespace_p 
(t1));
  gcc_assert (!type_with_linkage_p (t2) || !type_in_anonymous_namespace_p 
(t2));

make the following code unreachable:

  if ((type_with_linkage_p (t1)  type_in_anonymous_namespace_p (t1))
  || (type_with_linkage_p (t2)  type_in_anonymous_namespace_p (t2)))
{
  /* We can not trip this when comparing ODR types, only when trying to
 match different ODR derivations from different declarations.
 So WARN should be always false.  */
  gcc_assert (!warn);
  return false;
}

-- 
Eric Botcazou


Re: Don't dump low gimple functions in gimple dump

2015-05-22 Thread Richard Biener
On Thu, May 21, 2015 at 5:36 PM, Thomas Schwinge
tho...@codesourcery.com wrote:
 Hi!

 It's just been a year.  ;-P

 In early March, I (hopefully correctly) adapted Tom's patch to apply to
 then-current GCC trunk sources; posting this here.  Is the general
 approach OK?

 On Tue, 20 May 2014 10:16:45 +0200, Tom de Vries tom_devr...@mentor.com 
 wrote:
 Honza,

 Consider this program:
 ...
 int
 main(void)
 {
 #pragma omp parallel
   {
 extern void foo(void);
 foo ();
   }
   return 0;
 }
 ...

 When compiling this program with -fopenmp, the ompexp pass splits off a new
 function called main._omp_fn.0 containing the call to foo.  The new function 
 is
 then dumped into the gimple dump by analyze_function.

 There are two problems with this:
 - the new function is in low gimple, and is dumped next to high gimple
   functions
 - since it's already low, the new function is not lowered, and 'goes missing'
   in the dumps following the gimple dump, until it reappears again after the
   last lowering dump.
   [ http://gcc.gnu.org/ml/gcc/2014-03/msg00312.html ]

 This patch fixes the problems by ensuring that analyze_function only dumps 
 the
 new function to the gimple dump after gimplification (specifically, by moving
 the dump_function call into gimplify_function_tree.  That makes the call to
 dump_function in finalize_size_functions superfluous).

 That also requires us to add a call to dump_function in finalize_task_copyfn,
 where we split off a new high gimple function.

 And in expand_omp_taskreg and expand_omp_target, where we split off a new low
 gimple function, we now dump the new function into the current (ompexp) dump
 file, which is the last lowering dump.

 Finally, we dump an information statement at the start of
 cgraph_add_new_function to give a better idea when and what kind of function 
 is
 created.

 Bootstrapped and reg-tested on x86_64.

 OK for trunk ?

 Thanks,
 - Tom

 commit b925b393c3d975a9281789d97aff8a91a8b53be0
 Author: Thomas Schwinge tho...@codesourcery.com
 Date:   Sun Mar 1 15:05:15 2015 +0100

 Don't dump low gimple functions in gimple dump

 id:537b0f6d.7060...@mentor.com or id:53734dc5.90...@mentor.com

 2014-05-19  Tom de Vries  t...@codesourcery.com

 * cgraphunit.c (cgraph_add_new_function): Dump message on new 
 function.
 (analyze_function): Don't dump function to gimple dump file.
 * gimplify.c: Add tree-dump.h include.
 (gimplify_function_tree): Dump function to gimple dump file.
 * omp-low.c: Add tree-dump.h include.
 (finalize_task_copyfn): Dump new function to gimple dump file.
 (expand_omp_taskreg, expand_omp_target): Dump new function to dump 
 file.
 * stor-layout.c (finalize_size_functions): Don't dump function to 
 gimple
 dump file.

 * gcc.dg/gomp/dump-task.c: New test.
 ---
  gcc/cgraphunit.c  | 15 ++-
  gcc/gimplify.c|  3 +++
  gcc/omp-low.c |  6 ++
  gcc/stor-layout.c |  1 -
  gcc/testsuite/gcc.dg/gomp/dump-task.c | 33 +
  5 files changed, 56 insertions(+), 2 deletions(-)

 diff --git gcc/cgraphunit.c gcc/cgraphunit.c
 index 8280fc4..0860c86 100644
 --- gcc/cgraphunit.c
 +++ gcc/cgraphunit.c
 @@ -501,6 +501,20 @@ cgraph_node::add_new_function (tree fndecl, bool lowered)
  {
gcc::pass_manager *passes = g-get_passes ();
cgraph_node *node;
 +
 +  if (dump_file)
 +{
 +  const char *function_type = ((gimple_has_body_p (fndecl))
 +  ? (lowered
 + ? low gimple
 + : high gimple)
 +  : to-be-gimplified);
 +  fprintf (dump_file,
 +  Added new %s function %s to callgraph\n,
 +  function_type,
 +  fndecl_name (fndecl));
 +}
 +
switch (symtab-state)
  {
case PARSING:
 @@ -629,7 +643,6 @@ cgraph_node::analyze (void)
  body.  */
if (!gimple_has_body_p (decl))
 gimplify_function_tree (decl);
 -  dump_function (TDI_generic, decl);

/* Lower the function.  */
if (!lowered)
 diff --git gcc/gimplify.c gcc/gimplify.c
 index 9214648..d6c500d 100644
 --- gcc/gimplify.c
 +++ gcc/gimplify.c
 @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
  #include gimple-low.h
  #include cilk.h
  #include gomp-constants.h
 +#include tree-dump.h

  #include langhooks-def.h /* FIXME: for lhd_set_decl_assembler_name */
  #include tree-pass.h /* FIXME: only for PROP_gimple_any */
 @@ -9435,6 +9436,8 @@ gimplify_function_tree (tree fndecl)
cfun-curr_properties = PROP_gimple_any;

pop_cfun ();
 +
 +  dump_function (TDI_generic, fndecl);

Err - that dumps gimple now.  I think the dump you removed above was
simply bogus
and should have used TDI_gimple?  Or is TDI_generic just 

[committed] Fix warnings in extend.texi

2015-05-22 Thread Marek Polacek
This is to fix two warning: `.' or `,' must follow @xref, not )
occurences.

Applying to trunk.

2015-05-22  Marek Polacek  pola...@redhat.com

* doc/extend.texi: Use @pxref instead of @xref.

diff --git gcc/doc/extend.texi gcc/doc/extend.texi
index 5539199..6c51bc4 100644
--- gcc/doc/extend.texi
+++ gcc/doc/extend.texi
@@ -6304,7 +6304,7 @@ compilers to match the native Microsoft compiler.
 GCC allows attributes to be set on C labels.  @xref{Attribute Syntax}, for 
 details of the exact syntax for using attributes.  Other attributes are 
 available for functions (@pxref{Function Attributes}), variables 
-(@pxref{Variable Attributes}), enumerators (@xref{Enumerator Attributes}),
+(@pxref{Variable Attributes}), enumerators (@pxref{Enumerator Attributes}),
 and for types (@pxref{Type Attributes}).
 
 This example uses the @code{cold} label attribute to indicate the 
@@ -6358,7 +6358,7 @@ with computed goto or @code{asm goto}.
 GCC allows attributes to be set on enumerators.  @xref{Attribute Syntax}, for
 details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
-(@pxref{Variable Attributes}), labels (@xref{Label Attributes}),
+(@pxref{Variable Attributes}), labels (@pxref{Label Attributes}),
 and for types (@pxref{Type Attributes}).
 
 This example uses the @code{deprecated} enumerator attribute to indicate the

Marek


Ping: [Patch, fortran, PR44672, v6] [F08] ALLOCATE with SOURCE and no array-spec

2015-05-22 Thread Andre Vehreschild
Hi,

the patch (65548) this one depends on is in trunk now. 

Still bootstraps ok and regtests with the issue in
gfortran.dg/alloc_comp_constructor_1.f90 (which is addressed by the patch for
pr58586 already) on x86_64-linux-gnu/f21.

Ok for trunk?

- Andre

On Tue, 19 May 2015 12:26:02 +0200
Andre Vehreschild ve...@gmx.de wrote:

 Hi all,
 
 update based on latest 65548 (v5) patch and current trunk. Description and
 issue addressed unchanged (see cite below).
 
 Bootstrapped and regtested on x86_64-linux-gnu/f21.
 
 Any volunteers to review? The initial version dates back to March 30. 2015.
 Not a single comment so far!
 
 - Andre
 
 
 
 On Thu, 30 Apr 2015 16:17:42 +0200
 Andre Vehreschild ve...@gmx.de wrote:
 
  Hi all,
  
  and also for this bug, I like to present an updated patch. It was brought to
  my attention, that the previous patch did not fix statements like:
  
  allocate(m, source=[(I, I=1, n)])
  
  where n is a variable and
  
  type p
class(*), allocatable :: m(:,:)
  end type
  real mat(2,3)
  type(P) :: o
  allocate(o%m, source=mat)
  
  The new version of the patch fixes those issue now also and furthermore
  addresses some issues (most probably not all) where the rank of the
  source=-variable and the rank of the array to allocate differ. For example,
  when one is do:
  
  real v(:)
  allocate(v, source= arr(1,2:3))
  
  where arr has a rank of 2 and only the source=-expression a rank of one,
  which is then compatible with v. Nevertheless did this need addressing,
  when setting up the descriptor of the v and during data copy.
  
  Bootstrap ok on x86_64-linux-gnu/f21.
  Regtests with one regression in gfortran.dg/alloc_comp_constructor_1.f90,
  which is addressed in the patch for pr58586, whose final version is in
  preparation.
  
  Ok for trunk in combination with 58586 once both are reviewed?
  
  Regards,
  Andre
  
  
  On Wed, 29 Apr 2015 17:23:58 +0200
  Andre Vehreschild ve...@gmx.de wrote:
  
   Hi all,
   
   this is the fourth version of the patch, adapting to the current state of
   trunk. This patch is based on my patch for 65584 version 2 and needs that
   patch applied beforehand to apply cleanly. The patch for 65548 is
   available from:
   
   https://gcc.gnu.org/ml/fortran/2015-04/msg00121.html
   
   Scope:
   
   Allow allocate of arrays w/o having to give an array-spec as specified in
   F2008:C633. An example is:
   
   integer, dimension(:) :: arr
   allocate(arr, source = [1,2,3])
   
   Solution:
   
   While resolving an allocate, the objects to allocate are analyzed whether
   they carry an array-spec, if not the array-spec of the source=-expression
   is transferred. Unfortunately some source=-expressions are not easy to
   handle and have to be assigned to a temporary variable first. Only with
   the temporary variable the gfc_trans_allocate() is then able to compute
   the array descriptor correctly and allocate with correct array bounds.
   
   Side notes:
   
   This patch creates a regression in alloc_comp_constructor_1.f90 where two
   free()'s are gone missing. This will be fixed by the patch for pr58586 and
   therefore not repeated here.
   
   Bootstraps and regtests ok on x86_64-linux-gnu/f21.
   
   Ok for trunk?
   
   Regards,
 Andre
   
  
  
 
 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


[Ada] Constants without variable input are not hidden state

2015-05-22 Thread Arnaud Charlet
This patch implements the following rule with respect to constants:

   SPARK RM 7.1.1(2) - The hidden state of a package P consists of:
   * any variables, or constants with variable inputs, declared immediately in
   the private part or body of P.

Constants without variable input are not considered part of the hidden state of
a package. These constants do not require indicator Part_Of when declared in
the private part of a package.


-- Source --


--  types.ads

ackage Types is
   type CA is array (1 .. 3) of Integer;
   type UA is array (Integer range ) of Integer;

   type VR (Discr : Boolean) is record
  Comp_1 : Integer;

  case Discr is
 when True =
Comp_2 : Integer;
 when others =
null;
  end case;
   end record;

   subtype CR is VR (True);

   function Get_CA return CA;
   function Get_CR return CR;
   function Get_In return Integer;
   function Get_UA return UA;
   function Get_VR return VR;
end Types;

--  types.adb

package body Types is
   function Get_CA return CA is
  Result : constant CA := (others = 0);
   begin
  return Result;
   end Get_CA;

   function Get_CR return CR is
  Result : constant CR := (Discr = True, Comp_1 = 0, Comp_2 = 0);
   begin
  return Result;
   end Get_CR;

   function Get_In return Integer is
   begin
  return 0;
   end Get_In;

   function Get_UA return UA is
   begin
  return (0, 0, 0);
   end Get_UA;

   function Get_VR return VR is
   begin
  return (Discr = False, Comp_1 = 0);
   end Get_VR;
end Types;

--  legal_hidden_state.ads

with Types; use Types;
pragma Elaborate_All (Types);

package Legal_Hidden_State
  with SPARK_Mode,
   Abstract_State = State
is
   procedure Force_Body;

private
   --  Constrained array

   C1 : constant CA := Get_CA with Part_Of = State;
   C2 : constant CA := (others = Get_In) with Part_Of = State;
   C3 : constant CA := (1, 2, Get_In) with Part_Of = State;

   --  Constrained record

   C4 : constant CR := Get_CR with Part_Of = State;
   C5 : constant CR :=
  (Discr = True, Comp_1 = 1, Comp_2 = Get_In) with Part_Of = State;

   --  Scalar

   C6 : constant Integer := Get_In with Part_Of = State;

   --  Unconstrained array

   C7 : constant UA := Get_UA with Part_Of = State;
   C8 : constant UA (1 .. 3) := (others = Get_In) with Part_Of = State;
   C9 : constant UA := (1, 2, Get_In) with Part_Of = State;

   --  Variant record

   C10 : constant VR := Get_VR with Part_Of = State;
   C11 : constant VR :=
   (Discr = False, Comp_1 = Get_In) with Part_Of = State;
   C12 : constant VR (False) :=
   (Discr = False, Comp_1 = Get_In) with Part_Of = State;
end Legal_Hidden_State;

--  legal_hidden_state.adb

package body Legal_Hidden_State
  with SPARK_Mode,
   Refined_State =
 (State =
   (C1,  C2,  C3,  C4,  C5,  C6,  C7,  C8,  C9,  C10, C11, C12,
C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24))
is
   --  Constrained array

   C13 : constant CA := Get_CA;
   C14 : constant CA := (others = Get_In);
   C15 : constant CA := (1, 2, Get_In);

   --  Constrained record

   C16 : constant CR := Get_CR;
   C17 : constant CR := (Discr = True, Comp_1 = 17, Comp_2 = Get_In);

   --  Scalar

   C18 : constant Integer := Get_In;

   --  Unconstrained array

   C19 : constant UA := Get_UA;
   C20 : constant UA (1 .. 3) := (others = Get_In);
   C21 : constant UA := (1, 2, Get_In);

   --  Variant record

   C22 : constant VR := Get_VR;
   C23 : constant VR := (Discr = False, Comp_1 = Get_In);
   C24 : constant VR (False) := (Discr = False, Comp_1 = Get_In);

   procedure Force_Body is begin null; end Force_Body;
end Legal_Hidden_State;

-
-- Compilation --
-

$ gcc -c legal_hidden_state.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Hristian Kirtchev  kirtc...@adacore.com

* sem_prag.adb (Analyze_Pragma): Constants without variable
input do not require indicator Part_Of.
(Check_Missing_Part_Of): Constants without variable input do not
requrie indicator Part_Of.
(Collect_Visible_States): Constants without variable input are
not part of the hidden state of a package.
* sem_util.ads, sem_util.adb (Has_Variable_Input): New routine.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 223484)
+++ sem_prag.adb(working copy)
@@ -2710,7 +2710,7 @@
  Legal   : out Boolean);
   --  Subsidiary to the analysis of pragmas Abstract_State and Part_Of.
   --  Perform full analysis of indicator Part_Of. Item_Id is the entity of
-  --  an abstract state, variable or package instantiation. State is the
+  --  an abstract state, object or package instantiation. State is the
   --  encapsulating state. Indic is the Part_Of indicator. Flag Legal is
   --  set when the indicator 

Re: Calculate TYPE_CANONICAL only for types that can be accessed in memory

2015-05-22 Thread Jan Hubicka
 Now we have it spelled out 4 times ... makes sense to create a new
 macro for it?  (though I cannot think of a good name...
 UNACCESSIBLE_TYPE_P ()?)

Yep, actually I already made that version of patch yesterday but then got
hooked by beers.  This is better version (also with more sensible comments).
I will commit it at afternoon if you have no further comments.


* lto.c (hash_canonical_type): Be sure we hash only types that
need alias set.
(gimple_register_canonical_type_1): Do not produce canonical
types for types that do not need alias sets.
* tree.c (gimple_canonical_types_compatible_p): Sanity check that
we do not try to compute canonical type for type that does not need
alias set.
(verify_type): Drop FIXME for METHOD_TYPE, update FIXME for
FUNCITON_TYPE.
* tree.h (type_with_alias_set_p): New.
Index: lto/lto.c
===
--- lto/lto.c   (revision 223508)
+++ lto/lto.c   (working copy)
@@ -309,6 +309,12 @@ hash_canonical_type (tree type)
 {
   inchash::hash hstate;
 
+  /* We compute alias sets only for types that needs them.
+ Be sure we do not recurse to something else as we can not hash incomplete
+ types in a way they would have same hash value as compatible complete
+ types.  */
+  gcc_checking_assert (type_with_alias_set_p (type));
+
   /* Combine a few common features of types so that types are grouped into
  smaller sets; when searching for existing matching types to merge,
  only existing types having the same features as the new type will be
@@ -493,7 +495,7 @@ gimple_register_canonical_type_1 (tree t
 static void
 gimple_register_canonical_type (tree t)
 {
-  if (TYPE_CANONICAL (t))
+  if (TYPE_CANONICAL (t) || !type_with_alias_set_p (t))
 return;
 
   gimple_register_canonical_type_1 (t, hash_canonical_type (t));
Index: tree.c
===
--- tree.c  (revision 223508)
+++ tree.c  (working copy)
@@ -12720,6 +12720,23 @@ gimple_canonical_types_compatible_p (con
   if (t1 == NULL_TREE || t2 == NULL_TREE)
 return false;
 
+  /* We consider complete types always compatible with incomplete type.
+ This does not make sense for canonical type calculation and thus we
+ need to ensure that we are never called on it.
+
+ FIXME: For more correctness the function probably should have three modes
+   1) mode assuming that types are complete mathcing their structure
+   2) mode allowing incomplete types but producing equivalence classes
+  and thus ignoring all info from complete types
+   3) mode allowing incomplete types to match complete but checking
+  compatibility between complete types.
+
+ 1 and 2 can be used for canonical type calculation. 3 is the real
+ definition of type compatibility that can be used i.e. for warnings during
+ declaration merging.  */
+
+  gcc_assert (!trust_type_canonical
+ || (type_with_alias_set_p (t1)  type_with_alias_set_p (t2)));
   /* If the types have been previously registered and found equal
  they still are.  */
   if (TYPE_CANONICAL (t1)  TYPE_CANONICAL (t2)
@@ -12939,10 +12953,9 @@ verify_type (const_tree t)
   /* Method and function types can not be used to address memory and thus
  TYPE_CANONICAL really matters only for determining useless conversions.
 
- FIXME: C++ FE does not agree with gimple_canonical_types_compatible_p
- here.  gimple_canonical_types_compatible_p calls comp_type_attributes
- while for C++ FE the attributes does not make difference.  */
-  else if (TREE_CODE (t) == FUNCTION_TYPE || TREE_CODE (t) == METHOD_TYPE)
+ FIXME: C++ FE produce declarations of builtin functions that are not
+ compatible with main variants.  */
+  else if (TREE_CODE (t) == FUNCTION_TYPE)
 ;
   else if (t != ct
   /* FIXME: gimple_canonical_types_compatible_p can not compare types
Index: tree.h
===
--- tree.h  (revision 223508)
+++ tree.h  (working copy)
@@ -5090,6 +5090,26 @@ int_bit_position (const_tree field)
  + wi::to_offset (DECL_FIELD_BIT_OFFSET (field))).to_shwi ();
 }
 
+/* Return true if it makes sense to consider alias set for a type T.  */
+
+inline bool
+type_with_alias_set_p (const_tree t)
+{
+  /* Function and method types are never accessed as memory locations.  */
+  if (TREE_CODE (t) == FUNCTION_TYPE || TREE_CODE (t) == METHOD_TYPE)
+return false;
+
+  if (COMPLETE_TYPE_P (t))
+return true;
+
+  /* Incomplete types can not be accessed in general except for arrays
+ where we can fetch its element despite we have no array bounds.  */
+  if (TREE_CODE (t) == ARRAY_TYPE  COMPLETE_TYPE_P (TREE_TYPE (t)))
+return true;
+
+  return false;
+}
+
 extern void gt_ggc_mx (tree );
 extern void gt_pch_nx (tree );
 extern void gt_pch_nx 

[RFC / CFT] PR c++/66192 - Remove TARGET_RELAXED_ORDERING and use load acquires.

2015-05-22 Thread Ramana Radhakrishnan

All,


	This patch removes the special casing for targets with relaxed memory 
ordering and handles guard accesses with equivalent atomic load acquire 
operations. In this process we change the algorithm to load the guard 
variable with an atomic load that has ACQUIRE semantics. I'm not 
terribly familiar with the C++ front-end so I'm not sure I've used the 
appropriate interfaces for doing something like this.


This then means that on targets which have weak memory models, the fast 
path is inlined and can directly use a load-acquire instruction where 
available (and yay! one more hook gone).


Currently bootstrapping and regression testing on AArch64 and ARM (prior 
to the commit that caused PR66241). If this goes in then I'm happy to 
withdraw part of the patches to trunk for AArch64 / ARM that defines 
TARGET_RELAXED_ORDERING and only propose those hunks to the branches.


I'd also request the other target maintainers CC'd to help by testing 
this on their platforms as I do not have access to all of them.


To help folks see the difference, this is the difference in output for a 
compiler for AArch64 built with TARGET_RELAXED_ORDERING set to true and 
this patch for the testcase below.



int* f(void) {
  static int* p = new int;
  return p;
}

-   adrpx19, .LANCHOR0
-   add x20, x19, :lo12:.LANCHOR0
-   mov x0, x20
-   bl  __cxa_guard_acquire
-   cbnzw0, .L2
-   ldr x0, [x20, 8]
+   adrpx20, .LANCHOR0
+   add x19, x20, :lo12:.LANCHOR0
+   ldarx0, [x19]
+   tbz x0, 0, .L11
+.L9:
+   ldr x0, [x19, 8]



regards
Ramana

2015-05-22  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

PR c++/66192
* config/alpha/alpha.c (TARGET_RELAXED_ORDERING): Likewise.
* config/ia64/ia64.c (TARGET_RELAXED_ORDERING): Likewise.
* config/rs6000/rs6000.c (TARGET_RELAXED_ORDERING): Likewise.
* config/sparc/linux.h (SPARC_RELAXED_ORDERING): Likewise.
* config/sparc/linux64.h (SPARC_RELAXED_ORDERING): Likewise.
* config/sparc/sparc.c (TARGET_RELAXED_ORDERING): Likewise.
* config/sparc/sparc.h (SPARC_RELAXED_ORDERING): Likewise.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_RELAXED_ORDERING): Delete.
* target.def (TARGET_RELAXED_ORDERING): Delete.

gcc/cp/ChangeLog:

2015-05-22  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

PR c++/66192
* decl.c (expand_static_init): Remove special casing for
targets with weak memory model.
* decl2.c (build_atomic_load): New function.
 (get_guard_cond): Adjust for atomic_load.


[PATCH] PR other/66250: Can't adjust complex nor decimal floating point modes

2015-05-22 Thread H.J. Lu
machmode.def has

/* Allow the target to specify additional modes of various kinds.  */

/* Complex modes.  */
COMPLEX_MODES (INT);
COMPLEX_MODES (FLOAT);

/* Decimal floating point modes.  */
DECIMAL_FLOAT_MODE (SD, 4, decimal_single_format);
DECIMAL_FLOAT_MODE (DD, 8, decimal_double_format);
DECIMAL_FLOAT_MODE (TD, 16, decimal_quad_format);

We can't adjust any complex nor DFP modes in i386-modes.def since they
aren't available yet.  But we need to include i386-modes.def before

COMPLEX_MODES (INT);
COMPLEX_MODES (FLOAT);

to get extra modes.  This patch adds EXTRA_ADJUSTMENTS_FILE containing
 adjustments to machine modes and includes it after all modes have been
created.

Tested on Linux/x86-64.  The generated mode files are equivalent before
and after the change. I enclosed the diffs here.  OK for master?

Thanks.

H.J.
---
PR other/66250
* Makefile.in (extra_adjustments_file): New.
(build/genmodes.o): Depend on $(extra_adjustments_file).
* config.gcc (extra_adjustments): New.  Set to
${cpu_type}/${cpu_type}-adjustments.def if available.
* configure.ac (extra_adjustments): AC_SUBST and
AC_DEFINE_UNQUOTED if set.
* genmodes.c (HAVE_EXTRA_ADJUSTMENTS): New.
(EXTRA_ADJUSTMENTS_FILE): Likewise.
(emit_autogen_header): Likewise.
(emit_insn_modes_h): Call emit_autogen_header.
(emit_insn_modes_c_header): Likewise.
(emit_min_insn_modes_c_header): Likewise.
* machmode.def: Include EXTRA_ADJUSTMENTS_FILE if
HAVE_EXTRA_ADJUSTMENTS is 1.
* config.in: Regenerated.
* configure: Likewise.
* config/i386/i386-modes.def (ADJUST_FLOAT_FORMAT,
ADJUST_BYTESIZE, ADJUST_ALIGNMENT): Moved to ...
* config/i386/i386-adjustments.def: Here.  New file.
---
 gcc/Makefile.in  |  3 ++-
 gcc/config.gcc   |  9 +
 gcc/config.in|  7 +++
 gcc/config/i386/i386-adjustments.def | 28 
 gcc/config/i386/i386-modes.def   | 12 +---
 gcc/configure| 16 ++--
 gcc/configure.ac |  9 +
 gcc/genmodes.c   | 34 ++
 gcc/machmode.def |  5 +
 9 files changed, 97 insertions(+), 26 deletions(-)
 create mode 100644 gcc/config/i386/i386-adjustments.def

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 512e7c8..87e1cb1 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -544,6 +544,7 @@ lang_tree_files=@lang_tree_files@
 target_cpu_default=@target_cpu_default@
 OBJC_BOEHM_GC=@objc_boehm_gc@
 extra_modes_file=@extra_modes_file@
+extra_adjustments_file=@extra_adjustments_file@
 extra_opt_files=@extra_opt_files@
 host_hook_obj=@out_host_hook_obj@
 
@@ -2540,7 +2541,7 @@ CFLAGS-errors.o += -DHOST_GENERATOR_FILE
 build/genmddeps.o: genmddeps.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h\
   errors.h $(READ_MD_H)
 build/genmodes.o : genmodes.c $(BCONFIG_H) $(SYSTEM_H) errors.h
\
-  $(HASHTAB_H) machmode.def $(extra_modes_file)
+  $(HASHTAB_H) machmode.def $(extra_modes_file) $(extra_adjustments_file)
 build/genopinit.o : genopinit.c $(RTL_BASE_H) $(BCONFIG_H) $(SYSTEM_H) \
   coretypes.h $(GTM_H) errors.h gensupport.h optabs.def
 build/genoutput.o : genoutput.c $(RTL_BASE_H) $(BCONFIG_H) $(SYSTEM_H) \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 1fcc290..85572d6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -103,6 +103,10 @@
 #   machine modes, if necessary and different from
 #   $cpu_type/$cpu_type-modes.def.
 #
+#  extra_adjustmentsThe name of the file containing adjustments to
+#   machine modes, if necessary and different from
+#   $cpu_type/$cpu_type-adjustments.def.
+#
 #  extra_objs  List of extra objects that should be linked into
 #  the compiler proper (cc1, cc1obj, cc1plus)
 #  depending on target.
@@ -494,6 +498,11 @@ if test -f 
${srcdir}/config/${cpu_type}/${cpu_type}-modes.def
 then
extra_modes=${cpu_type}/${cpu_type}-modes.def
 fi
+extra_adjustments=
+if test -f ${srcdir}/config/${cpu_type}/${cpu_type}-adjustments.def
+then
+   extra_adjustments=${cpu_type}/${cpu_type}-adjustments.def
+fi
 if test -f ${srcdir}/config/${cpu_type}/${cpu_type}.opt
 then
extra_options=${extra_options} ${cpu_type}/${cpu_type}.opt
diff --git a/gcc/config/i386/i386-adjustments.def 
b/gcc/config/i386/i386-adjustments.def
new file mode 100644
index 000..690201e
--- /dev/null
+++ b/gcc/config/i386/i386-adjustments.def
@@ -0,0 +1,28 @@
+/* Definitions of target adjustments for GCC for x86.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as 

[Ada] Removal of SPARK RM 6.9 (11)

2015-05-22 Thread Arnaud Charlet
This patch removes the (incorrect) implementation of the following rule:

   SPARK RM 6.9 (11) - A non-ghost library unit package or generic package
   specification shall not require a completion solely because of ghost
   declarations. [In other words, if a library unit package or generic package
   specification requires a body, then it must still require a body if all of
   the ghost declarations therein were to be removed.

The patch also updates references to SPARK RM 6.9 in various compiler files.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Hristian Kirtchev  kirtc...@adacore.com

* ghost.adb (Check_Ghost_Completion): Update references to SPARK
RM 6.9 rules.
(Check_Ghost_Policy): Update references to SPARK RM 6.9 rules.
* sem_ch3.adb (Analyze_Object_Declaration): Update references
to SPARK RM 6.9 rules.
(Check_Completion): Ghost entities do not require a special form of
completion.
* sem_ch6.adb (Analyze_Generic_Subprogram_Body): Update references
to SPARK RM 6.9 rules.
(Analyze_Subprogram_Body_Helper): Update references to SPARK RM 6.9
rules.
* sem_ch7.adb (Analyze_Package_Body_Helper): Update references
to SPARK RM 6.9 rules.
(Requires_Completion_In_Body): Ghost entities do not require a special
form of completion.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 223484)
+++ sem_ch3.adb (working copy)
@@ -4055,7 +4055,7 @@
 
   --  The Ghost policy in effect at the point of declaration
   --  and at the point of completion must match
-  --  (SPARK RM 6.9(15)).
+  --  (SPARK RM 6.9(14)).
 
   if Present (Prev_Entity)
 and then Is_Ghost_Entity (Prev_Entity)
@@ -4237,7 +4237,7 @@
  Set_Is_Ghost_Entity (Id);
 
  --  The Ghost policy in effect at the point of declaration and at the
- --  point of completion must match (SPARK RM 6.9(16)).
+ --  point of completion must match (SPARK RM 6.9(14)).
 
  if Present (Prev_Entity) and then Is_Ghost_Entity (Prev_Entity) then
 Check_Ghost_Completion (Prev_Entity, Id);
@@ -10937,12 +10937,6 @@
  if Is_Intrinsic_Subprogram (E) then
 null;
 
- --  A Ghost entity declared in a non-Ghost package does not force the
- --  need for a body (SPARK RM 6.9(11)).
-
- elsif not Is_Ghost_Entity (Pack_Id) and then Is_Ghost_Entity (E) then
-null;
-
  --  The following situation requires special handling: a child unit
  --  that appears in the context clause of the body of its parent:
 
@@ -19964,7 +19958,7 @@
  Set_Is_Ghost_Entity (Full_T);
 
  --  The Ghost policy in effect at the point of declaration and at the
- --  point of completion must match (SPARK RM 6.9(15)).
+ --  point of completion must match (SPARK RM 6.9(14)).
 
  Check_Ghost_Completion (Priv_T, Full_T);
 
Index: sem_ch7.adb
===
--- sem_ch7.adb (revision 223476)
+++ sem_ch7.adb (working copy)
@@ -750,7 +750,7 @@
  Set_Is_Ghost_Entity (Body_Id);
 
  --  The Ghost policy in effect at the point of declaration and at the
- --  point of completion must match (SPARK RM 6.9(15)).
+ --  point of completion must match (SPARK RM 6.9(14)).
 
  Check_Ghost_Completion (Spec_Id, Body_Id);
   end if;
@@ -2527,12 +2527,6 @@
   then
  return False;
 
-  --  A Ghost entity declared in a non-Ghost package does not force the
-  --  need for a body (SPARK RM 6.9(11)).
-
-  elsif not Is_Ghost_Entity (Pack_Id) and then Is_Ghost_Entity (Id) then
- return False;
-
   --  Otherwise test to see if entity requires a completion. Note that
   --  subprogram entities whose declaration does not come from source are
   --  ignored here on the basis that we assume the expander will provide an
Index: ghost.adb
===
--- ghost.adb   (revision 223476)
+++ ghost.adb   (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---Copyright (C) 2014-2015, Free Software Foundation, Inc.   --
+--  Copyright (C) 2014-2015, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -106,7 +106,7 @@
 
begin
   --  The Ghost 

Re: [patch 10/10] debug-early merge: compiler proper

2015-05-22 Thread Richard Biener
On Wed, May 20, 2015 at 5:50 PM, Aldy Hernandez al...@redhat.com wrote:
 On 05/18/2015 06:56 AM, Richard Biener wrote:

 BTW, thanks for the review.

 On Fri, May 8, 2015 at 2:40 AM, Aldy Hernandez al...@redhat.com wrote:

 As seen on TV.


 +/* FIRST_TIME is set to TRUE for the first time we are called for a
 +   translation unit from finalize_compilation_unit() or false
 +   otherwise.  */
 +
   static void
 -analyze_functions (void)
 +analyze_functions (bool first_time)
   {
 ...
 +  if (first_time)
 +{
 +  symtab_node *snode;
 +  FOR_EACH_SYMBOL (snode)
 +   check_global_declaration (snode-decl);
 +}
 +

 I think it is better to split analyze_functions (why does it have it's own
 copy
 of unreachable node removal?) into analysis and unused-symbol removal
 and have the
 check_global_declaration call be in finalize_compilation_unit directly.
 Honza?


 Leaving things as is, as per Honza's comment ??


 @@ -1113,6 +1131,19 @@ analyze_functions (void)
  {
if (symtab-dump_file)
  fprintf (symtab-dump_file,  %s, node-name ());
 +
 + /* See if the debugger can use anything before the DECL
 +passes away.  Perhaps it can notice a DECL that is now a
 +constant and can tag the early DIE with an appropriate
 +attribute.
 +
 +Otherwise, this is the last chance the debug_hooks have
 +at looking at optimized away DECLs, since
 +late_global_decl will subsequently be called from the
 +contents of the now pruned symbol table.  */
 + if (!decl_function_context (node-decl))
 +   (*debug_hooks-late_global_decl) (node-decl);
 +
node-remove ();

 so this applies to VAR_DECLs only - shouldn't this be in the
 varpool_node::remove function then?  You can even register/deregister
 a hook for this in finalize_compilation_unit.  That would IMHO be better.


 The problem is that varpool_node::remove() may be called before we
 have finished parsing the DECL, thus before we call
 early_global_decl() on it.  So we would essentially be calling
 late_global_decl() on a DECL for which we haven't called
 early_global_decl().

 To complicate matters, we may call ::remove() before we finish parsing
 a decl.  In the C front-end, for instance, we call ::remove() from
 duplicate_decls(), before we even call rest_of_decl_compilation (where
 we call early_global_decl).

 Calling late_global_decl so early, before we have even finished
 parsing, seems wrong and obviously causes problems.  One example is
 dwarf2out can put the DECL into the deferred_asm_names list, only to
 have duplicate_decls() gcc_free it from under us.


 All debug stuff apart from dwarf2out.c changes (I assume Jason reviews
 them) are ok.

 diff --git a/gcc/gengtype.c b/gcc/gengtype.c
 index 02012d5..15b6dd2 100644
 --- a/gcc/gengtype.c
 +++ b/gcc/gengtype.c
 @@ -4718,7 +4718,8 @@ write_roots (pair_p variables, bool emit_pch)
  this funcion will have to be adjusted to be more like
  output_mangled_typename.  */

 -static void
 +/* ?? Why are we keeping this?  Is this actually used anywhere?  */
 +static void ATTRIBUTE_UNUSED
   output_typename (outf_p of, const_type_p t)
   {
 switch (t-kind)

 Just remove the function.


 Done.


 The langhooks changes are ok.

 diff --git a/gcc/passes.c b/gcc/passes.c
 index 04ff042..4dee8ad 100644
 --- a/gcc/passes.c
 +++ b/gcc/passes.c
 @@ -293,6 +293,28 @@ rest_of_decl_compilation (tree decl,
 else if (TREE_CODE (decl) == VAR_DECL  !DECL_EXTERNAL (decl)
  TREE_STATIC (decl))
   varpool_node::get_create (decl);
 +
 +  /* Generate early debug for global variables.  Any local variables will
 + be handled by either handling reachable functions from
 + finalize_compilation_unit (and by consequence, locally scoped
 + symbols), or by rest_of_type_compilation below.
 +
 + Also, pick up function prototypes, which will be mostly ignored
 + by the different early_global_decl() hooks, but will at least be
 + used by Go's hijack of the debug_hooks to implement
 + -fdump-go-spec.  */
 +  if (!flag_wpa
 +   !in_lto_p

 Just check !in_lto_p, !flag_wpa is redundant.


 Done.


 +   !decl_function_context (decl)
 +   !current_function_decl

 Why that?  !decl_function_context should catch relevant cases?


 You'd think, huh?  The issue here are extern declarations appearing
 inside of a function.  For this case, decl_function_context is NULL,
 because the actual context is toplevel, but current_function_decl is
 set to the function where the extern declaration appears.

 For example:

 namespace S
 {
   int
   f()
   {
 {
   int i = 42;
   {
 extern int i; // Local extern declaration.
 return i;
   }
 }
   }
 }

 I have added a big fat comment in the code now, since it's clearly not
 obvious why we need current_function_decl.


 +   !decl_type_context (decl))
 +

Re: Calculate TYPE_CANONICAL only for types that can be accessed in memory

2015-05-22 Thread Jan Hubicka
  +  /* No need for canonical types of functions and methods; those are never
  + accessed as memory locations.  */
  +  if (TREE_CODE (t) == FUNCTION_TYPE || TREE_CODE (t) == METHOD_TYPE)
  +return;
 
 Just occured to me that it might make sense to remove the
 FUNCTION/METHOD_TYPE case in useless_type_conversion_p (I wonder
 in which cases we enter up in that path...).

Yes, I think that is unreachable (it was used only by that ignored code path in
ipa-symtab.c) and I also had patch for it somewhere.  Will look it up at
afternoon (I am teaching at morning).
I think useless_type_conversion_p should also just work on types that do have
memory representation.

Honza


Re: [RFC / CFT] PR c++/66192 - Remove TARGET_RELAXED_ORDERING and use load acquires.

2015-05-22 Thread Ramana Radhakrishnan

Bah ! now with patch attached.

Ramana
diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 1ba99d0..857c9ac 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -9987,12 +9987,6 @@ alpha_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
 #undef TARGET_EXPAND_BUILTIN_VA_START
 #define TARGET_EXPAND_BUILTIN_VA_START alpha_va_start
 
-/* The Alpha architecture does not require sequential consistency.  See
-   http://www.cs.umd.edu/~pugh/java/memoryModel/AlphaReordering.html
-   for an example of how it can be violated in practice.  */
-#undef TARGET_RELAXED_ORDERING
-#define TARGET_RELAXED_ORDERING true
-
 #undef TARGET_OPTION_OVERRIDE
 #define TARGET_OPTION_OVERRIDE alpha_option_override
 
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index c1e2ecd..45ad97a 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -630,11 +630,6 @@ static const struct attribute_spec ia64_attribute_table[] =
 #define TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P \
   ia64_libgcc_floating_mode_supported_p
 
-/* ia64 architecture manual 4.4.7: ... reads, writes, and flushes may occur
-   in an order different from the specified program order.  */
-#undef TARGET_RELAXED_ORDERING
-#define TARGET_RELAXED_ORDERING true
-
 #undef TARGET_LEGITIMATE_CONSTANT_P
 #define TARGET_LEGITIMATE_CONSTANT_P ia64_legitimate_constant_p
 #undef TARGET_LEGITIMATE_ADDRESS_P
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a590ef4..ce70ca0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1620,17 +1620,6 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #define TARGET_STACK_PROTECT_FAIL rs6000_stack_protect_fail
 #endif
 
-/* MPC604EUM 3.5.2 Weak Consistency between Multiple Processors
-   The PowerPC architecture requires only weak consistency among
-   processors--that is, memory accesses between processors need not be
-   sequentially consistent and memory accesses among processors can occur
-   in any order. The ability to order memory accesses weakly provides
-   opportunities for more efficient use of the system bus. Unless a
-   dependency exists, the 604e allows read operations to precede store
-   operations.  */
-#undef TARGET_RELAXED_ORDERING
-#define TARGET_RELAXED_ORDERING true
-
 #ifdef HAVE_AS_TLS
 #undef TARGET_ASM_OUTPUT_DWARF_DTPREL
 #define TARGET_ASM_OUTPUT_DWARF_DTPREL rs6000_output_dwarf_dtprel
diff --git a/gcc/config/sparc/linux.h b/gcc/config/sparc/linux.h
index 56def4b..37f507d 100644
--- a/gcc/config/sparc/linux.h
+++ b/gcc/config/sparc/linux.h
@@ -139,12 +139,6 @@ do {   
\
 /* Static stack checking is supported by means of probes.  */
 #define STACK_CHECK_STATIC_BUILTIN 1
 
-/* Linux currently uses RMO in uniprocessor mode, which is equivalent to
-   TMO, and TMO in multiprocessor mode.  But they reserve the right to
-   change their minds.  */
-#undef SPARC_RELAXED_ORDERING
-#define SPARC_RELAXED_ORDERING true
-
 #undef NEED_INDICATE_EXEC_STACK
 #define NEED_INDICATE_EXEC_STACK 1
 
diff --git a/gcc/config/sparc/linux64.h b/gcc/config/sparc/linux64.h
index fa805fd..8b4a435 100644
--- a/gcc/config/sparc/linux64.h
+++ b/gcc/config/sparc/linux64.h
@@ -253,12 +253,6 @@ do {   
\
 /* Static stack checking is supported by means of probes.  */
 #define STACK_CHECK_STATIC_BUILTIN 1
 
-/* Linux currently uses RMO in uniprocessor mode, which is equivalent to
-   TMO, and TMO in multiprocessor mode.  But they reserve the right to
-   change their minds.  */
-#undef SPARC_RELAXED_ORDERING
-#define SPARC_RELAXED_ORDERING true
-
 #undef NEED_INDICATE_EXEC_STACK
 #define NEED_INDICATE_EXEC_STACK 1
 
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index a1562ad..094287f 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -808,9 +808,6 @@ char sparc_hard_reg_printed[8];
 #define TARGET_ATTRIBUTE_TABLE sparc_attribute_table
 #endif
 
-#undef TARGET_RELAXED_ORDERING
-#define TARGET_RELAXED_ORDERING SPARC_RELAXED_ORDERING
-
 #undef TARGET_OPTION_OVERRIDE
 #define TARGET_OPTION_OVERRIDE sparc_option_override
 
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index ce1b68b..fd24431 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -106,17 +106,6 @@ extern enum cmodel sparc_cmodel;
 
 #define SPARC_DEFAULT_CMODEL CM_32
 
-/* The SPARC-V9 architecture defines a relaxed memory ordering model (RMO)
-   which requires the following macro to be true if enabled.  Prior to V9,
-   there are no instructions to even talk about memory synchronization.
-   Note that the UltraSPARC III processors don't implement RMO, unlike the
-   UltraSPARC II processors.  Niagara, Niagara-2, and Niagara-3 do not
-   implement RMO either.
-
-   Default to false; for example, Solaris never enables RMO, only ever uses
-   

[Ada] Constants and hidden state

2015-05-22 Thread Arnaud Charlet
This patch modifies the treatment of constants within the state space of a
package. Constants that appear in the hidden state space may or may not act
as constituents or possess indicator Part_Of. This is because the compiler
cannot accurately determine whether a constant has variable input which in
turn classifies it as a hidden state.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Hristian Kirtchev  kirtc...@adacore.com

* sem_prag.adb (Analyze_Pragma): Remove the detection
of a useless Part_Of indicator when the related item is a constant.
(Check_Matching_Constituent): Do not emit an error on a constant.
(Check_Missing_Part_Of): Do not check for a missing Part_Of indicator
when the related item is a constant.
(Collect_Body_States): Code cleanup.
(Collect_Visible_States): Code cleanup.
(Report_Unused_States): Do not emit an error on a constant.
* sem_util.ads, sem_util.adb (Has_Variable_Input): Removed.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 223534)
+++ sem_prag.adb(working copy)
@@ -17555,20 +17555,6 @@
Legal   = Legal);
 
 if Legal then
-
-   --  Constants without variable input are not considered part
-   --  of the hidden state of a package (SPARK RM 7.1.1(2)). As a
-   --  result such constants do not require a Part_Of indicator.
-
-   if Ekind (Item_Id) = E_Constant
- and then not Has_Variable_Input (Item_Id)
-   then
-  SPARK_Msg_NE
-(useless Part_Of indicator, constant  does not have 
-  variable input, N, Item_Id);
-  return;
-   end if;
-
State_Id := Entity (State);
 
--  The Part_Of indicator turns an object into a constituent of
@@ -23983,14 +23969,25 @@
  end loop;
   end if;
 
+  --  Constants are part of the hidden state of a package, but
+  --  the compiler cannot determine whether they have variable
+  --  input (SPARK RM 7.1.1(2)) and cannot classify them as a
+  --  hidden state. Accept the constant quietly even if it is
+  --  a visible state or lacks a Part_Of indicator.
+
+  if Ekind (Constit_Id) = E_Constant then
+ null;
+
   --  If we get here, then the constituent is not a hidden
   --  state of the related package and may not be used in a
   --  refinement (SPARK RM 7.2.2(9)).
 
-  Error_Msg_Name_1 := Chars (Spec_Id);
-  SPARK_Msg_NE
-(cannot use  in refinement, constituent is not a hidden 
-  state of package %, Constit, Constit_Id);
+  else
+ Error_Msg_Name_1 := Chars (Spec_Id);
+ SPARK_Msg_NE
+   (cannot use  in refinement, constituent is not a 
+ hidden state of package %, Constit, Constit_Id);
+  end if;
end if;
 end Check_Matching_Constituent;
 
@@ -24434,7 +24431,6 @@
  
 
  procedure Collect_Visible_States (Pack_Id : Entity_Id) is
-Decl: Node_Id;
 Item_Id : Entity_Id;
 
  begin
@@ -24453,28 +24449,16 @@
elsif Ekind (Item_Id) = E_Abstract_State then
   Add_Item (Item_Id, Result);
 
-   elsif Ekind_In (Item_Id, E_Constant, E_Variable) then
-  Decl := Declaration_Node (Item_Id);
+   --  Do not consider constants or variables that map generic
+   --  formals to their actuals, as the formals cannot be named
+   --  from the outside and participate in refinement.
 
-  --  Do not consider constants or variables that map generic
-  --  formals to their actuals as the formals cannot be named
-  --  from the outside and participate in refinement.
+   elsif Ekind_In (Item_Id, E_Constant, E_Variable)
+ and then No (Corresponding_Generic_Association
+(Declaration_Node (Item_Id)))
+   then
+  Add_Item (Item_Id, Result);
 
-  if Present (Corresponding_Generic_Association (Decl)) then
- null;
-
-  --  Constants without variable input are not considered a
-  --  hidden state of a package (SPARK RM 7.1.1(2)).
-
-  elsif Ekind (Item_Id) = E_Constant
-and then not Has_Variable_Input (Item_Id)
-  then
- null;
-
-  else
-   

[Ada] Cannot rename component of Volatile_Full_Access object

2015-05-22 Thread Arnaud Charlet
It is not allowed to rename a component of a composite object to which
pragma Volatile_Full_Access has been applied. The following is compiled
with -gnatj55

 1. package RenamVFA is
 2.type Int8_t is mod 2**8;
 3.type Rec is record
 4.  A,B,C,D : Int8_t;
 5.end record;
 6.for Rec'Size use 32;
 7.for Rec'Alignment use 4;
 8.pragma Volatile_Full_Access (Rec);
 9.R : Rec;
10.I1 : Int8_t renames R.A; -- illegal for now
|
 cannot rename component of
Volatile_Full_Access object

11.type Arr is array (1 .. 4) of Int8_t;
12.pragma Volatile_Full_Access (Arr);
13.A : Arr;
14.I2 : Int8_t renames A (1); -- illegal for now
   |
 cannot rename component of
Volatile_Full_Access object

15. end RenamVFA;

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Robert Dewar  de...@adacore.com

* sem_ch8.adb (Analyze_Object_Renaming): Check for renaming
component of an object to which Volatile_Full_Access applies.

Index: sem_ch8.adb
===
--- sem_ch8.adb (revision 223541)
+++ sem_ch8.adb (working copy)
@@ -912,6 +912,25 @@
   (renaming of conversion only allowed for tagged types, Nam);
  end if;
 
+ --  Reject renaming of component of Volatile_Full_Access object
+
+ if Nkind_In (Nam, N_Selected_Component, N_Indexed_Component) then
+declare
+   P : constant Node_Id := Prefix (Nam);
+begin
+   if Is_Entity_Name (P) then
+  if Has_Volatile_Full_Access (Entity (P))
+   or else
+ Has_Volatile_Full_Access (Etype (P))
+  then
+ Error_Msg_N
+   (cannot rename component of Volatile_Full_Access 
+ object, Nam);
+  end if;
+   end if;
+end;
+ end if;
+
  Resolve (Nam, T);
 
  --  If the renamed object is a function call of a limited type,


[Ada] Raise Program_Error on default initialization of references

2015-05-22 Thread Arnaud Charlet
This patch causes default initialization of objects of types
Constant_Reference_Type and Reference_Type in the containers packages to raise
Program_Error as required by the RM.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Bob Duff  d...@adacore.com

* a-cborma.ads, a-cidlli.ads, a-cimutr.ads, a-ciormu.ads,
* a-cihase.ads, a-cohama.ads, a-coorse.ads, a-cbhama.ads,
* a-cborse.ads, a-comutr.ads, a-ciorma.ads, a-cobove.ads,
* a-coormu.ads, a-convec.ads, a-cohase.ads, a-coinho.ads,
* a-cbdlli.ads, a-cbmutr.ads, a-cbhase.ads, a-cdlili.ads,
* a-cihama.ads, a-coinve.ads, a-ciorse.ads, a-coorma.ads,
* a-coinho-shared.ads (Constant_Reference_Type, Reference_Type):
Add an initialization expression raise Program_Error. See,
for example, RM-A.18.2(148.4).

Index: a-cdlili.ads
===
--- a-cdlili.ads(revision 223476)
+++ a-cdlili.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2004-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 2004-2015, Free Software Foundation, Inc. --
 --  --
 -- This specification is derived from the Ada Reference Manual for use with --
 -- GNAT. The copyright notice above, and the license provisions that follow --
@@ -319,9 +319,13 @@
pragma Inline (Finalize);
 
type Constant_Reference_Type
-  (Element : not null access constant Element_Type) is
+ (Element : not null access constant Element_Type) is
   record
- Control : Reference_Control_Type;
+ Control : Reference_Control_Type :=
+   raise Program_Error with uninitialized reference;
+ --  The RM says, The default initialization of an object of
+ --  type Constant_Reference_Type or Reference_Type propagates
+ --  Program_Error.
   end record;
 
procedure Write
@@ -337,9 +341,13 @@
for Constant_Reference_Type'Read use Read;
 
type Reference_Type
-  (Element : not null access Element_Type) is
+ (Element : not null access Element_Type) is
   record
- Control : Reference_Control_Type;
+ Control : Reference_Control_Type :=
+   raise Program_Error with uninitialized reference;
+ --  The RM says, The default initialization of an object of
+ --  type Constant_Reference_Type or Reference_Type propagates
+ --  Program_Error.
   end record;
 
procedure Write
Index: a-cihama.ads
===
--- a-cihama.ads(revision 223476)
+++ a-cihama.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2004-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 2004-2015, Free Software Foundation, Inc. --
 --  --
 -- This specification is derived from the Ada Reference Manual for use with --
 -- GNAT. The copyright notice above, and the license provisions that follow --
@@ -379,9 +379,13 @@
pragma Inline (Finalize);
 
type Constant_Reference_Type
-  (Element : not null access constant Element_Type) is
+ (Element : not null access constant Element_Type) is
   record
- Control : Reference_Control_Type;
+ Control : Reference_Control_Type :=
+   raise Program_Error with uninitialized reference;
+ --  The RM says, The default initialization of an object of
+ --  type Constant_Reference_Type or Reference_Type propagates
+ --  Program_Error.
   end record;
 
procedure Write
@@ -397,9 +401,13 @@
for Constant_Reference_Type'Read use Read;
 
type Reference_Type
-  (Element : not null access Element_Type) is
+ (Element : not null access Element_Type) is
   record
- Control : Reference_Control_Type;
+ Control : Reference_Control_Type :=
+   raise Program_Error with uninitialized reference;
+ --  The RM says, The default initialization of an object of
+ --  type Constant_Reference_Type or Reference_Type propagates
+ --  Program_Error.
   end record;
 
procedure Write
Index: a-coinve.ads
===
--- a-coinve.ads(revision 223476)
+++ a-coinve.ads(working copy)
@@ -6,7 +6,7 @@
 --

Re: [Ada] Correct some anmolies in the handling of Atomic

2015-05-22 Thread Duncan Sands

Hi Arnaud,


Index: exp_util.adb
===
--- exp_util.adb(revision 223476)
+++ exp_util.adb(working copy)
@@ -204,6 +204,13 @@
  when others = null;
   end case;

+  --  Nothing to do for the identifier in an object renaming declaration,
+  --  the renaming itself does not need atomic syncrhonization.


syncrhonization - synchronization

Ciao, Duncan.


Re: [patch 10/10] debug-early merge: compiler proper

2015-05-22 Thread Richard Biener
On Wed, May 20, 2015 at 11:45 PM, Aldy Hernandez al...@redhat.com wrote:
 On 05/20/2015 05:01 PM, Jan Hubicka wrote:


 commit 8824b5ecba26cef065e47b34609c72677c3c36fc
 Author: Aldy Hernandez al...@redhat.com
 Date:   Wed May 20 16:31:14 2015 -0400

  Set DECL_IGNORED_P on temporary arrays created in the switch
  conversion pass.

 diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
 index 6b68a16..a4bcdba 100644
 --- a/gcc/tree-switch-conversion.c
 +++ b/gcc/tree-switch-conversion.c
 @@ -1097,6 +1097,7 @@ build_one_array (gswitch *swtch, int num, tree
 arr_index_type,
 DECL_ARTIFICIAL (decl) = 1;
 TREE_CONSTANT (decl) = 1;
 TREE_READONLY (decl) = 1;
 +  DECL_IGNORED_P (decl) = 1;
 varpool_node::finalize_decl (decl);


 This looks obvious enough to me.  Technically speaking the array type
 constructed
 probalby should be TREE_ARTIFICAIL, but probably it does not matter.

Fine to commit to trunk btw.


 Yeah, that's what I thought.  I ignored the type because it won't make it to
 the debugging back end if we stop things at the DECL itself.

 FWIW, Ada is filled with these temporaries and/or types that should really
 be ignored, and are currently causing grief.

 If you grep for finalize_decl, there are several other calls:
 asan.c:  varpool_node::finalize_decl (var);
 asan.c:  varpool_node::finalize_decl (var);
 cgraphbuild.c:  varpool_node::finalize_decl (decl);
 cgraphunit.c:- varpool_finalize_decl
 cgraphunit.c:   varpool_node::finalize_decl (decl);
 cgraphunit.c:varpool_node::finalize_decl (tree decl)
 coverage.c:   varpool_node::finalize_decl (var);
 coverage.c:  varpool_node::finalize_decl (var);


 Etc etc.

 Hmmm, I bet mainline is generating dwarf for all this.  I don't feel
 comfortable touching all this (ok, I'm lazy), but it would seem like almost
 all of these calls would benefit from DECL_IGNORED_P.  Perhaps we could add
 an argument to finalize_decl() and do it in there.

The only issue are in passes using build_decl directly.  I guess we'd want
a middle-end-ish create new global static similar to what we have for
locals (create_tmp_var).  Some of the callers above already set DECL_IGNORED_P
properly.

Richard.

 Aldy


 coverage.c:  varpool_node::finalize_decl (fn_info_ary);
 coverage.c:  varpool_node::finalize_decl (gcov_info_var);
 omp-low.c:varpool_node::finalize_decl (t);
 omp-low.c:varpool_node::finalize_decl (t);
 omp-low.c:varpool_node::finalize_decl (decl);
 omp-low.c:  varpool_node::finalize_decl (vars_decl);
 omp-low.c:  varpool_node::finalize_decl (funcs_decl);
 passes.c:   varpool_node::finalize_decl (decl);
 tree-chkp.c:  varpool_node::finalize_decl (var);
 tree-chkp.c:  varpool_node::finalize_decl (bnd_var);
 tree-profile.c:  varpool_node::finalize_decl (ic_void_ptr_var);
 tree-profile.c:  varpool_node::finalize_decl (ic_gcov_type_ptr_var);
 tree-switch-conversion.c:  varpool_node::finalize_decl (decl);
 ubsan.c:  varpool_node::finalize_decl (decl);
 ubsan.c:  varpool_node::finalize_decl (var);
 ubsan.c:  varpool_node::finalize_decl (array);
 varasm.c:  varpool_node::finalize_decl (decl);
 varpool.c:   Unlike finalize_decl function is intended to be used
 varpool.c:  varpool_node::finalize_decl (decl);

 I would say most of them needs similar treatment (I am not 100% sure about
 OMP
 ones that may be user visible)

 Honza


 fetch = build4 (ARRAY_REF, value_type, decl, tidx, NULL_TREE,




Re: [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins.

2015-05-22 Thread Ramana Radhakrishnan

 Ok for trunk?

I can't approve but do you mind taking care of -march=armv8-a in the
arm backend too as that would have the same issues.

Ramana

 Matthew

 gcc/
 2015-05-21  Matthew Wahab  matthew.wa...@arm.com

 * config/aarch64/aarch64.c (aarch64_emit_post_barrier): New.
 (aarch64_split_atomic_op): Check for __sync memory models, emit
 appropriate initial and final barriers.




[Patch libstdc++] Rewrite cpu/generic/atomic_word.h

2015-05-22 Thread Ramana Radhakrishnan

Hi,

While writing atomic_word.h for the ARM backend to fix PR target/66200 I 
thought it would make more sense to write it all up with atomic 
primitives instead of providing various fragile bits of inline 
asssembler. Thus this patch came about. I intend to ask for a 
specialized version of this to be applied for the ARM backend in any 
case after testing completes.


If this goes in as a cleanup I expect all the ports to be deleting their 
atomic_word.h implementation in the various ports directories and I'll 
follow up with patches asking port maintainers to test and apply for 
each of the individual cases if this is deemed to be good enough.


Currently tested on AArch64 (without the other patch for PR 
target/66200) and tested in my tree as part of testing for the ARM 
backend on arm-linux-gnueabihf with no regressions.


Ok to apply ?


regards
Ramana



* config/cpu/generic/atomic_word.h: Rewrite using atomics.
commit a360fdf84683777db764ba313354da92691aeb17
Author: Ramana Radhakrishnan ramana.radhakrish...@arm.com
Date:   Fri May 22 08:00:10 2015 +

rewrite as atomics.

diff --git a/libstdc++-v3/config/cpu/generic/atomic_word.h 
b/libstdc++-v3/config/cpu/generic/atomic_word.h
index 19038bb..bedce31 100644
--- a/libstdc++-v3/config/cpu/generic/atomic_word.h
+++ b/libstdc++-v3/config/cpu/generic/atomic_word.h
@@ -29,19 +29,46 @@
 #ifndef _GLIBCXX_ATOMIC_WORD_H
 #define _GLIBCXX_ATOMIC_WORD_H 1
 
+#include bits/cxxabi_tweaks.h
+
 typedef int _Atomic_word;
 
-// Define these two macros using the appropriate memory barrier for the target.
-// The commented out versions below are the defaults.
-// See ia64/atomic_word.h for an alternative approach.
+
+namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
+{
+  // Test the first byte of __g and ensure that no loads are hoisted across
+  // the test.
+  inline bool
+  __test_and_acquire (__cxxabiv1::__guard *__g)
+  {
+unsigned char __c;
+unsigned char *__p = reinterpret_castunsigned char *(__g);
+__atomic_load (__p, __c,  __ATOMIC_ACQUIRE);
+return __c != 0;
+  }
+
+  // Set the first byte of __g to 1 and ensure that no stores are sunk
+  // across the store.
+  inline void
+  __set_and_release (__cxxabiv1::__guard *__g)
+  {
+unsigned char *__p = reinterpret_castunsigned char *(__g);
+unsigned char val = 1;
+__atomic_store (__p, val, __ATOMIC_RELEASE);
+  }
+
+#define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __gnu_cxx::__test_and_acquire (G)
+#define _GLIBCXX_GUARD_SET_AND_RELEASE(G) __gnu_cxx::__set_and_release (G)
 
 // This one prevents loads from being hoisted across the barrier;
 // in other words, this is a Load-Load acquire barrier.
-// This is necessary iff TARGET_RELAXED_ORDERING is defined in tm.h.  
-// #define _GLIBCXX_READ_MEM_BARRIER __asm __volatile (:::memory)
+// This is necessary iff TARGET_RELAXED_ORDERING is defined in tm.h.
+#define _GLIBCXX_READ_MEM_BARRIER __atomic_thread_fence (__ATOMIC_ACQUIRE)
 
 // This one prevents stores from being sunk across the barrier; in other
 // words, a Store-Store release barrier.
-// #define _GLIBCXX_WRITE_MEM_BARRIER __asm __volatile (:::memory)
+#define _GLIBCXX_WRITE_MEM_BARRIER __atomic_thread_fence (__ATOMIC_RELEASE)
+
+}
 
 #endif 


Re: [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins.

2015-05-22 Thread Matthew Wahab

On 22/05/15 12:26, Ramana Radhakrishnan wrote:


Ok for trunk?


I can't approve but do you mind taking care of -march=armv8-a in the
arm backend too as that would have the same issues.



Will do,
Matthew



Re: [patch, testsuite, ARM] don't try to execute advsimd-intrinsics tests on hardware without NEON

2015-05-22 Thread Richard Biener
On Thu, May 21, 2015 at 6:06 PM, Sandra Loosemore
san...@codesourcery.com wrote:
 On 05/21/2015 03:48 AM, Christophe Lyon wrote:

 On 21 May 2015 at 07:33, Sandra Loosemore san...@codesourcery.com wrote:

 ARM testing shares the AArch64 advsimd-intrinsics execution tests.  On
 ARM,
 though, the NEON support being tested is optional -- some arches are
 compatible with the NEON compilation options but hardware available for
 testing might or might not be able to execute those instructions. In
 arm-none-eabi testing of a long list of multilibs, I found that this
 problem
 caused some of the multilibs to get stuck for days because every one of
 these execution tests was wandering off into the weeds and timing out.

 The vect.exp tests already handle this by setting dg-do-what-default to
 either run or compile, depending on whether we have target hardware
 execution support (arm_neon_hw) for NEON, or only compilation support
 (arm_neon_ok).  So, I've adapted that logic for advsimd-intrinsics.exp
 too.


 Indeed it makes sense.


 It also appeared that the main loop over the test cases was running them
 all
 twice with the torture options -- once using c-torture-execute and once
 using gcc-dg-runtest.  I deleted the former since it appears to ignore
 dg-do-what-default and always try to execute no matter what.  My
 dejagnu-fu
 isn't the strongest and this is pretty confusing to me am I missing
 something here?  Otherwise, OK to commit?


 As noted by Alan in
 https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01348.html
 the sets of options covered by gcc-dg-runtest and c-torture-execute
 are slightly different.

 That was the reason I kept both.
 We can probably live with no longer testing -Og -g as Alan says.
 OTOH, are the 2 option sets supposed to be the same, or are there any
 plans to make them differ substantially  in the future?


 Richard, adding -Og -g was your change:

 https://gcc.gnu.org/ml/gcc-patches/2012-09/msg01367.html

 Is it an oversight that the torture option lists in c-torture.exp,
 objc-torture.exp, and gcc-dg.exp are not consistent?  Maybe we should have a
 separate patch to unify them?

I think the various torture flags were never consistent.  I simply
avoided putting
even more load on the various combinations tested (we should remove some
of them, like the -finline-functions and -funroll-loops cases which add little
or nothing today).

But yes, that's separate things.  I'm fine with a patch to unify the various
torture flag list into one common one and a later patch trimming it down
somewhat.

Thanks,
Richard.

 -Sandra





Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test suite failures

2015-05-22 Thread Bin.Cheng
On Fri, May 22, 2015 at 4:58 PM, Kyrill Tkachov
kyrylo.tkac...@foss.arm.com wrote:
 Hi Venkat,


 On 22/05/15 09:50, Kumar, Venkataramanan wrote:

 Hi Kyrill,

 Sorry for little delay in responding.

 -Original Message-
 From: Kyrill Tkachov [mailto:kyrylo.tkac...@foss.arm.com]
 Sent: Tuesday, May 19, 2015 9:13 PM
 To: Kumar, Venkataramanan; James Greenhalgh; gcc-patches@gcc.gnu.org
 Cc: Ramana Radhakrishnan; seg...@kernel.crashing.org; Marcus Shawcroft
 Subject: Re: [Patch] [AArch64] PR target 66049: fix add/extend gcc test
 suite
 failures

 Hi Venkat,

 On 19/05/15 16:37, Kumar, Venkataramanan wrote:

 Hi Maintainers,

 Please find the attached patch, that fixes add/extend gcc test suite
 failures

 in Aarch64 target.

 Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66049

 These tests started to fail after we prevented combiner from converting

 shift RTX to mult RTX, when the RTX is not inside a memory operation
 (r222874) .

 Now I have added new add/extend patterns which are based on shift

 operations,  to fix these cases.

 Testing status with the patch.

 (1) GCC bootstrap on AArch64 successful.
 (2)  SPEC2006 INT runs did not show any degradation.

 Does that mean there was no performance regression? Or no codegen
 difference?

 Yes there was no performance regression.


 What I'd expect from this patch is that the codegen would be the same as
 before the combine patch
 (r222874). A performance difference can sometimes be hard to measure
 even at worse code quality.
 Can you please confirm that on SPEC2006 INT the adds and shifts are now
 back to being combined
 into a single instruction?

 I  used -dp --save-temps to dump pattern names in assembly files.
 I used revision before combiner patch (r222873) and the patch on top of
 trunk (little older r223194) for comparison.

 Quickly compared the counts generated for the below patterns from the SPEC
 INT binaries.

 *adds_optabmode_multp2 vs *adds_optabALLX:mode_shft_GPI:mode
 *subs_optabmode_multp2 vs *subs_optabALLX:mode_shft_GPI:mode
 *add_uxtmode_multp2  vs  *add_uxtmode_shift2
 *add_uxtsi_multp2_uxtw vs  *add_uxtsi_shift2_uxtw
 *sub_uxtmode_multp2  vs  *sub_uxtmode_shift2
 *sub_uxtsi_multp2_uxtw vs  *sub_uxtsi_shift2_uxtw
 *adds_mul_imm_modevs *adds_shift_imm_mode
 *subs_mul_imm_modevs *subs_shift_imm_mode

 Patterns are found in few benchmarks. The  generated counts are matching
 in the binaries compiled with the two revisions.

   Also Looked at assembly generated and  the adds and shifts are combined
 properly using the new patterns. Please let me know if this is OK.


 Thanks for investigating this. I've run your patch on aarch64.exp and I see
 the add+shift/extend failures
 that we were seeing go away (I'm sure you saw that as well ;).
Exact what's expected.  This patch and the previous combine one are
designed to fix those failures.

Thanks,
bin
 Up to the maintainers to review the patch.

 Thanks,
 Kyrill



 Thanks,
 Kyrill

 (3) gcc regression testing passed.

 (-Snip-)
 # Comparing 3 common sum files
 ## /bin/sh ./gcc-fsf-trunk/contrib/compare_tests  /tmp/gxx-sum1.24998

 /tmp/gxx-sum2.24998

 Tests that now work, but didn't before:

 gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-

 9]+, lsl 3

 gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+,
 x[0-9]+,

 lsl 3

 gcc.target/aarch64/adds3.c scan-assembler-times adds\tx[0-9]+, x[0-9]+,

 x[0-9]+, sxtw 2

 gcc.target/aarch64/extend.c scan-assembler add\tw[0-9]+,.*uxth #?1
 gcc.target/aarch64/extend.c scan-assembler add\tx[0-9]+,.*uxtw #?3
 gcc.target/aarch64/extend.c scan-assembler sub\tw[0-9]+,.*uxth #?1
 gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxth #?1
 gcc.target/aarch64/extend.c scan-assembler sub\tx[0-9]+,.*uxtw #?3
 gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-

 9]+, lsl 3

 gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
 x[0-9]+,

 lsl 3

 gcc.target/aarch64/subs3.c scan-assembler-times subs\tx[0-9]+, x[0-9]+,

 x[0-9]+, sxtw 2

 # No differences found in 3 common sum files
 (-Snip-)

 The patterns are fixing the regressing tests, so I have not added any
 new

 tests.

 Regarding  removal of the old patterns based on mults,  I am planning
 to

 do it as a separate work.

 Is this OK for trunk ?

 gcc/ChangeLog

 2015-05-19  Venkataramanan Kumar  venkataramanan.ku...@amd.com

   * config/aarch64/aarch64.md
   (*adds_shift_imm_mode):  New pattern.
   (*subs_shift_imm_mode):  Likewise.
   (*adds_optabALLX:mode_shift_GPI:mode):  Likewise.
   (*subs_optabALLX:mode_shift_GPI:mode): Likewise.
   (*add_uxtmode_shift2): Likewise.
   (*add_uxtsi_shift2_uxtw): Likewise.
  (*sub_uxtmode_shift2): Likewise.
  (*sub_uxtsi_shift2_uxtw): Likewise.


 Regards,
 Venkat.




 Regards,
 Venkat,




Add few cases to operand_equal_p

2015-05-22 Thread Jan Hubicka
Hi,
I am working on patch that makes operand_equal_p replace logic from
ipa-icf-gimple's compare_op via a valueizer hook.  Currently the patch however
cuts number of merges on firefox to half (apparently becuase it gives up on
some tree codes too early)
The patch bellow merges code from ipa-icf-gimple.c that is missing in
fold-const and is needed. 

Bootstrapped/regtested x86_64-linux, OK?

Honza

* fold-const.c (operand_equal_p): Handle OBJ_TYPE_REF, CONSTRUCTOR
and be more tolerant about FIELD_DECL.
* tree.c (add_expr): Handle FIELD_DECL.
(prototype_p, virtual_method_call_p, obj_type_ref_class): Constify.
* tree.h (prototype_p, virtual_method_call_p, obj_type_ref_class):
Constify.
Index: fold-const.c
===
--- fold-const.c(revision 223500)
@@ -3037,6 +3058,40 @@ operand_equal_p (const_tree arg0, const_
case DOT_PROD_EXPR:
  return OP_SAME (0)  OP_SAME (1)  OP_SAME (2);
 
+   /* OBJ_TYPE_REF really is just a transparent wrapper around expression,
+  but it holds additional type information for devirtualization that
+  needs to be matched.  We may want to intoruce OEP flag if we want
+  to compare the actual value only, or if we also care about effects
+  of potentially merging the code.  This flag can bypass this check
+  as well as the alias set matching in MEM_REF.  */
+   case OBJ_TYPE_REF:
+ {
+   if (!operand_equal_p (OBJ_TYPE_REF_EXPR (arg0),
+ OBJ_TYPE_REF_EXPR (arg1), flags))
+ return false;
+   if (flag_devirtualize  virtual_method_call_p (arg0))
+ {
+   if (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
+   != tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1)))
+ return false;
+   if (!types_same_for_odr (obj_type_ref_class (arg0),
+obj_type_ref_class (arg1)))
+ return false;
+   /* OBJ_TYPE_REF_OBJECT is used to track the instance of
+  object THIS pointer points to.  Checking that both
+  addresses equal is safe approximation of the fact
+  that dynamic types are equal.
+  Do not care about the other flags, because this expression
+  does not attribute to actual value of OBJ_TYPE_REF  */
+   if (!operand_equal_p (OBJ_TYPE_REF_OBJECT (arg0),
+ OBJ_TYPE_REF_OBJECT (arg1),
+ OEP_ADDRESS_OF))
+ return false;
+ }
+
+   return true;
+ }
+
default:
  return 0;
}
@@ -3097,6 +3152,21 @@ operand_equal_p (const_tree arg0, const_
}
 
 case tcc_declaration:
+  /* At LTO time the FIELD_DECL may exist in multiple copies.
+We only care about offsets and bit offsets for operands.  */
+  if (TREE_CODE (arg0) == FIELD_DECL)
+   {
+ tree offset1 = DECL_FIELD_OFFSET (arg0);
+ tree offset2 = DECL_FIELD_OFFSET (arg1);
+
+ tree bit_offset1 = DECL_FIELD_BIT_OFFSET (arg0);
+ tree bit_offset2 = DECL_FIELD_BIT_OFFSET (arg1);
+
+ flags = ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
+
+ return operand_equal_p (offset1, offset2, flags)
+ operand_equal_p (bit_offset1, bit_offset2, flags);
+   }
   /* Consider __builtin_sqrt equal to sqrt.  */
   return (TREE_CODE (arg0) == FUNCTION_DECL
   DECL_BUILT_IN (arg0)  DECL_BUILT_IN (arg1)
@@ -3104,12 +3174,50 @@ operand_equal_p (const_tree arg0, const_
   DECL_FUNCTION_CODE (arg0) == DECL_FUNCTION_CODE (arg1));
 
 default:
+  /* In GIMPLE empty constructors are allowed in initializers of
+vector types.  */
+  if (TREE_CODE (arg0) == CONSTRUCTOR)
+   {
+ unsigned length1 = vec_safe_length (CONSTRUCTOR_ELTS (arg0));
+ unsigned length2 = vec_safe_length (CONSTRUCTOR_ELTS (arg1));
+
+ if (length1 != length2)
+   return false;
+
+ flags = ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
+
+ for (unsigned i = 0; i  length1; i++)
+   if (!operand_equal_p (CONSTRUCTOR_ELT (arg0, i)-value,
+ CONSTRUCTOR_ELT (arg1, i)-value, flags))
+ return false;
+
+ return true;
+   }
   return 0;
 }
 
 #undef OP_SAME
 #undef OP_SAME_WITH_NULL
 }
Index: tree.c
===
--- tree.c  (revision 223500)
+++ tree.c  (working copy)
@@ -7647,6 +7647,12 @@ add_expr (const_tree t, inchash::hash h
  }
return;
   }
+case FIELD_DECL:
+  /* At LTO time the FIELD_DECL may exist in multiple copies.
+We only care about offsets and bit offsets for operands.  */
+  

[PATCH][3/n] Reduction vectorization improvements

2015-05-22 Thread Richard Biener

This does some more cleanup and refactoring with two fixes, the
pure slp compute in vect_analyze_loop_operations was failing
to look at pattern stmts and the vect_is_slp_reduction hunk makes
reduction detection fail because the pattern state changes in
between reduction detection and vectoriztaion (which re-calls
the early code).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2015-05-22  Richard Biener  rguent...@suse.de

* tree-vectorizer.h (struct _slp_oprnd_info): Add second_pattern
member.
* tree-vect-loop.c (vect_analyze_loop_operations): Look at
patterns when determining whether SLP is pure.
(vect_is_slp_reduction): Remove check for pattern stmts.
(vect_is_simple_reduction_1): Remove dead code.
* tree-vect-slp.c (vect_create_oprnd_info): Initialize second_pattern.
(vect_get_and_check_slp_defs): Pass in the stmt number.
Allow the first def in a reduction to be not a pattern stmt when
the rest of the stmts def are patterns.
(vect_build_slp_tree_1): Allow tcc_expression codes like
SAD_EXPR and DOT_PROD_EXPR.
(vect_build_slp_tree): Adjust.
(vect_analyze_slp): Refactor and move BB vect error message ...
(vect_slp_analyze_bb_1): ... here.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 223529)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1399,7 +1399,12 @@ vect_analyze_loop_operations (loop_vec_i
{
  gimple stmt = gsi_stmt (si);
  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
- gcc_assert (stmt_info);
+ if (STMT_VINFO_IN_PATTERN_P (stmt_info)
+  STMT_VINFO_RELATED_STMT (stmt_info))
+   {
+ stmt = STMT_VINFO_RELATED_STMT (stmt_info);
+ stmt_info = vinfo_for_stmt (stmt);
+   }
  if ((STMT_VINFO_RELEVANT_P (stmt_info)
   || VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))
   !PURE_SLP_STMT (stmt_info))
@@ -2031,12 +2036,8 @@ vect_is_slp_reduction (loop_vec_info loo
 
   if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt)))
 {
-  if (vinfo_for_stmt (use_stmt)
-   !STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (use_stmt)))
-{
-  loop_use_stmt = use_stmt;
-  nloop_uses++;
-}
+ loop_use_stmt = use_stmt;
+ nloop_uses++;
 }
else
  n_out_of_loop_uses++;
@@ -2265,9 +2266,7 @@ vect_is_simple_reduction_1 (loop_vec_inf
   return NULL;
 }
 
-  if (vinfo_for_stmt (use_stmt)
-  !is_pattern_stmt_p (vinfo_for_stmt (use_stmt)))
-nloop_uses++;
+  nloop_uses++;
   if (nloop_uses  1)
 {
   if (dump_enabled_p ())
@@ -2325,9 +2324,7 @@ vect_is_simple_reduction_1 (loop_vec_inf
   gimple use_stmt = USE_STMT (use_p);
   if (is_gimple_debug (use_stmt))
continue;
-  if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt))
-  vinfo_for_stmt (use_stmt)
-  !is_pattern_stmt_p (vinfo_for_stmt (use_stmt)))
+  if (flow_bb_inside_loop_p (loop, gimple_bb (use_stmt)))
nloop_uses++;
   if (nloop_uses  1)
{
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 223529)
+++ gcc/tree-vect-slp.c (working copy)
@@ -183,6 +183,7 @@ vect_create_oprnd_info (int nops, int gr
   oprnd_info-first_dt = vect_uninitialized_def;
   oprnd_info-first_op_type = NULL_TREE;
   oprnd_info-first_pattern = false;
+  oprnd_info-second_pattern = false;
   oprnds_info.quick_push (oprnd_info);
 }
 
@@ -242,7 +243,7 @@ vect_get_place_in_interleaving_chain (gi
 
 static int 
 vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
- gimple stmt, bool first,
+ gimple stmt, unsigned stmt_num,
  vecslp_oprnd_info *oprnds_info)
 {
   tree oprnd;
@@ -256,6 +257,8 @@ vect_get_and_check_slp_defs (loop_vec_in
   int first_op_idx = 1;
   bool commutative = false;
   bool first_op_cond = false;
+  bool first = stmt_num == 0;
+  bool second = stmt_num == 1;
 
   if (loop_vinfo)
 loop = LOOP_VINFO_LOOP (loop_vinfo);
@@ -326,7 +329,11 @@ again:
   !STMT_VINFO_LIVE_P (vinfo_for_stmt (def_stmt)))
 {
   pattern = true;
-  if (!first  !oprnd_info-first_pattern)
+  if (!first  !oprnd_info-first_pattern
+ /* Allow different pattern state for the defs of the
+first stmt in reduction chains.  */
+  (oprnd_info-first_dt != vect_reduction_def
+ || (!second  !oprnd_info-second_pattern)))
   

Re: [RFC / CFT] PR c++/66192 - Remove TARGET_RELAXED_ORDERING and use load acquires.

2015-05-22 Thread Ramana Radhakrishnan



On 22/05/15 14:40, Jason Merrill wrote:

On 05/22/2015 07:23 AM, Ramana Radhakrishnan wrote:

+  /* Load the guard value only through an atomic acquire load.  */
+  guard = build_atomic_load (guard, MEMMODEL_ACQUIRE);
+
/* Check to see if the GUARD is zero.  */
guard = get_guard_bits (guard);


I wonder if these calls should be reversed, to express that we're only
trying to atomically load a byte (on non-ARM targets)?



I'm not sure about the impact on other non-ARM targets without some more 
investigation.



+  tree orig_src = src;
+  tree t, addr, val;
+  unsigned int size;
+  int fncode;
+
+  size = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (src)));
+
+  fncode = BUILT_IN_ATOMIC_LOAD_N + exact_log2 (size) + 1;
+  t = builtin_decl_implicit ((enum built_in_function) fncode);
+
+  addr = build1 (ADDR_EXPR, ptr_type, src);
+  val = build_call_expr (t, 2, addr, mem_model);
+
+  /* First reinterpret the loaded bits in the original type of the load,
+ then convert to the expected result type.  */
+  t = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (src), val);
+  return convert (TREE_TYPE (orig_src), t);


I don't see anything that changes src here.


Sorry 'bout that. Will fix in the next spin depending on comments from 
others.


I take it that you are happy with the patch otherwise ...

regards
Ramana



Jason



[RFC] operand_equal_p with valueization

2015-05-22 Thread Jan Hubicka
Hi,
with aliasing sanity checks I got burnt again with ipa-icf-gimple's
compare_operand doing alias set checks on all types it ever trips across.

I always tought that we do not need two equality testers - operand_equal_p and
compare_operand and given that it turns out to be non-trivial to fix issues in
compare_operand I decided to go ahead with plan to unify them.
I think operand_equal_p is better code base to start from since it is more
tested and knows more special cases.

The basic idea is to add valeize hook similar way as other folders do that
allows to do magic inside the recursive comparsion. I.e. declare two trees
equal if they are not (i.e. SSA_NAMES from differen functions).  I think it
should be useful for GVN/VRP/CCP too to improve folding of conditionals.

After trying out the hook and figuring out that ipa-icf does not have
global context to hold its tables, I dedcided that maybe more C++ way
is to have comparsion class that can be derived an adjusted for other
needs.

The following patch is a first step.  If it is considered sane, I will
continue by moving the code to one place - ipa-icf-gimple or special
ipa-icf-op.c. I will also recover Martin's diagnostics that is useful
to debug why things are considered different.

Also the code should be C++ized, for example it should use true/false
instead 0/1.

I think also the iterative hashing should be together with operand_equal_p
implementation because these two must match as I found with merging the two
cases that ipa-icf-gimple gets right and fold-const wrong - OBJ_TYPE_REF,
CONSTRUCTOR and PARM_DECL.

Finally I think we want compare_gimple class that does the same for
gimple and is independent of rest of the ipa-icf that may be better
suitable for stuff like tail merging.

The patch bootstraps/regtests ppc64-linux, but I think it is not quite
mainline ready as it noticeably reduces number of equivalences found. 
I will need to debug why that happens, but I am sending it as an RFC for
the basic concept and interfaces.

Honza

* fold-const.c (operand_equal_p): Reorg to wrapper for
(operand_compare::operand_equal_p): This function; add
support for valueization, add missing type matching for
OBJ_TYPE_REF, CONSTRUCTOR; relax matching of PARM_DECL.
(operand_compare::operand_equal_valueize): New.
* fold-const.h (operand_equal_p): Update prototype.
(class operand_compare): New class.
* ipa-icf-gimple.c (func_checker::operand_equal_valueize): Break
ipa-icf specific bits out from ...
(func_checker::compare_operand): ... here; remove most of generic
handling and use operand_compare class.
* ipa-icf-gimple.h (operand_compare): New.
* ipa-icf.c (sem_function::equals_private): Arrange CFUN to be up to
date so we operand_equal_p works well for flag_devirtualize.
Index: fold-const.c
===
--- fold-const.c(revision 223500)
+++ fold-const.c(working copy)
@@ -2716,8 +2730,9 @@ combine_comparisons (location_t loc,
are considered the same.  It is used when the caller has other ways
to ensure that global memory is unchanged in between.  */
 
-int
-operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
+bool
+operand_compare::operand_equal_p (const_tree arg0, const_tree arg1,
+ unsigned int flags)
 {
   /* If either is ERROR_MARK, they aren't equal.  */
   if (TREE_CODE (arg0) == ERROR_MARK || TREE_CODE (arg1) == ERROR_MARK
@@ -2868,6 +2883,12 @@ operand_equal_p (const_tree arg0, const_
   if (flags  OEP_ONLY_CONST)
 return 0;
 
+  int val = operand_equal_valueize (arg0, arg1, flags);
+  if (val == 1)
+return 1;
+  if (val == 0)
+return 0;
+
 /* Define macros to test an operand from arg0 and arg1 for equality and a
variant that allows null and views null as being different from any
non-null value.  In the latter case, if either is null, the both
@@ -3104,12 +3174,50 @@ operand_equal_p (const_tree arg0, const_
   DECL_FUNCTION_CODE (arg0) == DECL_FUNCTION_CODE (arg1));
 
 default:
   return 0;
 }
 
 #undef OP_SAME
 #undef OP_SAME_WITH_NULL
 }
+
+/* Valueizer is a virtual method that allows to introduce extra equalities
+   that are not directly visible from the operand.
+   N1 means values are known to be equal, 0 means values are known to be 
different
+   -1 means that operand_equal_p should continue processing.  */
+int
+operand_compare::operand_equal_valueize (const_tree, const_tree, unsigned int)
+{
+  return -1;
+}
+
+/* Conveinece wrapper around operand_compare class because usually we do
+   not need to play with the valueizer.  */
+bool
+operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
+{
+  static operand_compare default_compare_instance;
+  return default_compare_instance.operand_equal_p (arg0, arg1, flags);
+}
 
 /* Similar to operand_equal_p, 

Re: [RFC] operand_equal_p with valueization

2015-05-22 Thread Richard Biener
On Fri, 22 May 2015, Jan Hubicka wrote:

 Hi,
 with aliasing sanity checks I got burnt again with ipa-icf-gimple's
 compare_operand doing alias set checks on all types it ever trips across.
 
 I always tought that we do not need two equality testers - operand_equal_p and
 compare_operand and given that it turns out to be non-trivial to fix issues in
 compare_operand I decided to go ahead with plan to unify them.
 I think operand_equal_p is better code base to start from since it is more
 tested and knows more special cases.
 
 The basic idea is to add valeize hook similar way as other folders do that
 allows to do magic inside the recursive comparsion. I.e. declare two trees
 equal if they are not (i.e. SSA_NAMES from differen functions).  I think it
 should be useful for GVN/VRP/CCP too to improve folding of conditionals.
 
 After trying out the hook and figuring out that ipa-icf does not have
 global context to hold its tables, I dedcided that maybe more C++ way
 is to have comparsion class that can be derived an adjusted for other
 needs.
 
 The following patch is a first step.  If it is considered sane, I will
 continue by moving the code to one place - ipa-icf-gimple or special
 ipa-icf-op.c. I will also recover Martin's diagnostics that is useful
 to debug why things are considered different.
 
 Also the code should be C++ized, for example it should use true/false
 instead 0/1.
 
 I think also the iterative hashing should be together with operand_equal_p
 implementation because these two must match as I found with merging the two
 cases that ipa-icf-gimple gets right and fold-const wrong - OBJ_TYPE_REF,
 CONSTRUCTOR and PARM_DECL.
 
 Finally I think we want compare_gimple class that does the same for
 gimple and is independent of rest of the ipa-icf that may be better
 suitable for stuff like tail merging.
 
 The patch bootstraps/regtests ppc64-linux, but I think it is not quite
 mainline ready as it noticeably reduces number of equivalences found. 
 I will need to debug why that happens, but I am sending it as an RFC for
 the basic concept and interfaces.

I think for the case of IPA ICF it would be better to address the
issue that it cannot do merging of functions with TBAA conflicts.
That is, drop that TBAA code from IPA ICF and arrange for the
IPA inliner to inline original unmerged copies.  We were talking
about making the original nodes inline clones of the node that
eventually prevails, much similar to speculative inlining ones
(if I remember that suggestion from Martin correctly).

And no, I'm hesitant to change operand_equal_p too much.  It's
very much deep-rooted into GENERIC.

Richard.


 Honza
 
   * fold-const.c (operand_equal_p): Reorg to wrapper for
   (operand_compare::operand_equal_p): This function; add
   support for valueization, add missing type matching for
   OBJ_TYPE_REF, CONSTRUCTOR; relax matching of PARM_DECL.
   (operand_compare::operand_equal_valueize): New.
   * fold-const.h (operand_equal_p): Update prototype.
   (class operand_compare): New class.
   * ipa-icf-gimple.c (func_checker::operand_equal_valueize): Break
   ipa-icf specific bits out from ...
   (func_checker::compare_operand): ... here; remove most of generic
   handling and use operand_compare class.
   * ipa-icf-gimple.h (operand_compare): New.
   * ipa-icf.c (sem_function::equals_private): Arrange CFUN to be up to
   date so we operand_equal_p works well for flag_devirtualize.
 Index: fold-const.c
 ===
 --- fold-const.c  (revision 223500)
 +++ fold-const.c  (working copy)
 @@ -2716,8 +2730,9 @@ combine_comparisons (location_t loc,
 are considered the same.  It is used when the caller has other ways
 to ensure that global memory is unchanged in between.  */
  
 -int
 -operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
 +bool
 +operand_compare::operand_equal_p (const_tree arg0, const_tree arg1,
 +   unsigned int flags)
  {
/* If either is ERROR_MARK, they aren't equal.  */
if (TREE_CODE (arg0) == ERROR_MARK || TREE_CODE (arg1) == ERROR_MARK
 @@ -2868,6 +2883,12 @@ operand_equal_p (const_tree arg0, const_
if (flags  OEP_ONLY_CONST)
  return 0;
  
 +  int val = operand_equal_valueize (arg0, arg1, flags);
 +  if (val == 1)
 +return 1;
 +  if (val == 0)
 +return 0;
 +
  /* Define macros to test an operand from arg0 and arg1 for equality and a
 variant that allows null and views null as being different from any
 non-null value.  In the latter case, if either is null, the both
 @@ -3104,12 +3174,50 @@ operand_equal_p (const_tree arg0, const_
  DECL_FUNCTION_CODE (arg0) == DECL_FUNCTION_CODE (arg1));
  
  default:
return 0;
  }
  
  #undef OP_SAME
  #undef OP_SAME_WITH_NULL
  }
 +
 +/* Valueizer is a virtual method that allows to introduce extra equalities
 +   that are not 

Re: [RFC / CFT] PR c++/66192 - Remove TARGET_RELAXED_ORDERING and use load acquires.

2015-05-22 Thread Jason Merrill

On 05/22/2015 07:23 AM, Ramana Radhakrishnan wrote:

+  /* Load the guard value only through an atomic acquire load.  */
+  guard = build_atomic_load (guard, MEMMODEL_ACQUIRE);
+
/* Check to see if the GUARD is zero.  */
guard = get_guard_bits (guard);


I wonder if these calls should be reversed, to express that we're only 
trying to atomically load a byte (on non-ARM targets)?



+  tree orig_src = src;
+  tree t, addr, val;
+  unsigned int size;
+  int fncode;
+
+  size = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (src)));
+
+  fncode = BUILT_IN_ATOMIC_LOAD_N + exact_log2 (size) + 1;
+  t = builtin_decl_implicit ((enum built_in_function) fncode);
+
+  addr = build1 (ADDR_EXPR, ptr_type, src);
+  val = build_call_expr (t, 2, addr, mem_model);
+
+  /* First reinterpret the loaded bits in the original type of the load,
+ then convert to the expected result type.  */
+  t = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (src), val);
+  return convert (TREE_TYPE (orig_src), t);


I don't see anything that changes src here.

Jason



Re: [RFC / CFT] PR c++/66192 - Remove TARGET_RELAXED_ORDERING and use load acquires.

2015-05-22 Thread David Edelsohn
On Fri, May 22, 2015 at 9:40 AM, Jason Merrill ja...@redhat.com wrote:
 On 05/22/2015 07:23 AM, Ramana Radhakrishnan wrote:

 +  /* Load the guard value only through an atomic acquire load.  */
 +  guard = build_atomic_load (guard, MEMMODEL_ACQUIRE);
 +
 /* Check to see if the GUARD is zero.  */
 guard = get_guard_bits (guard);


 I wonder if these calls should be reversed, to express that we're only
 trying to atomically load a byte (on non-ARM targets)?

That expresses the semantics more directly, but will that lead to less
efficient code on some RISC architectures?

- David


[PATCH] Fix ICE in PR66251

2015-05-22 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-05-22  Richard Biener  rguent...@suse.de

PR tree-optimization/66251
* tree-vect-stmts.c (vectorizable_conversion): Properly
set STMT_VINFO_VEC_STMT even for the SLP case.

* gfortran.fortran-torture/compile/pr66251.f90: New testcase.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 223529)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -3964,14 +3964,12 @@ vectorizable_conversion (gimple stmt, gi
 
  if (slp_node)
SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt);
+
+ if (!prev_stmt_info)
+   STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
  else
-   {
- if (!prev_stmt_info)
-   STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
- else
-   STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
- prev_stmt_info = vinfo_for_stmt (new_stmt);
-   }
+   STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
+ prev_stmt_info = vinfo_for_stmt (new_stmt);
}
}
 
Index: gcc/testsuite/gfortran.fortran-torture/compile/pr66251.f90
===
--- gcc/testsuite/gfortran.fortran-torture/compile/pr66251.f90  (revision 0)
+++ gcc/testsuite/gfortran.fortran-torture/compile/pr66251.f90  (working copy)
@@ -0,0 +1,7 @@
+SUBROUTINE dbcsr_data_convert (n)
+  COMPLEX(KIND=4), DIMENSION(:), POINTER :: s_data_c
+  COMPLEX(KIND=8), DIMENSION(:), POINTER :: t_data_z
+  t_data_z(1:n) = CMPLX(s_data_c(1:n), KIND=8)
+  CALL foo()
+END SUBROUTINE dbcsr_data_convert
+


Re: [PATCH][ARM] Handle UNSPEC_VOLATILE in rtx costs and don't recurse inside the unspec

2015-05-22 Thread Ramana Radhakrishnan
On Mon, Apr 20, 2015 at 5:28 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 A pet project of mine is to get to the point where backend rtx costs
 functions won't have
 to handle rtxes that don't match down to any patterns/expanders we have. Or
 at least limit such cases.
 A case dealt with in this patch is QImode PLUS. We don't actually generate
 or handle these anywhere in
 the arm backend *except* in sync.md where, for example,
 atomic_sync_optabmode matches:
 (set (match_operand:QHSD 0 mem_noofs_operand +Ua)
 (unspec_volatile:QHSD
   [(syncop:QHSD (match_dup 0)
  (match_operand:QHSD 1 atomic_op_operand atomic_op_str))
(match_operand:SI 2 const_int_operand)];; model
   VUNSPEC_ATOMIC_OP))

 Here QHSD can contain QImode and HImode while syncop can be PLUS.
 Now immediately during splitting in arm_split_atomic_op we convert that
 QImode PLUS into an SImode one, so we never actually generate any kind of
 QImode add operations
 (how would we? we don't have define_insns for such things) but the RTL
 optimisers will get a hold
 of the UNSPEC_VOLATILE in the meantime and ask for it's cost (for example,
 cse when building libatomic).
 Currently we don't handle UNSPEC_VOLATILE (VUNSPEC_ATOMIC_OP) so the arm rtx
 costs function just recurses
 into the QImode PLUS that I'd like to avoid.
 This patch stops that by passing the VUNSPEC_ATOMIC_OP into arm_unspec_cost
 and handling it there
 (very straightforwardly just returning COSTS_N_INSNS (2); there's no
 indication that we want to do anything
 smarter here) and stopping the recursion.

 This is a small step in the direction of not having to care about obviously
 useless rtxes in the backend.
 The astute reader might notice that in sync.md we also have the pattern
 atomic_fetch_sync_optabmode
 which expands to/matches this:
 (set (match_operand:QHSD 0 s_register_operand =r)
 (match_operand:QHSD 1 mem_noofs_operand +Ua))
(set (match_dup 1)
 (unspec_volatile:QHSD
   [(syncop:QHSD (match_dup 1)
  (match_operand:QHSD 2 atomic_op_operand atomic_op_str))
(match_operand:SI 3 const_int_operand)];; model
   VUNSPEC_ATOMIC_OP))


 Here the QImode PLUS is in a PARALLEL together with the UNSPEC, so it might
 have rtx costs called on it
 as well. This will always be a (plus (reg) (mem)) rtx, which is unlike any
 other normal rtx we generate
 in the arm backend. I'll try to get a patch to handle that case, but I'm
 still thinking on how to best
 do that.

 Tested arm-none-eabi, I didn't see any codegen differences in some compiled
 codebases.

 Ok for trunk?

OK

Ramana



 P.S. I know that expmed creates all kinds of irregular rtxes and asks for
 their costs. I'm hoping to clean that
 up at some point...

 2015-04-20  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm.c (arm_new_rtx_costs): Handle UNSPEC_VOLATILE.
 (arm_unspec_cost): Allos UNSPEC_VOLATILE.  Do not recurse inside
 unknown unspecs.


[Ada] Small enhancement to unchecked conversion warning in -gnatf mode

2015-05-22 Thread Arnaud Charlet
In full warning mode, when an unchecked conversion is applied to types of
different sizes, the compiler issues a warning describing the effects of
the conversion on the additional or missing bits.

When the source is smaller than the target, it was issuing a specific warning
if the target is of a discrete type but, when the source is larger than the
target, it wasn't doing the same if the source is of a discrete type.

The compiler must issue the following warnings with -gnatf:

ucw.ads:12:03: warning: types for unchecked conversion have different sizes
ucw.ads:12:03: warning: size of REC is 16, size of INTEGER is 32
ucw.ads:12:03: warning: target value will include 16 undefined high order bits
ucw.ads:13:03: warning: types for unchecked conversion have different sizes
ucw.ads:13:03: warning: size of INTEGER is 32, size of REC is 16
ucw.ads:13:03: warning: 16 high order bits of source will be ignored

on

with Unchecked_Conversion;

package UCW is

  type Int8_t is mod 2**8;

  type Rec is record
I1 : Int8_t;
I2 : Int8_t;
  end record;

  function C1 is new Unchecked_Conversion (Source = Rec, Target = Integer);
  function C2 is new Unchecked_Conversion (Source = Integer, Target = Rec);

end UCW;

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Eric Botcazou  ebotca...@adacore.com

* sem_ch13.adb (Validate_Unchecked_Conversions): Also issue
specific warning for discrete types when the source is larger
than the target.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 223534)
+++ sem_ch13.adb(working copy)
@@ -13483,9 +13483,22 @@
 end if;
 
  else pragma Assert (Source_Siz  Target_Siz);
-Error_Msg
-  (\?z?^ trailing bits of source will be ignored!,
-   Eloc);
+if Is_Discrete_Type (Source) then
+   if Bytes_Big_Endian then
+  Error_Msg
+(\?z?^ low order bits of source will be 
+  ignored!, Eloc);
+   else
+  Error_Msg
+(\?z?^ high order bits of source will be 
+  ignored!, Eloc);
+   end if;
+
+else
+   Error_Msg
+ (\?z?^ trailing bits of source will be 
+   ignored!, Eloc);
+end if;
  end if;
   end if;
end if;


[Ada] Make sure Volatile_Full_Access is treated like Atomic

2015-05-22 Thread Arnaud Charlet
This update makes sure that Volatile_Full_Access is treated like
Atomic in all cases except checking specific RM legality rules,
and controlling atomic synchronization.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Robert Dewar  de...@adacore.com

* exp_ch5.adb, layout.adb, einfo.adb, einfo.ads, sem_prag.adb,
freeze.adb, freeze.ads, sem_util.adb, sem_util.ads, exp_ch2.adb,
exp_ch4.adb, errout.adb, exp_aggr.adb, sem_ch13.adb: This is a general
change that deals with the fact that most of the special coding for
Atomic should also apply to the case of Volatile_Full_Access.
A new attribute Is_Atomic_Or_VFA is introduced, and many of the
references to Is_Atomic now use this new attribute.

Index: exp_ch5.adb
===
--- exp_ch5.adb (revision 223476)
+++ exp_ch5.adb (working copy)
@@ -429,11 +429,11 @@
   elsif Has_Controlled_Component (L_Type) then
  Loop_Required := True;
 
-  --  If object is atomic, we cannot tolerate a loop
+  --  If object is atomic/VFA, we cannot tolerate a loop
 
-  elsif Is_Atomic_Object (Act_Lhs)
+  elsif Is_Atomic_Or_VFA_Object (Act_Lhs)
   or else
-Is_Atomic_Object (Act_Rhs)
+Is_Atomic_Or_VFA_Object (Act_Rhs)
   then
  return;
 
@@ -442,8 +442,8 @@
 
   elsif Has_Atomic_Components (L_Type)
 or else Has_Atomic_Components (R_Type)
-or else Is_Atomic (Component_Type (L_Type))
-or else Is_Atomic (Component_Type (R_Type))
+or else Is_Atomic_Or_VFA (Component_Type (L_Type))
+or else Is_Atomic_Or_VFA (Component_Type (R_Type))
   then
  Loop_Required := True;
 
@@ -3395,7 +3395,7 @@
 Next_Elmt (Prim);
  end loop;
 
- --  default iterator must exist.
+ --  Default iterator must exist
 
  pragma Assert (False);
 
Index: layout.adb
===
--- layout.adb  (revision 223476)
+++ layout.adb  (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2001-2014, Free Software Foundation, Inc. --
+--  Copyright (C) 2001-2015, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -2684,11 +2684,11 @@
 
  elsif Is_Array_Type (E) then
 
---  For arrays that are required to be atomic, we do the same
+--  For arrays that are required to be atomic/VFA, we do the same
 --  processing as described above for short records, since we
 --  really need to have the alignment set for the whole array.
 
-if Is_Atomic (E) and then not Debug_Flag_Q then
+if Is_Atomic_Or_VFA (E) and then not Debug_Flag_Q then
Set_Composite_Alignment (E);
 end if;
 
@@ -2903,11 +2903,19 @@
 and then Is_Record_Type (E)
 and then Is_Packed (E)
   then
- --  No effect for record with atomic components
+ --  No effect for record with atomic/VFA components
 
- if Is_Atomic (E) then
+ if Is_Atomic_Or_VFA (E) then
 Error_Msg_N (Optimize_Alignment has no effect for ??, E);
-Error_Msg_N (\pragma ignored for atomic record??, E);
+
+if Is_Atomic (E) then
+   Error_Msg_N
+ (\pragma ignored for atomic record??, E);
+else
+   Error_Msg_N
+ (\pragma ignored for bolatile full access record??, E);
+end if;
+
 return;
  end if;
 
@@ -2920,20 +2928,30 @@
 return;
  end if;
 
- --  No effect if any component is atomic or is a by reference type
+ --  No effect if any component is atomic/VFA or is a by reference type
 
  declare
 Ent : Entity_Id;
+
  begin
 Ent := First_Component_Or_Discriminant (E);
 while Present (Ent) loop
if Is_By_Reference_Type (Etype (Ent))
- or else Is_Atomic (Etype (Ent))
- or else Is_Atomic (Ent)
+ or else Is_Atomic_Or_VFA (Etype (Ent))
+ or else Is_Atomic_Or_VFA (Ent)
then
   Error_Msg_N (Optimize_Alignment has no effect for ??, E);
-  Error_Msg_N
-(\pragma is ignored if atomic components present??, E);
+
+ 

Re: Add few cases to operand_equal_p

2015-05-22 Thread Jan Hubicka
  +   case OBJ_TYPE_REF:
  + {
  +   if (!operand_equal_p (OBJ_TYPE_REF_EXPR (arg0),
  + OBJ_TYPE_REF_EXPR (arg1), flags))
  + return false;
  +   if (flag_devirtualize  virtual_method_call_p (arg0))
  + {
  +   if (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
  +   != tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1)))
  + return false;
  +   if (!types_same_for_odr (obj_type_ref_class (arg0),
  +obj_type_ref_class (arg1)))
  + return false;
 
 devirt machinery in operand_equal_p?  please not.  not more places!

OBJ_TYPE_REF explicitly says what is the type of class whose virtual method is
called. It is GIMPLE operand expression like other, so I think we need to handle
it.

Actually I think I can just drop the flag_devirtualize check because with
!flag_devirtualize we should simply avoid having OBJ_TYPE_REF around.  That
would make it looking less devirt specific.  We can also test types for
equivalence of main variant, but that would just introduce false positives when
LTO did not merged two identical ODR types.  It would be correct though.
 
  +   /* OBJ_TYPE_REF_OBJECT is used to track the instance of
  +  object THIS pointer points to.  Checking that both
  +  addresses equal is safe approximation of the fact
  +  that dynamic types are equal.
  +  Do not care about the other flags, because this expression
  +  does not attribute to actual value of OBJ_TYPE_REF  */
  +   if (!operand_equal_p (OBJ_TYPE_REF_OBJECT (arg0),
  + OBJ_TYPE_REF_OBJECT (arg1),
  + OEP_ADDRESS_OF))
  + return false;
  + }
  +
  +   return true;
  + }
  +
  default:
return 0;
  }
  @@ -3097,6 +3152,21 @@ operand_equal_p (const_tree arg0, const_
  }
   
   case tcc_declaration:
  +  /* At LTO time the FIELD_DECL may exist in multiple copies.
  +We only care about offsets and bit offsets for operands.  */
 
 Err?  Even at LTO time FIELD_DECLs should only appear once.
 So - testcase?

FIELD_DECL has TREE_TYPE and TREE_TYPE may not get merged by LTO.  So if the
two expressions originate from two different units, we may have two
semantically equivalent FIELD_DECLs (of compatible types and same offsets) that
occupy different memory locations because their merging was prevented by
something upstream (like complete wrt incmplete pointer in the type)
  +  /* In GIMPLE empty constructors are allowed in initializers of
  +vector types.  */
 
 Why this comment about GIMPLE?  This is about comparing GENERIC
 trees which of course can have CONSTRUCTORs of various sorts.
 
  +  if (TREE_CODE (arg0) == CONSTRUCTOR)
  +   {
  + unsigned length1 = vec_safe_length (CONSTRUCTOR_ELTS (arg0));
  + unsigned length2 = vec_safe_length (CONSTRUCTOR_ELTS (arg1));
  +
  + if (length1 != length2)
  +   return false;
  +
  + flags = ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
  +
  + for (unsigned i = 0; i  length1; i++)
  +   if (!operand_equal_p (CONSTRUCTOR_ELT (arg0, i)-value,
  + CONSTRUCTOR_ELT (arg1, i)-value, flags))
 
 You need to compare -index as well here.  I'm not sure constructor
 values are sorted always so you might get very many false negatives.

many false negatives is better than all false negatices :)

You are right, either I should punt on non-empty constructors or compare
the indexes, will do the second.

  +case FIELD_DECL:
  +  /* At LTO time the FIELD_DECL may exist in multiple copies.
  +We only care about offsets and bit offsets for operands.  */
 
 So explain that this is the reason we don't want to hash by
 DECL_UID.  I still think this is bogus.

Will do if we agree on having this.

I know you would like ipa-icf to keep original bodies and use them for inlining
declaring alias sets to be function local.  This is wrong plan.  Consder:

void t(int *ptr)
{
  *ptr=1;
}

int a(int *ptr1, int *ptr2)
{
  int a = *ptr1;
  t(ptr2)
  return a+*ptr1;
}

long b(long *ptr1, int *ptr2)
{
  int a = *ptr1;
  t(ptr2)
  return a+*ptr1;
}

here aliasing leads to the two options to be optimizer differently:
a:
.LFB1:  
.cfi_startproc
movl4(%esp), %edx
movl8(%esp), %ecx
movl(%edx), %eax
movl$1, (%ecx)
addl(%edx), %eax
ret
.cfi_endproc
b:
.LFB2:  
.cfi_startproc
movl4(%esp), %eax
movl8(%esp), %edx
movl(%eax), %eax
movl$1, (%edx)
addl%eax, %eax
ret
.cfi_endproc

however with -fno-early-inlining the functions look identical (modulo alias
sets) at ipa-icf time.  If we merged a/b, we could get wrong code for a
even though no inlining of a or b happens.

So either we match the alias sets 

Re: [RFC] operand_equal_p with valueization

2015-05-22 Thread Jan Hubicka
 
 And no, I'm hesitant to change operand_equal_p too much.  It's
 very much deep-rooted into GENERIC.

OK, as another option, i can bring relevant logic from operand_equal_p
to ipa-icf and separate it into the compare_operand class like I did.
Use it in ipa-icf-gimple now and we can slowly turn other uses of
operand_equal into the compare_operand users in middle end.

I agree that operand_equal is bit crazy code and it does not handle quite few
things we could do at gimple.  I have nothing against going this direction.
(after all I do not like touching fold-const much becuase it works on generic,
gimple and FE non-generic and it is not well specified what it should do)

Honza


Re: [patch 1/10] debug-early merge: Ada front-end

2015-05-22 Thread Aldy Hernandez

On 05/22/2015 04:31 AM, Eric Botcazou wrote:

My apologies for the delay on Ada.  I have reworked the patch to leave
the first pass on the TYPE_DECLs which are definitely needed.  I also
optimized things a bit, since we don't need to save all the globals any
more.


Thanks, this looks fine modulo a couple of nits, see below.


There is one regression which I'd like you and Jason's input before
proceeding:

+FAIL: gnat.dg/specs/debug1.ads scan-assembler-times DW_AT_artificial 17

The problem is that the Ada front-end twiddles the DECL_ARTIFICIAL flag
*after* it has called debug_hooks-early_global_decl().  The function
gnat_to_gnu_entity() calls create_var_decl_1-rest_of_decl_compilation,
but then much later twiddles DECL_ARTIFICIAL:

if (!Comes_From_Source (gnat_entity))
DECL_ARTIFICIAL (gnu_decl) = 1;

Twiddling DECL_ARTIFICIAL after we create early dwarf, means the DIE
does not get the DW_AT_artificial.  Would it be possible for you guys
(ahem, Ada folk) to set DECL_ARTIFICIAL before calling
rest_of_decl_compilation?


Sure, just add a ??? comment before the above lines for now.


Done.




@@ -535,8 +535,7 @@ extern tree gnat_type_for_size (unsigned precision, int
unsignedp);
 an unsigned type; otherwise a signed type is returned.  */
  extern tree gnat_type_for_mode (machine_mode mode, int unsignedp);

-/* Emit debug info for all global variable declarations.  */
-extern void gnat_write_global_declarations (void);
+extern void note_types_used_by_globals (void);

The comment needs to be adjusted instead of removed and preferably be the same
as the one (ajusted too) above the function in utils.c.


Done.

I have queued the attached patch for merge into mainline once the rest 
of the patches are approved.


Thank you.

gcc/ada/

	* gcc-interface/gigi.h (note_types_used_by_globals): Rename from
	gnat_write_global_declarations.
	* gcc-interface/misc.c (gnat_parse_file): Call
	note_types_used_by_globals.
	Remove LANG_HOOKS_WRITE_GLOBALS.
	* gcc-interface/utils.c: Rename global_decls to type_decls.
	(gnat_write_global_declarations): Rename
	to note_types_used_by_globals.
	Remove call to finalize_compilation_unit.
	Remove debug_hooks-global_decl() call for globals.
	(gnat_pushdecls): Only insert into type_decls if TYPE_DECL.

diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c
index d908a1b..e778616 100644
--- a/gcc/ada/gcc-interface/decl.c
+++ b/gcc/ada/gcc-interface/decl.c
@@ -5241,6 +5241,13 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, int definition)
!Present (Alias (gnat_entity))
!(Present (Renamed_Object (gnat_entity))  saved))
 {
+  /* ?? DECL_ARTIFICIAL, and possibly DECL_IGNORED_P below, should
+	 be set before calling rest_of_decl_compilation above (through
+	 create_var_decl_1).  This is because rest_of_decl_compilation
+	 calls the debugging backend and will create a DIE without
+	 DW_AT_artificial.
+
+	 This is currently causing gnat.dg/specs/debug1.ads to FAIL.  */
   if (!Comes_From_Source (gnat_entity))
 	DECL_ARTIFICIAL (gnu_decl) = 1;
 
diff --git a/gcc/ada/gcc-interface/gigi.h b/gcc/ada/gcc-interface/gigi.h
index 6d65fc5..ace4ac8 100644
--- a/gcc/ada/gcc-interface/gigi.h
+++ b/gcc/ada/gcc-interface/gigi.h
@@ -535,8 +535,9 @@ extern tree gnat_type_for_size (unsigned precision, int unsignedp);
an unsigned type; otherwise a signed type is returned.  */
 extern tree gnat_type_for_mode (machine_mode mode, int unsignedp);
 
-/* Emit debug info for all global variable declarations.  */
-extern void gnat_write_global_declarations (void);
+/* Keep track of types used at the global level and emit debug info
+   for all global types.  */
+extern void note_types_used_by_globals (void);
 
 /* Return the unsigned version of a TYPE_NODE, a scalar type.  */
 extern tree gnat_unsigned_type (tree type_node);
diff --git a/gcc/ada/gcc-interface/misc.c b/gcc/ada/gcc-interface/misc.c
index edaab97..2056a61 100644
--- a/gcc/ada/gcc-interface/misc.c
+++ b/gcc/ada/gcc-interface/misc.c
@@ -122,6 +122,8 @@ gnat_parse_file (void)
 
   /* Call the front end.  */
   _ada_gnat1drv ();
+
+  note_types_used_by_globals ();
 }
 
 /* Return language mask for option processing.  */
@@ -972,8 +974,6 @@ gnat_init_ts (void)
 #define LANG_HOOKS_GETDECLS		lhd_return_null_tree_v
 #undef  LANG_HOOKS_PUSHDECL
 #define LANG_HOOKS_PUSHDECL		gnat_return_tree
-#undef  LANG_HOOKS_WRITE_GLOBALS
-#define LANG_HOOKS_WRITE_GLOBALS	gnat_write_global_declarations
 #undef  LANG_HOOKS_GET_ALIAS_SET
 #define LANG_HOOKS_GET_ALIAS_SET	gnat_get_alias_set
 #undef  LANG_HOOKS_PRINT_DECL
diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c
index 5968857..915cbe1 100644
--- a/gcc/ada/gcc-interface/utils.c
+++ b/gcc/ada/gcc-interface/utils.c
@@ -227,8 +227,8 @@ static GTY((deletable)) struct gnat_binding_level *free_binding_level;
 /* The context to be used for global declarations.  */
 static GTY(()) tree global_context;
 
-/* An array of global 

Re: Do not compute alias sets for types that don't need them

2015-05-22 Thread Richard Biener
On Fri, 22 May 2015, Jan Hubicka wrote:

 Hi,
 this patch fixes few cases where we compute alias type and don't need to
 that are found by adding type_with_alias_set_p check to alias.c (I will send
 this patch separately as there is still one ICE caught by it I believe
 originating from ipa-icf-gimple, I have more involved fix for that)
 
 The point of all this is to check that we don't make bogus querries to alias
 oracle and don't try to use them somehow - we did in ipa-icf-gimple.  This is 
 a
 bug as the alias oracle gives stupid answers to stupid questions and sometimes
 think the types conflicts sometimes thinks they doesn't.  The other three 
 cases
 are sort of just pointless computations.
 
 Next part I would like to break out is the path computing equivalences 
 considering
 incomplete types (i.e. where all strucutres are equivalent and so all unions 
 or
 all functions of a given return value)
 
 Bootstrapped/regtested x86_64-linux, OK?
 
 Honza
 
   * tree-streamer-out.c (pack_ts_type_common_value_fields): Only consider
   alias set for types that have them.
   * lto-streamer-out.c (DFS::DFS_write_tree_body): Likewise.
   * emit-rtl.c (set_mem_attributes_minus_bitpos): Likewise.
   * ipa-icf-gimple.c (func_checker::compatible_types_p): Likewise.
 Index: tree-streamer-out.c
 ===
 --- tree-streamer-out.c   (revision 223508)
 +++ tree-streamer-out.c   (working copy)
 @@ -346,6 +346,7 @@ pack_ts_type_common_value_fields (struct
   alias-set zero to this type.  */
bp_pack_var_len_int (bp, (TYPE_ALIAS_SET (expr) == 0
   || (!in_lto_p
 +  type_with_alias_set_p (expr)
get_alias_set (expr) == 0)) ? 0 : -1);
  }
  
 Index: lto-streamer-out.c
 ===
 --- lto-streamer-out.c(revision 223508)
 +++ lto-streamer-out.c(working copy)
 @@ -1146,6 +1146,7 @@ hash_tree (struct streamer_tree_cache_d
hstate.add_int (TYPE_ALIGN (t));
hstate.add_int ((TYPE_ALIAS_SET (t) == 0
|| (!in_lto_p
 +   type_with_alias_set_p (t)
 get_alias_set (t) == 0))
   ? 0 : -1);
  }
 Index: emit-rtl.c
 ===
 --- emit-rtl.c(revision 223508)
 +++ emit-rtl.c(working copy)
 @@ -1787,8 +1787,15 @@ set_mem_attributes_minus_bitpos (rtx ref
memset (attrs, 0, sizeof (attrs));
  
/* Get the alias set from the expression or type (perhaps using a
 - front-end routine) and use it.  */
 -  attrs.alias = get_alias_set (t);
 + front-end routine) and use it.
 +
 + We may be called to produce MEM RTX for variable of incomplete type.
 + This MEM RTX will only be used to produce address of a vairable, so
 + we do not need to compute alias set.  */

But why do we set mem_attributes for it then?

 +  if (!DECL_P (t) || type_with_alias_set_p (TYPE_MAIN_VARIANT (TREE_TYPE 
 (t
 +attrs.alias = get_alias_set (t);
 +  else
 +gcc_assert (DECL_P (t)  TREE_ADDRESSABLE (t));

This assert also looks fishy to me...  (also just use 'type' instead
of TREE_TYPE (t), not sure why you need to pun to the main variant
here either - get_alias_set will do that for you and so should
type_with_alias_set_p if that is necessary).

The rest of the patch looks ok.

Thanks,
Richard.

MEM_VOLATILE_P (ref) |= TYPE_VOLATILE (type);
MEM_POINTER (ref) = POINTER_TYPE_P (type);
 Index: ipa-icf-gimple.c
 ===
 --- ipa-icf-gimple.c  (revision 223508)
 +++ ipa-icf-gimple.c  (working copy)
 @@ -274,6 +274,12 @@ func_checker::compatible_types_p (tree t
if (!types_compatible_p (t1, t2))
  return return_false_with_msg (types are not compatible);
  
 +  /* FIXME: type compatiblity checks for types that are never stored
 + to memory is not very useful.  We should update the code to avoid
 + calling compatible_types_p on types that will never be used.  */
 +  if (!type_with_alias_set_p (t1) || !type_with_alias_set_p (t2))
 +return true;
 +
if (get_alias_set (t1) != get_alias_set (t2))
  return return_false_with_msg (alias sets are different);
  
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)


Do not compute alias sets for types that don't need them

2015-05-22 Thread Jan Hubicka
Hi,
this patch fixes few cases where we compute alias type and don't need to
that are found by adding type_with_alias_set_p check to alias.c (I will send
this patch separately as there is still one ICE caught by it I believe
originating from ipa-icf-gimple, I have more involved fix for that)

The point of all this is to check that we don't make bogus querries to alias
oracle and don't try to use them somehow - we did in ipa-icf-gimple.  This is a
bug as the alias oracle gives stupid answers to stupid questions and sometimes
think the types conflicts sometimes thinks they doesn't.  The other three cases
are sort of just pointless computations.

Next part I would like to break out is the path computing equivalences 
considering
incomplete types (i.e. where all strucutres are equivalent and so all unions or
all functions of a given return value)

Bootstrapped/regtested x86_64-linux, OK?

Honza

* tree-streamer-out.c (pack_ts_type_common_value_fields): Only consider
alias set for types that have them.
* lto-streamer-out.c (DFS::DFS_write_tree_body): Likewise.
* emit-rtl.c (set_mem_attributes_minus_bitpos): Likewise.
* ipa-icf-gimple.c (func_checker::compatible_types_p): Likewise.
Index: tree-streamer-out.c
===
--- tree-streamer-out.c (revision 223508)
+++ tree-streamer-out.c (working copy)
@@ -346,6 +346,7 @@ pack_ts_type_common_value_fields (struct
  alias-set zero to this type.  */
   bp_pack_var_len_int (bp, (TYPE_ALIAS_SET (expr) == 0
|| (!in_lto_p
+type_with_alias_set_p (expr)
 get_alias_set (expr) == 0)) ? 0 : -1);
 }
 
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 223508)
+++ lto-streamer-out.c  (working copy)
@@ -1146,6 +1146,7 @@ hash_tree (struct streamer_tree_cache_d
   hstate.add_int (TYPE_ALIGN (t));
   hstate.add_int ((TYPE_ALIAS_SET (t) == 0
 || (!in_lto_p
+ type_with_alias_set_p (t)
  get_alias_set (t) == 0))
? 0 : -1);
 }
Index: emit-rtl.c
===
--- emit-rtl.c  (revision 223508)
+++ emit-rtl.c  (working copy)
@@ -1787,8 +1787,15 @@ set_mem_attributes_minus_bitpos (rtx ref
   memset (attrs, 0, sizeof (attrs));
 
   /* Get the alias set from the expression or type (perhaps using a
- front-end routine) and use it.  */
-  attrs.alias = get_alias_set (t);
+ front-end routine) and use it.
+
+ We may be called to produce MEM RTX for variable of incomplete type.
+ This MEM RTX will only be used to produce address of a vairable, so
+ we do not need to compute alias set.  */
+  if (!DECL_P (t) || type_with_alias_set_p (TYPE_MAIN_VARIANT (TREE_TYPE (t
+attrs.alias = get_alias_set (t);
+  else
+gcc_assert (DECL_P (t)  TREE_ADDRESSABLE (t));
 
   MEM_VOLATILE_P (ref) |= TYPE_VOLATILE (type);
   MEM_POINTER (ref) = POINTER_TYPE_P (type);
Index: ipa-icf-gimple.c
===
--- ipa-icf-gimple.c(revision 223508)
+++ ipa-icf-gimple.c(working copy)
@@ -274,6 +274,12 @@ func_checker::compatible_types_p (tree t
   if (!types_compatible_p (t1, t2))
 return return_false_with_msg (types are not compatible);
 
+  /* FIXME: type compatiblity checks for types that are never stored
+ to memory is not very useful.  We should update the code to avoid
+ calling compatible_types_p on types that will never be used.  */
+  if (!type_with_alias_set_p (t1) || !type_with_alias_set_p (t2))
+return true;
+
   if (get_alias_set (t1) != get_alias_set (t2))
 return return_false_with_msg (alias sets are different);
 


[Ada] Duplicate symbol xxxAM due to anonymous access allocation

2015-05-22 Thread Arnaud Charlet
This patch reimplements the generation of anonymous finalization masters used
in servicing anonymous access-to-controlled type allocations. The modification
prevents the generation of a duplicate anonymous master in certain cases.


-- Source --


--  gen_pack.ads

with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;

generic
package Gen_Pack is
   type Rec is tagged record
  Comp : Unbounded_String;
   end record;

   procedure Force_Body;
end Gen_Pack;

--  gen_pack.adb

package body Gen_Pack is
   Ptr : access Rec := new Rec;

   procedure Force_Body is begin null; end Force_Body;
end Gen_Pack;

--  pack.ads

with Gen_Pack;

package Pack is
   procedure Force_Body;

   package Inst_1 is new Gen_Pack;
end Pack;

--  pack.adb

package body Pack is
   procedure Force_Body is begin null; end Force_Body;

   package Inst_2 is new Gen_Pack;
end Pack;

-
-- Compilation --
-

$ gcc -c pack.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Hristian Kirtchev  kirtc...@adacore.com

* einfo.adb Node36 is now used as Anonymous_Master. Flag253
is now unused.
(Anonymous_Master): New routine.
(Has_Anonymous_Master): Removed.
(Set_Anonymous_Master): New routine.
(Set_Has_Anonymous_Master): Removed.
(Write_Entity_Flags): Remove the output for Has_Anonymous_Maser.
(Write_Field36_Name): Add output for Anonymous_Master.
* einfo.ads Add new attribute Anonymous_Master along with
occurrences in nodes. Remove attribute Has_Anonymous_Master along
with occurrences in nodes.
(Anonymous_Master): New routine along with pragma Inline.
(Has_Anonymous_Master): Removed along with pragma Inline.
(Set_Anonymous_Master): New routine along with pragma Inline.
(Set_Has_Anonymous_Master): Removed along with pragma Inline.
* exp_ch4.adb (Create_Anonymous_Master): New routine.
(Current_Anonymous_Master): Reimplemented.

Index: einfo.adb
===
--- einfo.adb   (revision 223546)
+++ einfo.adb   (working copy)
@@ -264,7 +264,8 @@
 
--Import_Pragma   Node35
 
-   --(unused)Node36
+   --Anonymous_MasterNode36
+
--(unused)Node38
--(unused)Node39
--(unused)Node40
@@ -556,7 +557,6 @@
 
--Has_Implicit_DereferenceFlag251
--Is_Processed_Transient  Flag252
-   --Has_Anonymous_MasterFlag253
--Is_Implementation_Defined   Flag254
--Is_Predicate_Function   Flag255
--Is_Predicate_Function_M Flag256
@@ -594,6 +594,7 @@
--Has_Volatile_Full_AccessFlag285
--Needs_Typedef   Flag286
 
+   --(unused)Flag253
--(unused)Flag287
--(unused)Flag288
--(unused)Flag289
@@ -753,6 +754,12 @@
   return Uint14 (Id);
end Alignment;
 
+   function Anonymous_Master (Id : E) return E is
+   begin
+  pragma Assert (Ekind_In (Id, E_Function, E_Package, E_Procedure));
+  return Node36 (Id);
+   end Anonymous_Master;
+
function Associated_Entity (Id : E) return E is
begin
   return Node37 (Id);
@@ -1375,13 +1382,6 @@
   return Flag79 (Id);
end Has_All_Calls_Remote;
 
-   function Has_Anonymous_Master (Id : E) return B is
-   begin
-  pragma Assert
-(Ekind_In (Id, E_Function, E_Package, E_Package_Body, E_Procedure));
-  return Flag253 (Id);
-   end Has_Anonymous_Master;
-
function Has_Atomic_Components (Id : E) return B is
begin
   return Flag86 (Implementation_Base_Type (Id));
@@ -3576,6 +3576,12 @@
   Set_Elist16 (Id, V);
end Set_Access_Disp_Table;
 
+   procedure Set_Anonymous_Master (Id : E; V : E) is
+   begin
+  pragma Assert (Ekind_In (Id, E_Function, E_Package, E_Procedure));
+  Set_Node36 (Id, V);
+   end Set_Anonymous_Master;
+
procedure Set_Associated_Entity (Id : E; V : E) is
begin
   Set_Node37 (Id, V);
@@ -4246,13 +4252,6 @@
   Set_Flag79 (Id, V);
end Set_Has_All_Calls_Remote;
 
-   procedure Set_Has_Anonymous_Master (Id : E; V : B := True) is
-   begin
-  pragma Assert
-(Ekind_In (Id, E_Function, E_Package, E_Package_Body, E_Procedure));
-  Set_Flag253 (Id, V);
-   end Set_Has_Anonymous_Master;
-
procedure Set_Has_Atomic_Components (Id : E; V : B := True) is
begin
   pragma Assert (not Is_Type (Id) or else Is_Base_Type (Id));
@@ -8634,7 +8633,6 @@
   W (Has_Aliased_Components,  Flag135 (Id));
   W (Has_Alignment_Clause,Flag46  (Id));
   W (Has_All_Calls_Remote,Flag79  (Id));
-  W (Has_Anonymous_Master,

[Ada] Size should be zero for null range discrete subtype

2015-05-22 Thread Arnaud Charlet
The 'Size of a discrete subtype with a null range should be zero. The
following test:

 1. with Ada.Text_IO; use Ada.Text_IO;
 2. procedure Static_Null_Range_Size is
 3.subtype Static_Null_Range is
 4.  Integer range 5 .. 0;
 5.Dummy : Static_Null_Range;
 6. begin
 7.Put_Line (Static_Null_Range:);
 8.Put_Line (Type'Size is  
 9.Natural'Image
10.(Static_Null_Range'Size));
11.Put_Line (  Object'Size is  
12.Natural'Image (Dummy'Size));
13.if Static_Null_Range'Size /= 0 then
14.   raise Program_Error;
15.end if;
16. end Static_Null_Range_Size;

should generate the output:

Static_Null_Range:
Type'Size is  0
  Object'Size is  32

and not raise an exception.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Robert Dewar  de...@adacore.com

* sem_ch13.adb (Minimum_Size): Size is zero for null range
discrete subtype.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 223555)
+++ sem_ch13.adb(working copy)
@@ -11718,11 +11718,20 @@
  Lo := Uint_0;
   end if;
 
+  --  Null range case, size is always zero. We only do this in the discrete
+  --  type case, since that's the odd case that came up. Probably we should
+  --  also do this in the fixed-point case, but doing so causes peculiar
+  --  gigi failures, and it is not worth worrying about this incredibly
+  --  marginal case (explicit null-range fixed-point type declarations)???
+
+  if Lo  Hi and then Is_Discrete_Type (T) then
+ S := 0;
+
   --  Signed case. Note that we consider types like range 1 .. -1 to be
   --  signed for the purpose of computing the size, since the bounds have
   --  to be accommodated in the base type.
 
-  if Lo  0 or else Hi  0 then
+  elsif Lo  0 or else Hi  0 then
  S := 1;
  B := Uint_1;
 


[Ada] Internal crash on package instantation compilation unit

2015-05-22 Thread Arnaud Charlet
This patch updates the implementation of anonymous masters that support
finalization actions of anonymous access-to-controlled type allocations
to handle package instantiations that act as a compilation unit.


-- Source --


--  q.ads

package Q is
  type Obj_T is tagged null record;

  type T is access all Obj_T'Class;
end Q;

--  r.ads

with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
with Q;

package R is
   type Obj_T is new Q.Obj_T with record
  S : Unbounded_String;
   end record;

   function Create return Obj_T;
end R;

--  h.ads

with Q;
with R;

generic
  type T is private;

package H is
  type Obj_T is new R.Obj_T with null record;

  function Func return Q.T;
end H;

--  h.adb

package body H is
   Temp : constant Obj_T := (R.Create with null record);
   Temp_Ptr : constant access Obj_T := new Obj_T'(Temp);
   Data : constant Q.T := Q.T (Temp_Ptr);

   function Func return Q.T is begin return Data; end;
end H;

--  g.ads

with H;
with Q;

generic
  type T is private;

package G is
  package My_H is new H (Boolean);

  type Obj_T is new My_H.Obj_T with null record;

  function Func return Q.T;
end G;

--  g.adb

package body G is
  Data : constant Q.T :=
   new Obj_T'(My_H.Obj_T (My_H.Func.all) with null record);

  function Func return Q.T is begin return Data; end;
end G;

--  p.ads

with G;

package P is new G (Boolean);

-
-- Compilation --
-

$ gcc -c p.ads

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Hristian Kirtchev  kirtc...@adacore.com

* einfo.adb (Anonymous_Master): This attribute now applies
to package and subprogram bodies.
(Set_Anonymous_Master): This attribute now applies to package and
subprogram bodies.
(Write_Field36_Name): Add output for package and subprogram bodies.
* einfo.ads Update the documentation on attribute Anonymous_Master
along with occurrences in entities.
* exp_ch4.adb (Create_Anonymous_Master): Reimplemented to
handle spec and body anonymous masters of the same unit.
(Current_Anonymous_Master): Reimplemented. Handle a
package instantiation that acts as a compilation unit.
(Insert_And_Analyze): Reimplemented.

Index: einfo.adb
===
--- einfo.adb   (revision 223553)
+++ einfo.adb   (working copy)
@@ -757,7 +757,11 @@
 
function Anonymous_Master (Id : E) return E is
begin
-  pragma Assert (Ekind_In (Id, E_Function, E_Package, E_Procedure));
+  pragma Assert (Ekind_In (Id, E_Function,
+   E_Package,
+   E_Package_Body,
+   E_Procedure,
+   E_Subprogram_Body));
   return Node36 (Id);
end Anonymous_Master;
 
@@ -3586,7 +3590,11 @@
 
procedure Set_Anonymous_Master (Id : E; V : E) is
begin
-  pragma Assert (Ekind_In (Id, E_Function, E_Package, E_Procedure));
+  pragma Assert (Ekind_In (Id, E_Function,
+   E_Package,
+   E_Package_Body,
+   E_Procedure,
+   E_Subprogram_Body));
   Set_Node36 (Id, V);
end Set_Anonymous_Master;
 
@@ -10141,7 +10149,9 @@
  when E_Function   |
   E_Operator   |
   E_Package|
-  E_Procedure  =
+  E_Package_Body   |
+  E_Procedure  |
+  E_Subprogram_Body=
 Write_Str (Anonymous_Master);
 
  when others   =
Index: einfo.ads
===
--- einfo.ads   (revision 223553)
+++ einfo.ads   (working copy)
@@ -437,10 +437,10 @@
 --   into an attribute definition clause for this purpose.
 
 --Anonymous_Master (Node36)
---   Defined in the entities of non-generic subprogram and package units.
---   Contains the entity of a special heterogeneous finalization master
---   that services most anonymous access-to-controlled allocations that
---   occur within the unit.
+--   Defined in the entities of non-generic packages, subprograms and their
+--   corresponding bodies. Contains the entity of a special heterogeneous
+--   finalization master that services most anonymous access-to-controlled
+--   allocations that occur within the unit.
 
 --Associated_Entity (Node37)
 --   Defined in all entities. This field is similar to Associated_Node, but
@@ -6096,6 +6096,7 @@
--SPARK_Pragma(Node32)
--SPARK_Aux_Pragma   

Re: Add few cases to operand_equal_p

2015-05-22 Thread Richard Biener
On Fri, 22 May 2015, Jan Hubicka wrote:

 Hi,
 I am working on patch that makes operand_equal_p replace logic from
 ipa-icf-gimple's compare_op via a valueizer hook.  Currently the patch however
 cuts number of merges on firefox to half (apparently becuase it gives up on
 some tree codes too early)
 The patch bellow merges code from ipa-icf-gimple.c that is missing in
 fold-const and is needed. 
 
 Bootstrapped/regtested x86_64-linux, OK?

No, I don't like this.

 Honza
 
   * fold-const.c (operand_equal_p): Handle OBJ_TYPE_REF, CONSTRUCTOR
   and be more tolerant about FIELD_DECL.
   * tree.c (add_expr): Handle FIELD_DECL.
   (prototype_p, virtual_method_call_p, obj_type_ref_class): Constify.
   * tree.h (prototype_p, virtual_method_call_p, obj_type_ref_class):
   Constify.
 Index: fold-const.c
 ===
 --- fold-const.c  (revision 223500)
 @@ -3037,6 +3058,40 @@ operand_equal_p (const_tree arg0, const_
   case DOT_PROD_EXPR:
 return OP_SAME (0)  OP_SAME (1)  OP_SAME (2);
  
 + /* OBJ_TYPE_REF really is just a transparent wrapper around expression,
 +but it holds additional type information for devirtualization that
 +needs to be matched.  We may want to intoruce OEP flag if we want
 +to compare the actual value only, or if we also care about effects
 +of potentially merging the code.  This flag can bypass this check
 +as well as the alias set matching in MEM_REF.  */
 + case OBJ_TYPE_REF:
 +   {
 + if (!operand_equal_p (OBJ_TYPE_REF_EXPR (arg0),
 +   OBJ_TYPE_REF_EXPR (arg1), flags))
 +   return false;
 + if (flag_devirtualize  virtual_method_call_p (arg0))
 +   {
 + if (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
 + != tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1)))
 +   return false;
 + if (!types_same_for_odr (obj_type_ref_class (arg0),
 +  obj_type_ref_class (arg1)))
 +   return false;

devirt machinery in operand_equal_p?  please not.  not more places!

 + /* OBJ_TYPE_REF_OBJECT is used to track the instance of
 +object THIS pointer points to.  Checking that both
 +addresses equal is safe approximation of the fact
 +that dynamic types are equal.
 +Do not care about the other flags, because this expression
 +does not attribute to actual value of OBJ_TYPE_REF  */
 + if (!operand_equal_p (OBJ_TYPE_REF_OBJECT (arg0),
 +   OBJ_TYPE_REF_OBJECT (arg1),
 +   OEP_ADDRESS_OF))
 +   return false;
 +   }
 +
 + return true;
 +   }
 +
   default:
 return 0;
   }
 @@ -3097,6 +3152,21 @@ operand_equal_p (const_tree arg0, const_
   }
  
  case tcc_declaration:
 +  /* At LTO time the FIELD_DECL may exist in multiple copies.
 +  We only care about offsets and bit offsets for operands.  */

Err?  Even at LTO time FIELD_DECLs should only appear once.
So - testcase?

IMHO most of this boils down to ICF being too strict about aliasing
because it inlines merged bodies instead of original ones.

 +  if (TREE_CODE (arg0) == FIELD_DECL)
 + {
 +   tree offset1 = DECL_FIELD_OFFSET (arg0);
 +   tree offset2 = DECL_FIELD_OFFSET (arg1);
 +
 +   tree bit_offset1 = DECL_FIELD_BIT_OFFSET (arg0);
 +   tree bit_offset2 = DECL_FIELD_BIT_OFFSET (arg1);
 +
 +   flags = ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
 +
 +   return operand_equal_p (offset1, offset2, flags)
 +   operand_equal_p (bit_offset1, bit_offset2, flags);
 + }
/* Consider __builtin_sqrt equal to sqrt.  */
return (TREE_CODE (arg0) == FUNCTION_DECL
  DECL_BUILT_IN (arg0)  DECL_BUILT_IN (arg1)
 @@ -3104,12 +3174,50 @@ operand_equal_p (const_tree arg0, const_
  DECL_FUNCTION_CODE (arg0) == DECL_FUNCTION_CODE (arg1));
  
  default:

case tcc_exceptional:

 +  /* In GIMPLE empty constructors are allowed in initializers of
 +  vector types.  */

Why this comment about GIMPLE?  This is about comparing GENERIC
trees which of course can have CONSTRUCTORs of various sorts.

 +  if (TREE_CODE (arg0) == CONSTRUCTOR)
 + {
 +   unsigned length1 = vec_safe_length (CONSTRUCTOR_ELTS (arg0));
 +   unsigned length2 = vec_safe_length (CONSTRUCTOR_ELTS (arg1));
 +
 +   if (length1 != length2)
 + return false;
 +
 +   flags = ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
 +
 +   for (unsigned i = 0; i  length1; i++)
 + if (!operand_equal_p (CONSTRUCTOR_ELT (arg0, i)-value,
 +   CONSTRUCTOR_ELT (arg1, i)-value, flags))

You need to compare -index as well here.  I'm not sure constructor
values are sorted 

Re: Do not compute alias sets for types that don't need them

2015-05-22 Thread Jan Hubicka
  Index: emit-rtl.c
  ===
  --- emit-rtl.c  (revision 223508)
  +++ emit-rtl.c  (working copy)
  @@ -1787,8 +1787,15 @@ set_mem_attributes_minus_bitpos (rtx ref
 memset (attrs, 0, sizeof (attrs));
   
 /* Get the alias set from the expression or type (perhaps using a
  - front-end routine) and use it.  */
  -  attrs.alias = get_alias_set (t);
  + front-end routine) and use it.
  +
  + We may be called to produce MEM RTX for variable of incomplete type.
  + This MEM RTX will only be used to produce address of a vairable, so
  + we do not need to compute alias set.  */
 
 But why do we set mem_attributes for it then?

Because we are stupid.  We want to produce var, where var is of incomplete type
and to do it, we need to compute DECL_RTL of var and that is MEM RTX and we 
assign
attributes to every DECL_RTL MEM we produce.  I do not see how to cut this down
except for having something like DECL_RTL_ADDR and have way to build it without
actually making MEM expr... (which would save some memory for the MEM RTX and 
for
attributes but it would make it difficult to hook that value somewhere)
 
  +  if (!DECL_P (t) || type_with_alias_set_p (TYPE_MAIN_VARIANT (TREE_TYPE 
  (t
  +attrs.alias = get_alias_set (t);
  +  else
  +gcc_assert (DECL_P (t)  TREE_ADDRESSABLE (t));
 
 This assert also looks fishy to me...  (also just use 'type' instead
 of TREE_TYPE (t), not sure why you need to pun to the main variant
 here either - get_alias_set will do that for you and so should
 type_with_alias_set_p if that is necessary).

I am not sure if TYPE_MAIN_VARIANT is really needed here.  What I know is that
complete types may have incomplete variants.  I was bit worried that I can
disable calculation of alias set of something that do matter by not checking
main variant in case we somehow manage to have a variable of incomplete vairant
type but still access it because we know its complete main variant (that would
be somewhat broken).

That is also why I added the check - the idea is that at least I can check that 
this
particular var is indeed having address taken.   Other option would be to define
an invalid alias set and assign MEM RTX this invalid alias set and make alias.c
bomb when it trips across it.

I actually noticed it triggers during producing some libjava glue where we
forgot to set TREE_ADDRESSABLE for artificial variable that takes address.
I will look into fix later today or tomorrow.

Honza
 
 The rest of the patch looks ok.
 
 Thanks,
 Richard.
 
 MEM_VOLATILE_P (ref) |= TYPE_VOLATILE (type);
 MEM_POINTER (ref) = POINTER_TYPE_P (type);
  Index: ipa-icf-gimple.c
  ===
  --- ipa-icf-gimple.c(revision 223508)
  +++ ipa-icf-gimple.c(working copy)
  @@ -274,6 +274,12 @@ func_checker::compatible_types_p (tree t
 if (!types_compatible_p (t1, t2))
   return return_false_with_msg (types are not compatible);
   
  +  /* FIXME: type compatiblity checks for types that are never stored
  + to memory is not very useful.  We should update the code to avoid
  + calling compatible_types_p on types that will never be used.  */
  +  if (!type_with_alias_set_p (t1) || !type_with_alias_set_p (t2))
  +return true;
  +
 if (get_alias_set (t1) != get_alias_set (t2))
   return return_false_with_msg (alias sets are different);
   
  
  
 
 -- 
 Richard Biener rguent...@suse.de
 SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
 Norton, HRB 21284 (AG Nuernberg)


[C++ Patch] PR 65598

2015-05-22 Thread Paolo Carlini

Hi,

this is also by and large obvious, I think: in order to use the right 
location for error messages about 'explicit', just use 
declspecs-locations[ds_explicit]. Tested x86_64-linux.


Thanks,
Paolo.

PS: I'm pretty sure we do have similar issues for other decl-specifiers, 
which can be likewise fixed very easily. I'll go through them over the 
next weeks..



/cp
2015-05-22  Paolo Carlini  paolo.carl...@oracle.com

PR c++/65598
* decl.c (grokdeclarator): Use the correct location in error
messages about 'explicit'.

/testsuite
2015-05-22  Paolo Carlini  paolo.carl...@oracle.com

PR c++/65598
* g++.dg/cpp0x/explicit9.C: New.
* g++.dg/cpp0x/explicit8.C: Check the locations too.
Index: cp/decl.c
===
--- cp/decl.c   (revision 223520)
+++ cp/decl.c   (working copy)
@@ -10266,12 +10266,15 @@ grokdeclarator (const cp_declarator *declarator,
 in the declaration of a constructor or conversion function within
 a class definition.  */
   if (!current_class_type)
-   error (%explicit% outside class declaration);
+   error_at (declspecs-locations[ds_explicit],
+ %explicit% outside class declaration);
   else if (friendp)
-   error (%explicit% in friend declaration);
+   error_at (declspecs-locations[ds_explicit],
+ %explicit% in friend declaration);
   else
-   error (only declarations of constructors and conversion operators 
-  can be %explicit%);
+   error_at (declspecs-locations[ds_explicit],
+ only declarations of constructors and conversion operators 
+ can be %explicit%);
   explicitp = 0;
 }
 
Index: testsuite/g++.dg/cpp0x/explicit8.C
===
--- testsuite/g++.dg/cpp0x/explicit8.C  (revision 223520)
+++ testsuite/g++.dg/cpp0x/explicit8.C  (working copy)
@@ -5,13 +5,13 @@ struct A {
   explicit operator int() const;
 };
 
-explicit inline A::operator int() const { return 1; }  // { dg-error 
'explicit' outside class declaration }
+explicit inline A::operator int() const { return 1; }  // { dg-error 
1:'explicit' outside class declaration }
 
 struct B {
-  explicit void f();  // { dg-error only declarations of constructors and 
conversion operators can be 'explicit' }
+  explicit void f();  // { dg-error 3:only declarations of constructors and 
conversion operators can be 'explicit' }
 };
 
-explicit void B::f() { }  // { dg-error 'explicit' outside class declaration 
}
+explicit void B::f() { }  // { dg-error 1:'explicit' outside class 
declaration }
 
 struct C {
   explicit C(int);
@@ -18,5 +18,5 @@ struct C {
 };
 
 struct D {
-  explicit friend C::C(int);  // { dg-error 'explicit' in friend declaration 
}
+  explicit friend C::C(int);  // { dg-error 3:'explicit' in friend 
declaration }
 };
Index: testsuite/g++.dg/cpp0x/explicit9.C
===
--- testsuite/g++.dg/cpp0x/explicit9.C  (revision 0)
+++ testsuite/g++.dg/cpp0x/explicit9.C  (working copy)
@@ -0,0 +1,12 @@
+// PR c++/65598
+// { dg-do compile { target c++11 } }
+
+struct ExplicitTest
+{
+  explicit operator bool() const;
+};
+
+explicit ExplicitTest::operator bool() const  // { dg-error 1:'explicit' 
outside class declaration }
+{
+  return true;
+}


Re: [patch 10/10] debug-early merge: compiler proper

2015-05-22 Thread Aldy Hernandez

On 05/22/2015 07:23 AM, Richard Biener wrote:

On Wed, May 20, 2015 at 5:50 PM, Aldy Hernandez al...@redhat.com wrote:

On 05/18/2015 06:56 AM, Richard Biener wrote:



diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index ad1bb23..2a9f417 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1334,6 +1334,9 @@ struct GTY(()) tree_block {
 tree abstract_origin;
 tree fragment_origin;
 tree fragment_chain;
+
+  /* Pointer to the DWARF lexical block.  */
+  struct die_struct *die;
   };

   struct GTY(()) tree_type_common {

Ick - do we need this?  dwarf2out.c has a hashtable to map blocks to
DIEs (which you don't remove in turn).



We need a way to reference the early created DIE from late debugging, and we
can't use block_map because it gets cloberred across functions. It's
currently being released in late debug (dwarf2out_function_decl),
that's why you see it not set to NULL in dwarf2out_c_finalize.

Also, it uses BLOCK_NUMBERs, which according to the documentation in
tree.h, are not guaranteed to be unique across functions.

As Honza mentioned, we're already using a DIE map in types through
TYPE_SYMTAB_DIE.  See lookup_type_die() in dwarf2out.c.

Could we leave this as is?


But why then not eliminate block_map in favor of using the new -die member?
Having both looks very odd to me.


Oh, I would love to.  I just didn't want to rip things apart elsewhere 
until I was sure you guys were on board with the approach.



Can you cook up a patch for trunk adding that field to tree_block and removing
the block_map map in favor of sth like what we do for
lookup_type_die/equate_type_number_to_die
and TYPE_SYMTAB_DIE?


Absolutely!  The attached patch removes block_map in favor of BLOCK_DIE. 
 I did not add lookup_block_die/equate_block_number_to_die abstractions 
because I think BLOCK_DIE is pretty straightforward.


The attached patch is against mainline.  I also ported it to the branch 
for testing, and neither the branch nor mainline exhibit any regressions.


Tested on x86-64 Linux with --enable-languages=all,go,ada.

OK for trunk?

Aldy

commit 9a82ff7b044a8d17eaaaec5eaec3e73f836224df
Author: Aldy Hernandez al...@redhat.com
Date:   Fri May 22 10:07:17 2015 -0400

* dwarf2out.c: Remove block_map.
(gen_call_site_die): Replace block_map use with BLOCK_DIE.
(gen_lexical_block_die): Same.
(dwarf2out_function_decl): Remove block_map use.
(dwarf2out_c_finalize): Same.
* tree-core.h (struct tree_block): Add die field.
* tree.h (BLOCK_DIE): New.

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index cc7ac84..15c545e 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -2908,10 +2908,6 @@ static int call_site_count = -1;
 /* Number of tail call sites in the current function.  */
 static int tail_call_site_count = -1;
 
-/* Vector mapping block numbers to DW_TAG_{lexical_block,inlined_subroutine}
-   DIEs.  */
-static vecdw_die_ref block_map;
-
 /* A cached location list.  */
 struct GTY ((for_user)) cached_dw_loc_list_def {
   /* The DECL_UID of the decl that this entry describes.  */
@@ -18368,8 +18364,7 @@ gen_call_site_die (tree decl, dw_die_ref subr_die,
  block != DECL_INITIAL (decl)
  TREE_CODE (block) == BLOCK)
 {
-  if (block_map.length ()  BLOCK_NUMBER (block))
-   stmt_die = block_map[BLOCK_NUMBER (block)];
+  stmt_die = BLOCK_DIE (block);
   if (stmt_die)
break;
   block = BLOCK_SUPERCONTEXT (block);
@@ -19469,11 +19464,7 @@ gen_lexical_block_die (tree stmt, dw_die_ref 
context_die)
   dw_die_ref stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt);
 
   if (call_arg_locations)
-{
-  if (block_map.length () = BLOCK_NUMBER (stmt))
-   block_map.safe_grow_cleared (BLOCK_NUMBER (stmt) + 1);
-  block_map[BLOCK_NUMBER (stmt)] = stmt_die;
-}
+BLOCK_DIE (stmt) = stmt_die;
 
   if (! BLOCK_ABSTRACT (stmt)  TREE_ASM_WRITTEN (stmt))
 add_high_low_attributes (stmt, stmt_die);
@@ -19506,11 +19497,7 @@ gen_inlined_subroutine_die (tree stmt, dw_die_ref 
context_die)
= new_die (DW_TAG_inlined_subroutine, context_die, stmt);
 
   if (call_arg_locations)
-   {
- if (block_map.length () = BLOCK_NUMBER (stmt))
-   block_map.safe_grow_cleared (BLOCK_NUMBER (stmt) + 1);
- block_map[BLOCK_NUMBER (stmt)] = subr_die;
-   }
+   BLOCK_DIE (stmt) = subr_die;
   add_abstract_origin_attribute (subr_die, decl);
   if (TREE_ASM_WRITTEN (stmt))
 add_high_low_attributes (stmt, subr_die);
@@ -21407,7 +21394,6 @@ dwarf2out_function_decl (tree decl)
   call_arg_loc_last = NULL;
   call_site_count = -1;
   tail_call_site_count = -1;
-  block_map.release ();
   decl_loc_table-empty ();
   cached_dw_loc_list_table-empty ();
 }
@@ -25008,7 +24994,6 @@ dwarf2out_c_finalize (void)
   call_arg_loc_last = NULL;
   call_site_count = -1;
   tail_call_site_count = -1;
-  //block_map = NULL;
   cached_dw_loc_list_table = NULL;
   

[Ada] Rename Has_Volatile_Full_Access into Is_Volatile_Full_Access

2015-05-22 Thread Arnaud Charlet
This is an internal change that renames the Has_Volatile_Full_Access flag into
Is_Volatile_Full_Access for the sake of consistency with similar flags.

No user-visible changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2015-05-22  Eric Botcazou  ebotca...@adacore.com

* einfo.ads (Has_Volatile_Full_Access): Rename into...
(Is_Volatile_Full_Access): ...this.
(Set_Has_Volatile_Full_Access): Rename into...
(Set_Is_Volatile_Full_Access): ...this.
* einfo.adb (Has_Volatile_Full_Access): Rename into...
(Is_Volatile_Full_Access): ...this.
(Set_Has_Volatile_Full_Access): Rename into...
(Set_Is_Volatile_Full_Access): ...this.
(Is_Atomic_Or_VFA): Adjust to above renaming.
* errout.adb (Special_Msg_Delete): Likewise.
* exp_pakd.adb (Install_PAT): Likewise.
* freeze.adb (Freeze_Array_Type): Likewise.
* sem_ch8.adb (Analyze_Object_Renaming): Likewise.
* sem_ch13.adb (Inherit_Delayed_Rep_Aspects): Likewise.
(Inherit_Aspects_At_Freeze_Point): Likewise.
* sem_prag.adb (Set_Atomic_VFA): Likewise.
(Process_Atomic_Independent_Shared_Volatile): Likewise.
* sem_util.adb (Is_Atomic_Or_VFA_Object): Likewise.

Index: einfo.adb
===
--- einfo.adb   (revision 223560)
+++ einfo.adb   (working copy)
@@ -592,7 +592,7 @@
--Has_Nested_Subprogram   Flag282
--Is_Uplevel_Referenced_EntityFlag283
--Is_UnimplementedFlag284
-   --Has_Volatile_Full_AccessFlag285
+   --Is_Volatile_Full_Access Flag285
--Needs_Typedef   Flag286
 
--(unused)Flag253
@@ -1856,11 +1856,6 @@
   return Flag87 (Implementation_Base_Type (Id));
end Has_Volatile_Components;
 
-   function Has_Volatile_Full_Access (Id : E) return B is
-   begin
-  return Flag285 (Id);
-   end Has_Volatile_Full_Access;
-
function Has_Xref_Entry (Id : E) return B is
begin
   return Flag182 (Id);
@@ -2528,6 +2523,11 @@
   end if;
end Is_Volatile;
 
+   function Is_Volatile_Full_Access (Id : E) return B is
+   begin
+  return Flag285 (Id);
+   end Is_Volatile_Full_Access;
+
function Itype_Printed (Id : E) return B is
begin
   pragma Assert (Is_Itype (Id));
@@ -4758,11 +4758,6 @@
   Set_Flag87 (Id, V);
end Set_Has_Volatile_Components;
 
-   procedure Set_Has_Volatile_Full_Access (Id : E; V : B := True) is
-   begin
-  Set_Flag285 (Id, V);
-   end Set_Has_Volatile_Full_Access;
-
procedure Set_Has_Xref_Entry (Id : E; V : B := True) is
begin
   Set_Flag182 (Id, V);
@@ -5498,6 +5493,11 @@
   Set_Flag16 (Id, V);
end Set_Is_Volatile;
 
+   procedure Set_Is_Volatile_Full_Access (Id : E; V : B := True) is
+   begin
+  Set_Flag285 (Id, V);
+   end Set_Is_Volatile_Full_Access;
+
procedure Set_Itype_Printed (Id : E; V : B := True) is
begin
   pragma Assert (Is_Itype (Id));
@@ -7335,7 +7335,7 @@
 
function Is_Atomic_Or_VFA (Id : E) return B is
begin
-  return Is_Atomic (Id) or else Has_Volatile_Full_Access (Id);
+  return Is_Atomic (Id) or else Is_Volatile_Full_Access (Id);
end Is_Atomic_Or_VFA;
 
--
@@ -8750,7 +8750,6 @@
   W (Has_Uplevel_Reference,   Flag215 (Id));
   W (Has_Visible_Refinement,  Flag263 (Id));
   W (Has_Volatile_Components, Flag87  (Id));
-  W (Has_Volatile_Full_Access,Flag285 (Id));
   W (Has_Xref_Entry,  Flag182 (Id));
   W (In_Package_Body, Flag48  (Id));
   W (In_Private_Part, Flag45  (Id));
@@ -8865,6 +8864,7 @@
   W (Is_Visible_Formal,   Flag206 (Id));
   W (Is_Visible_Lib_Unit, Flag116 (Id));
   W (Is_Volatile, Flag16  (Id));
+  W (Is_Volatile_Full_Access, Flag285 (Id));
   W (Itype_Printed,   Flag202 (Id));
   W (Kill_Elaboration_Checks, Flag32  (Id));
   W (Kill_Range_Checks,   Flag33  (Id));
Index: einfo.ads
===
--- einfo.ads   (revision 223560)
+++ einfo.ads   (working copy)
@@ -2047,12 +2047,6 @@
 --   type the pragma will be chained to the rep item chain of the first
 --   subtype in the usual manner.
 
---Has_Volatile_Full_Access (Flag285)
---   Defined in all type entities, and also in constants, components and
---   variables. Set if a pragma Volatile_Full_Access applies to the entity.
---   In the case of private and incomplete types, this flag is set in
---   both the partial view and the full view.
-
 --Has_Xref_Entry (Flag182)
 --   Defined in all entities. Set if an entity has an entry in the Xref
 --   information generated in ali files. This is true for all source
@@ 

Re: [patch 10/10] debug-early merge: compiler proper

2015-05-22 Thread Aldy Hernandez

On 05/22/2015 07:26 AM, Richard Biener wrote:

On Wed, May 20, 2015 at 11:45 PM, Aldy Hernandez al...@redhat.com wrote:

On 05/20/2015 05:01 PM, Jan Hubicka wrote:



commit 8824b5ecba26cef065e47b34609c72677c3c36fc
Author: Aldy Hernandez al...@redhat.com
Date:   Wed May 20 16:31:14 2015 -0400

  Set DECL_IGNORED_P on temporary arrays created in the switch
  conversion pass.

diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 6b68a16..a4bcdba 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1097,6 +1097,7 @@ build_one_array (gswitch *swtch, int num, tree
arr_index_type,
 DECL_ARTIFICIAL (decl) = 1;
 TREE_CONSTANT (decl) = 1;
 TREE_READONLY (decl) = 1;
+  DECL_IGNORED_P (decl) = 1;
 varpool_node::finalize_decl (decl);



This looks obvious enough to me.  Technically speaking the array type
constructed
probalby should be TREE_ARTIFICAIL, but probably it does not matter.


Fine to commit to trunk btw.


Tested independently on trunk, and committed there.

Thanks.



Re: [C++ Patch] PR 65598

2015-05-22 Thread Jason Merrill

OK.

Jason


[PATCH, i386, libgcc]: Split SSE specific part from set_fast_math

2015-05-22 Thread Uros Bizjak
Hello!

This patch splits SSE specific part of set_fast_math to its own
function, decorated with fxsr,sse target attribute. This way, we can
avoid compiling the whole file with -msse that implies generation of
possibly unsupported CMOVE insns.

Additionally, we can now use generic t-crtfm makefile fragment.

2015-05-22  Uros Bizjak  ubiz...@gmail.com

* config.host (i[34567]-*-*, x86_64-*-*): Add t-crtfm instead of
i386/t-crtfm to tmake_file.
* config/i386/crtfastmath.c (set_fast_math_sse): New function.
(set_fast_math): Use set_fast_math_sse for SSE targets.
* config/i386/t-crtfm: Remove.

Bootstrapped, regression tested on x86_64-linux-gnu {,-m32} and
committed to mainline SVN.

Uros.
Index: config.host
===
--- config.host (revision 223519)
+++ config.host (working copy)
@@ -553,12 +553,12 @@ hppa*-*-openbsd*)
tmake_file=$tmake_file pa/t-openbsd
;;
 i[34567]86-*-darwin*)
-   tmake_file=$tmake_file i386/t-crtpc i386/t-crtfm
+   tmake_file=$tmake_file i386/t-crtpc t-crtfm
tm_file=$tm_file i386/darwin-lib.h
extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o
;;
 x86_64-*-darwin*)
-   tmake_file=$tmake_file i386/t-crtpc i386/t-crtfm
+   tmake_file=$tmake_file i386/t-crtpc t-crtfm
tm_file=$tm_file i386/darwin-lib.h
extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o
;;
@@ -595,24 +595,24 @@ x86_64-*-openbsd*)
;;
 i[34567]86-*-linux*)
extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o
-   tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff 
t-dfprules
+   tmake_file=${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules
tm_file=${tm_file} i386/elf-lib.h
md_unwind_header=i386/linux-unwind.h
;;
 i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i[34567]86-*-gnu* | 
i[34567]86-*-kopensolaris*-gnu)
extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o
-   tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff 
t-dfprules
+   tmake_file=${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules
tm_file=${tm_file} i386/elf-lib.h
;;
 x86_64-*-linux*)
extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o
-   tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff 
t-dfprules
+   tmake_file=${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules
tm_file=${tm_file} i386/elf-lib.h
md_unwind_header=i386/linux-unwind.h
;;
 x86_64-*-kfreebsd*-gnu | x86_64-*-knetbsd*-gnu)
extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o
-   tmake_file=${tmake_file} i386/t-crtpc i386/t-crtfm i386/t-crtstuff 
t-dfprules
+   tmake_file=${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff 
t-dfprules
tm_file=${tm_file} i386/elf-lib.h
;;
 i[34567]86-pc-msdosdjgpp*)
@@ -628,7 +628,7 @@ i[34567]86-*-rtems*)
extra_parts=$extra_parts crti.o crtn.o
;;
 i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*)
-   tmake_file=$tmake_file i386/t-crtpc i386/t-crtfm
+   tmake_file=$tmake_file i386/t-crtpc t-crtfm
extra_parts=$extra_parts crtprec32.o crtprec64.o crtprec80.o 
crtfastmath.o
tm_file=${tm_file} i386/elf-lib.h
md_unwind_header=i386/sol2-unwind.h
@@ -652,7 +652,7 @@ i[34567]86-*-cygwin*)
else
tmake_dlldir_file=i386/t-dlldir-x
fi
-   tmake_file=${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} 
i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin i386/t-crtfm i386/t-chkstk 
t-dfprules
+   tmake_file=${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} 
i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin t-crtfm i386/t-chkstk 
t-dfprules
;;
 x86_64-*-cygwin*)
extra_parts=crtbegin.o crtbeginS.o crtend.o crtfastmath.o
@@ -672,7 +672,7 @@ x86_64-*-cygwin*)
tmake_dlldir_file=i386/t-dlldir-x
fi
# FIXME - dj - t-chkstk used to be in here, need a 64-bit version of 
that
-   tmake_file=${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} 
i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin i386/t-crtfm t-dfprules 
i386/t-chkstk
+   tmake_file=${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} 
i386/t-slibgcc-cygming i386/t-cygming i386/t-cygwin t-crtfm t-dfprules 
i386/t-chkstk
;;
 i[34567]86-*-mingw*)
extra_parts=crtbegin.o crtend.o crtfastmath.o
@@ -700,7 +700,7 @@ i[34567]86-*-mingw*)
else
tmake_dlldir_file=i386/t-dlldir-x
fi
-   tmake_file=${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} 
i386/t-slibgcc-cygming i386/t-cygming i386/t-mingw32 i386/t-crtfm i386/t-chkstk 
t-dfprules
+   tmake_file=${tmake_file} ${tmake_eh_file} ${tmake_dlldir_file} 

Copy TYPE_NO_FORCE_BLK in finalize_type_size

2015-05-22 Thread Jan Hubicka
Hi,
PR 66181 is about ICE in verify_type that complains that type and its variant 
differs
by TYPE_NO_FORCE_BLK.  This flag is kind-of internal to stor-layout.c, so the 
divergence
may not matter (I am not sure about it as C++ FE finalizes type variants 
separately
and thus it may trip to different values) but I think it is cleaner to copy it 
to all
variants like we copy other stuff set by stor layout.

Bootstrapped/regtested x86_64-linux and the patch reportedly fixes the ARM ICE
OK?
Honza
PR middle-end/66181
* stor-layout.c (finalize_type_size): Also copy TYPE_NO_FORCE_BLK.
Index: stor-layout.c
===
--- stor-layout.c   (revision 222869)
+++ stor-layout.c   (working copy)
@@ -1834,6 +1834,7 @@
  TYPE_ALIGN (variant) = valign;
  TYPE_PRECISION (variant) = precision;
  SET_TYPE_MODE (variant, mode);
+ TYPE_NO_FORCE_BLK (variant) = TYPE_NO_FORCE_BLK (type);
}
 }
 }



[patch] libstdc++/66017 Avoid bad casts and fix alignment of _Rb_tree_nodelong long::_M_storage

2015-05-22 Thread Jonathan Wakely

There are two problems involved in this PR.

First, as Clang's ubsan detects, we are using static_cast to convert
from _Rb_tree_node_base* to _Rb_tree_node_Val* in cases where there
is no _Rb_tree_node_Val at that address (_M_impl._M_header is just
an _Rb_tree_node_base). That's undefined behaviour, and shows up here
when alignof(_Rb_tree_node_base) != alignof(_Rb_tree_node_Val)
because we end up with a misaligned _Rb_tree_node_Val* (which we
never dereference, because it's the past-the-end iterator, but it's
still UB to do the cast).

The second problem is that alignof(_Rb_tree_nodelong long) changes
depending on whether it's compiled as c++98 or c++11. This is because
in C++11 mode I replaced the member _M_value_field with uninitialized
storage aligned as alignof(_Val), using __aligned_buffer, but for
targets that define the ADJUST_FIELD_ALIGN target macro it's wrong to
assume that alignof(_M_value_field) == alignof(_Val), because the type
might have weaker alignment when it's a member of a structure. This
means the __aligned_buffer in C++11 mode is not aligned the same as
the _M_value_field member it was meant to replace. Ouch.

The first problem can be fixed by simply removing the casts, I don't
see why we need them. They are there because the _Rb_tree_iterator and
_Rb_tree_const_iterator constructors take a pointer-to-derived, but
then they upcast to pointer-to-base and store that type instead. So if
we just make the constructors take pointer-to-base instead then we
don't need the possibly-invalid casts. This way we only do the
downcast when dereferencing an iterator, which only happens for real
nodes where the downcast is valid, not for _M_impl._M_header.

The second problem can be fixed by making __aligned_buffer use the
alignment of struct _Tp2 { Tp t; } instead of alignof(_Tp).

patch.txt fixes those two problems. That is needed on trunk and the
gcc-5 branch.

patch2.txt then takes the idea further and gets rid of some more casts
that aren't actually necessary. The pointer returned by _M_end() is
never dereferenced, so there's no reason it can't be a _Base_ptr
rather than a _Link_type (i.e. _Rb_tree_node_base* rather than
_Rb_tree_node_Val*). While making that change I realised that the
_M_xxx_tr() functions for heterogeneous lookup can be simplified, so
the non-const versions use const ones and then call _M_const_cast() on
the results. Any objections to this additional change going in to
trunk?  We could take it even further and have _M_begin() return a
_Base_ptr and then only downcast it as needed, but I'm not proposing
that for now.

commit d57e7058e821664735e0118f8e37b8cbc37e49fb
Author: Jonathan Wakely jwak...@redhat.com
Date:   Thu May 21 14:41:16 2015 +0100

	PR libstdc++/66017
	* include/bits/stl_tree.h (_Rb_tree_iterator, _Rb_tree_const_iterator):
	Support construction from _Base_ptr.
	(_Rb_tree_const_iterator::_M_const_cast): Remove static_cast.
	(_Rb_tree::begin, _Rb_tree::end): Remove static_cast.
	* include/ext/aligned_buffer.h (__aligned_buffer): Use alignment of
	_Tp as a member subobject, not as a complete object.

diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h
index 5ca8e28..2dc95df 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -188,7 +188,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_node() { }
 
   explicit
-  _Rb_tree_iterator(_Link_type __x) _GLIBCXX_NOEXCEPT
+  _Rb_tree_iterator(_Base_ptr __x) _GLIBCXX_NOEXCEPT
   : _M_node(__x) { }
 
   reference
@@ -260,7 +260,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_node() { }
 
   explicit
-  _Rb_tree_const_iterator(_Link_type __x) _GLIBCXX_NOEXCEPT
+  _Rb_tree_const_iterator(_Base_ptr __x) _GLIBCXX_NOEXCEPT
   : _M_node(__x) { }
 
   _Rb_tree_const_iterator(const iterator __it) _GLIBCXX_NOEXCEPT
@@ -268,8 +268,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   iterator
   _M_const_cast() const _GLIBCXX_NOEXCEPT
-  { return iterator(static_casttypename iterator::_Link_type
-			(const_casttypename iterator::_Base_ptr(_M_node))); }
+  { return iterator(const_casttypename iterator::_Base_ptr(_M_node)); }
 
   reference
   operator*() const _GLIBCXX_NOEXCEPT
@@ -868,28 +867,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   iterator
   begin() _GLIBCXX_NOEXCEPT
-  { 
-	return iterator(static_cast_Link_type
-			(this-_M_impl._M_header._M_left));
-  }
+  { return iterator(this-_M_impl._M_header._M_left); }
 
   const_iterator
   begin() const _GLIBCXX_NOEXCEPT
-  { 
-	return const_iterator(static_cast_Const_Link_type
-			  (this-_M_impl._M_header._M_left));
-  }
+  { return const_iterator(this-_M_impl._M_header._M_left); }
 
   iterator
   end() _GLIBCXX_NOEXCEPT
-  { return iterator(static_cast_Link_type(this-_M_impl._M_header)); }
+  { return iterator(this-_M_impl._M_header); }
 
   const_iterator
  

Reuse predicate code analysis for constraints

2015-05-22 Thread Richard Sandiford
This patch adjusts the fix for PR target/65689 along the lines suggested
in https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01559.html.  The idea
is to reuse the existing gensupport.c routine to work out the codes
accepted by constraints.

I'd originally done this with an eye to using compute_test_codes for
the problem that Andreas found on s390.  I don't think it's going to
be useful for that after all, but it seems worth having for its on sake.

Bootstrapped  regression-tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* gensupport.h (compute_test_codes): Declare.
* gensupport.c (compute_predicate_codes): Rename to...
(compute_test_codes): ...this.  Generalize error message.
(process_define_predicate): Update accordingly.
* genpreds.c (compute_maybe_allows): Delete.
(add_constraint): Use compute_test_codes to determine whether
something can accept a SUBREG, REG or MEM.

Index: gcc/gensupport.h
===
--- gcc/gensupport.h2015-05-21 08:45:32.663464769 +0100
+++ gcc/gensupport.h2015-05-21 08:45:33.015460604 +0100
@@ -109,5 +109,6 @@ struct pattern_stats
 };
 
 extern void get_pattern_stats (struct pattern_stats *ranges, rtvec vec);
+extern void compute_test_codes (rtx, int, char *);
 
 #endif /* GCC_GENSUPPORT_H */
Index: gcc/gensupport.c
===
--- gcc/gensupport.c2015-05-21 08:45:32.663464769 +0100
+++ gcc/gensupport.c2015-05-21 08:51:12.667438995 +0100
@@ -204,8 +204,8 @@ #define TRISTATE_NOT(a) \
predicate expression EXP, writing the result to CODES.  LINENO is
the line number on which the directive containing EXP appeared.  */
 
-static void
-compute_predicate_codes (rtx exp, int lineno, char codes[NUM_RTX_CODE])
+void
+compute_test_codes (rtx exp, int lineno, char *codes)
 {
   char op0_codes[NUM_RTX_CODE];
   char op1_codes[NUM_RTX_CODE];
@@ -215,29 +215,29 @@ compute_predicate_codes (rtx exp, int li
   switch (GET_CODE (exp))
 {
 case AND:
-  compute_predicate_codes (XEXP (exp, 0), lineno, op0_codes);
-  compute_predicate_codes (XEXP (exp, 1), lineno, op1_codes);
+  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
+  compute_test_codes (XEXP (exp, 1), lineno, op1_codes);
   for (i = 0; i  NUM_RTX_CODE; i++)
codes[i] = TRISTATE_AND (op0_codes[i], op1_codes[i]);
   break;
 
 case IOR:
-  compute_predicate_codes (XEXP (exp, 0), lineno, op0_codes);
-  compute_predicate_codes (XEXP (exp, 1), lineno, op1_codes);
+  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
+  compute_test_codes (XEXP (exp, 1), lineno, op1_codes);
   for (i = 0; i  NUM_RTX_CODE; i++)
codes[i] = TRISTATE_OR (op0_codes[i], op1_codes[i]);
   break;
 case NOT:
-  compute_predicate_codes (XEXP (exp, 0), lineno, op0_codes);
+  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
   for (i = 0; i  NUM_RTX_CODE; i++)
codes[i] = TRISTATE_NOT (op0_codes[i]);
   break;
 
 case IF_THEN_ELSE:
   /* a ? b : c  accepts the same codes as (a  b) | (!a  c).  */
-  compute_predicate_codes (XEXP (exp, 0), lineno, op0_codes);
-  compute_predicate_codes (XEXP (exp, 1), lineno, op1_codes);
-  compute_predicate_codes (XEXP (exp, 2), lineno, op2_codes);
+  compute_test_codes (XEXP (exp, 0), lineno, op0_codes);
+  compute_test_codes (XEXP (exp, 1), lineno, op1_codes);
+  compute_test_codes (XEXP (exp, 2), lineno, op2_codes);
   for (i = 0; i  NUM_RTX_CODE; i++)
codes[i] = TRISTATE_OR (TRISTATE_AND (op0_codes[i], op1_codes[i]),
TRISTATE_AND (TRISTATE_NOT (op0_codes[i]),
@@ -321,7 +321,7 @@ compute_predicate_codes (rtx exp, int li
 
 default:
   error_with_line (lineno,
-  '%s' cannot be used in a define_predicate expression,
+  '%s' cannot be used in predicates or constraints,
   GET_RTX_NAME (GET_CODE (exp)));
   memset (codes, I, NUM_RTX_CODE);
   break;
@@ -373,7 +373,7 @@ process_define_predicate (rtx desc, int
   if (GET_CODE (desc) == DEFINE_SPECIAL_PREDICATE)
 pred-special = true;
 
-  compute_predicate_codes (XEXP (desc, 1), lineno, codes);
+  compute_test_codes (XEXP (desc, 1), lineno, codes);
 
   for (i = 0; i  NUM_RTX_CODE; i++)
 if (codes[i] != N)
Index: gcc/genpreds.c
===
--- gcc/genpreds.c  2015-05-21 08:45:32.663464769 +0100
+++ gcc/genpreds.c  2015-05-21 08:45:33.015460604 +0100
@@ -716,34 +716,6 @@ mangle (const char *name)
   return XOBFINISH (rtl_obstack, const char *);
 }
 
-/* Return a bitmask, bit 1 if EXP maybe allows a REG/SUBREG, 2 if EXP
-   maybe allows a MEM.  Bits should be clear only when we are sure it
-   will not allow a REG/SUBREG or a 

[PATCH/RFC] Make loop-header-copying more aggressive, rerun before tree-if-conversion

2015-05-22 Thread Alan Lawrence
This example which I wrote to test ifconversion, currently fails to if-convert 
or vectorize:


int foo ()
{
  for (int i = 0; i  32 ; i++)
{
  int m = (a[i]  i) ? 5 : 4;
  b[i] = a[i] * m;
}
}

...because jump-threading in dom1 rearranged the loop into a form that neither 
if-conversion nor vectorization would attempt. Discussion at 
https://gcc.gnu.org/ml/gcc/2015-04/msg00343.html lead to the suggestion that I 
should rerun loop-header copying (an earlier attempt to fix ifconversion, 
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01743.html, still did not enable 
vectorization.)


This patch does so (and makes slightly less conservative, to tackle the example 
above). I found I had to make this a separate pass, so that the phi nodes were 
cleaned up at the end of the pass before running tree_if_conversion. Also at 
this stage in the compiler (inside loop opts) it was not possible to run 
loop_optimizer_init+finalize, or other loop_optimizer data structures needed 
later would be deleted; hence, I have two nearly-but-not-quite-identical passes, 
the new ch_vect avoiding the init/finalize. I tried to tackle this with some 
C++ subclassing, which removes the duplication, but the result feels a little 
ugly; suggestions for any neater approach welcome.


This patch causes failure of the scan-tree-dump of dom2 in gcc.dg/ssa/pr21417.c. 
This looks for jump-threading to perform an optimization, but no longer finds 
the expected line in the log - as the loop-header-copying phase has already done 
an equivalent transformation *before* dom2. The final CFG is thus in the desired 
form, but I'm not sure how to determine this (scanning the CFG itself is very 
difficult, well beyond what we can do with regex, requiring looking at multiple 
lines and basic blocks). Can anyone advise? [The test issue can be worked around 
by preserving the old do_while_p logic for the first header-copying pass, and 
using the new logic only for the second, but this is more awkward inside the 
compiler, which feels wrong.]


Besides the new vect-ifcvt-11.c, the testsuite actually has a couple of other 
examples where this patch enables (undesired!) vectorization. I've dealt with 
these, but for the record:
	* gcc.dg/vect/slp-perm-7.c: the initialization loop in main, contained a check 
that input[i]  200; this was already optimized out (because input[i] was set to 
i%256, where iN with N #defined to 16), but that loop was not vectorized because:
/work/alalaw01/oban/srcfsf/gcc/gcc/testsuite/gcc.dg/vect/slp-perm-7.c:54:3: 
note: not vectorized: latch block not empty.
/work/alalaw01/oban/srcfsf/gcc/gcc/testsuite/gcc.dg/vect/slp-perm-7.c:54:3: 
note: bad loop form.


	* gcc.dg/vect/vect-strided-a-u16-i4.c: the main1() function has three loops; 
the first (initialization) has an 'if (y) abort() /* Avoid vectorization.  */'. 
However, the 'volatile int y = 0' this was meant to reference, is actually 
shadowed by a local non-volatile; the test is thus peeled off and absent from 
the body of the loop. The loop only avoided vectorization because of non-empty 
latch and bad loop form, as previous.


With this patch, both those loops now have good form, hence I have fixed both to 
check a global volatile to prevent these extraneous parts from being vectorized.


Tested with bootstrap + check-gcc on x86_64 and AArch64 (linux). As noted above, 
this causes a spurious PASS-FAIL of a scan-tree-dump test, which I'm unsure how 
to fix, but no other regressions.


gcc/ChangeLog:

* tree-pass.h (make_pass_ch_vect): New.
* passes.def: Add pass_ch_vect just before pass_if_conversion.

* tree-ssa-loop-ch.c (do_while_loop_p): For single-exit loops,
look for blocks with exit edge and code after them.
(pass_data_ch_vect, class pass_ch_vect, make_pass_ch_vect): New.
(class pass_ch): Extend pass_ch_vect.
(pass_ch::execute): Move all but loop_optimizer_init/finalize to...
(pass_ch_vect::execute): ...here.

* tree-ssa-loop.c (pass_tree_loop_init::execute): Add flags
LOOPS_HAVE_PREHEADERS and LOOPS_HAVE_SIMPLE_LATCHES.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-perm-7.c (zero): New.
(main): Test zero rather than input[i], to avoid vectorization.
* gcc.dg/vect/vect-strided-a-u16-i4.c (main1): Narrow scope of x,y,z,w.
of unsigned
* gcc.dg/vect/vect-ifcvt-11.c: New.

diff --git a/gcc/passes.def b/gcc/passes.def
index 1d598b2..87cfe2a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -247,6 +247,7 @@ along with GCC; see the file COPYING3.  If not see
 	  PUSH_INSERT_PASSES_WITHIN (pass_parallelize_loops)
 	  NEXT_PASS (pass_expand_omp_ssa);
 	  POP_INSERT_PASSES ()
+	  NEXT_PASS (pass_ch_vect);
 	  NEXT_PASS (pass_if_conversion);
 	  /* pass_vectorize must immediately follow pass_if_conversion.
 	 Please do not add any other passes in between.  */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-7.c 

Re: [RFA] Fix combine to canonicalize (mult X pow2)) more often

2015-05-22 Thread Segher Boessenkool
On Thu, May 21, 2015 at 09:24:37AM -0600, Jeff Law wrote:
 When combine needs to split a complex insn, it will canonicalize a 
 simple (mult X (const_int Y)) where Y is a power of 2 into the expected 
 (ashift X (const_int Y')) if the (mult ...) is selected as a split point.
 
 However if the split point is (plus (mult (X (const_int Y)) Z) combine 
 fails to canonicalize.

 OK for the trunk?

Okay.  Something went wrong with your changelog though.  Rebasing
changelogs is a recipe for disaster.


The following is just copy-paste, but anyway...

 +   /* Similarly for (plus (mult FOO (const_int pow2))).  */
 +   if (split_code == PLUS
 +GET_CODE (XEXP (*split, 0)) == MULT
 +CONST_INT_P (XEXP (XEXP (*split, 0), 1))
 +INTVAL (XEXP (XEXP (*split, 0), 1))  0
 +(i = exact_log2 (UINTVAL (XEXP (XEXP (*split, 0), 1 = 0)

INTVAL (..)  0  here disallows 163 while exact_log2 allows it
(with a 64-bit HWI).  Most places don't check the  0 and it doesn't
seem necessary in the places that do.

This won't ever trigger for a SImode 131 either.

None of this matters when we're just looking at scaled indexing, of
course.  But exact_log2 is much harder to use correctly than to use
it incorrectly :-(


Segher


Re: [RFC / CFT] PR c++/66192 - Remove TARGET_RELAXED_ORDERING and use load acquires.

2015-05-22 Thread Ramana Radhakrishnan



On 22/05/15 15:28, Jason Merrill wrote:

On 05/22/2015 09:55 AM, David Edelsohn wrote:

On Fri, May 22, 2015 at 9:40 AM, Jason Merrill ja...@redhat.com wrote:

On 05/22/2015 07:23 AM, Ramana Radhakrishnan wrote:


+  /* Load the guard value only through an atomic acquire load.  */
+  guard = build_atomic_load (guard, MEMMODEL_ACQUIRE);
+
 /* Check to see if the GUARD is zero.  */
 guard = get_guard_bits (guard);



I wonder if these calls should be reversed, to express that we're only
trying to atomically load a byte (on non-ARM targets)?


That expresses the semantics more directly, but will that lead to less
efficient code on some RISC architectures?


I'm not sure.  I would expect that the target would use a larger load
and mask it if that is better for performance, but I don't know.

I do notice that get_guard_bits after build_atomic_load just won't work
on non-ARM targets, as it ends up trying to take the address of a value.

Jason





So on powerpc where targetm.guard_mask_bit is false - this is what I see.


{
  static int * p;

static int * p;
  if (cleanup_point (unsigned char) *(char *) (long long int) 
__atomic_load_8 (_ZGVZ1fvE1p, 2) == 0)

{
  if (cleanup_point __cxa_guard_acquire (_ZGVZ1fvE1p) != 0)
{
  cleanup_point  Unknown tree: expr_stmt
  TARGET_EXPR D.2846, 0;, p = (int *) operator new (4);;, D.2846 = 
1;;, __cxa_guard_release (_ZGVZ1fvE1p); ;

}
}
  return retval = p;
}

with the following output - David - is this what you would expect ?


0:  addis 2,12,.TOC.-0b@ha
addi 2,2,.TOC.-0b@l
.localentry _Z1fv,.-_Z1fv
mflr %r0
std %r30,-16(%r1)
std %r31,-8(%r1)
.cfi_register 65, 0
.cfi_offset 30, -16
.cfi_offset 31, -8
addis %r30,%r2,_ZGVZ1fvE1p@toc@ha
addi %r30,%r30,_ZGVZ1fvE1p@toc@l
std %r0,16(%r1)
stdu %r1,-64(%r1)
.cfi_def_cfa_offset 64
.cfi_offset 65, 16
addis %r10,%r2,_ZGVZ1fvE1p@toc@ha   # gpr load 
fusion, type long

ld %r10,_ZGVZ1fvE1p@toc@l(%r10)
cmpw %cr7,%r10,%r10
bne- %cr7,$+4
isync
rlwinm %r9,%r10,0,0xff
std %r10,32(%r1)
cmpwi %cr7,%r9,0
beq %cr7,.L11
.L9:
addi %r1,%r1,64
.cfi_remember_state
.cfi_def_cfa_offset 0
addis %r3,%r2,_ZZ1fvE1p@toc@ha  # gpr load fusion, type 
long

ld %r3,_ZZ1fvE1p@toc@l(%r3)
ld %r0,16(%r1)
ld %r30,-16(%r1)
ld %r31,-8(%r1)
mtlr %r0
.cfi_restore 65
.cfi_restore 31
.cfi_restore 30
blr
.p2align 4,,15
.L11:
.cfi_restore_state
mr %r3,%r30




And on AArch64 which is where guard_mask_bit is true.


 static int * p;

static int * p;
  if (cleanup_point ((long long int) __atomic_load_8 (_ZGVZ1fvE1p, 
2)  1) == 0)

{
  if (cleanup_point __cxa_guard_acquire (_ZGVZ1fvE1p) != 0)
{
  cleanup_point  Unknown tree: expr_stmt
  TARGET_EXPR D.3168, 0;, p = (int *) operator new (4);;, D.3168 = 
1;;, __cxa_guard_release (_ZGVZ1fvE1p); ;

}
}
  return retval = p;
}

Alternatively I can change this to an atomic_load of just the byte 
required but I'm not sure if that's supported on all targets.


I'm going to be running away shortly for the long weekend here, so will 
resume discussions/experiments on Tuesday.



regards
Ramana


Re: [RFC / CFT] PR c++/66192 - Remove TARGET_RELAXED_ORDERING and use load acquires.

2015-05-22 Thread Jason Merrill

On 05/22/2015 09:55 AM, David Edelsohn wrote:

On Fri, May 22, 2015 at 9:40 AM, Jason Merrill ja...@redhat.com wrote:

On 05/22/2015 07:23 AM, Ramana Radhakrishnan wrote:


+  /* Load the guard value only through an atomic acquire load.  */
+  guard = build_atomic_load (guard, MEMMODEL_ACQUIRE);
+
 /* Check to see if the GUARD is zero.  */
 guard = get_guard_bits (guard);



I wonder if these calls should be reversed, to express that we're only
trying to atomically load a byte (on non-ARM targets)?


That expresses the semantics more directly, but will that lead to less
efficient code on some RISC architectures?


I'm not sure.  I would expect that the target would use a larger load 
and mask it if that is better for performance, but I don't know.


I do notice that get_guard_bits after build_atomic_load just won't work 
on non-ARM targets, as it ends up trying to take the address of a value.


Jason



Re: [PATCH][AArch64] PR target/65491: Classify V1TF vectors as AAPCS64 short vectors rather than composite types

2015-05-22 Thread Kyrill Tkachov

Hi James,

On 19/05/15 12:18, James Greenhalgh wrote:

On Mon, Apr 20, 2015 at 11:16:02AM +0100, Kyrylo Tkachov wrote:

Hi all,

The ICE in the PR happens when we pass a 1x(128-bit float) vector as an
argument.
The aarch64 backend erroneously classifies it as a composite type when in
fact it
is a short vector according to AAPCS64
(section 4.1.2 from
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.p
df).

Agreed.


The solution in this patch is to check aarch64_composite_type_p for a short
vector with
aarch64_short_vector_p rather than the other way around (check for
aarch64_short_vector_p
in aarch64_composite_type_p).

I think I understand what you are saying, but your patch does the
opposite (ADDS a check for aarch64_short_vector_p in
aarch64_composite_type_p, REMOVES a check for aarch64_composite_type_p,
in aarch64_short_vector_p)...


Yeah, I just worded it wrong in the cover letter, sorry about that.
As you say, the logic is pretty hairy.



This logic is pretty hairy, and I'm struggling to convince myself that
your change only hits the bug you described above. I think I've worked
it through and it does, but if you can find any additional ABI tests
which stress the Vector/Floating-Point passing rules that would help
settle my nerves.


The aapcs64.exp stuff seems to test the existing rules
quite well...



The patch is OK. I wouldn't think we would want to backport it to
release branches as there is no regression to fix.


Ok, I've committed it with r223577.
I agree that it's not a regression fix, so messing with ABI code
in the release branches is not desirable.

Thanks for the review.

Kyrill



Thanks,
James


2015-04-20  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 PR target/65491
 * config/aarch64/aarch64.c (aarch64_short_vector_p): Move above
 aarch64_composite_type_p.  Remove check for aarch64_composite_type_p.
 (aarch64_composite_type_p): Return false if given type and mode are
 for a short vector.

2015-04-20  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 PR target/65491
 * gcc.target/aarch64/pr65491_1.c: New test.
 * gcc.target/aarch64/aapcs64/type-def.h (vlf1_t): New typedef.
 * gcc.target/aarch64/aapcs64/func-ret-1.c: Add test for vlf1_t.




Re: Add few cases to operand_equal_p

2015-05-22 Thread Jan Hubicka
   + case OBJ_TYPE_REF:
   +   {
   + if (!operand_equal_p (OBJ_TYPE_REF_EXPR (arg0),
   +   OBJ_TYPE_REF_EXPR (arg1), flags))
   +   return false;
   + if (flag_devirtualize  virtual_method_call_p (arg0))
   +   {
   + if (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
   + != tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1)))
   +   return false;
   + if (!types_same_for_odr (obj_type_ref_class (arg0),
   +  obj_type_ref_class (arg1)))
   +   return false;
  
  devirt machinery in operand_equal_p?  please not.  not more places!
 
 OBJ_TYPE_REF explicitly says what is the type of class whose virtual method is
 called. It is GIMPLE operand expression like other, so I think we need to 
 handle
 it.
 
 Actually I think I can just drop the flag_devirtualize check because with
 !flag_devirtualize we should simply avoid having OBJ_TYPE_REF around.  That
 would make it looking less devirt specific.  We can also test types for
 equivalence of main variant, but that would just introduce false positives 
 when
 LTO did not merged two identical ODR types.  It would be correct though.

After more tought, I indeed like the idea of having gimple specific matching
code better (I managed to talk myself into the idea of unifying the code rather
than duplicating everything, but handling generic is a pain)

If we go this path, I think I can just withdraw OBJ_TYPE_REF change here.  
operand_equal_p conservatively consders every pair of two OBJ_TYPE_REFs 
different
that is safe.

Concerning the rest of the patch, I leave it up to your decision if we want
to handle these in operand_equal_p or only at gimple level.

thanks,
Honza
  
   + /* OBJ_TYPE_REF_OBJECT is used to track the instance of
   +object THIS pointer points to.  Checking that both
   +addresses equal is safe approximation of the fact
   +that dynamic types are equal.
   +Do not care about the other flags, because this expression
   +does not attribute to actual value of OBJ_TYPE_REF  */
   + if (!operand_equal_p (OBJ_TYPE_REF_OBJECT (arg0),
   +   OBJ_TYPE_REF_OBJECT (arg1),
   +   OEP_ADDRESS_OF))
   +   return false;
   +   }
   +
   + return true;
   +   }
   +
 default:
   return 0;
 }
   @@ -3097,6 +3152,21 @@ operand_equal_p (const_tree arg0, const_
 }

case tcc_declaration:
   +  /* At LTO time the FIELD_DECL may exist in multiple copies.
   +  We only care about offsets and bit offsets for operands.  */
  
  Err?  Even at LTO time FIELD_DECLs should only appear once.
  So - testcase?
 
 FIELD_DECL has TREE_TYPE and TREE_TYPE may not get merged by LTO.  So if the
 two expressions originate from two different units, we may have two
 semantically equivalent FIELD_DECLs (of compatible types and same offsets) 
 that
 occupy different memory locations because their merging was prevented by
 something upstream (like complete wrt incmplete pointer in the type)
   +  /* In GIMPLE empty constructors are allowed in initializers of
   +  vector types.  */
  
  Why this comment about GIMPLE?  This is about comparing GENERIC
  trees which of course can have CONSTRUCTORs of various sorts.
  
   +  if (TREE_CODE (arg0) == CONSTRUCTOR)
   + {
   +   unsigned length1 = vec_safe_length (CONSTRUCTOR_ELTS (arg0));
   +   unsigned length2 = vec_safe_length (CONSTRUCTOR_ELTS (arg1));
   +
   +   if (length1 != length2)
   + return false;
   +
   +   flags = ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
   +
   +   for (unsigned i = 0; i  length1; i++)
   + if (!operand_equal_p (CONSTRUCTOR_ELT (arg0, i)-value,
   +   CONSTRUCTOR_ELT (arg1, i)-value, flags))
  
  You need to compare -index as well here.  I'm not sure constructor
  values are sorted always so you might get very many false negatives.
 
 many false negatives is better than all false negatices :)
 
 You are right, either I should punt on non-empty constructors or compare
 the indexes, will do the second.
 
   +case FIELD_DECL:
   +  /* At LTO time the FIELD_DECL may exist in multiple copies.
   +  We only care about offsets and bit offsets for operands.  */
  
  So explain that this is the reason we don't want to hash by
  DECL_UID.  I still think this is bogus.
 
 Will do if we agree on having this.
 
 I know you would like ipa-icf to keep original bodies and use them for 
 inlining
 declaring alias sets to be function local.  This is wrong plan.  Consder:
 
 void t(int *ptr)
 {
   *ptr=1;
 }
 
 int a(int *ptr1, int *ptr2)
 {
   int a = *ptr1;
   t(ptr2)
   return a+*ptr1;
 }
 
 long b(long *ptr1, int *ptr2)
 {
   int a = *ptr1;
   t(ptr2)
   return a+*ptr1;
 }
 
 here aliasing leads to the two options to be optimizer differently:
 a:
 .LFB1:  
 .cfi_startproc
  

Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

2015-05-22 Thread Sriraman Tallam
On Fri, May 22, 2015 at 2:00 AM, Pedro Alves pal...@redhat.com wrote:
 On 05/21/2015 11:02 PM, Sriraman Tallam wrote:
 On Thu, May 21, 2015 at 2:51 PM, Pedro Alves pal...@redhat.com wrote:
 On 05/21/2015 10:12 PM, Sriraman Tallam wrote:

 My original proposal, for x86_64 only, was to add
 -fno-plt=function-name. This lets the user decide for which
 functions PLT must be avoided.  Let the compiler always generate an
 indirect call using call *func@GOTPCREL(%rip).  We could do this for
 non-PIC code too.  No need for linker fixups since this relies on the
 user to know that func is from a shared object.

 Having to pass function names on the command line seems like an odd
 interface.  E.g, you'll need to pass the mangled name for
 C++ functions.  Any reason this isn't a function attribute?

 It is not clear to me where I would stick the attribute.  Example
 usage in foo.cc:

 #includestring.h

 int main() {
   int n = memcmp();
 }

 I want memcmp to not go through PLT, do you propose explicitly
 re-declaring it in foo.cc with the attribute?

 I guess you'd do:

 #includestring.h

 __attribute__((no_plt)) typeof (memcpy) memcpy;

 int main() {
   int n = memcmp();
 }

 or even:

 #includestring.h

 int main() {
   if (hotpath) {
 __attribute__((no_plt)) typeof (memcpy) memcpy;
 for (..) {
   int n = memcmp();
 }
   } else {
   int n = memcmp();
   }
 }

 or globally:

 $ cat no-plt/string.h:
 #include_next string.h
 __attribute__((no_plt)) typeof (memcpy) memcpy;

 $ gcc -I no-plt/ ...

That looks good, thanks.

Sri


 Thanks,
 Pedro Alves



  1   2   >