date:20140212

[PATCH] S390: Add test for hotpatching of nested functions

2014-02-12 Thread Dominik Vogt

This patch adds a s390 specific test that verifies that
hotpatching of nested functions is not supported and generates a
warning.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
>From ae383db14eb5a9035603e7c80126470389ea8029 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 13 Feb 2014 06:57:14 +
Subject: [PATCH 1/2] S390: Add test for hotpatching of nested functions.

(Must generate a warning that it's not supported.)
---
 gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c | 23 ++
 1 file changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c

diff --git a/gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c b/gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c
new file mode 100644
index 000..489fc5d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c
@@ -0,0 +1,23 @@
+/* Functional tests for the function hotpatching feature.  */
+
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -mhotpatch" } */
+
+#include 
+
+int hp1(void)
+{
+  int nested1(void) /* { dg-warning "hotpatching is not compatible with nested functions" } */
+  { return 1; }
+
+  __attribute__ ((hotpatch))
+  int nested2(void) /* { dg-warning "hotpatching is not compatible with nested functions" } */
+  { return 1; }
+
+  return nested1() - nested2();
+}
+
+int main (void)
+{
+  return hp1();
+}
-- 
1.8.3.1

2014-02-13  Dominik Vogt  

* gcc.target/s390/hotpatch-compile-8.c: New test

Re: [PATCH] (gcc-4.8) S390: Fix crash with -mhotpatch and gfortran

2014-02-12 Thread Dominik Vogt

On Wed, Feb 12, 2014 at 11:30:30AM +0100, Dominik Vogt wrote:
> On Wed, Feb 12, 2014 at 11:28:38AM +0100, Dominik Vogt wrote:
> > The attached patch fixes a crash if gfortran encounters a nested
> > function when -mhotpatch is enabled.  (It slightly improves the
> > warning message too.)
> > 
> > This patch affects s390 only.  Andreas Krebbel will commit the
> > patch soon, if there are no objections.
> 
> Same patch for gcc-4.8.

Patch updated according to Richard's suggestion in the 4.9 thread.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
>From 81264c60640ae4b61cc7fa0129857ca3b3d253e8 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 12 Feb 2014 05:53:34 +
Subject: [PATCH 1/2] S390: Fix crash when -mhotpatch encounters nested
 functions

(e.g. with gfortran).
---
 gcc/config/s390/s390.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 2ba1d8a..6612bf9 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5475,9 +5475,8 @@ s390_asm_output_function_label (FILE *asm_out_file, const char *fname,
   if (hotpatch_trampoline_halfwords >= 0
 	  && decl_function_context (decl) != NULL_TREE)
 	{
-	  warning_at (0, DECL_SOURCE_LOCATION (decl),
-		  "hotpatch_prologue is not compatible with nested"
-		  " function");
+	  warning_at (DECL_SOURCE_LOCATION (decl), OPT_mhotpatch,
+		  "hotpatching is not compatible with nested functions");
 	  hotpatch_trampoline_halfwords = -1;
 	}
 }
-- 
1.8.3.1

2014-02-12  Dominik Vogt  

* config/s390/s390.c (s390_asm_output_function_label):
fix crash caused by bad second argument to warning_at() with -mhotpatch
and nested functions (e.g. with gfortran)

Re: [PATCH] S390: Fix crash with -mhotpatch and gfortran

2014-02-12 Thread Dominik Vogt

On Wed, Feb 12, 2014 at 11:34:16AM +, Richard Sandiford wrote:
> Dominik Vogt  writes:
> > diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> > index 32a25a4..9ae8ffd 100644
> > --- a/gcc/config/s390/s390.c
> > +++ b/gcc/config/s390/s390.c
> > @@ -5510,9 +5510,8 @@ s390_asm_output_function_label (FILE *asm_out_file, 
> > const char *fname,
> >if (hotpatch_trampoline_halfwords >= 0
> >   && decl_function_context (decl) != NULL_TREE)
> > {
> > - warning_at (0, DECL_SOURCE_LOCATION (decl),
> > - "hotpatch_prologue is not compatible with nested"
> > - " function");
> > + warning_at (0, OPT_mhotpatch,
> > + "hotpatching is not compatible with nested functions");
> 
> Looks like this should be:
> 
> warning_at (DECL_SOURCE_LOCATION (decl), OPT_mhotpatch,
> "hotpatching is not compatible with nested functions");

Yep, see updated patch.  Thanks.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
>From b5c24ad50180da2ac31be5c6c3d5fb827c8ab8ed Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 12 Feb 2014 05:53:34 +
Subject: [PATCH 1/2] S390: Fix crash when -mhotpatch encounters nested
 functions

(e.g. with gfortran).
---
 gcc/config/s390/s390.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 32a25a4..ec88bf1 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5510,9 +5510,8 @@ s390_asm_output_function_label (FILE *asm_out_file, const char *fname,
   if (hotpatch_trampoline_halfwords >= 0
 	  && decl_function_context (decl) != NULL_TREE)
 	{
-	  warning_at (0, DECL_SOURCE_LOCATION (decl),
-		  "hotpatch_prologue is not compatible with nested"
-		  " function");
+	  warning_at (DECL_SOURCE_LOCATION (decl), OPT_mhotpatch,
+		  "hotpatching is not compatible with nested functions");
 	  hotpatch_trampoline_halfwords = -1;
 	}
 }
-- 
1.8.3.1

2014-02-12  Dominik Vogt  

* config/s390/s390.c (s390_asm_output_function_label):
fix crash caused by bad second argument to warning_at() with -mhotpatch
and nested functions (e.g. with gfortran)

[committed] Remove auto increment FIXME from pa.c

2014-02-12 Thread John David Anglin

This hunk of code disabling auto increment instructions can be removed  
on

the trunk now that PR middle-end/56791 is fixed there.

Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.

Committed to trunk.

Dave
--
John David Anglin   dave.ang...@bell.net


2014-02-12  John David Anglin  

* config/pa/pa.c (pa_option_override): Remove auto increment FIXME.

Index: config/pa/pa.c
===
--- config/pa/pa.c  (revision 207710)
+++ config/pa/pa.c  (working copy)
@@ -517,12 +517,6 @@
   write_symbols = NO_DEBUG;
 }
 
-#ifdef AUTO_INC_DEC
-  /* FIXME: Disable auto increment and decrement processing until reload
- is completed.  See PR middle-end 56791.  */
-  flag_auto_inc_dec = reload_completed;
-#endif
-
   /* We only support the "big PIC" model now.  And we always generate PIC
  code when in 64bit mode.  */
   if (flag_pic == 1 || TARGET_64BIT)

Re: [PATCH] Don't add DW_AT_{main_subprogram,calling_convention} attributes more than once (PR debug/60152)

2014-02-12 Thread Cary Coutant

> gen_subprogram_die is often called more than once for the same decl
> (e.g. the first time through force_decl_die etc.), but it always
> unconditionally calls add_calling_convention_attribute which thus
> may add the attributes several times.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

Looks good to me.

> Alternatively, we could avoid the add_calling_convention_attribute
> call in gen_subprogram_die if subr_die == orig_die, i.e. if we have
> already processed the decl once, hopefully the calling conventions would be
> the same in all cases.

s/orig_die/old_die/

This sounds a little more efficient -- no calls to get_AT just to see
if the attribute has already been added.

I've got a slight preference for your alternate proposal (assuming it works).

-cary

[PATCH] [libgomp] make it possible to use OMP on both sides of a fork

2014-02-12 Thread Nathaniel Smith

Problem: A common use care for OMP is to accelerate the internal
workings of an otherwise serial interface. For example, OpenBLAS in
some settings will internally use OMP to accelerate the implementation
of matrix-matrix multiply (DGEMM). When DGEMM is called, then an OMP
section is started, the work is done, then the OMP section exits, the
program returns to serial mode, and DGEMM returns. All this is
entirely transparent to the user -- in fact, it's common for users to
switch between different linear algebra cores (BLAS libraries) without
recompiling, so it's impossible for code that uses linear algebra to
know which underlying library is in use, or how it has been compiled.

However, in order to support some corners of the OMP spec, it is
important that the threads that were started to implement an OMP
parallel section be kept around, in case another OMP section has
started. (AFAICT this is only true when "threadprivate" variables are
in use. Unfortunately AFAICT there is currently no way to determine
whether this is the case -- such variables are handled directly by GCC
without calling into libgomp, so we can't tell at runtime whether they
exist.)

And, this causes a big problem and abstraction leak: it means that if
you use OMP (e.g., by multiplying two matrices), and then fork, and
then the child also uses OMP (e.g., by also multiplying two matrices),
then the child immediately deadlocks (as OMP waits for threads that it
thinks still exist, but that disappeared during the fork). The result
is that it simply *is not possible to know* whether fork() will
actually work as advertised, even when writing purely serial code, if
that code happens to do seemingly innocent things like linear algebra.
And this then ends up causing surprising wreakage in far-flung parts
of the numerical ecosystem (e.g., here's someone trying to figure
figure out why their web site's task manager crashes whenever they try
to plot a graph: https://github.com/celery/celery/issues/1842).

(Somewhat more impassioned rant and references to previous discussions
here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035)

In practice, GOMP seems to be the only OMP implementation that suffers
from this problem; people who encounter this problem are often advised
to switch to icc.

There does not appear to be any fully POSIX-compliant way to solve
this problem (not least because in a strict reading of the POSIX spec,
you aren't guaranteed to be able to do practically *anything* after a
fork() in any program which has ever called a pthreads_* function). In
a less strict reading, we might expect to be okay if no threads are
actually running at the time that fork() is called -- but, we can't
shut down OMP threads before forking, because of the issue with
threadprivate variables -- it might change the behaviour of compliant
programs.

But in practice, if the fork() occurs at a time when every thread is
just sitting waiting on a barrier, then we can be pretty sure that
libc etc. will be in a generally thread-consistent state. And in
practice, the few truly dangerous operations we need to clean up after
the fact -- e.g., destroying that barrier -- do seem to work, at least
on Linux. The attached patch, therefore, takes this strategy.
Crucially, it should have no impact on compliant programs, because it
doesn't actually do anything except set/check a single global variable
until the user actually enters an OMP section in the child, at which
case they have already violated POSIX, so we might as well cross our
fingers and hope for the best. (At the very least, the included test
does fail on Linux x86-64 without the patch, and passes with the
patch.)

Other options that might be worth considering:
-- Adding some way for libgomp to determine whether threadprivate
variables are in use, and then using this information to shut down
threads in a pre-fork handler iff doing so is safe.
-- Instead of trying to clean up the various mutex/barrier/semaphore
detritus left in the child by the evaporating threads, we could simply
leak them. I don't know which is worse in practice: a small leak (once
per child process), or the risk that the various *_destroy functions
will blow up (as POSIX allows them to do).

ChangeLog:

2014-02-12  Nathaniel J. Smith  

* team.c (gomp_free_pool_helper): Move per-thread cleanup to main
thread.
(gomp_free_thread): Delegate implementation to...
(gomp_free_thread_pool): ...this new function. Like old
gomp_free_thread, but does per-thread cleanup, and has option to
skip everything that involves interacting with actual threads,
which is useful when called after fork.
(gomp_after_fork_callback): New function.
(gomp_team_start): Register atfork handler, and check for fork on
entry.

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
Index: testsuite/libgomp.c/fork-1.c
===
--- testsu

[PATCH] Don't add DW_AT_{main_subprogram,calling_convention} attributes more than once (PR debug/60152)

2014-02-12 Thread Jakub Jelinek

Hi!

gen_subprogram_die is often called more than once for the same decl
(e.g. the first time through force_decl_die etc.), but it always
unconditionally calls add_calling_convention_attribute which thus
may add the attributes several times.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

Alternatively, we could avoid the add_calling_convention_attribute
call in gen_subprogram_die if subr_die == orig_die, i.e. if we have
already processed the decl once, hopefully the calling conventions would be
the same in all cases.

2014-02-12  Jakub Jelinek  

PR debug/60152
* dwarf2out.c (add_calling_convention_attribute): Don't add
DW_AT_main_subprogram or DW_AT_calling_convention attributes
if they are already present.

--- gcc/dwarf2out.c.jj  2014-01-30 20:31:19.0 +0100
+++ gcc/dwarf2out.c 2014-02-12 18:26:41.098486306 +0100
@@ -16873,13 +16873,15 @@ add_calling_convention_attribute (dw_die
rely on the old way, which we thus keep.  */
   value = DW_CC_program;
 
-  if (dwarf_version >= 4 || !dwarf_strict)
+  if ((dwarf_version >= 4 || !dwarf_strict)
+ && !get_AT (subr_die, DW_AT_main_subprogram))
add_AT_flag (subr_die, DW_AT_main_subprogram, 1);
 }
 
   /* Only add the attribute if the backend requests it, and
  is not DW_CC_normal.  */
-  if (value && (value != DW_CC_normal))
+  if (value && value != DW_CC_normal
+  && !get_AT (subr_die, DW_AT_calling_convention))
 add_AT_unsigned (subr_die, DW_AT_calling_convention, value);
 }
 

Jakub

[PATCH] Fix compress_float_constants related ICE (PR target/43546)

2014-02-12 Thread Jakub Jelinek

Hi!

With -O1 -m32 -mpreferred-stack-boundary=2 -msseregparm -msse
the following testcase ICEs, because the stack realignment code isn't aware
of the DFmode value that possibly needs spilling (on i?86/x86_64 the cost
code is saying that extending load from SFmode memory into DFmode constant
is for free, which is the case for i387 stack, but not when the target is
a SSE register).

Fixed by forcing the value first into a pseudo, where gen_reg_rtx takes
care of telling the stack realignment code about it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 

2014-02-12  Jakub Jelinek  

PR target/43546
* expr.c (compress_float_constant): If x is a hard register,
extend into a pseudo and then move to x.

* gcc.target/i386/pr43546.c: New test.

--- gcc/expr.c.jj   2014-02-11 18:45:28.0 +0100
+++ gcc/expr.c  2014-02-12 12:43:48.829403526 +0100
@@ -3726,12 +3726,21 @@ compress_float_constant (rtx x, rtx y)
 into a new pseudo.  This constant may be used in different modes,
 and if not, combine will put things back together for us.  */
   trunc_y = force_reg (srcmode, trunc_y);
-  emit_unop_insn (ic, x, trunc_y, UNKNOWN);
+
+  /* If x is a hard register, perform the extension into a pseudo,
+so that e.g. stack realignment code is aware of it.  */
+  rtx target = x;
+  if (REG_P (x) && HARD_REGISTER_P (x))
+   target = gen_reg_rtx (dstmode);
+
+  emit_unop_insn (ic, target, trunc_y, UNKNOWN);
   last_insn = get_last_insn ();
 
-  if (REG_P (x))
+  if (REG_P (target))
set_unique_reg_note (last_insn, REG_EQUAL, y);
 
+  if (target != x)
+   return emit_move_insn (x, target);
   return last_insn;
 }
 
--- gcc/testsuite/gcc.target/i386/pr43546.c.jj  2014-02-12 12:51:08.029131223 
+0100
+++ gcc/testsuite/gcc.target/i386/pr43546.c 2014-02-12 12:50:49.0 
+0100
@@ -0,0 +1,12 @@
+/* PR target/43546 */
+/* { dg-do compile } */
+/* { dg-options "-O1" } */
+/* { dg-additional-options "-mpreferred-stack-boundary=2 -msseregparm -msse" { 
target ia32 } } */
+
+extern void bar (double);
+
+void
+foo (void)
+{
+  bar (1.0);
+}

Jakub

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-12 Thread Xinliang David Li

On Wed, Feb 12, 2014 at 8:02 AM, Teresa Johnson  wrote:
> On Wed, Feb 12, 2014 at 6:45 AM, Teresa Johnson  wrote:
>> On Tue, Feb 11, 2014 at 6:13 PM, Xinliang David Li  
>> wrote:
>>> On Tue, Feb 11, 2014 at 5:36 PM, Teresa Johnson  
>>> wrote:
 On Tue, Feb 11, 2014 at 5:16 PM, Xinliang David Li  
 wrote:
> Why is call graph needed to determine whether to drop the profile?

 Because we detect this situation by looking for cases where the call
 edge counts greatly exceed the callee node count.

>
> If that is needed, it might be possible to leverage the ipa_profile
> pass as it will walk through all function nodes to do profile
> annotation. With this you can make decision to drop callee profile in
> caller's context.

 There are 2 ipa profiling passes, which are somewhat confusingly named
 (to me at least. =). This is being done during the first.

 The first is pass_ipa_tree_profile in tree-profile.c, but is a
 SIMPLE_IPA_PASS and has the name "profile" in the dump. The second is
 pass_ipa_profile in ipa-profile.c, which is an IPA_PASS and has the
 name "profile_estimate" in the dump. I assume you are suggesting to
 move this into the latter? But I'm not clear on what benefit that
 gives - the functions are not being traversed in order, so there is
 still the issue of needing to rebuild the cgraph after dropping
 profiles, which might be best done earlier as I have in the patch.
>>>
>>>
>>> I meant the tree-profile one.  I think this might work: after all the
>>> function's profile counts are annotated, add another walk of the the
>>> call graph nodes to drop bad profiles before the the call graph is
>>> rebuilt (Call graph does exist at that point).
>>
>> Ok, so it is already done in tree-profile. But it sounds like you are
>> suggesting reordering it to just above where we update the calls and
>> rebuild the cgraph the first time? As you noted in a follow-on email
>> to me, the cgraph edges don't have the profile counts at that point
>> (and neither do the nodes), so I would need to compare the count on
>> the call's bb to the entry bb count of the callee. That should be
>> doable, let me take a stab at it.
>
> This works well. Tested on omnetpp which has some dropped profiles and
> ensured that the behavior and output of the ipa tree profile phase is
> the same. Re-running bootstrap and regression tests.
>
> Here's the new patch. The only changes from the earlier patch are in
> handle_missing_profiles, where we now get the counts off of the entry
> and call stmt bbs, and in tree_profiling, where we call
> handle_missing_profiles earlier and I have removed the outlined cgraph
> rebuilding code since it doesn't need to be reinvoked.
>
> Honza, does this look ok for trunk when stage 1 reopens? David, I can
> send a similar patch for review to google-4_8 if it looks good.
>
> Thanks,
> Teresa
>
> 2014-02-12  Teresa Johnson  
>
> * graphite.c (graphite_finalize): Pass new parameter.
> * params.def (PARAM_MIN_CALLER_REESTIMATE_RATIO): New.
> * predict.c (tree_estimate_probability): New parameter.
> (tree_estimate_probability_worker): Renamed from
> tree_estimate_probability_driver.
> (tree_reestimate_probability): New function.
> (tree_estimate_probability_driver): Invoke
> tree_estimate_probability_worker.
> (freqs_to_counts): Move here from tree-inline.c.
> (drop_profile): Re-estimated profiles when dropping counts.
> (handle_missing_profiles): Drop for some non-zero functions as well,
> get counts from bbs to support invocation before cgraph rebuild.
> (counts_to_freqs): Remove code obviated by reestimation.
> * predict.h (tree_estimate_probability): Update declaration.
> * tree-inline.c (freqs_to_counts): Move to predict.c.
> (copy_cfg_body): Remove code obviated by reestimation.
> * tree-profile.c (tree_profiling): Invoke handle_missing_profiles
> before cgraph rebuild.
>
> Index: graphite.c
> ===
> --- graphite.c  (revision 207436)
> +++ graphite.c  (working copy)
> @@ -247,7 +247,7 @@ graphite_finalize (bool need_cfg_cleanup_p)
>cleanup_tree_cfg ();
>profile_status_for_fn (cfun) = PROFILE_ABSENT;
>release_recorded_exits ();
> -  tree_estimate_probability ();
> +  tree_estimate_probability (false);
>  }
>
>cloog_state_free (cloog_state);
> Index: params.def
> ===
> --- params.def  (revision 207436)
> +++ params.def  (working copy)
> @@ -44,6 +44,12 @@ DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCOME,
>   "Maximal estimated outcome of branch considered predictable",
>   2, 0, 50)
>
> +DEFPARAM (PARAM_MIN_CALLER_REESTIMATE_RATIO,
> + "min-caller-reestimate-ratio",
> + "Minimum ca

wide-int, tree

2014-02-12 Thread Mike Stump

In resolving a recent x86_64 libitm build problem found while trunk merging, I 
discovered that fold_builtin_memory_op  grabs an int whose size is based upon 
the size of a complex long double, which is 256 bits, and then during type 
hashing (where it calls type_hash_canon), it wants to check to see if this 
value fits into a uhwi.  to_widest is aborting due to 256 (complex long double) 
>= 128 (max wide int from i386-modes.def:

> /* Keep the OI and XI modes from confusing the compiler into thinking
>that these modes could actually be used for computation.  They are
>only holders for vectors during data movement.  */
> #define MAX_BITSIZE_MODE_ANY_INT (128)

)

to_widest does this to prevent large values from being truncated into smaller 
values.

The patch below fixes it, however, the question is, is this the right place to 
fix it.  Another fix would be to admit that the compiler does use 256 bit 
integers on x86 (for the maximum value of the OImode type) and instead change 
MAX_BITSIZE_MODE_ANY_INT to be 256.  Other possibilities exist.

This problem is limited to x86, since it is the only port that uses 
MAX_BITSIZE_MODE_ANY_INT.

Thoughts?

diff --git a/gcc/tree.c b/gcc/tree.c
index 2a68b03..8f2891a 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -6970,9 +6970,14 @@ tree_fits_shwi_p (const_tree t)
 bool
 tree_fits_uhwi_p (const_tree t)
 {
-  return (t != NULL_TREE
- && TREE_CODE (t) == INTEGER_CST
- && wi::fits_uhwi_p (wi::to_widest (t)));
+  if (t == NULL_TREE)
+return false;
+  if (TREE_CODE (t) != INTEGER_CST)
+return false;
+  if (TREE_INT_CST_EXT_NUNITS (t) == 1)
+return TREE_INT_CST_ELT (t, 0) >= 0;
+  return (TREE_INT_CST_EXT_NUNITS (t) == 2
+ && TREE_INT_CST_ELT (t, 1) == 0);
 }
 
 /* T is an INTEGER_CST whose numerical value (extended according to

The chain of how we got here:

fold_builtin_3:
  case BUILT_IN_MEMMOVE:
return fold_builtin_memory_op (loc, arg0, arg1, arg2,
  =>   type, ignore, /*endp=*/3);



fold_builtin_memory_op:
if (FLOAT_MODE_P (TYPE_MODE (srctype))
|| TREE_CODE (srctype) == BOOLEAN_TYPE
|| TREE_CODE (srctype) == ENUMERAL_TYPE)
  {
B   enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
if (mode == BLKmode)
  srctype = NULL_TREE;
else
  srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode),
  =>1);
  }

build_nonstandard_integer_type:
if (unsignedp)
  fixup_unsigned_type (itype);
else
B =>  fixup_signed_type (itype);

ret = itype;
if (tree_fits_uhwi_p (TYPE_MAX_VALUE (itype)))
  ret = type_hash_canon (tree_to_uhwi (TYPE_MAX_VALUE (itype)), itype);
if (precision <= MAX_INT_CACHED_PREC)
  nonstandard_integer_type_cache[precision + unsignedp] = ret;

return ret;

Re: Warn about virtual table mismatches

2014-02-12 Thread Bernhard Reutner-Fischer


On 12 February 2014 07:27:59 Jan Hubicka  wrote:


> On 02/11/2014 07:54 PM, Jan Hubicka wrote:
> >+/* Allow combining RTTI and non-RTTI is OK.  */
> You mean combining -frtti and -fno-rtti compiles?  Yes, that's fine,
> though you need to prefer the -frtti version in case code from that
> translation unit uses the RTTI info.

Is there some mechanism that linker will do so? At the moment we just chose 
variant
that would be selected by linker.  I can make the choice, but what happens 
with non-LTO?
> >/aux/hubicka/firefox/accessible/src/generic/DocAccessible.cpp:1232:0: 
note: the first different method is �HandleAccEvent�

> I don't see where this note would come from in the patch.
> Sorry, diffed old tree

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 207702)
+++ ipa-devirt.c(working copy)
@@ -1675,6 +1675,132 @@
 }


+/* Compare two virtual tables, PREVAILING and VTABLE and output ODR
+   violation warings.  */
+
+void
+compare_virtual_tables (tree prevailing, tree vtable)
+{
+  tree init1 = DECL_INITIAL (prevailing), init2 = DECL_INITIAL (vtable);
+  tree val1 = NULL, val2 = NULL;
+  if (!DECL_VIRTUAL_P (prevailing))
+{
+  odr_violation_reported = true;
+  if (warning_at (DECL_SOURCE_LOCATION (prevailing), 0,
+"declaration %D conflict with a virtual table",
+prevailing))
+   inform (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT (vtable))),
+   "a type defining the virtual table in another translation 
unit");
+  return;
+}
+  if (init1 == init2 || init2 == error_mark_node)
+return;
+  /* Be sure to keep virtual table contents even for external
+ vtables when they are available.  */
+  gcc_assert (init1 && init1 != error_mark_node);
+  if (!init2 && DECL_EXTERNAL (vtable))
+return;
+  if (init1 && init2
+  && CONSTRUCTOR_NELTS (init1) == CONSTRUCTOR_NELTS (init2))
+{
+  unsigned i;
+  tree field1, field2;
+  bool matched = true;
+
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init1),
+i, field1, val1)
+   {
+ gcc_assert (!field1);
+ field2 = CONSTRUCTOR_ELT (init2, i)->index;
+ val2 = CONSTRUCTOR_ELT (init2, i)->value;
+ gcc_assert (!field2);
+ if (val2 == val1)
+   continue;
+ if (TREE_CODE (val1) == NOP_EXPR)
+   val1 = TREE_OPERAND (val1, 0);
+ if (TREE_CODE (val2) == NOP_EXPR)
+   val2 = TREE_OPERAND (val2, 0);
+ /* Unwind
+ val 
+  readonly constant
+  arg 0 
+  readonly
+  arg 0 
+  arg 0 > arg 1 >> */
+
+ while (TREE_CODE (val1) == TREE_CODE (val2)
+&& (((TREE_CODE (val1) == MEM_REF
+  || TREE_CODE (val1) == POINTER_PLUS_EXPR)
+ && (TREE_OPERAND (val1, 1))
+   == TREE_OPERAND (val2, 1))
+|| TREE_CODE (val1) == ADDR_EXPR))
+   {
+ val1 = TREE_OPERAND (val1, 0);
+ val2 = TREE_OPERAND (val2, 0);
+ if (TREE_CODE (val1) == NOP_EXPR)
+   val1 = TREE_OPERAND (val1, 0);
+ if (TREE_CODE (val2) == NOP_EXPR)
+   val2 = TREE_OPERAND (val2, 0);
+   }
+ if (val1 == val2)
+   continue;
+ if (TREE_CODE (val2) == ADDR_EXPR)
+   {
+ tree tmp = val1;
+ val1 = val2;
+ val2 = tmp;
+}
+ /* Combining RTTI and non-RTTI vtables is OK.
+??? Perhaps we can alsa (optionally) warn here?  */
+ if (TREE_CODE (val1) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (val1, 0)) == VAR_DECL
+ && !DECL_VIRTUAL_P (TREE_OPERAND (val1, 0))
+ && integer_zerop (val2))
+   continue;
+ /* For some reason zeros gets NOP_EXPR wrappers.  */
+ if (integer_zerop (val1)
+ && integer_zerop (val2))
+   continue;
+ /* Compare assembler names; this function is run during
+declaration merging.  */
+ if (DECL_P (val1) && DECL_P (val2)
+ && DECL_ASSEMBLER_NAME (val1) == DECL_ASSEMBLER_NAME (val2))
+   continue;
+ matched = false;
+ break;
+   }
+  if (!matched)
+   {
+ odr_violation_reported = true;
+	  if (warning_at (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT 
(vtable))), 0,

+"type %qD violates one definition rule",
+DECL_CONTEXT (vtable)))
+   {
+ inform (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT 
(prevailing))),
+ "a type with the same name but different virtual table is 
"
+ "defined in another translation unit");
+ /* See if we can be more informa

Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-12 Thread Eric Botcazou

> I am not sure how to deal with this, given that we have mismatched
> V_C_Es anyway, I'm inclined not to care and let the expander deal with
> it.  But at the same I understand that it is ugly and will certainly
> cause somebody more headache in the future.  I suppose that not
> scalarizing here might hurt performance and would be frowned upon at
> the very least.  If the fields bigger than the record approach is the
> standard way of doing this, perhaps SRA can detect such cases and
> produce these strange COMPONENT_REFs instead, but is it so?

You may remember that we went that way before (building a COMPONENT_REF for 
bit-fields instead of fully lowering the access) so doing it again would be a 
step backwards.  Likewise if we refuses to scalarize.  So IMO it's either low-
level fiddling in SRA or in the expander (my preference too).

-- 
Eric Botcazou

Re: [RS6000] power8 internal compiler errors

2014-02-12 Thread Ulrich Weigand

David Edelsohn wrote:
> On Mon, Feb 10, 2014 at 8:33 PM, Alan Modra  wrote:
> > On Mon, Feb 10, 2014 at 07:01:03PM -0500, David Edelsohn wrote:
> >> On Mon, Feb 10, 2014 at 5:18 PM, Alan Modra  wrote:
> >>
> >> Shouldn't addr_op2 also be set from find_replacement?
> >
> > Sorry, I thought after I sent the email that I should have added some
> > explanation of why certain parts need find_replacement and others
> > don't.  We want just those parts of addresses that might have been
> > reloaded.
> >
> > There's the case of the entire address being reloaded (actually, I'm
> > not sure this one is needed) and then all the ones we do in the rs6000
> > backend in legitimize_reload_address.  I think I found all the
> > required parts but it certainly won't hurt if you check too.  Calling
> > find_replacement when not strictly necessary will slow down gcc a
> > little..
> 
> I cannot tell if either branch of the PLUS could contain an address
> that previously was replaced. I want to avoid slow downs, but I also
> want to avoid leaving a latent bug, either now or a future change to
> reload or IRA.

In general, any part of an address may have been replaced.  In the
most generic case, if we have an address of the form
  (mem (plus (reg1) (reg2)))
for two pseudos, each of the pseudos *could* have been allocated
e.g. to an FPR under unusual circumstances, and then it would
require a replacement (without the whole address itself being
fully replaced).
 
> Is there an ENABLE_CHECKING assert of some form that could test
> addr_op2 to ensure that it does contain part of the address that was
> reloaded when not configured as a Release?

Note that find_replacement itself already recurses into both sides
of a PLUS.  So given the code flow:

-  addr = XEXP (mem, 0);
+  addr = find_replacement (&XEXP (mem, 0));
[...]
   if (GET_CODE (addr) == AND)
{
  and_op2 = XEXP (addr, 1);
- addr = XEXP (addr, 0);
+ addr = find_replacement (&XEXP (addr, 0));
}

  if (GET_CODE (addr) == PRE_MODIFY)
{
  scratch_or_premodify = XEXP (addr, 0);
  if (!REG_P (scratch_or_premodify))
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);

  if (GET_CODE (XEXP (addr, 1)) != PLUS)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);

  addr = XEXP (addr, 1);
}

  if (GET_CODE (addr) == PLUS
  && (and_op2 != NULL_RTX
  || !rs6000_legitimate_offset_address_p (PTImode, addr,
  false, true)))
{
  addr_op1 = XEXP (addr, 0);
  addr_op2 = XEXP (addr, 1);

the only time addr_op1 *or* addr_op2 might need another replacement
is if addr was reset to the inner of a PRE_MODIFY.  So it might be
easier and cheaper overall to just do a find_replacement within
the PRE_MODIFY clause ...


If you really prefer a check, I guess you can always do something like:

#ifdef ENABLE_CHECKING
  gcc_assert (find_replacement (&XEXP (...)) == XEXP (...));
#endif

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com

RE: [PING] [PATCH] _Cilk_for for C and C++

2014-02-12 Thread Iyer, Balaji V



> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, February 12, 2014 12:10 PM
> To: Iyer, Balaji V
> Cc: 'Jason Merrill'; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'r...@redhat.com'
> Subject: Re: [PING] [PATCH] _Cilk_for for C and C++
> 
> On Wed, Feb 12, 2014 at 05:04:38PM +, Iyer, Balaji V wrote:
> > I looked at the test code you send me (cf3.cc) at -O1 and it is
> > removing all the lines you have shown above.  Yes, I would imagine -O0
> > to have code that can be redundant or unnecessary.  Some of it could
> > be the artifact of internal code insertion.  But isn't the main job of
> > the instruction scheduler to remove all these redundant work?
> > Besides, it is just a function call.  The compiler at -O2, -O and -O3
> > removes the chunk of code that you mentioned.
> 
> As I said, just change the testcase so that the operator isn't inline, and
> suddenly even -O3 will not be able to remove the call.

I am sorry, I do not see any operators being asked to inline explicitly..

class I
{
public:
  typedef ptrdiff_t difference_type;
  I ();
  ~I ();
  I (T *);
  I (const I &);
  T &operator * ();
  T *operator -> ();
  T &operator [] (const difference_type &) const;
  I &operator = (const I &);
  I &operator ++ ();
  I operator ++ (int);
  I &operator -- ();
  I operator -- (int);
  I &operator += (const difference_type &);
  I &operator -= (const difference_type &);
  I operator + (const difference_type &) const;
  I operator - (const difference_type &) const;
  template  friend bool operator == (I &, I &);
  template  friend bool operator == (const I &, const I &);
  template  friend bool operator < (I &, I &);
  template  friend bool operator < (const I &, const I &);
  template  friend bool operator <= (I &, I &);
  template  friend bool operator <= (const I &, const I &);
  template  friend bool operator > (I &, I &);
  template  friend bool operator > (const I &, const I &);
  template  friend bool operator >= (I &, I &);
  template  friend bool operator >= (const I &, const I &);
  template  friend typename I::difference_type operator - (I 
&, I &);
  template  friend typename I::difference_type operator - (const 
I &, const I &);
  template  friend I operator + (typename I::difference_type 
, const I &);
private:
  T *p;
};


> 
>   Jakub

Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-12 Thread Martin Jambor

Hi,

On Tue, Feb 11, 2014 at 02:17:04PM +0100, Richard Biener wrote:
> Hmm.  The intent was of course to only allow truly no-op converts via
> VIEW_CONVERT_EXPR - that is, the size of the operand type and the
> result type should be the same.  So, isn't SRA doing it wrong when
> creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?
> 

Even though I cannot recall the details, I remember I had to make SRA
able to accept such size mismatches in between V_C_E operands and
results in its input, to be able to work on Ada FE generated IL.  I
believe that is still the case and so we would have these mismatches
even if we changed SRA not to create them.

The reason why SRA creates such a thing is that
get_ref_base_and_extent returns size 24 when run on the following, an
expression of type with type size 32:

unit size 
align 32 symtab 0 alias set -1 canonical type 0x7ffc451dc5e8 
precision 32 min  max  context  RM size 
chain >
sizes-gimplified public unsigned SI
size 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x7ffc451fe930 precision 
24 min  max  
RM size  RM min  
RM max >

arg 0 
unit size 
align 8 symtab 0 alias set -1 canonical type 0x7ffc451fe888 fields 
 context 
Ada size 
chain >

arg 0 
BLK file opt31.adb line 31 col 5
size 
unit size 
align 32 context >
arg 1 
BLK file opt31.adb line 25 col 5
size 
unit size 
align 8 offset_align 128
offset 
bit offset  context 
>>
arg 1 
sizes-gimplified public unsigned SI
size 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x7ffc451fe930 
precision 24 min  max  RM size  RM min  RM max >
unsigned external packed bit-field nonaddressable SI file opt31.adb 
line 16 col 5
size 
unit size 
align 1 offset_align 128
offset 
bit offset 
bit_field_type 
sizes-gimplified public unsigned SI
size 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x7ffc451fe930 
precision 24 min  max  RM size  RM min  RM max > context 
>>

This type wins over an access to the same memory with only 24-bit
large type from another statement and when that statement is modified,
ensuing type mismatch is "fixed" by creating a VIEW_CONVERT_EXPR with
the mismatch.

I am not sure how to deal with this, given that we have mismatched
V_C_Es anyway, I'm inclined not to care and let the expander deal with
it.  But at the same I understand that it is ugly and will certainly
cause somebody more headache in the future.  I suppose that not
scalarizing here might hurt performance and would be frowned upon at
the very least.  If the fields bigger than the record approach is the
standard way of doing this, perhaps SRA can detect such cases and
produce these strange COMPONENT_REFs instead, but is it so?

Martin

Re: [PING] [PATCH] _Cilk_for for C and C++

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 05:04:38PM +, Iyer, Balaji V wrote:
> I looked at the test code you send me (cf3.cc) at -O1 and it is removing
> all the lines you have shown above.  Yes, I would imagine -O0 to have code
> that can be redundant or unnecessary.  Some of it could be the artifact of
> internal code insertion.  But isn't the main job of the instruction
> scheduler to remove all these redundant work?  Besides, it is just a
> function call.  The compiler at -O2, -O and -O3 removes the chunk of code
> that you mentioned.

As I said, just change the testcase so that the operator isn't inline, and
suddenly even -O3 will not be able to remove the call.

Jakub

RE: [PING] [PATCH] _Cilk_for for C and C++

2014-02-12 Thread Iyer, Balaji V

> > > More importantly, what is retval.1?  I'd expect you should be using
> > > retval.0 there and have it also as firstprivate(retval.0) on the parallel.
> > > In *.omplower dump I actually see:
> > > retval.0 = operator- (D.2885, &i); ...
> > > retval.1 = operator- (D.2888, &i);
> > > i.e. operator- is called twice.
> > >
> >
> > Yes, one is for the if-clause and one is for the condition. It really 
> > doesn't
> matter because we get of the stuff in the condition and replace with our own
> for loop with something like the for-loop shown  below. So retval1 doesn't
> come into picture. It is only alive from parser to the expand_cilk_for 
> function.
> >
> > For (i = low; i < high; i++)
> > {
> > <_Cilk_for_body>
> > }
> 
> No, it really does matter.  Just look at the *.optimized dump with -O0 -
> fcilkplus:
> 
>   retval.0_4 = operator- (_3, &i);
> in _Z3foo1JIiE function and
>   _4 = .omp_data_i_2(D)->j;
>   _5 = J::end (_4);
>   retval.1_6 = operator- (_5, &i);
>   retval.3_7 = retval.1_6;
> in _Z3foo1JIiE._cilk_for_fn.0.  All the 4 statements are dead, you really
> shouldn't emit them, even when optimizing, if e.g. the operator- isn't inline,
> g++ won't be able to optimize it away.
> 

I looked at the test code you send me (cf3.cc) at -O1 and it is removing all 
the lines you have shown above. Yes, I would imagine -O0 to have code that can 
be redundant or unnecessary. Some of it could be the artifact of internal code 
insertion. But isn't the main job of the instruction scheduler to remove all 
these redundant work? Besides, it is just a function call. The compiler at -O2, 
-O and -O3 removes the chunk of code that you mentioned.

-Balaji V. Iyer.


>   Jakub

Re: Harden ipa-devirt for invalid programs, part 1 (and also fix a wrong code issue)

2014-02-12 Thread Jakub Jelinek

Hi!

On Tue, Feb 11, 2014 at 09:43:52PM +0100, Jan Hubicka wrote:
> *** record_targets_from_bases (tree otr_type
> *** 1286,1292 
>  HOST_WIDE_INT otr_token,
>  tree outer_type,
>  HOST_WIDE_INT offset,
> !vec  nodes,
>  pointer_set_t *inserted,
>  pointer_set_t *matched_vtables,
>  bool *completep)
> --- 1318,1324 
>  HOST_WIDE_INT otr_token,
>  tree outer_type,
>  HOST_WIDE_INT offset,
> !vec  &nodes,
>  pointer_set_t *inserted,
>  pointer_set_t *matched_vtables,
>  bool *completep)

I think this change (which has not been mentioned in the ChangeLog BTW)
has fixed PR59737.  I've committed the following testcase as obvious
in order to close the PR.

2014-02-12  Jakub Jelinek  

PR middle-end/59737
* g++.dg/ipa/pr59737.C: New test.

--- gcc/testsuite/g++.dg/ipa/pr59737.C.jj   2014-02-12 17:47:28.123297799 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr59737.C  2014-02-12 17:44:43.0 +0100
@@ -0,0 +1,48 @@
+// PR middle-end/59737
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct A
+{
+  virtual void foo (int &x);
+  friend void
+  operator>> (int &x, A &y)
+  {
+y.foo (x);
+  }
+};
+
+struct B : public A
+{
+  void foo (int &x);
+};
+
+struct F : public B
+{
+  void foo (int &x);
+};
+
+struct G : public F
+{
+  void foo (int &);
+};
+
+struct C : A
+{
+  void foo (int &);
+  struct H : public G
+  {
+void foo (int &);
+  };
+  struct D : A
+  {
+H d;
+  };
+};
+
+void
+C::foo (int &x)
+{
+  D a;
+  x >> a.d;
+}


Jakub

Re: [RS6000] power8 internal compiler errors

2014-02-12 Thread David Edelsohn

On Mon, Feb 10, 2014 at 8:33 PM, Alan Modra  wrote:
> On Mon, Feb 10, 2014 at 07:01:03PM -0500, David Edelsohn wrote:
>> On Mon, Feb 10, 2014 at 5:18 PM, Alan Modra  wrote:
>>
>> Shouldn't addr_op2 also be set from find_replacement?
>
> Sorry, I thought after I sent the email that I should have added some
> explanation of why certain parts need find_replacement and others
> don't.  We want just those parts of addresses that might have been
> reloaded.
>
> There's the case of the entire address being reloaded (actually, I'm
> not sure this one is needed) and then all the ones we do in the rs6000
> backend in legitimize_reload_address.  I think I found all the
> required parts but it certainly won't hurt if you check too.  Calling
> find_replacement when not strictly necessary will slow down gcc a
> little..

I cannot tell if either branch of the PLUS could contain an address
that previously was replaced. I want to avoid slow downs, but I also
want to avoid leaving a latent bug, either now or a future change to
reload or IRA.

Is there an ENABLE_CHECKING assert of some form that could test
addr_op2 to ensure that it does contain part of the address that was
reloaded when not configured as a Release?

Thanks, David

Re: [PATCH][AARCH64]Resolves testsuite/gcc.target/aarch64/aapcs64/ret-func-1.c regression

2014-02-12 Thread Richard Earnshaw


On 12/02/14 15:18, Marcus Shawcroft wrote:

On 3 February 2014 10:02, Renlin Li  wrote:


2014-02-03  Renlin Li  

 * gcc.target/aarch64/aapcs64/validate_memory.h: move f32in64 and
i32in128 cases
 outside special big-endian processing block.


This is a test case fix.  This is ok with me but needs a release manager ack.

/Marcus



s/move/Move/

R.

Re: Clean up pretty printers [10/n]

2014-02-12 Thread Jakub Jelinek

On Thu, Aug 22, 2013 at 05:27:54AM -0500, Gabriel Dos Reis wrote:
>   * error.c (init_error): Remove calls to pp_construct and
>   pp_cxx_pretty_printer_init.  Initialize cxx_pp with placement-new.
>   * cxx-pretty-print.h (cxx_pretty_printer::cxx_pretty_printer): Declare.
>   (cxx_pretty_printer_init): Remove.
>   * cxx-pretty-print.c (cxx_pretty_printer::cxx_pretty_printer):
>   Rename from cxx_pretty_printer_init.  Adjust.
>   * cp-objcp-common.c (cxx_initialize_diagnostics): Simplify
>   initialization of C++ diagnostics pretty printer.

> --- cp/error.c(revision 201904)
> +++ cp/error.c(working copy)
> @@ -33,6 +33,8 @@
>  #include "pointer-set.h"
>  #include "c-family/c-objc.h"
>  
> +#include // For placement-new.
> +
>  #define pp_separate_with_comma(PP) pp_cxx_separate_with (PP, ',')
>  #define pp_separate_with_semicolon(PP) pp_cxx_separate_with (PP, ';')
>  
> @@ -109,8 +111,7 @@
>diagnostic_finalizer (global_dc) = cp_diagnostic_finalizer;
>diagnostic_format_decoder (global_dc) = cp_printer;
>  
> -  pp_construct (cxx_pp, NULL, 0);
> -  pp_cxx_pretty_printer_init (cxx_pp);
> +  new (cxx_pp) cxx_pretty_printer ();

Unfortunately this always leaks memory,
static cxx_pretty_printer scratch_pretty_printer;
is constructed first during static initialization, where it e.g. allocates
through XNEW the buffer, and then when init_error is called, it ignores
the old content of scratch_pretty_printer and constructs the same objects
again.  Why is the new (cxx_pp) cxx_pretty_printer (); line needed at all?
Isn't it already constructed with the right ctor?
Or, shouldn't it at least first try to call destructor?

I think there are more spots with placement new, perhaps all have the same
issue?

Jakub

Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-12 Thread H.J. Lu

On Wed, Feb 12, 2014 at 12:20 AM, Uros Bizjak  wrote:
> On Tue, Feb 11, 2014 at 9:41 PM, H.J. Lu  wrote:
>
>> HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
>> pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
>> We already pass --32 to assembler on x86.  It should be OK to do it
>> in configure.  OK for trunk?
>
> This would break Solaris/x86 with as configurations, where this test
> currently passes, but would fail since as doesn't understand --32.
>

 How about passing --32 to as only for Linux?  OK to install?
>>>
>>> I'd rather do it for gas instead, which can be used on non-Linux
>>> systems, too.
>>>
>>
>> Sure.  Here is the new patch.  OK to install?
>
> Attached is slightly changed patch that follows established
> configure.ac code formatting. Please check if this version works for
> you.
>
> The patch is OK for mainline and release branches.
>

I checked it into trunk and will backport it later.

Thanks.

-- 
H.J.

[PATCH][AARCH64]Adjust address with offset assembler format

2014-02-12 Thread Renlin Li


Hi all,

This is a simple patch which adds a space between base register and 
offset during the address asm translation, making the output assembler 
code format consistent for aarch64 target.


Is it Okay for stage-1?

Kind regards,
Renlin Li


gcc/ChangeLog:

2014-02-12  Renlin Li  

* config/aarch64/aarch64.c (aarch64_print_operand_address): Adjust 
the output asm format

by adding a space  between base register and offset.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d3c5cbc..50ecdd8 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3830,34 +3830,34 @@ aarch64_print_operand_address (FILE *f, rtx x)
 	if (addr.offset == const0_rtx)
 	  asm_fprintf (f, "[%s]", reg_names [REGNO (addr.base)]);
 	else
-	  asm_fprintf (f, "[%s,%wd]", reg_names [REGNO (addr.base)],
+	  asm_fprintf (f, "[%s, %wd]", reg_names [REGNO (addr.base)],
 		   INTVAL (addr.offset));
 	return;
 
   case ADDRESS_REG_REG:
 	if (addr.shift == 0)
-	  asm_fprintf (f, "[%s,%s]", reg_names [REGNO (addr.base)],
+	  asm_fprintf (f, "[%s, %s]", reg_names [REGNO (addr.base)],
 		   reg_names [REGNO (addr.offset)]);
 	else
-	  asm_fprintf (f, "[%s,%s,lsl %u]", reg_names [REGNO (addr.base)],
+	  asm_fprintf (f, "[%s, %s, lsl %u]", reg_names [REGNO (addr.base)],
 		   reg_names [REGNO (addr.offset)], addr.shift);
 	return;
 
   case ADDRESS_REG_UXTW:
 	if (addr.shift == 0)
-	  asm_fprintf (f, "[%s,w%d,uxtw]", reg_names [REGNO (addr.base)],
+	  asm_fprintf (f, "[%s, w%d, uxtw]", reg_names [REGNO (addr.base)],
 		   REGNO (addr.offset) - R0_REGNUM);
 	else
-	  asm_fprintf (f, "[%s,w%d,uxtw %u]", reg_names [REGNO (addr.base)],
+	  asm_fprintf (f, "[%s, w%d, uxtw %u]", reg_names [REGNO (addr.base)],
 		   REGNO (addr.offset) - R0_REGNUM, addr.shift);
 	return;
 
   case ADDRESS_REG_SXTW:
 	if (addr.shift == 0)
-	  asm_fprintf (f, "[%s,w%d,sxtw]", reg_names [REGNO (addr.base)],
+	  asm_fprintf (f, "[%s, w%d, sxtw]", reg_names [REGNO (addr.base)],
 		   REGNO (addr.offset) - R0_REGNUM);
 	else
-	  asm_fprintf (f, "[%s,w%d,sxtw %u]", reg_names [REGNO (addr.base)],
+	  asm_fprintf (f, "[%s, w%d, sxtw %u]", reg_names [REGNO (addr.base)],
 		   REGNO (addr.offset) - R0_REGNUM, addr.shift);
 	return;
 
@@ -3865,27 +3865,27 @@ aarch64_print_operand_address (FILE *f, rtx x)
 	switch (GET_CODE (x))
 	  {
 	  case PRE_INC:
-	asm_fprintf (f, "[%s,%d]!", reg_names [REGNO (addr.base)], 
+	asm_fprintf (f, "[%s, %d]!", reg_names [REGNO (addr.base)], 
 			 GET_MODE_SIZE (aarch64_memory_reference_mode));
 	return;
 	  case POST_INC:
-	asm_fprintf (f, "[%s],%d", reg_names [REGNO (addr.base)],
+	asm_fprintf (f, "[%s], %d", reg_names [REGNO (addr.base)],
 			 GET_MODE_SIZE (aarch64_memory_reference_mode));
 	return;
 	  case PRE_DEC:
-	asm_fprintf (f, "[%s,-%d]!", reg_names [REGNO (addr.base)],
+	asm_fprintf (f, "[%s, -%d]!", reg_names [REGNO (addr.base)],
 			 GET_MODE_SIZE (aarch64_memory_reference_mode));
 	return;
 	  case POST_DEC:
-	asm_fprintf (f, "[%s],-%d", reg_names [REGNO (addr.base)],
+	asm_fprintf (f, "[%s], -%d", reg_names [REGNO (addr.base)],
 			 GET_MODE_SIZE (aarch64_memory_reference_mode));
 	return;
 	  case PRE_MODIFY:
-	asm_fprintf (f, "[%s,%wd]!", reg_names [REGNO (addr.base)],
+	asm_fprintf (f, "[%s, %wd]!", reg_names [REGNO (addr.base)],
 			 INTVAL (addr.offset));
 	return;
 	  case POST_MODIFY:
-	asm_fprintf (f, "[%s],%wd", reg_names [REGNO (addr.base)],
+	asm_fprintf (f, "[%s], %wd", reg_names [REGNO (addr.base)],
 			 INTVAL (addr.offset));
 	return;
 	  default:
@@ -3894,7 +3894,7 @@ aarch64_print_operand_address (FILE *f, rtx x)
 	break;
 
   case ADDRESS_LO_SUM:
-	asm_fprintf (f, "[%s,#:lo12:", reg_names [REGNO (addr.base)]);
+	asm_fprintf (f, "[%s, #:lo12:", reg_names [REGNO (addr.base)]);
 	output_addr_const (f, addr.offset);
 	asm_fprintf (f, "]");
 	return;

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-12 Thread Teresa Johnson

On Wed, Feb 12, 2014 at 6:45 AM, Teresa Johnson  wrote:
> On Tue, Feb 11, 2014 at 6:13 PM, Xinliang David Li  wrote:
>> On Tue, Feb 11, 2014 at 5:36 PM, Teresa Johnson  wrote:
>>> On Tue, Feb 11, 2014 at 5:16 PM, Xinliang David Li  
>>> wrote:
 Why is call graph needed to determine whether to drop the profile?
>>>
>>> Because we detect this situation by looking for cases where the call
>>> edge counts greatly exceed the callee node count.
>>>

 If that is needed, it might be possible to leverage the ipa_profile
 pass as it will walk through all function nodes to do profile
 annotation. With this you can make decision to drop callee profile in
 caller's context.
>>>
>>> There are 2 ipa profiling passes, which are somewhat confusingly named
>>> (to me at least. =). This is being done during the first.
>>>
>>> The first is pass_ipa_tree_profile in tree-profile.c, but is a
>>> SIMPLE_IPA_PASS and has the name "profile" in the dump. The second is
>>> pass_ipa_profile in ipa-profile.c, which is an IPA_PASS and has the
>>> name "profile_estimate" in the dump. I assume you are suggesting to
>>> move this into the latter? But I'm not clear on what benefit that
>>> gives - the functions are not being traversed in order, so there is
>>> still the issue of needing to rebuild the cgraph after dropping
>>> profiles, which might be best done earlier as I have in the patch.
>>
>>
>> I meant the tree-profile one.  I think this might work: after all the
>> function's profile counts are annotated, add another walk of the the
>> call graph nodes to drop bad profiles before the the call graph is
>> rebuilt (Call graph does exist at that point).
>
> Ok, so it is already done in tree-profile. But it sounds like you are
> suggesting reordering it to just above where we update the calls and
> rebuild the cgraph the first time? As you noted in a follow-on email
> to me, the cgraph edges don't have the profile counts at that point
> (and neither do the nodes), so I would need to compare the count on
> the call's bb to the entry bb count of the callee. That should be
> doable, let me take a stab at it.

This works well. Tested on omnetpp which has some dropped profiles and
ensured that the behavior and output of the ipa tree profile phase is
the same. Re-running bootstrap and regression tests.

Here's the new patch. The only changes from the earlier patch are in
handle_missing_profiles, where we now get the counts off of the entry
and call stmt bbs, and in tree_profiling, where we call
handle_missing_profiles earlier and I have removed the outlined cgraph
rebuilding code since it doesn't need to be reinvoked.

Honza, does this look ok for trunk when stage 1 reopens? David, I can
send a similar patch for review to google-4_8 if it looks good.

Thanks,
Teresa

2014-02-12  Teresa Johnson  

* graphite.c (graphite_finalize): Pass new parameter.
* params.def (PARAM_MIN_CALLER_REESTIMATE_RATIO): New.
* predict.c (tree_estimate_probability): New parameter.
(tree_estimate_probability_worker): Renamed from
tree_estimate_probability_driver.
(tree_reestimate_probability): New function.
(tree_estimate_probability_driver): Invoke
tree_estimate_probability_worker.
(freqs_to_counts): Move here from tree-inline.c.
(drop_profile): Re-estimated profiles when dropping counts.
(handle_missing_profiles): Drop for some non-zero functions as well,
get counts from bbs to support invocation before cgraph rebuild.
(counts_to_freqs): Remove code obviated by reestimation.
* predict.h (tree_estimate_probability): Update declaration.
* tree-inline.c (freqs_to_counts): Move to predict.c.
(copy_cfg_body): Remove code obviated by reestimation.
* tree-profile.c (tree_profiling): Invoke handle_missing_profiles
before cgraph rebuild.

Index: graphite.c
===
--- graphite.c  (revision 207436)
+++ graphite.c  (working copy)
@@ -247,7 +247,7 @@ graphite_finalize (bool need_cfg_cleanup_p)
   cleanup_tree_cfg ();
   profile_status_for_fn (cfun) = PROFILE_ABSENT;
   release_recorded_exits ();
-  tree_estimate_probability ();
+  tree_estimate_probability (false);
 }

   cloog_state_free (cloog_state);
Index: params.def
===
--- params.def  (revision 207436)
+++ params.def  (working copy)
@@ -44,6 +44,12 @@ DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCOME,
  "Maximal estimated outcome of branch considered predictable",
  2, 0, 50)

+DEFPARAM (PARAM_MIN_CALLER_REESTIMATE_RATIO,
+ "min-caller-reestimate-ratio",
+ "Minimum caller-to-callee node count ratio to force
reestimated branch "
+  "probabilities in callee (where 0 means only when callee
count is 0)",
+ 10, 0, 0)
+
 DEFPARAM (PARAM_INLINE_MIN_SPEEDUP,
  "inline-m

Re: PATCH: Compile g++.dg/opt/pr52727.C with -march=i686 for ia32

2014-02-12 Thread H.J. Lu

On Tue, Feb 4, 2014 at 8:44 AM, H.J. Lu  wrote:
> Hi,
>
> Revision 206943 caused a regression on g++.dg/opt/pr52727.C, which wasn't
> caught by my automated tester since SSE2 and SSE math are enabled by
> default.  This patch adds -march=i686 for ia32 when compiling
> g++.dg/opt/pr52727.C, which is required to reproduce the bug. OK to
> install?
>
> Thanks.
>
>
> H.J.
> ---
> 2014-02-04  H.J. Lu  
>
> * g++.dg/opt/pr52727.C: Compile with -march=i686 for ia32.
>
> diff --git a/gcc/testsuite/g++.dg/opt/pr52727.C 
> b/gcc/testsuite/g++.dg/opt/pr52727.C
> index 4dd3853..ed8b973 100644
> --- a/gcc/testsuite/g++.dg/opt/pr52727.C
> +++ b/gcc/testsuite/g++.dg/opt/pr52727.C
> @@ -1,5 +1,6 @@
>  // { dg-do compile }
>  // { dg-options "-g -Os" }
> +// { dg-additional-options "-march=i686" { target ia32 } }
>
>  int grow (int);
>  void fn (int);

Since -march=i686 is required for this testcase on x86, I
am checking in this as obvious fix.

-- 
H.J.

Re: [PATCH] Improve vec code generation

2014-02-12 Thread Trevor Saunders

On Wed, Feb 12, 2014 at 04:28:14PM +0100, Richard Biener wrote:
> On Wed, Feb 12, 2014 at 4:21 PM, Trevor Saunders  
> wrote:
> > On Wed, Feb 12, 2014 at 03:09:17PM +0100, Richard Biener wrote:
> >>
> >> The following aims to improve code generation for loops like
> >> those in tree-ssa-alias.c:
> >>
> >>   auto_vec component_refs1;
> >>   auto_vec component_refs2;
> >>
> >>   /* Create the stack of handled components for REF1.  */
> >>   while (handled_component_p (ref1))
> >> {
> >>   component_refs1.safe_push (ref1);
> >>   ref1 = TREE_OPERAND (ref1, 0);
> >> }
> >>
> >> It does so by making sure the vector itself doesn't necessarily
> >> escape (to vec.c:calculate_allocation) and by simplifying the
> >> way we identify a vector using auto storage.  In the end this
> >> results in us doing better CSE.  But not SRA unfortunately,
> >> as we have
> >>
> >>   component_refs1.m_header.m_alloc = 16;
> >>   component_refs1.m_header.m_using_auto_storage = 1;
> >>   component_refs1.m_header.m_num = 0;
> >>   component_refs1.D.56915.m_vec = &component_refs1.m_header;
> >>
> >> which is what keeps component_refs1 address-taken (no easy
> >> way out of that I fear).
> >>
> >> It also gets rid of the aliasing-suspicious
> >>
> >> this->m_vec = reinterpret_cast *>
> >> (&m_header);
> >>
> >> and the friend declaration in class auto_vec.
> >
> > over all I like this, only two small comments
> >
> >>   (vec_prefix::m_has_auto_buf): Rename to ...
> >>   (vec_prefix::m_using_auto_storage): ... this.
> >
> > One optimization that might be worth doing is if a vector is cleared
> > reseting m_vec to point at the internal storage in which case the old
> > name would be more accurate, on the other hand its probably rare that's
> > useful, and doing it this way is simpler.
> 
> Yeah, I thought about this but disregarded it as not-the-common case.

makes sense.

> >> ~auto_vec ()
> >> --- 1250,1257 
> >>   public:
> >> auto_vec ()
> >> {
> >> ! m_auto.embedded_init (MAX (N, 2), 0, 1);
> >> ! this->m_vec = &m_auto;
> >> }
> >>
> >> ~auto_vec ()
> >> *** public:
> >> *** 1246,1255 
> >> }
> >>
> >>   private:
> >> !   friend class vec;
> >> !
> >> !   vec_prefix m_header;
> >> !   T m_data[N];
> >>   };
> >>
> >>   /* auto_vec is a sub class of vec whose storage is released when it is
> >> --- 1260,1267 
> >> }
> >>
> >>   private:
> >> !   vec m_auto;
> >> !   T m_data[MAX (N - 1, 1)];
> >
> > why is the MAX() needed? auto_vec is specialized below so someone
> > would need to write auto_vec for N - 1 to be less than 1 which is
> > crazy, and seems like shouldn't be allowed to compile.
> 
> It's for the quite common auto_vec case where T m_data[N-1]
> doesn't work (zero-sized array is a GNU extension).  I've had the
> variant of simply allocating N + 1 elements (and thus using
> m_data[N]) but then the above is closer to what people expect (but
> sets the minimum auto size to two elements).

 I forgot about and then missed embedded vectors having the one internal
 element, so yeah this makes sense.

> Alternatively one could have templated vec

yeah, lets not do that its already way too template happy.

> itself with a size for its m_data member.  Or specialize auto_vec
> for N == 1 and leave out the m_data member for that.
> 
> I considered both not worth the trouble ;)

sgtm

Thanks!

Trev

> 
> If you prefer using m_data[N] and thus allocating one more element
> than requested in auto storage I can do that as well.
> 
> Thanks,
> Richard.
> 
> > thanks!
> >
> > Trev
> >
> >>   };
> >>
> >>   /* auto_vec is a sub class of vec whose storage is released when it is
> >> *** template
> >> *** 1396,1402 
> >>   inline bool
> >>   vec::reserve (unsigned nelems, bool exact 
> >> MEM_STAT_DECL)
> >>   {
> >> !   if (!nelems || space (nelems))
> >>   return false;
> >>
> >> /* For now play a game with va_heap::reserve to hide our auto storage 
> >> if any,
> >> --- 1408,1414 
> >>   inline bool
> >>   vec::reserve (unsigned nelems, bool exact 
> >> MEM_STAT_DECL)
> >>   {
> >> !   if (space (nelems))
> >>   return false;
> >>
> >> /* For now play a game with va_heap::reserve to hide our auto storage 
> >> if any,
> >> *** vec::release (void)
> >> *** 1462,1468 
> >>
> >> if (using_auto_storage ())
> >>   {
> >> !   static_cast *> (this)->m_header.m_num = 0;
> >> return;
> >>   }
> >>
> >> --- 1474,1480 
> >>
> >> if (using_auto_storage ())
> >>   {
> >> !   m_vec->m_vecpfx.m_num = 0;
> >> return;
> >>   }
> >>
> >> *** template
> >> *** 1705,1716 
> >>   inline bool
> >>   vec::using_auto_storage () const
> >>   {
> >> !   if (!m_vec->m_vecpfx.m_has_auto_buf)
> >> ! return false;
> >> !
> >> !   const vec_prefix *auto_header
> >> ! = &static_cast *> (this)->m_header;
> >> !   return reinterpret_cast (m_vec) == auto_header;

Re: [PATCH] Improve vec code generation

2014-02-12 Thread Trevor Saunders

On Wed, Feb 12, 2014 at 03:09:17PM +0100, Richard Biener wrote:
> 
> The following aims to improve code generation for loops like
> those in tree-ssa-alias.c:
> 
>   auto_vec component_refs1;
>   auto_vec component_refs2;
> 
>   /* Create the stack of handled components for REF1.  */
>   while (handled_component_p (ref1))
> {
>   component_refs1.safe_push (ref1);
>   ref1 = TREE_OPERAND (ref1, 0);
> }
> 
> It does so by making sure the vector itself doesn't necessarily
> escape (to vec.c:calculate_allocation) and by simplifying the
> way we identify a vector using auto storage.  In the end this
> results in us doing better CSE.  But not SRA unfortunately,
> as we have
> 
>   component_refs1.m_header.m_alloc = 16;
>   component_refs1.m_header.m_using_auto_storage = 1;
>   component_refs1.m_header.m_num = 0;
>   component_refs1.D.56915.m_vec = &component_refs1.m_header;
> 
> which is what keeps component_refs1 address-taken (no easy
> way out of that I fear).
> 
> It also gets rid of the aliasing-suspicious
> 
> this->m_vec = reinterpret_cast *> 
> (&m_header);
> 
> and the friend declaration in class auto_vec.

over all I like this, only two small comments

>   (vec_prefix::m_has_auto_buf): Rename to ...
>   (vec_prefix::m_using_auto_storage): ... this.

One optimization that might be worth doing is if a vector is cleared
reseting m_vec to point at the internal storage in which case the old
name would be more accurate, on the other hand its probably rare that's
useful, and doing it this way is simpler.

> ~auto_vec ()
> --- 1250,1257 
>   public:
> auto_vec ()
> {
> ! m_auto.embedded_init (MAX (N, 2), 0, 1);
> ! this->m_vec = &m_auto;
> }
>   
> ~auto_vec ()
> *** public:
> *** 1246,1255 
> }
>   
>   private:
> !   friend class vec;
> ! 
> !   vec_prefix m_header;
> !   T m_data[N];
>   };
>   
>   /* auto_vec is a sub class of vec whose storage is released when it is
> --- 1260,1267 
> }
>   
>   private:
> !   vec m_auto;
> !   T m_data[MAX (N - 1, 1)];

why is the MAX() needed? auto_vec is specialized below so someone
would need to write auto_vec for N - 1 to be less than 1 which is
crazy, and seems like shouldn't be allowed to compile.

thanks!

Trev

>   };
>   
>   /* auto_vec is a sub class of vec whose storage is released when it is
> *** template
> *** 1396,1402 
>   inline bool
>   vec::reserve (unsigned nelems, bool exact MEM_STAT_DECL)
>   {
> !   if (!nelems || space (nelems))
>   return false;
>   
> /* For now play a game with va_heap::reserve to hide our auto storage if 
> any,
> --- 1408,1414 
>   inline bool
>   vec::reserve (unsigned nelems, bool exact MEM_STAT_DECL)
>   {
> !   if (space (nelems))
>   return false;
>   
> /* For now play a game with va_heap::reserve to hide our auto storage if 
> any,
> *** vec::release (void)
> *** 1462,1468 
>   
> if (using_auto_storage ())
>   {
> !   static_cast *> (this)->m_header.m_num = 0;
> return;
>   }
>   
> --- 1474,1480 
>   
> if (using_auto_storage ())
>   {
> !   m_vec->m_vecpfx.m_num = 0;
> return;
>   }
>   
> *** template
> *** 1705,1716 
>   inline bool
>   vec::using_auto_storage () const
>   {
> !   if (!m_vec->m_vecpfx.m_has_auto_buf)
> ! return false;
> ! 
> !   const vec_prefix *auto_header
> ! = &static_cast *> (this)->m_header;
> !   return reinterpret_cast (m_vec) == auto_header;
>   }
>   
>   #if (GCC_VERSION >= 3000)
> --- 1717,1723 
>   inline bool
>   vec::using_auto_storage () const
>   {
> !   return m_vec->m_vecpfx.m_using_auto_storage;
>   }
>   
>   #if (GCC_VERSION >= 3000)

Re: [PING] [PATCH] _Cilk_for for C and C++

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 03:14:23PM +, Iyer, Balaji V wrote:
> > The testcase is GPL as the original libgomp.c++/for-1.C testcase, so sure.
> > Perhaps it would be much better though if instead of having a compile time
> > testcase you'd just do what libgomp.c++/for-1.C does, just replace all the
> > #pragma omp parallel for in there with _Cilk_for and turn it into a runtime
> > testcase.
> > 

> I really don't want to do that because I don't think there is a 1:1
> match-up between the rules of #pragma omp for and _Cilk_for.

But there is nothing OpenMP specific on any of the tests, all could as well
be tested by removing all the #pragma omp ... lines and just be tested as
normal C+++ loops.  Is there anything that _Cilk_for wouldn't handle?

IMNSHO if you remove all the #pragma omp parallel for lines (even with any
clauses it sometimes has) and replace for with _Cilk_for on the following
line, you should have a valid Cilk+ program.

Only f11 would need to be changed from:
#pragma omp parallel
  {
#pragma omp for nowait
for (T i = x; i <= y; i += 3)
  baz (i);
#pragma omp single
{
  T j = y + 3;
  baz (j);
}
  }
to say:
  _Cilk_for (T i = x; i <= y; i += 3)
baz (i);
  {
T j = y + 3;
baz (j);
  }
so that it performs the same thing.

> > First a minor nit, there is extra newline before _Cilk_for:
> >   newline_and_indent (buffer, spc);
> >   if (flag_cilkplus
> >   && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
> > pp_string (buffer, "_Cilk_for (");
> >   else
> > pp_string (buffer, "for ("); I guess for _Cilk_for collapse is 
> > never > 1,
> > right?  If that is the case, then perhaps you should move the
> > newline_and_indent (buffer, spc); call into the else block.
> > 
> 
> OK. I will fix this and send you a patch? 

Sure.

> > More importantly, what is retval.1?  I'd expect you should be using retval.0
> > there and have it also as firstprivate(retval.0) on the parallel.
> > In *.omplower dump I actually see:
> > retval.0 = operator- (D.2885, &i); ...
> > retval.1 = operator- (D.2888, &i); i.e. 
> > operator- is
> > called twice.
> > 
> 
> Yes, one is for the if-clause and one is for the condition. It really doesn't 
> matter because we get of the stuff in the condition and replace with our own 
> for loop with something like the for-loop shown  below. So retval1 doesn't 
> come into picture. It is only alive from parser to the expand_cilk_for 
> function.
> 
> For (i = low; i < high; i++)
> {
>   <_Cilk_for_body>
> }

No, it really does matter.  Just look at the *.optimized dump with -O0 
-fcilkplus:

  retval.0_4 = operator- (_3, &i);
in _Z3foo1JIiE function and
  _4 = .omp_data_i_2(D)->j;
  _5 = J::end (_4);
  retval.1_6 = operator- (_5, &i);
  retval.3_7 = retval.1_6;
in _Z3foo1JIiE._cilk_for_fn.0.  All the 4 statements are dead, you really
shouldn't emit them, even when optimizing, if e.g. the operator- isn't
inline, g++ won't be able to optimize it away.

Jakub

Re: [PATCH] Improve vec code generation

2014-02-12 Thread Richard Biener

On Wed, Feb 12, 2014 at 4:21 PM, Trevor Saunders  wrote:
> On Wed, Feb 12, 2014 at 03:09:17PM +0100, Richard Biener wrote:
>>
>> The following aims to improve code generation for loops like
>> those in tree-ssa-alias.c:
>>
>>   auto_vec component_refs1;
>>   auto_vec component_refs2;
>>
>>   /* Create the stack of handled components for REF1.  */
>>   while (handled_component_p (ref1))
>> {
>>   component_refs1.safe_push (ref1);
>>   ref1 = TREE_OPERAND (ref1, 0);
>> }
>>
>> It does so by making sure the vector itself doesn't necessarily
>> escape (to vec.c:calculate_allocation) and by simplifying the
>> way we identify a vector using auto storage.  In the end this
>> results in us doing better CSE.  But not SRA unfortunately,
>> as we have
>>
>>   component_refs1.m_header.m_alloc = 16;
>>   component_refs1.m_header.m_using_auto_storage = 1;
>>   component_refs1.m_header.m_num = 0;
>>   component_refs1.D.56915.m_vec = &component_refs1.m_header;
>>
>> which is what keeps component_refs1 address-taken (no easy
>> way out of that I fear).
>>
>> It also gets rid of the aliasing-suspicious
>>
>> this->m_vec = reinterpret_cast *>
>> (&m_header);
>>
>> and the friend declaration in class auto_vec.
>
> over all I like this, only two small comments
>
>>   (vec_prefix::m_has_auto_buf): Rename to ...
>>   (vec_prefix::m_using_auto_storage): ... this.
>
> One optimization that might be worth doing is if a vector is cleared
> reseting m_vec to point at the internal storage in which case the old
> name would be more accurate, on the other hand its probably rare that's
> useful, and doing it this way is simpler.

Yeah, I thought about this but disregarded it as not-the-common case.

>> ~auto_vec ()
>> --- 1250,1257 
>>   public:
>> auto_vec ()
>> {
>> ! m_auto.embedded_init (MAX (N, 2), 0, 1);
>> ! this->m_vec = &m_auto;
>> }
>>
>> ~auto_vec ()
>> *** public:
>> *** 1246,1255 
>> }
>>
>>   private:
>> !   friend class vec;
>> !
>> !   vec_prefix m_header;
>> !   T m_data[N];
>>   };
>>
>>   /* auto_vec is a sub class of vec whose storage is released when it is
>> --- 1260,1267 
>> }
>>
>>   private:
>> !   vec m_auto;
>> !   T m_data[MAX (N - 1, 1)];
>
> why is the MAX() needed? auto_vec is specialized below so someone
> would need to write auto_vec for N - 1 to be less than 1 which is
> crazy, and seems like shouldn't be allowed to compile.

It's for the quite common auto_vec case where T m_data[N-1]
doesn't work (zero-sized array is a GNU extension).  I've had the
variant of simply allocating N + 1 elements (and thus using
m_data[N]) but then the above is closer to what people expect (but
sets the minimum auto size to two elements).

Alternatively one could have templated vec
itself with a size for its m_data member.  Or specialize auto_vec
for N == 1 and leave out the m_data member for that.

I considered both not worth the trouble ;)

If you prefer using m_data[N] and thus allocating one more element
than requested in auto storage I can do that as well.

Thanks,
Richard.

> thanks!
>
> Trev
>
>>   };
>>
>>   /* auto_vec is a sub class of vec whose storage is released when it is
>> *** template
>> *** 1396,1402 
>>   inline bool
>>   vec::reserve (unsigned nelems, bool exact 
>> MEM_STAT_DECL)
>>   {
>> !   if (!nelems || space (nelems))
>>   return false;
>>
>> /* For now play a game with va_heap::reserve to hide our auto storage if 
>> any,
>> --- 1408,1414 
>>   inline bool
>>   vec::reserve (unsigned nelems, bool exact 
>> MEM_STAT_DECL)
>>   {
>> !   if (space (nelems))
>>   return false;
>>
>> /* For now play a game with va_heap::reserve to hide our auto storage if 
>> any,
>> *** vec::release (void)
>> *** 1462,1468 
>>
>> if (using_auto_storage ())
>>   {
>> !   static_cast *> (this)->m_header.m_num = 0;
>> return;
>>   }
>>
>> --- 1474,1480 
>>
>> if (using_auto_storage ())
>>   {
>> !   m_vec->m_vecpfx.m_num = 0;
>> return;
>>   }
>>
>> *** template
>> *** 1705,1716 
>>   inline bool
>>   vec::using_auto_storage () const
>>   {
>> !   if (!m_vec->m_vecpfx.m_has_auto_buf)
>> ! return false;
>> !
>> !   const vec_prefix *auto_header
>> ! = &static_cast *> (this)->m_header;
>> !   return reinterpret_cast (m_vec) == auto_header;
>>   }
>>
>>   #if (GCC_VERSION >= 3000)
>> --- 1717,1723 
>>   inline bool
>>   vec::using_auto_storage () const
>>   {
>> !   return m_vec->m_vecpfx.m_using_auto_storage;
>>   }
>>
>>   #if (GCC_VERSION >= 3000)

Re: [PATCH][AARCH64]Resolves testsuite/gcc.target/aarch64/aapcs64/ret-func-1.c regression

2014-02-12 Thread Marcus Shawcroft

On 3 February 2014 10:02, Renlin Li  wrote:

> 2014-02-03  Renlin Li  
>
> * gcc.target/aarch64/aapcs64/validate_memory.h: move f32in64 and
> i32in128 cases
> outside special big-endian processing block.

This is a test case fix.  This is ok with me but needs a release manager ack.

/Marcus

RE: [PING] [PATCH] _Cilk_for for C and C++

2014-02-12 Thread Iyer, Balaji V



> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, February 12, 2014 9:59 AM
> To: Iyer, Balaji V
> Cc: 'Jason Merrill'; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'r...@redhat.com'
> Subject: Re: [PING] [PATCH] _Cilk_for for C and C++
> 
> On Mon, Feb 10, 2014 at 10:07:18PM +, Iyer, Balaji V wrote:
> > I looked at both but forgot to test them with my implementation. Sorry
> > about this.  I have fixed the ICE issue.  To make sure this does not
> > happen further, I have added your test cf3.C into test suite (renamed
> > to cf3.cc).  I hope that is OK with you.
> 
> The testcase is GPL as the original libgomp.c++/for-1.C testcase, so sure.
> Perhaps it would be much better though if instead of having a compile time
> testcase you'd just do what libgomp.c++/for-1.C does, just replace all the
> #pragma omp parallel for in there with _Cilk_for and turn it into a runtime
> testcase.
> 
I really don't want to do that because I don't think there is a 1:1 match-up 
between the rules of #pragma omp for and _Cilk_for. 

> > I have attached a fixed patch and Changelogs. Is this OK?
> 
> Looks better (note, still looking just at the dumps), but not completely ok
> yet.  On cf3.cc, I see in *.gimple:
> 
> D.2883 = J::begin (j);
> I::I (&i, D.2883);
> D.2885 = J::end (j);
> retval.0 = operator- (D.2885, &i);
> D.2886 = retval.0 / 2;
> #pragma omp parallel firstprivate(i) if(D.2886) shared(D.2865) 
> shared(j)
>   {
> difference_type retval.1;
> const struct I & D.2888;
> const difference_type D.2866;
> long int D.2889;
> struct I & D.2890;
> 
> try
>   {
> 
> _Cilk_for (D.2864 = 0; D.2864 < retval.1; D.2864 = D.2864 + 2)
> private(D.2864)
>   {
> D.2889 = D.2864 - D.2865;
> D.2866 = D.2889;
> try
>   {
> D.2890 = I::operator+= (&i, &D.2866);
> 
> First a minor nit, there is extra newline before _Cilk_for:
>   newline_and_indent (buffer, spc);
>   if (flag_cilkplus
>   && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
> pp_string (buffer, "_Cilk_for (");
>   else
> pp_string (buffer, "for ("); I guess for _Cilk_for collapse is 
> never > 1,
> right?  If that is the case, then perhaps you should move the
> newline_and_indent (buffer, spc); call into the else block.
> 

OK. I will fix this and send you a patch? 

> More importantly, what is retval.1?  I'd expect you should be using retval.0
> there and have it also as firstprivate(retval.0) on the parallel.
> In *.omplower dump I actually see:
> retval.0 = operator- (D.2885, &i); ...
> retval.1 = operator- (D.2888, &i); i.e. 
> operator- is
> called twice.
> 

Yes, one is for the if-clause and one is for the condition. It really doesn't 
matter because we get of the stuff in the condition and replace with our own 
for loop with something like the for-loop shown  below. So retval1 doesn't come 
into picture. It is only alive from parser to the expand_cilk_for function.

For (i = low; i < high; i++)
{
<_Cilk_for_body>
}

So, is there any other changes that you need me to make?

Thanks,

Balaji V. Iyer.


>   Jakub

[PATCH] Swap ealias and esra pass ordering

2014-02-12 Thread Richard Biener


As said in the other mail esra corrupts ealias info by eventually
adding new PHI nodes but at the same time doesn't benefit from
points-to info.  Also esra may improve PTA precision by exposing
more SSA names.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progres.

Queued for stage1.

Richard.

2014-02-12  Richard Biener  

* passes.def: Move early points-to after early SRA.

Index: gcc/passes.def
===
--- gcc/passes.def  (revision 207725)
+++ gcc/passes.def  (working copy)
@@ -68,10 +68,10 @@ along with GCC; see the file COPYING3.
  /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
  NEXT_PASS (pass_forwprop);
+ NEXT_PASS (pass_sra_early);
  /* pass_build_ealias is a dummy pass that ensures that we
 execute TODO_rebuild_alias at this point.  */
  NEXT_PASS (pass_build_ealias);
- NEXT_PASS (pass_sra_early);
  NEXT_PASS (pass_fre);
  NEXT_PASS (pass_copy_prop);
  NEXT_PASS (pass_merge_phi);

Re: [PING] [PATCH] _Cilk_for for C and C++

2014-02-12 Thread Jakub Jelinek

On Mon, Feb 10, 2014 at 10:07:18PM +, Iyer, Balaji V wrote:
> I looked at both but forgot to test them with my implementation. Sorry
> about this.  I have fixed the ICE issue.  To make sure this does not
> happen further, I have added your test cf3.C into test suite (renamed to
> cf3.cc).  I hope that is OK with you.

The testcase is GPL as the original libgomp.c++/for-1.C testcase, so sure.
Perhaps it would be much better though if instead of having a compile time
testcase you'd just do what libgomp.c++/for-1.C does, just replace all the
#pragma omp parallel for in there with _Cilk_for and turn it into a runtime
testcase.

> I have attached a fixed patch and Changelogs. Is this OK?

Looks better (note, still looking just at the dumps), but not completely ok
yet.  On cf3.cc, I see in *.gimple:

D.2883 = J::begin (j);
I::I (&i, D.2883);
D.2885 = J::end (j);
retval.0 = operator- (D.2885, &i);
D.2886 = retval.0 / 2;
#pragma omp parallel firstprivate(i) if(D.2886) shared(D.2865) shared(j)
  {
difference_type retval.1;
const struct I & D.2888;
const difference_type D.2866;
long int D.2889;
struct I & D.2890;

try
  {

_Cilk_for (D.2864 = 0; D.2864 < retval.1; D.2864 = D.2864 + 2) 
private(D.2864)
  {
D.2889 = D.2864 - D.2865;
D.2866 = D.2889;
try
  {
D.2890 = I::operator+= (&i, &D.2866);

First a minor nit, there is extra newline before _Cilk_for:
  newline_and_indent (buffer, spc);
  if (flag_cilkplus
  && gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR)
pp_string (buffer, "_Cilk_for (");
  else
pp_string (buffer, "for (");
I guess for _Cilk_for collapse is never > 1, right?  If that is the case,
then perhaps you should move the newline_and_indent (buffer, spc); call
into the else block.

More importantly, what is retval.1?  I'd expect you should be using retval.0
there and have it also as firstprivate(retval.0) on the parallel.
In *.omplower dump I actually see:
retval.0 = operator- (D.2885, &i);
...
retval.1 = operator- (D.2888, &i);
i.e. operator- is called twice.

Jakub

[COMMITTED] Fix typo in dg-error invocation.

2014-02-12 Thread Thomas Schwinge

From: tschwinge 

gcc/testsuite/
* c-c++-common/raw-string-3.c: Fix typo in dg-error invocation.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@207725 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog   | 2 ++
 gcc/testsuite/c-c++-common/raw-string-3.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog
index 46d5ff8..d07a7a2 100644
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -1,5 +1,7 @@
 2014-02-12  Thomas Schwinge  
 
+   * c-c++-common/raw-string-3.c: Fix typo in dg-error invocation.
+
* gcc.dg/cilk-plus/jump-openmp.c: New file.
 
 2014-02-12  Richard Biener  
diff --git gcc/testsuite/c-c++-common/raw-string-3.c 
gcc/testsuite/c-c++-common/raw-string-3.c
index f20337f..1dde113 100644
--- gcc/testsuite/c-c++-common/raw-string-3.c
+++ gcc/testsuite/c-c++-common/raw-string-3.c
@@ -7,7 +7,7 @@
 const void *s0 = R"(a)";   // { dg-error "was not 
declared|undeclared" "undeclared" }
 // { dg-error "expected ',' or ';'" "expected" { target c } 7 }
 const void *s1 = uR"(a)";  // { dg-error "was not 
declared|undeclared" "undeclared" }
-// { dg-error "expected ',' or ';'" expected" { target c } 9 }
+// { dg-error "expected ',' or ';'" "expected" { target c } 9 }
 const void *s2 = UR"(a)";  // { dg-error "was not 
declared|undeclared" "undeclared" }
 // { dg-error "expected ',' or ';'" "expected" { target c } 11 
}
 const void *s3 = u8R"(a)"; // { dg-error "was not 
declared|undeclared" "undeclared" }
-- 
1.8.1.1

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-12 Thread Teresa Johnson

On Tue, Feb 11, 2014 at 6:13 PM, Xinliang David Li  wrote:
> On Tue, Feb 11, 2014 at 5:36 PM, Teresa Johnson  wrote:
>> On Tue, Feb 11, 2014 at 5:16 PM, Xinliang David Li  
>> wrote:
>>> Why is call graph needed to determine whether to drop the profile?
>>
>> Because we detect this situation by looking for cases where the call
>> edge counts greatly exceed the callee node count.
>>
>>>
>>> If that is needed, it might be possible to leverage the ipa_profile
>>> pass as it will walk through all function nodes to do profile
>>> annotation. With this you can make decision to drop callee profile in
>>> caller's context.
>>
>> There are 2 ipa profiling passes, which are somewhat confusingly named
>> (to me at least. =). This is being done during the first.
>>
>> The first is pass_ipa_tree_profile in tree-profile.c, but is a
>> SIMPLE_IPA_PASS and has the name "profile" in the dump. The second is
>> pass_ipa_profile in ipa-profile.c, which is an IPA_PASS and has the
>> name "profile_estimate" in the dump. I assume you are suggesting to
>> move this into the latter? But I'm not clear on what benefit that
>> gives - the functions are not being traversed in order, so there is
>> still the issue of needing to rebuild the cgraph after dropping
>> profiles, which might be best done earlier as I have in the patch.
>
>
> I meant the tree-profile one.  I think this might work: after all the
> function's profile counts are annotated, add another walk of the the
> call graph nodes to drop bad profiles before the the call graph is
> rebuilt (Call graph does exist at that point).

Ok, so it is already done in tree-profile. But it sounds like you are
suggesting reordering it to just above where we update the calls and
rebuild the cgraph the first time? As you noted in a follow-on email
to me, the cgraph edges don't have the profile counts at that point
(and neither do the nodes), so I would need to compare the count on
the call's bb to the entry bb count of the callee. That should be
doable, let me take a stab at it.

Thanks,
Teresa

>
> David
>>
>> Teresa
>>
>>>
>>> David
>>>
>>> On Tue, Feb 11, 2014 at 5:04 PM, Teresa Johnson  
>>> wrote:
 On Tue, Feb 11, 2014 at 2:56 PM, Xinliang David Li  
 wrote:
> Is it better to add some logic in counts_to_freq to determine if the
> profile count needs to be dropped completely to force profile
> estimation?

 This is the problem I was mentioning below where we call
 counts_to_freqs before we have the cgraph and can tell that we will
 need to drop the profile. When we were only dropping the profile for
 functions with all 0 counts, we simply avoided doing the
 counts_to_freqs when the counts were all 0, which works since the 0
 counts don't leave things in an inconsistent state (counts vs
 estimated frequencies).

 Teresa

>
> David
>
> On Mon, Feb 10, 2014 at 2:12 PM, Teresa Johnson  
> wrote:
>> This patch attempts to address the lost profile issue for COMDATs in
>> more circumstances, exposed by function splitting.
>>
>> My earlier patch handled the case where the comdat had 0 counts since
>> the linker kept the copy in a different module. In that case we
>> prevent the guessed frequencies on 0-count functions from being
>> dropped by counts_to_freq, and then later mark any reached via
>> non-zero callgraph edges as guessed. Finally, when one such 0-count
>> comdat is inlined the call count is propagated to the callee blocks
>> using the guessed probabilities.
>>
>> However, in this case, there was a comdat that had a very small
>> non-zero count, that was being inlined to a much hotter callsite. This
>> could happen when there was a copy that was ipa-inlined
>> in the profile gen compile, so the copy in that module gets some
>> non-zero counts from the ipa inlined instance, but when the out of
>> line copy was eliminated by the linker (selected from a different
>> module). In this case the inliner was scaling the bb counts up quite a
>> lot when inlining. The problem is that you most likely can't trust
>> that the 0 count bbs in such a case are really not executed by the
>> callsite it is being into, since the counts are very small and
>> correspond to a different callsite. In some internal C++ applications
>> I am seeing that we execute out of the split cold portion of COMDATs
>> for this reason.
>>
>> This problem is more complicated to address than the 0-count instance,
>> because we need the cgraph to determine which functions to drop the
>> profile on, and at that point the estimated frequencies have already
>> been overwritten by counts_to_freqs. To handle this broader case, I
>> have changed the drop_profile routine to simply re-estimate the
>> probabilities/frequencies (and translate these into counts scaled from
>> the incoming call edge counts). This unfortunately n

[PATCH] RTL PRE speedup

2014-02-12 Thread Richard Biener


No magic bullet for the insn-recog.c testcase I'm looking at but
quite obvious and should save half of the canon_true_dependence calls
for blocks that end up non-transparent.

Now one of the main weakness of compute_transp is that we call it
for all expressions but for the MEM case it's certainly not
interesting to know whether it is transparent in _all_ blocks
but just a subset (dependent on the actual transform we do - either
where it's available or where its anticipated).

Committed as obvious.

Richard.

2014-02-12  Richard Biener  

* gcse.c (compute_transp): break from loop over canon_modify_mem_list
when we found a dependence.

Index: gcc/gcse.c
===
*** gcc/gcse.c  (revision 207718)
--- gcc/gcse.c  (working copy)
*** compute_transp (const_rtx x, int indx, s
*** 1735,1741 
  
  if (canon_true_dependence (dest, GET_MODE (dest),
 dest_addr, x, x_addr))
!   bitmap_clear_bit (bmap[bb_index], indx);
}
}
}
--- 1735,1744 
  
  if (canon_true_dependence (dest, GET_MODE (dest),
 dest_addr, x, x_addr))
!   {
! bitmap_clear_bit (bmap[bb_index], indx);
! break;
!   }
}
}
}

[PING] RE: [PING] [PATCH] _Cilk_for for C and C++

2014-02-12 Thread Iyer, Balaji V

Hi Jakub,
Did you get a chance to look at this patch?

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Iyer, Balaji V
> Sent: Monday, February 10, 2014 5:07 PM
> To: Jakub Jelinek
> Cc: 'Jason Merrill'; 'Jeff Law'; 'Aldy Hernandez'; 'gcc-patches@gcc.gnu.org';
> 'r...@redhat.com'
> Subject: RE: [PING] [PATCH] _Cilk_for for C and C++
> 
> Hi Jakub,
> 
> > -Original Message-
> > From: Jakub Jelinek [mailto:ja...@redhat.com]
> > Sent: Monday, February 10, 2014 12:58 PM
> > To: Iyer, Balaji V
> > Cc: 'Jason Merrill'; 'Jeff Law'; 'Aldy Hernandez';
> > 'gcc-patches@gcc.gnu.org'; 'r...@redhat.com'
> > Subject: Re: [PING] [PATCH] _Cilk_for for C and C++
> >
> > On Fri, Feb 07, 2014 at 10:14:21PM +, Iyer, Balaji V wrote:
> > >   Attached, please find a fixed patch. Along with it, I have also
> > > added
> > > 2 changelog files for C and C++ respectively.
> >
> > Have you even looked at the second testcase I've posted?
> > gimplification ICEs on it with your latest patch, because firstprivate
> > clause is added for the same variable multiple times, and it seems
> > parallel still isn't around _Cilk_for.
> 
> I looked at both but forgot to test them with my implementation. Sorry
> about this. I have fixed the ICE issue. To make sure this does not happen
> further, I have added your test cf3.C into test suite (renamed to cf3.cc). I
> hope that is OK with you.
> 
> I have attached a fixed patch and Changelogs. Is this OK?
> 
> Thanks,
> 
> Balaji V. Iyer.
> 
> >
> > Jakub

Re: [PATCH] Improve vec code generation

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 03:09:17PM +0100, Richard Biener wrote:
> 2014-02-12  Richard Biener  
> 
>   * vec.c (vec_prefix::calculate_allocation): Move as
>   inline variant to vec.h.
>   (vec_prefix::calculate_allocation_1): New out-of-line version.
>   * vec.h (vec_prefix::calculate_allocation_1): Declare.
>   (vec_prefix::m_has_auto_buf): Rename to ...
>   (vec_prefix::m_using_auto_storage): ... this.
>   (vec_prefix::calculate_allocation): Inline the easy cases
>   and dispatch to calculate_allocation_1 which doesn't need the
>   prefix address.
>   (va_heap::reserve): Use gcc_checking_assert.
>   (vec::embedded_init): Add argument to initialize
>   m_using_auto_storage.
>   (auto_vec): Change m_vecpfx member to a vec
>   member and adjust.
>   (vec::reserve): Remove redundant check.
>   (vec::release): Avoid casting.
>   (vec::using_auto_storage): Simplify.

Ok.

Jakub

[PATCH] Improve vec code generation

2014-02-12 Thread Richard Biener


The following aims to improve code generation for loops like
those in tree-ssa-alias.c:

  auto_vec component_refs1;
  auto_vec component_refs2;

  /* Create the stack of handled components for REF1.  */
  while (handled_component_p (ref1))
{
  component_refs1.safe_push (ref1);
  ref1 = TREE_OPERAND (ref1, 0);
}

It does so by making sure the vector itself doesn't necessarily
escape (to vec.c:calculate_allocation) and by simplifying the
way we identify a vector using auto storage.  In the end this
results in us doing better CSE.  But not SRA unfortunately,
as we have

  component_refs1.m_header.m_alloc = 16;
  component_refs1.m_header.m_using_auto_storage = 1;
  component_refs1.m_header.m_num = 0;
  component_refs1.D.56915.m_vec = &component_refs1.m_header;

which is what keeps component_refs1 address-taken (no easy
way out of that I fear).

It also gets rid of the aliasing-suspicious

this->m_vec = reinterpret_cast *> 
(&m_header);

and the friend declaration in class auto_vec.

Bootstrap / regtest ongoing on x86_64-unknown-linux-gnu.  Ok at this
stage if it works ok?

Thanks,
Richard.

2014-02-12  Richard Biener  

* vec.c (vec_prefix::calculate_allocation): Move as
inline variant to vec.h.
(vec_prefix::calculate_allocation_1): New out-of-line version.
* vec.h (vec_prefix::calculate_allocation_1): Declare.
(vec_prefix::m_has_auto_buf): Rename to ...
(vec_prefix::m_using_auto_storage): ... this.
(vec_prefix::calculate_allocation): Inline the easy cases
and dispatch to calculate_allocation_1 which doesn't need the
prefix address.
(va_heap::reserve): Use gcc_checking_assert.
(vec::embedded_init): Add argument to initialize
m_using_auto_storage.
(auto_vec): Change m_vecpfx member to a vec
member and adjust.
(vec::reserve): Remove redundant check.
(vec::release): Avoid casting.
(vec::using_auto_storage): Simplify.

Index: gcc/vec.c
===
*** gcc/vec.c   (revision 207718)
--- gcc/vec.c   (working copy)
*** vec_prefix::release_overhead (void)
*** 171,216 
  
  
  /* Calculate the number of slots to reserve a vector, making sure that
!RESERVE slots are free.  If EXACT grow exactly, otherwise grow
!exponentially.  PFX is the control data for the vector.  */
  
  unsigned
! vec_prefix::calculate_allocation (vec_prefix *pfx, unsigned reserve,
! bool exact)
  {
-   unsigned alloc = 0;
-   unsigned num = 0;
- 
-   if (pfx)
- {
-   alloc = pfx->m_alloc;
-   num = pfx->m_num;
- }
-   else if (!reserve)
- gcc_unreachable ();
- 
/* We must have run out of room.  */
!   gcc_assert (alloc - num < reserve);
  
!   if (exact)
! /* Exact size.  */
! alloc = num + reserve;
else
! {
!   /* Exponential growth. */
!   if (!alloc)
!   alloc = 4;
!   else if (alloc < 16)
!   /* Double when small.  */
!   alloc = alloc * 2;
!   else
!   /* Grow slower when large.  */
!   alloc = (alloc * 3 / 2);
! 
!   /* If this is still too small, set it to the right size. */
!   if (alloc < num + reserve)
!   alloc = num + reserve;
! }
return alloc;
  }
  
--- 171,197 
  
  
  /* Calculate the number of slots to reserve a vector, making sure that
!it is of at least DESIRED size by growing ALLOC exponentially.  */
  
  unsigned
! vec_prefix::calculate_allocation_1 (unsigned alloc, unsigned desired)
  {
/* We must have run out of room.  */
!   gcc_assert (alloc < desired);
  
!   /* Exponential growth. */
!   if (!alloc)
! alloc = 4;
!   else if (alloc < 16)
! /* Double when small.  */
! alloc = alloc * 2;
else
! /* Grow slower when large.  */
! alloc = (alloc * 3 / 2);
! 
!   /* If this is still too small, set it to the right size. */
!   if (alloc < desired)
! alloc = desired;
return alloc;
  }
  
Index: gcc/vec.h
===
*** gcc/vec.h   (revision 207718)
--- gcc/vec.h   (working copy)
*** struct vec_prefix
*** 218,223 
--- 218,224 
void register_overhead (size_t, const char *, int, const char *);
void release_overhead (void);
static unsigned calculate_allocation (vec_prefix *, unsigned, bool);
+   static unsigned calculate_allocation_1 (unsigned, unsigned);
  
/* Note that vec_prefix should be a base class for vec, but we use
   offsetof() on vector fields of tree structures (e.g.,
*** struct vec_prefix
*** 233,242 
friend struct va_heap;
  
unsigned m_alloc : 31;
!   unsigned m_has_auto_buf : 1;
unsigned m_num;
  };
  
  template struct vec;
  
  /* Valid vector layouts
--- 234,258 
friend struct va_heap;
  
unsigned m_alloc : 31;
!   unsigned m_using_auto_storage : 1;
unsigned m_num;
  };
  
+ /

Re: [ARM] [Churn] Comments on cost tables should have lower-case identifiers.

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 12:46:04PM +, Ramana Radhakrishnan wrote:
> Ok by me but I'd like the RM ack.

Ok.

> >2014-02-12  James Greenhalgh  
> >
> > * config/arm/aarch-cost-tables

The filename is config/arm/aarch-cost-tables.h, and
the (...) if it fits should just be on the same line as the filename.

> > (generic_extra_costs): Fix identifiers in comments.
> > (cortexa53_extra_costs): Likewise.
> > * config/arm/arm.c
> > (cortexa9_extra_costs): Fix identifiers in comments.
> > (cortexa7_extra_costs): Likewise.
> > (cortexa12_extra_costs): Likewise.
> > (cortexa15_extra_costs): Likewise.
> > (v7m_extra_costs): Likewise.

Jakub

Re: [ARM] [Churn] Comments on cost tables should have lower-case identifiers.

2014-02-12 Thread Ramana Radhakrishnan


On 02/12/14 12:34, James Greenhalgh wrote:


Hi,

This patch fixes a nuisance with the comments in the cost tables.
Presently they are in sentence case, which to any casual observer
(i.e. me) would suggest that the field name should begin with
an upper-case character, .Fma for example. Also irritating is trying
to find the cost of the real .fma, which a case-sensitive grep won't
spot. GNU style explicitly calls this sort of mangling of identifiers
out as bad practice.

I've checked that both AArch64 and ARM toolchains still build after
this minor change.

OK for trunk/stage-1 (I don't mind which)?


I'd prefer to take this now as it is effectively only a fix for comments 
and the patch only affects comments. Let's finish the churn for 4.9.


Ok by me but I'd like the RM ack.

Ramana



Thanks,
James

---
gcc/

2014-02-12  James Greenhalgh  

* config/arm/aarch-cost-tables
(generic_extra_costs): Fix identifiers in comments.
(cortexa53_extra_costs): Likewise.
* config/arm/arm.c
(cortexa9_extra_costs): Fix identifiers in comments.
(cortexa7_extra_costs): Likewise.
(cortexa12_extra_costs): Likewise.
(cortexa15_extra_costs): Likewise.
(v7m_extra_costs): Likewise.




--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.

Re: [ARM] [Trivial] Fix shortening of field name extend.

2014-02-12 Thread Ramana Radhakrishnan


On 02/12/14 12:19, James Greenhalgh wrote:


Hi,

In aarch-common-protos.h we define a field in alu_cost_table:

   "extnd"

On its own this is an upsetting optimization of the
English language, but this trouble is compounded by the
comment attached to this field throughout the cost tables
themselves:

   /* Extend.  */

This patch fixes the spelling of extend to match that in the
commemnts.

I've checked that AArch64 and AArch32 build with this patch
applied.

OK for trunk/stage-1 (I don't mind which)?


I am happy for this to go in now -

Jakub ?


regards
Ramana



Thanks,
James

---
2014-02-12  James Greenhalgh  

* config/arm/aarch-common-protos.h
(alu_cost_table): Fix spelling of "extend".
* config/arm/arm.c (arm_new_rtx_costs): Fix spelling of "extend".




--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.

[ARM] [Churn] Comments on cost tables should have lower-case identifiers.

2014-02-12 Thread James Greenhalgh


Hi,

This patch fixes a nuisance with the comments in the cost tables.
Presently they are in sentence case, which to any casual observer
(i.e. me) would suggest that the field name should begin with
an upper-case character, .Fma for example. Also irritating is trying
to find the cost of the real .fma, which a case-sensitive grep won't
spot. GNU style explicitly calls this sort of mangling of identifiers
out as bad practice.

I've checked that both AArch64 and ARM toolchains still build after
this minor change.

OK for trunk/stage-1 (I don't mind which)?

Thanks,
James

---
gcc/

2014-02-12  James Greenhalgh  

* config/arm/aarch-cost-tables
(generic_extra_costs): Fix identifiers in comments.
(cortexa53_extra_costs): Likewise.
* config/arm/arm.c
(cortexa9_extra_costs): Fix identifiers in comments.
(cortexa7_extra_costs): Likewise.
(cortexa12_extra_costs): Likewise.
(cortexa15_extra_costs): Likewise.
(v7m_extra_costs): Likewise.
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 690ef9b..c30ea2f 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -26,99 +26,99 @@ const struct cpu_cost_table generic_extra_costs =
 {
   /* ALU */
   {
-0,			/* Arith.  */
-0,			/* Logical.  */
-0,			/* Shift.  */
-COSTS_N_INSNS (1),	/* Shift_reg.  */
-0,			/* Arith_shift.  */
-COSTS_N_INSNS (1),	/* Arith_shift_reg.  */
-0,			/* Log_shift.  */
-COSTS_N_INSNS (1),	/* Log_shift_reg.  */
-0,			/* Extend.  */
-COSTS_N_INSNS (1),	/* Extend_arith.  */
-0,			/* Bfi.  */
-0,			/* Bfx.  */
-0,			/* Clz.  */
+0,			/* arith.  */
+0,			/* logical.  */
+0,			/* shift.  */
+COSTS_N_INSNS (1),	/* shift_reg.  */
+0,			/* arith_shift.  */
+COSTS_N_INSNS (1),	/* arith_shift_reg.  */
+0,			/* log_shift.  */
+COSTS_N_INSNS (1),	/* log_shift_reg.  */
+0,			/* extend.  */
+COSTS_N_INSNS (1),	/* extend_arith.  */
+0,			/* bfi.  */
+0,			/* bfx.  */
+0,			/* clz.  */
 COSTS_N_INSNS (1),	/* non_exec.  */
 false		/* non_exec_costs_exec.  */
   },
   {
 /* MULT SImode */
 {
-  COSTS_N_INSNS (2),	/* Simple.  */
-  COSTS_N_INSNS (1),	/* Flag_setting.  */
-  COSTS_N_INSNS (2),	/* Extend.  */
-  COSTS_N_INSNS (3),	/* Add.  */
-  COSTS_N_INSNS (3),	/* Extend_add.  */
-  COSTS_N_INSNS (8)		/* Idiv.  */
+  COSTS_N_INSNS (2),	/* simple.  */
+  COSTS_N_INSNS (1),	/* flag_setting.  */
+  COSTS_N_INSNS (2),	/* extend.  */
+  COSTS_N_INSNS (3),	/* add.  */
+  COSTS_N_INSNS (3),	/* extend_add.  */
+  COSTS_N_INSNS (8)		/* idiv.  */
 },
 /* MULT DImode */
 {
-  0,			/* Simple (N/A).  */
-  0,			/* Flag_setting (N/A).  */
-  COSTS_N_INSNS (2),	/* Extend.  */
-  0,			/* Add (N/A).  */
-  COSTS_N_INSNS (3),	/* Extend_add.  */
-  0/* Idiv (N/A).  */
+  0,			/* simple (N/A).  */
+  0,			/* flag_setting (N/A).  */
+  COSTS_N_INSNS (2),	/* extend.  */
+  0,			/* add (N/A).  */
+  COSTS_N_INSNS (3),	/* extend_add.  */
+  0/* idiv (N/A).  */
 }
   },
   /* LD/ST */
   {
-COSTS_N_INSNS (2),	/* Load.  */
-COSTS_N_INSNS (2),	/* Load_sign_extend.  */
-COSTS_N_INSNS (3),	/* Ldrd.  */
-COSTS_N_INSNS (2),	/* Ldm_1st.  */
-1,			/* Ldm_regs_per_insn_1st.  */
-1,			/* Ldm_regs_per_insn_subsequent.  */
-COSTS_N_INSNS (2),	/* Loadf.  */
-COSTS_N_INSNS (3),	/* Loadd.  */
-COSTS_N_INSNS (1),  /* Load_unaligned.  */
-COSTS_N_INSNS (2),	/* Store.  */
-COSTS_N_INSNS (3),	/* Strd.  */
-COSTS_N_INSNS (2),	/* Stm_1st.  */
-1,			/* Stm_regs_per_insn_1st.  */
-1,			/* Stm_regs_per_insn_subsequent.  */
-COSTS_N_INSNS (2),	/* Storef.  */
-COSTS_N_INSNS (3),	/* Stored.  */
-COSTS_N_INSNS (1)  /* Store_unaligned.  */
+COSTS_N_INSNS (2),	/* load.  */
+COSTS_N_INSNS (2),	/* load_sign_extend.  */
+COSTS_N_INSNS (3),	/* ldrd.  */
+COSTS_N_INSNS (2),	/* ldm_1st.  */
+1,			/* ldm_regs_per_insn_1st.  */
+1,			/* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (2),	/* loadf.  */
+COSTS_N_INSNS (3),	/* loadd.  */
+COSTS_N_INSNS (1),  /* load_unaligned.  */
+COSTS_N_INSNS (2),	/* store.  */
+COSTS_N_INSNS (3),	/* strd.  */
+COSTS_N_INSNS (2),	/* stm_1st.  */
+1,			/* stm_regs_per_insn_1st.  */
+1,			/* stm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (2),	/* storef.  */
+COSTS_N_INSNS (3),	/* stored.  */
+COSTS_N_INSNS (1)  /* store_unaligned.  */
   },
   {
 /* FP SFmode */
 {
-  COSTS_N_INSNS (7),	/* Div.  */
-  COSTS_N_INSNS (2),	/* Mult.  */
-  COSTS_N_INSNS (3),	/* Mult_addsub.  */
-  COSTS_N_INSNS (3),	/* Fma.  */
-  COSTS_N_INSNS (1),	/* Addsub.  */
-  0,			/* Fpconst.  */
-  0,			/* Neg.  */
-  0,			/* Compare.  */
-  0,			/* Widen.  */
-

[ARM] [Trivial] Fix shortening of field name extend.

2014-02-12 Thread James Greenhalgh


Hi,

In aarch-common-protos.h we define a field in alu_cost_table:

  "extnd"

On its own this is an upsetting optimization of the
English language, but this trouble is compounded by the
comment attached to this field throughout the cost tables
themselves:

  /* Extend.  */

This patch fixes the spelling of extend to match that in the
commemnts.

I've checked that AArch64 and AArch32 build with this patch
applied.

OK for trunk/stage-1 (I don't mind which)?

Thanks,
James

---
2014-02-12  James Greenhalgh  

* config/arm/aarch-common-protos.h
(alu_cost_table): Fix spelling of "extend".
* config/arm/arm.c (arm_new_rtx_costs): Fix spelling of "extend".
diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 056fe56..a5ff6b4 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -48,8 +48,8 @@ struct alu_cost_table
   const int arith_shift_reg;	/* ... and when the shift is by a reg.  */
   const int log_shift;		/* Additional when logic also shifts...  */
   const int log_shift_reg;	/* ... and when the shift is by a reg.  */
-  const int extnd;		/* Zero/sign extension.  */
-  const int extnd_arith;	/* Extend and arith.  */
+  const int extend;		/* Zero/sign extension.  */
+  const int extend_arith;	/* Extend and arith.  */
   const int bfi;		/* Bit-field insert.  */
   const int bfx;		/* Bit-field extraction.  */
   const int clz;		/* Count Leading Zeros.  */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b49f43e..eae8b0e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9589,7 +9589,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	{
 	  /* UXTA[BH] or SXTA[BH].  */
 	  if (speed_p)
-		*cost += extra_cost->alu.extnd_arith;
+		*cost += extra_cost->alu.extend_arith;
 	  *cost += (rtx_cost (XEXP (XEXP (x, 0), 0), ZERO_EXTEND, 0,
   speed_p)
 			+ rtx_cost (XEXP (x, 1), PLUS, 0, speed_p));
@@ -10306,7 +10306,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	  *cost = COSTS_N_INSNS (1);
 	  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
 	  if (speed_p)
-	*cost += extra_cost->alu.extnd;
+	*cost += extra_cost->alu.extend;
 	}
   else if (GET_MODE (XEXP (x, 0)) != SImode)
 	{
@@ -10359,7 +10359,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	  *cost = COSTS_N_INSNS (1);
 	  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
 	  if (speed_p)
-	*cost += extra_cost->alu.extnd;
+	*cost += extra_cost->alu.extend;
 	}
   else if (GET_MODE (XEXP (x, 0)) != SImode)
 	{

Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-12 Thread Eric Botcazou

> That's true.  What I wonder is why the stmt checker doesn't trip.  Probably
> because while SRA scalarizes the thing the result isn't rewritten into
> SSA form?

We have a slice of an array on the RHS so there is no SSA form, it's the VCE 
of an ARRAY_RANGE_REF to a 24-bit precision integer type.

> So there may be a testcase where that happens and we'd even ICE?

Probably not in Ada, we only generate a VCE when there is an aggregate type.

-- 
Eric Botcazou

Re: [PATCH] S390: Fix crash with -mhotpatch and gfortran

2014-02-12 Thread Richard Sandiford

Dominik Vogt  writes:
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 32a25a4..9ae8ffd 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -5510,9 +5510,8 @@ s390_asm_output_function_label (FILE *asm_out_file, 
> const char *fname,
>if (hotpatch_trampoline_halfwords >= 0
> && decl_function_context (decl) != NULL_TREE)
>   {
> -   warning_at (0, DECL_SOURCE_LOCATION (decl),
> -   "hotpatch_prologue is not compatible with nested"
> -   " function");
> +   warning_at (0, OPT_mhotpatch,
> +   "hotpatching is not compatible with nested functions");

Looks like this should be:

  warning_at (DECL_SOURCE_LOCATION (decl), OPT_mhotpatch,
  "hotpatching is not compatible with nested functions");

Thanks,
Richard

Re: [RFC, patch] Detect lack of 32-bit devel environment on x86_64-linux targets

2014-02-12 Thread Richard Biener

On Wed, Feb 12, 2014 at 11:57 AM, Jakub Jelinek  wrote:
> On Wed, Feb 12, 2014 at 11:52:47AM +0100, Richard Biener wrote:
>> On Tue, Feb 11, 2014 at 5:17 PM, Jakub Jelinek  wrote:
>> > On Fri, Jan 31, 2014 at 03:49:40PM +0100, Richard Biener wrote:
>> >> On Fri, Jan 31, 2014 at 3:45 PM, FX  wrote:
>> >> >> I've just seen that an explicit --enable-multilib is a way to do that.
>> >> >
>> >> > Yes, I was writing that as a reply when I received your email. (Also, 
>> >> > it's written in the configure error message.)
>> >>
>> >> Yeah - you know, that message is quite long and somehow I didn't read it
>> >> carefully.  I suspect that will happen to others, too, so we'll get some
>> >> extra complaints from that ;)
>> >>
>> >> >> Btw, doing the configure check exactly after all-stage1-gcc should be
>> >> >> an early enough and a serialization point, no?  There you can do the
>> >> >> check even for when cross-compiling.
>> >> >
>> >> > Well, you've already built a whole stage, so it's not so early, is it?
>> >>
>> >> Well, building the stage1 compiler is probably the fastest thing nowadays
>> >> (it didn't yet build the target libraries for stage1 with the stage1,
>> >> unoptimized
>> >> and checking-enabled compiler - which is where it would fail in the odd
>> >> way which is what you want to improve).
>> >>
>> >> As I said, you can't "properly" check it at the point you are checking.
>> >> Which is why I complain - you're not checking this properly!
>> >>
>> >> Anyway, I've fixed the "bug" on our side with --enable-multilib.
>> >
>> > Just hit the same thing, while I have (in mock) 32-bit devel libc 
>> > installed,
>> > I don't have 32-bit libgcc_s installed (what for, it will be built by gcc).
>> >
>> > Please revert it, or at least improve it (e.g. by trying to build
>> > with -static-libgcc at least).
>>
>> I wouldn't have static 32bit libgcc installed either.
>
> So perhaps turn that into a check if preprocessing of #include 
> works with -m32 (and never complain when building --with-sysroot*)?
> I mean, if features.h doesn't preprocess successfully (because of missing
> /usr/include/gnu/stubs-32.h), then as long as the compiler is going to
> include the same headers and not some sysroot, multilib building would fail
> in that case too.

That sounds good to me.  Though it still wouldn't work with a
non-multilib built host compiler (maybe detect that and just warn
but not abort for that case?).

Richard.

> Jakub

Re: Warn about virtual table mismatches

2014-02-12 Thread Richard Biener

On Wed, Feb 12, 2014 at 7:27 AM, Jan Hubicka  wrote:
>> On 02/11/2014 07:54 PM, Jan Hubicka wrote:
>> >+  /* Allow combining RTTI and non-RTTI is OK.  */
>>
>> You mean combining -frtti and -fno-rtti compiles?  Yes, that's fine,
>> though you need to prefer the -frtti version in case code from that
>> translation unit uses the RTTI info.
>
> Is there some mechanism that linker will do so? At the moment we just chose 
> variant
> that would be selected by linker.  I can make the choice, but what happens 
> with non-LTO?
>>
>> >/aux/hubicka/firefox/accessible/src/generic/DocAccessible.cpp:1232:0: note: 
>> >the first different method is �HandleAccEvent�
>>
>> I don't see where this note would come from in the patch.
>>
> Sorry, diffed old tree
>
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c(revision 207702)
> +++ ipa-devirt.c(working copy)
> @@ -1675,6 +1675,132 @@
>  }
>
>
> +/* Compare two virtual tables, PREVAILING and VTABLE and output ODR
> +   violation warings.  */
> +
> +void
> +compare_virtual_tables (tree prevailing, tree vtable)
> +{
> +  tree init1 = DECL_INITIAL (prevailing), init2 = DECL_INITIAL (vtable);
> +  tree val1 = NULL, val2 = NULL;
> +  if (!DECL_VIRTUAL_P (prevailing))
> +{
> +  odr_violation_reported = true;
> +  if (warning_at (DECL_SOURCE_LOCATION (prevailing), 0,
> +"declaration %D conflict with a virtual table",
> +prevailing))
> +   inform (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT (vtable))),
> +   "a type defining the virtual table in another translation 
> unit");
> +  return;
> +}
> +  if (init1 == init2 || init2 == error_mark_node)
> +return;
> +  /* Be sure to keep virtual table contents even for external
> + vtables when they are available.  */
> +  gcc_assert (init1 && init1 != error_mark_node);
> +  if (!init2 && DECL_EXTERNAL (vtable))
> +return;
> +  if (init1 && init2
> +  && CONSTRUCTOR_NELTS (init1) == CONSTRUCTOR_NELTS (init2))
> +{
> +  unsigned i;
> +  tree field1, field2;
> +  bool matched = true;
> +
> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init1),
> +i, field1, val1)
> +   {
> + gcc_assert (!field1);
> + field2 = CONSTRUCTOR_ELT (init2, i)->index;
> + val2 = CONSTRUCTOR_ELT (init2, i)->value;
> + gcc_assert (!field2);
> + if (val2 == val1)
> +   continue;
> + if (TREE_CODE (val1) == NOP_EXPR)
> +   val1 = TREE_OPERAND (val1, 0);
> + if (TREE_CODE (val2) == NOP_EXPR)
> +   val2 = TREE_OPERAND (val2, 0);
> + /* Unwind
> + val 
> +  readonly constant
> +  arg 0 
> +  readonly
> +  arg 0 
> +  arg 0 > arg 1 >> 
> */
> +
> + while (TREE_CODE (val1) == TREE_CODE (val2)
> +&& (((TREE_CODE (val1) == MEM_REF
> +  || TREE_CODE (val1) == POINTER_PLUS_EXPR)
> + && (TREE_OPERAND (val1, 1))
> +   == TREE_OPERAND (val2, 1))
> +|| TREE_CODE (val1) == ADDR_EXPR))
> +   {
> + val1 = TREE_OPERAND (val1, 0);
> + val2 = TREE_OPERAND (val2, 0);
> + if (TREE_CODE (val1) == NOP_EXPR)
> +   val1 = TREE_OPERAND (val1, 0);
> + if (TREE_CODE (val2) == NOP_EXPR)
> +   val2 = TREE_OPERAND (val2, 0);
> +   }
> + if (val1 == val2)
> +   continue;
> + if (TREE_CODE (val2) == ADDR_EXPR)
> +   {
> + tree tmp = val1;
> + val1 = val2;
> + val2 = tmp;
> +}
> + /* Combining RTTI and non-RTTI vtables is OK.
> +??? Perhaps we can alsa (optionally) warn here?  */
> + if (TREE_CODE (val1) == ADDR_EXPR
> + && TREE_CODE (TREE_OPERAND (val1, 0)) == VAR_DECL
> + && !DECL_VIRTUAL_P (TREE_OPERAND (val1, 0))
> + && integer_zerop (val2))
> +   continue;
> + /* For some reason zeros gets NOP_EXPR wrappers.  */
> + if (integer_zerop (val1)
> + && integer_zerop (val2))
> +   continue;
> + /* Compare assembler names; this function is run during
> +declaration merging.  */
> + if (DECL_P (val1) && DECL_P (val2)
> + && DECL_ASSEMBLER_NAME (val1) == DECL_ASSEMBLER_NAME (val2))
> +   continue;
> + matched = false;
> + break;
> +   }
> +  if (!matched)
> +   {
> + odr_violation_reported = true;
> + if (warning_at (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT 
> (vtable))), 0,

Please don't add new warnings that cannot be turned off.  Maybe
add -Wodr?  (also use that on the two warnings emitted in lto-symtab.c).

Thanks,
Richard.

> +

Re: [gomp4 5/6] Initial support in the C front end for OpenACC data clauses.

2014-02-12 Thread Thomas Schwinge

Hi!

On Tue, 14 Jan 2014 16:10:07 +0100, I wrote:
>   gcc/c-family/
>   * c-pragma.h (pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_COPY,
>   PRAGMA_OMP_CLAUSE_COPYOUT, PRAGMA_OMP_CLAUSE_CREATE,
>   PRAGMA_OMP_CLAUSE_DELETE, PRAGMA_OMP_CLAUSE_DEVICEPTR,
>   PRAGMA_OMP_CLAUSE_PRESENT, PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY,
>   PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN,
>   PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYOUT, and
>   PRAGMA_OMP_CLAUSE_PRESENT_OR_CREATE.
>   gcc/c/
>   * c-parser.c (c_parser_omp_clause_name): Handle these.
>   (c_parser_oacc_data_clause, c_parser_oacc_data_clause_deviceptr):
>   New functions.
>   (c_parser_oacc_all_clauses): Handle PRAGMA_OMP_CLAUSE_COPY,
>   PRAGMA_OMP_CLAUSE_COPYIN, PRAGMA_OMP_CLAUSE_COPYOUT,
>   PRAGMA_OMP_CLAUSE_CREATE, PRAGMA_OMP_CLAUSE_DELETE,
>   PRAGMA_OMP_CLAUSE_DEVICEPTR, PRAGMA_OMP_CLAUSE_PRESENT,
>   PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY,
>   PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN,
>   PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYOUT, and
>   PRAGMA_OMP_CLAUSE_PRESENT_OR_CREATE.
>   gcc/
>   * tree-core.h (omp_clause_code): Update description for
>   OMP_CLAUSE_MAP.

This I committed to gomp-4_0-branch as r207177.

In
,
Ilmir mentioned that I'm missing to handle the »short names: pcopy,
pcopyin, pcopyout and pcreate (see 2.6.5.9 - 2.6.5.12 of OpenACC 2.0«.
Unless there are any comments, I'll soon commit the following to
gomp-4_0-branch:

commit 9a1f6c075f6198c9ae3281387b875e6012e4387e
Author: Thomas Schwinge 
Date:   Wed Feb 12 11:59:51 2014 +0100

OpenACC: pcopy, pcopyin, pcopyout, pcreate clauses.

gcc/c/
* c-parser.c (c_parser_omp_clause_name): Accept pcopy, pcopyin,
pcopyout, pcreate clauses.
(c_parser_oacc_data_clause): Update comment.
gcc/
* tree-core.h (omp_clause_code) : Mention pcopy, pcopyin,
pcopyout, pcreate OpenACC clauses.
gcc/testsuite/
* c-c++-common/goacc/pcopy.c: New file.
* c-c++-common/goacc/pcopyin.c: Likewise.
* c-c++-common/goacc/pcopyout.c: Likewise.
* c-c++-common/goacc/pcreate.c: Likewise.

diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 6e89471..f401cef 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -9671,13 +9671,17 @@ c_parser_omp_clause_name (c_parser *parser)
result = PRAGMA_OMP_CLAUSE_PARALLEL;
  else if (!strcmp ("present", p))
result = PRAGMA_OMP_CLAUSE_PRESENT;
- else if (!strcmp ("present_or_copy", p))
+ else if (!strcmp ("present_or_copy", p)
+  || !strcmp ("pcopy", p))
result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY;
- else if (!strcmp ("present_or_copyin", p))
+ else if (!strcmp ("present_or_copyin", p)
+  || !strcmp ("pcopyin", p))
result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN;
- else if (!strcmp ("present_or_copyout", p))
+ else if (!strcmp ("present_or_copyout", p)
+  || !strcmp ("pcopyout", p))
result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYOUT;
- else if (!strcmp ("present_or_create", p))
+ else if (!strcmp ("present_or_create", p)
+  || !strcmp ("pcreate", p))
result = PRAGMA_OMP_CLAUSE_PRESENT_OR_CREATE;
  else if (!strcmp ("private", p))
result = PRAGMA_OMP_CLAUSE_PRIVATE;
@@ -9870,9 +9874,13 @@ c_parser_omp_var_list_parens (c_parser *parser, enum 
omp_clause_code kind,
delete ( variable-list )
present ( variable-list )
present_or_copy ( variable-list )
+ pcopy ( variable-list )
present_or_copyin ( variable-list )
+ pcopyin ( variable-list )
present_or_copyout ( variable-list )
-   present_or_create ( variable-list ) */
+ pcopyout ( variable-list )
+   present_or_create ( variable-list )
+ pcreate ( variable-list ) */
 
 static tree
 c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
diff --git gcc/testsuite/c-c++-common/goacc/pcopy.c 
gcc/testsuite/c-c++-common/goacc/pcopy.c
new file mode 100644
index 000..fd16525
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/pcopy.c
@@ -0,0 +1,11 @@
+/* { dg-additional-options "-fdump-tree-original" } */
+
+void
+f (char *cp)
+{
+#pragma acc parallel pcopy(cp[3:5])
+  ;
+}
+
+/* { dg-final { scan-tree-dump-times "#pragma acc parallel 
map\\(tofrom:\\*\\(cp \\+ 3\\) \\\[len: 5]\\) map\\(alloc:cp \\\[pointer 
assign, bias: 3]\\)" 1 "original" } } */
+/* { dg-final { cleanup-tree-dump "original" } } */
diff --git gcc/testsuite/c-c++-common/goacc/pcopyin.c 
gcc/testsuite/c-c++-common/goacc/pcopyin.c
new file mode 100644
index 000..c009d24
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/pcopyin.c
@@ -0,0 +1,11 @@
+/* { dg-additional-options "-fdump-tree-original" } */
+
+void
+f (char *cp)
+{
+#pragma acc parallel pcopyin(cp[4:6])
+  ;
+}
+
+/* { dg-

Re: [RFC, patch] Detect lack of 32-bit devel environment on x86_64-linux targets

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 11:52:47AM +0100, Richard Biener wrote:
> On Tue, Feb 11, 2014 at 5:17 PM, Jakub Jelinek  wrote:
> > On Fri, Jan 31, 2014 at 03:49:40PM +0100, Richard Biener wrote:
> >> On Fri, Jan 31, 2014 at 3:45 PM, FX  wrote:
> >> >> I've just seen that an explicit --enable-multilib is a way to do that.
> >> >
> >> > Yes, I was writing that as a reply when I received your email. (Also, 
> >> > it's written in the configure error message.)
> >>
> >> Yeah - you know, that message is quite long and somehow I didn't read it
> >> carefully.  I suspect that will happen to others, too, so we'll get some
> >> extra complaints from that ;)
> >>
> >> >> Btw, doing the configure check exactly after all-stage1-gcc should be
> >> >> an early enough and a serialization point, no?  There you can do the
> >> >> check even for when cross-compiling.
> >> >
> >> > Well, you've already built a whole stage, so it's not so early, is it?
> >>
> >> Well, building the stage1 compiler is probably the fastest thing nowadays
> >> (it didn't yet build the target libraries for stage1 with the stage1,
> >> unoptimized
> >> and checking-enabled compiler - which is where it would fail in the odd
> >> way which is what you want to improve).
> >>
> >> As I said, you can't "properly" check it at the point you are checking.
> >> Which is why I complain - you're not checking this properly!
> >>
> >> Anyway, I've fixed the "bug" on our side with --enable-multilib.
> >
> > Just hit the same thing, while I have (in mock) 32-bit devel libc installed,
> > I don't have 32-bit libgcc_s installed (what for, it will be built by gcc).
> >
> > Please revert it, or at least improve it (e.g. by trying to build
> > with -static-libgcc at least).
> 
> I wouldn't have static 32bit libgcc installed either.

So perhaps turn that into a check if preprocessing of #include 
works with -m32 (and never complain when building --with-sysroot*)?
I mean, if features.h doesn't preprocess successfully (because of missing
/usr/include/gnu/stubs-32.h), then as long as the compiler is going to
include the same headers and not some sysroot, multilib building would fail
in that case too.

Jakub

Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-12 Thread Richard Biener

On Tue, Feb 11, 2014 at 6:15 PM, Eric Botcazou  wrote:
>> Hmm.  The intent was of course to only allow truly no-op converts via
>> VIEW_CONVERT_EXPR - that is, the size of the operand type and the
>> result type should be the same.  So, isn't SRA doing it wrong when
>> creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?
>
> That's debatable IMO if the 4-byte type has 3-byte precision, but I don't have
> a strong opinion so I can try to fix it in SRA, although it will be weird to
> do low-level fiddling because of precision and size at the Tree level.

That's true.  What I wonder is why the stmt checker doesn't trip.  Probably
because while SRA scalarizes the thing the result isn't rewritten into
SSA form?  So there may be a testcase where that happens and we'd even
ICE?

Richard.

> --
> Eric Botcazou

[PATCH][AArch64] vqneg and vqabs intrinsics implementation

2014-02-12 Thread Alex Velenko


Hi,

This patch implements vqneg_s64, vqnegd_s64, vqabs_s64 and
vqabsd_s64 AArch64 intrinsics. Regression tests added.
Run full regression with no regressions.

Is patch OK?

Thanks,
Alex

gcc/

2014-02-12  Alex Velenko  

* gcc/config/aarch64/aarch64-simd.md (aarch64_s):
Pattern extended.
* config/aarch64/aarch64-simd-builtins.def (sqneg): Iterator
extended.
(sqabs): Likewise.
* config/aarch64/arm_neon.h (vqneg_s64): New intrinsic.
(vqnegd_s64): Likewise.
(vqabs_s64): Likewise.
(vqabsd_s64): Likewise.

gcc/testsuite/

2014-02-12  Alex Velenko  

*gcc.target/aarch64/vqneg_s64_1.c: New testcase.
*gcc.target/aarch64/vqabs_s64_1.c: New testcase.
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index e5f71b479ccfd1a9cbf84aed0f96b49762053f59..b3d0989f1b3bce1cab301f5fdb522324ed758c87 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -142,8 +142,8 @@
   BUILTIN_VSQN_HSDI (UNOP, sqmovn, 0)
   BUILTIN_VSQN_HSDI (UNOP, uqmovn, 0)
   /* Implemented by aarch64_s.  */
-  BUILTIN_VSDQ_I_BHSI (UNOP, sqabs, 0)
-  BUILTIN_VSDQ_I_BHSI (UNOP, sqneg, 0)
+  BUILTIN_VSDQ_I (UNOP, sqabs, 0)
+  BUILTIN_VSDQ_I (UNOP, sqneg, 0)
 
   BUILTIN_VSD_HSI (QUADOP, sqdmlal_lane, 0)
   BUILTIN_VSD_HSI (QUADOP, sqdmlsl_lane, 0)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 7378da9122d550f869c3e830e3e5a7681e7581f6..8a63dcdae8376b935c004fc84081e222d0a9a720 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2585,9 +2585,9 @@
 ;; q
 
 (define_insn "aarch64_s"
-  [(set (match_operand:VSDQ_I_BHSI 0 "register_operand" "=w")
-	(UNQOPS:VSDQ_I_BHSI
-	  (match_operand:VSDQ_I_BHSI 1 "register_operand" "w")))]
+  [(set (match_operand:VSDQ_I 0 "register_operand" "=w")
+	(UNQOPS:VSDQ_I
+	  (match_operand:VSDQ_I 1 "register_operand" "w")))]
   "TARGET_SIMD"
   "s\\t%0, %1"
   [(set_attr "type" "neon_")]
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 6af99361b8e265f66026dc506cfc23f044d153b4..7347bc0b18968d69b1c66ec75d30facb59450936 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -2318,6 +2318,12 @@ vqneg_s32 (int32x2_t __a)
   return (int32x2_t) __builtin_aarch64_sqnegv2si (__a);
 }
 
+__extension__ static __inline int64x1_t __attribute__ ((__always_inline__))
+vqneg_s64 (int64x1_t __a)
+{
+  return __builtin_aarch64_sqnegdi (__a);
+}
+
 __extension__ static __inline int8x16_t __attribute__ ((__always_inline__))
 vqnegq_s8 (int8x16_t __a)
 {
@@ -2354,6 +2360,12 @@ vqabs_s32 (int32x2_t __a)
   return (int32x2_t) __builtin_aarch64_sqabsv2si (__a);
 }
 
+__extension__ static __inline int64x1_t __attribute__ ((__always_inline__))
+vqabs_s64 (int64x1_t __a)
+{
+  return __builtin_aarch64_sqabsdi (__a);
+}
+
 __extension__ static __inline int8x16_t __attribute__ ((__always_inline__))
 vqabsq_s8 (int8x16_t __a)
 {
@@ -20943,6 +20955,12 @@ vqabss_s32 (int32x1_t __a)
   return (int32x1_t) __builtin_aarch64_sqabssi (__a);
 }
 
+__extension__ static __inline int64_t __attribute__ ((__always_inline__))
+vqabsd_s64 (int64_t __a)
+{
+  return __builtin_aarch64_sqabsdi (__a);
+}
+
 /* vqadd */
 
 __extension__ static __inline int8x1_t __attribute__ ((__always_inline__))
@@ -21561,6 +21579,12 @@ vqnegs_s32 (int32x1_t __a)
   return (int32x1_t) __builtin_aarch64_sqnegsi (__a);
 }
 
+__extension__ static __inline int64_t __attribute__ ((__always_inline__))
+vqnegd_s64 (int64_t __a)
+{
+  return __builtin_aarch64_sqnegdi (__a);
+}
+
 /* vqrdmulh */
 
 __extension__ static __inline int16x4_t __attribute__ ((__always_inline__))
diff --git a/gcc/testsuite/gcc.target/aarch64/vqabs_s64_1.c b/gcc/testsuite/gcc.target/aarch64/vqabs_s64_1.c
new file mode 100644
index ..3ea532278d6db7aedc0b6cc6c2498658ad80a72b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vqabs_s64_1.c
@@ -0,0 +1,54 @@
+/* Test vqabs_s64 intrinsics work correctly.  */
+/* { dg-do run } */
+/* { dg-options "--save-temps" } */
+
+#include 
+
+extern void abort (void);
+
+int __attribute__ ((noinline))
+test_vqabs_s64 (int64x1_t passed, int64_t expected)
+{
+  return vget_lane_s64 (vqabs_s64 (passed), 0) != expected;
+}
+
+int __attribute__ ((noinline))
+test_vqabsd_s64 (int64_t passed, int64_t expected)
+{
+  return vqabsd_s64 (passed) != expected;
+}
+
+/* { dg-final { scan-assembler-times "sqabs\\td\[0-9\]+, d\[0-9\]+" 2 } } */
+
+int
+main (int argc, char **argv)
+{
+  /* Basic test.  */
+  if (test_vqabs_s64 (vcreate_s64 (-1), 1))
+abort ();
+  if (test_vqabsd_s64 (-1, 1))
+abort ();
+
+  /* Getting absolute value of min int64_t.
+ Note, exact result cannot be represented in int64_t,
+ so max int64_t is expected.  */
+  if (test_vqabs_s64 (vcreate_s64 (0x8000), 0x7fff))
+a

Re: [RFC, patch] Detect lack of 32-bit devel environment on x86_64-linux targets

2014-02-12 Thread Richard Biener

On Tue, Feb 11, 2014 at 5:17 PM, Jakub Jelinek  wrote:
> On Fri, Jan 31, 2014 at 03:49:40PM +0100, Richard Biener wrote:
>> On Fri, Jan 31, 2014 at 3:45 PM, FX  wrote:
>> >> I've just seen that an explicit --enable-multilib is a way to do that.
>> >
>> > Yes, I was writing that as a reply when I received your email. (Also, it's 
>> > written in the configure error message.)
>>
>> Yeah - you know, that message is quite long and somehow I didn't read it
>> carefully.  I suspect that will happen to others, too, so we'll get some
>> extra complaints from that ;)
>>
>> >> Btw, doing the configure check exactly after all-stage1-gcc should be
>> >> an early enough and a serialization point, no?  There you can do the
>> >> check even for when cross-compiling.
>> >
>> > Well, you've already built a whole stage, so it's not so early, is it?
>>
>> Well, building the stage1 compiler is probably the fastest thing nowadays
>> (it didn't yet build the target libraries for stage1 with the stage1,
>> unoptimized
>> and checking-enabled compiler - which is where it would fail in the odd
>> way which is what you want to improve).
>>
>> As I said, you can't "properly" check it at the point you are checking.
>> Which is why I complain - you're not checking this properly!
>>
>> Anyway, I've fixed the "bug" on our side with --enable-multilib.
>
> Just hit the same thing, while I have (in mock) 32-bit devel libc installed,
> I don't have 32-bit libgcc_s installed (what for, it will be built by gcc).
>
> Please revert it, or at least improve it (e.g. by trying to build
> with -static-libgcc at least).

I wouldn't have static 32bit libgcc installed either.

Richard.

Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 11:42:09AM +0100, Richard Biener wrote:
> But as the testcase was supposed to test field-sensitive points-to
> I chose to disable SRA for the testcase (and queue pass interchange
> for 4.10 - unless you think it's ok now - I think it's harmless
> and should only improve early FRE results when SRA happens).

Yeah, I'd prefer to queue pass reorderings to 5.0 at this point.
> 
> Patch below in testing now.

LGTM.

> 2014-02-12  Richard Biener  
> 
>   PR middle-end/60092
>   * gimple-low.c (lower_builtin_posix_memalign): Lower conditional
>   of posix_memalign being successful.
>   (lower_stmt): Restrict lowering of posix_memalign to when
>   -ftree-bit-ccp is enabled.
> 
>   * gcc.dg/torture/pr60092.c: New testcase.
>   * gcc.dg/tree-ssa/alias-31.c: Disable SRA.

Jakub

Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-12 Thread Richard Biener

On Wed, 12 Feb 2014, Richard Biener wrote:

> On Wed, 12 Feb 2014, Jakub Jelinek wrote:
> 
> > On Wed, Feb 12, 2014 at 10:30:01AM +0100, Richard Biener wrote:
> > > Bah.  I am testing the following.
> > 
> > But then there is no guarantee that ptr is aligned after the call.
> > char buf[32] __attribute__((aligned (32)));
> > int
> > foo (void)
> > {
> >   void *ptr = buf + 1;
> >   posix_memalign (&ptr, 32, -1);
> >   /* Assume posix_memalign has failed.  */
> >   return ((__UINTPTR_TYPE__)ptr) & 31;
> > }
> > 
> > This should return 1, but supposedly doesn't with the optimization.
> > So, either we need to revert the lowering, or perhaps do it only if
> > the original posix_memalign has a lhs and do it like:
> >   void *tmp;
> >   int res = posix_memalign (&tmp, align, size);
> >   if (!res)
> > ptr = __builtin_assume_aligned (tmp, align);
> > or so (no need to initialize tmp and copy it back for the failure case,
> > but perhaps it would result in better code).
> 
> Yeah.  It seems to work modulo alias-31.c (checking what happens here,
> alias info looks good).  Prelimiary patch below.

Ok, I've analyzed what happens here and it's a pass ordering issue

  NEXT_PASS (pass_build_ealias);
  NEXT_PASS (pass_sra_early);
  NEXT_PASS (pass_fre);

PTA can figure out all points-to sets properly but when SRA
scalarizes the struct with the two posix_memaligned pointers
the replacements get written into SSA form and PHI nodes are
inserted for them (as assignment is now conditional).  That
process cannot reliably (and does never) set points-to info
for those vars.  FRE then fails to disambiguate.

Now, there is no reason why pass_build_ealias should be before
pass_sra_early - in fact that's a useless pessimization.

But as the testcase was supposed to test field-sensitive points-to
I chose to disable SRA for the testcase (and queue pass interchange
for 4.10 - unless you think it's ok now - I think it's harmless
and should only improve early FRE results when SRA happens).

Patch below in testing now.

Thanks,
Richard.

2014-02-12  Richard Biener  

PR middle-end/60092
* gimple-low.c (lower_builtin_posix_memalign): Lower conditional
of posix_memalign being successful.
(lower_stmt): Restrict lowering of posix_memalign to when
-ftree-bit-ccp is enabled.

* gcc.dg/torture/pr60092.c: New testcase.
* gcc.dg/tree-ssa/alias-31.c: Disable SRA.

Index: gcc/gimple-low.c
===
*** gcc/gimple-low.c(revision 207714)
--- gcc/gimple-low.c(working copy)
*** lower_stmt (gimple_stmt_iterator *gsi, s
*** 336,342 
data->cannot_fallthru = false;
return;
  }
!   else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN)
  {
lower_builtin_posix_memalign (gsi);
return;
--- 336,343 
data->cannot_fallthru = false;
return;
  }
!   else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN
!&& flag_tree_bit_ccp)
  {
lower_builtin_posix_memalign (gsi);
return;
*** lower_builtin_setjmp (gimple_stmt_iterat
*** 781,817 
  }
  
  /* Lower calls to posix_memalign to
!  posix_memalign (ptr, align, size);
!  tem = *ptr;
!  tem = __builtin_assume_aligned (tem, align);
!  *ptr = tem;
 or to
   void *tem;
!  posix_memalign (&tem, align, size);
!  ttem = tem;
!  ttem = __builtin_assume_aligned (ttem, align);
!  ptr = tem;
 in case the first argument was &ptr.  That way we can get at the
 alignment of the heap pointer in CCP.  */
  
  static void
  lower_builtin_posix_memalign (gimple_stmt_iterator *gsi)
  {
!   gimple stmt = gsi_stmt (*gsi);
!   tree pptr = gimple_call_arg (stmt, 0);
!   tree align = gimple_call_arg (stmt, 1);
tree ptr = create_tmp_reg (ptr_type_node, NULL);
if (TREE_CODE (pptr) == ADDR_EXPR)
  {
tree tem = create_tmp_var (ptr_type_node, NULL);
TREE_ADDRESSABLE (tem) = 1;
!   gimple_call_set_arg (stmt, 0, build_fold_addr_expr (tem));
stmt = gimple_build_assign (ptr, tem);
  }
else
  stmt = gimple_build_assign (ptr,
fold_build2 (MEM_REF, ptr_type_node, pptr,
 build_int_cst (ptr_type_node, 0)));
gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
stmt = gimple_build_call (builtin_decl_implicit (BUILT_IN_ASSUME_ALIGNED),
2, ptr, align);
--- 782,828 
  }
  
  /* Lower calls to posix_memalign to
!  res = posix_memalign (ptr, align, size);
!  if (res == 0)
!*ptr = __builtin_assume_aligned (*ptr, align);
 or to
   void *tem;
!  res = posix_memalign (&tem, align, size);
!  if (res == 0)
!ptr = __builtin_a

[PATCH] (gcc-4.8) S390: Fix crash with -mhotpatch and gfortran

2014-02-12 Thread Dominik Vogt

On Wed, Feb 12, 2014 at 11:28:38AM +0100, Dominik Vogt wrote:
> The attached patch fixes a crash if gfortran encounters a nested
> function when -mhotpatch is enabled.  (It slightly improves the
> warning message too.)
> 
> This patch affects s390 only.  Andreas Krebbel will commit the
> patch soon, if there are no objections.

Same patch for gcc-4.8.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
>From 0d9acbc61fa99ace25bc26d8af4f252d1c8c792d Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 12 Feb 2014 05:53:34 +
Subject: [PATCH 1/2] S390: Fix crash when -mhotpatch encounters nested
 functions

(e.g. with gfortran).
---
 gcc/config/s390/s390.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 2ba1d8a..88c97c6 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5475,9 +5475,8 @@ s390_asm_output_function_label (FILE *asm_out_file, const char *fname,
   if (hotpatch_trampoline_halfwords >= 0
 	  && decl_function_context (decl) != NULL_TREE)
 	{
-	  warning_at (0, DECL_SOURCE_LOCATION (decl),
-		  "hotpatch_prologue is not compatible with nested"
-		  " function");
+	  warning_at (0, OPT_mhotpatch,
+		  "hotpatching is not compatible with nested functions");
 	  hotpatch_trampoline_halfwords = -1;
 	}
 }
-- 
1.8.3.1

2014-02-12  Dominik Vogt  

* config/s390/s390.c (s390_asm_output_function_label):
fix crash caused by bad second argument to warning_at() with -mhotpatch
and nested functions (e.g. with gfortran)

[PATCH] S390: Fix crash with -mhotpatch and gfortran

2014-02-12 Thread Dominik Vogt

The attached patch fixes a crash if gfortran encounters a nested
function when -mhotpatch is enabled.  (It slightly improves the
warning message too.)

This patch affects s390 only.  Andreas Krebbel will commit the
patch soon, if there are no objections.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
>From 352cf98ffb32c1f0fa8b961314faf6e6afffb661 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 12 Feb 2014 05:53:34 +
Subject: [PATCH 1/2] S390: Fix crash when -mhotpatch encounters nested
 functions

(e.g. with gfortran).
---
 gcc/config/s390/s390.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 32a25a4..9ae8ffd 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5510,9 +5510,8 @@ s390_asm_output_function_label (FILE *asm_out_file, const char *fname,
   if (hotpatch_trampoline_halfwords >= 0
 	  && decl_function_context (decl) != NULL_TREE)
 	{
-	  warning_at (0, DECL_SOURCE_LOCATION (decl),
-		  "hotpatch_prologue is not compatible with nested"
-		  " function");
+	  warning_at (0, OPT_mhotpatch,
+		  "hotpatching is not compatible with nested functions");
 	  hotpatch_trampoline_halfwords = -1;
 	}
 }
-- 
1.8.3.1

2014-02-12  Dominik Vogt  

* config/s390/s390.c (s390_asm_output_function_label):
fix crash caused by bad second argument to warning_at() with -mhotpatch
and nested functions (e.g. with gfortran)

Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-12 Thread Richard Biener

On Wed, 12 Feb 2014, Jakub Jelinek wrote:

> On Wed, Feb 12, 2014 at 10:30:01AM +0100, Richard Biener wrote:
> > Bah.  I am testing the following.
> 
> But then there is no guarantee that ptr is aligned after the call.
> char buf[32] __attribute__((aligned (32)));
> int
> foo (void)
> {
>   void *ptr = buf + 1;
>   posix_memalign (&ptr, 32, -1);
>   /* Assume posix_memalign has failed.  */
>   return ((__UINTPTR_TYPE__)ptr) & 31;
> }
> 
> This should return 1, but supposedly doesn't with the optimization.
> So, either we need to revert the lowering, or perhaps do it only if
> the original posix_memalign has a lhs and do it like:
>   void *tmp;
>   int res = posix_memalign (&tmp, align, size);
>   if (!res)
> ptr = __builtin_assume_aligned (tmp, align);
> or so (no need to initialize tmp and copy it back for the failure case,
> but perhaps it would result in better code).

Yeah.  It seems to work modulo alias-31.c (checking what happens here,
alias info looks good).  Prelimiary patch below.

2014-02-12  Richard Biener  

PR middle-end/60092
* gimple-low.c (lower_builtin_posix_memalign): Lower conditional
of posix_memalign being successful.
(lower_stmt): Restrict lowering of posix_memalign to when
-ftree-bit-ccp is enabled.

* gcc.dg/pr60092.c: New testcase.

Index: gcc/gimple-low.c
===
*** gcc/gimple-low.c(revision 207714)
--- gcc/gimple-low.c(working copy)
*** lower_stmt (gimple_stmt_iterator *gsi, s
*** 336,342 
data->cannot_fallthru = false;
return;
  }
!   else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN)
  {
lower_builtin_posix_memalign (gsi);
return;
--- 336,343 
data->cannot_fallthru = false;
return;
  }
!   else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN
!&& flag_tree_bit_ccp)
  {
lower_builtin_posix_memalign (gsi);
return;
*** lower_builtin_setjmp (gimple_stmt_iterat
*** 781,817 
  }
  
  /* Lower calls to posix_memalign to
!  posix_memalign (ptr, align, size);
!  tem = *ptr;
!  tem = __builtin_assume_aligned (tem, align);
!  *ptr = tem;
 or to
   void *tem;
!  posix_memalign (&tem, align, size);
!  ttem = tem;
!  ttem = __builtin_assume_aligned (ttem, align);
!  ptr = tem;
 in case the first argument was &ptr.  That way we can get at the
 alignment of the heap pointer in CCP.  */
  
  static void
  lower_builtin_posix_memalign (gimple_stmt_iterator *gsi)
  {
!   gimple stmt = gsi_stmt (*gsi);
!   tree pptr = gimple_call_arg (stmt, 0);
!   tree align = gimple_call_arg (stmt, 1);
tree ptr = create_tmp_reg (ptr_type_node, NULL);
if (TREE_CODE (pptr) == ADDR_EXPR)
  {
tree tem = create_tmp_var (ptr_type_node, NULL);
TREE_ADDRESSABLE (tem) = 1;
!   gimple_call_set_arg (stmt, 0, build_fold_addr_expr (tem));
stmt = gimple_build_assign (ptr, tem);
  }
else
  stmt = gimple_build_assign (ptr,
fold_build2 (MEM_REF, ptr_type_node, pptr,
 build_int_cst (ptr_type_node, 0)));
gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
stmt = gimple_build_call (builtin_decl_implicit (BUILT_IN_ASSUME_ALIGNED),
2, ptr, align);
--- 782,828 
  }
  
  /* Lower calls to posix_memalign to
!  res = posix_memalign (ptr, align, size);
!  if (res == 0)
!*ptr = __builtin_assume_aligned (*ptr, align);
 or to
   void *tem;
!  res = posix_memalign (&tem, align, size);
!  if (res == 0)
!ptr = __builtin_assume_aligned (tem, align);
 in case the first argument was &ptr.  That way we can get at the
 alignment of the heap pointer in CCP.  */
  
  static void
  lower_builtin_posix_memalign (gimple_stmt_iterator *gsi)
  {
!   gimple stmt, call = gsi_stmt (*gsi);
!   tree pptr = gimple_call_arg (call, 0);
!   tree align = gimple_call_arg (call, 1);
!   tree res = gimple_call_lhs (call);
tree ptr = create_tmp_reg (ptr_type_node, NULL);
if (TREE_CODE (pptr) == ADDR_EXPR)
  {
tree tem = create_tmp_var (ptr_type_node, NULL);
TREE_ADDRESSABLE (tem) = 1;
!   gimple_call_set_arg (call, 0, build_fold_addr_expr (tem));
stmt = gimple_build_assign (ptr, tem);
  }
else
  stmt = gimple_build_assign (ptr,
fold_build2 (MEM_REF, ptr_type_node, pptr,
 build_int_cst (ptr_type_node, 0)));
+   if (res == NULL_TREE)
+ {
+   res = create_tmp_reg (integer_type_node, NULL);
+   gimple_call_set_lhs (call, res);
+ }
+   tree align_label = create_artificial_label (UNKNOWN_LOCATION);
+

Re: Simple install hook

2014-02-12 Thread Philip Herron

On 12 February 2014 00:52, David Malcolm  wrote:
> On Tue, 2014-02-11 at 16:51 +, Philip Herron wrote:
> [adding the j...@gcc.gnu.org ML to the CC]
>
>> Added install hook:
>
> Thanks!
>
> I don't know that this is needed for a 3-line patch, but have you done
> the copyright assignment paperwork for GCC contribution?  (I hope to
> merge my branch into gcc trunk at some point).  [Also, I'd love to have
> more, larger, patches from you for the jit branch!]

Yep i still have GCC copyright assignment.

> Excellent!  Looks promising - though it looks like the backend is all
> stubbed out at the moment.

Yeah only got started on it yesterday. Just need something to work
with the jit api to test and look to add features for.

>
> Note that the JIT API isn't frozen yet.  I try to remember to add "API
> change" to the subject line when posting my commits, but I don't always
> remember.
>

Yeah good point.

> Let me know if you have any questions on how the JIT API works - or
> input on how it *should* work.
>
> FWIW I've been experimentally porting GNU Octave's LLVM-based JIT to
> using libgccjit, and finding and fixing various issues in the latter on
> the way - that's been driving a lot of the patches to the jit branch
> lately.
>

Thats a good idea.

>> Was also considering some kind of libopcodes work to assemble the code
>> in memory instead of creating a .so in /tmp. Not sure if i know what i
>> am doing enough there but it might be more elegant for stuff using
>> this front-end.
>
> My thinking here was that the core code of the GNU assembler could gain
> the option of being built as a shared library, and having to isolate
> state in a "context" object, and we could try to hack the two projects
> into meeting in the middle.  Large amount of work though (and a
> different mailing list), hence the crude .so hack for now.
>

Yeah i was looking into it at the time and found, that trying to make
gas compile to a .so was a bit of a nightmare i couldn't get the
Makefile.am to regenerate its a funny autotools setup. But found
aparently libopcodes:

"The opcodes library contains functionality to assemble and
disassemble human readable assembly language to and from raw machine
code. "

http://www.toothycat.net/wiki/wiki.pl?Binutils

Might be something to look into.

--Phil

[C PATCH] Warn about variadic main (PR c/60156)

2014-02-12 Thread Marek Polacek

I figured it might be a good idea to warn about variadic main decl
(well, not in freestanding environment where it's
implementation-defined).

Regtested/bootstrapped on x86_64-linux, ok for 5.0?

2014-02-12  Marek Polacek  

PR c/60156
c-family/
* c-common.c (check_main_parameter_types): Warn about variadic main.
testsuite/
* c-c++-common/pr60156.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 5cf285b..bdf2852 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -2228,6 +2228,10 @@ check_main_parameter_types (tree decl)
   if (argct > 0 && (argct < 2 || argct > 3))
 pedwarn (input_location, OPT_Wmain,
 "%q+D takes only zero or two arguments", decl);
+
+  if (stdarg_p (TREE_TYPE (decl)))
+pedwarn (input_location, OPT_Wmain,
+"%q+D declared as variadic function", decl);
 }
 
 /* vector_targets_convertible_p is used for vector pointer types.  The
diff --git gcc/testsuite/c-c++-common/pr60156.c 
gcc/testsuite/c-c++-common/pr60156.c
index e69de29..1e8204c 100644
--- gcc/testsuite/c-c++-common/pr60156.c
+++ gcc/testsuite/c-c++-common/pr60156.c
@@ -0,0 +1,9 @@
+/* PR c/60156 */
+/* { dg-do compile } */
+/* { dg-options "-Wpedantic" } */
+
+int
+main (int argc, char *argv[], ...) /* { dg-warning "declared as variadic 
function" } */
+{
+  return 0;
+}

Marek

Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 10:30:01AM +0100, Richard Biener wrote:
> Bah.  I am testing the following.

But then there is no guarantee that ptr is aligned after the call.
char buf[32] __attribute__((aligned (32)));
int
foo (void)
{
  void *ptr = buf + 1;
  posix_memalign (&ptr, 32, -1);
  /* Assume posix_memalign has failed.  */
  return ((__UINTPTR_TYPE__)ptr) & 31;
}

This should return 1, but supposedly doesn't with the optimization.
So, either we need to revert the lowering, or perhaps do it only if
the original posix_memalign has a lhs and do it like:
  void *tmp;
  int res = posix_memalign (&tmp, align, size);
  if (!res)
ptr = __builtin_assume_aligned (tmp, align);
or so (no need to initialize tmp and copy it back for the failure case,
but perhaps it would result in better code).

Jakub

Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-12 Thread Richard Biener

On Wed, 12 Feb 2014, Andreas Krebbel wrote:

> On 07/02/14 10:33, Richard Biener wrote:
> > + static void
> > + lower_builtin_posix_memalign (gimple_stmt_iterator *gsi)
> > + {
> > +   gimple stmt = gsi_stmt (*gsi);
> > +   tree pptr = gimple_call_arg (stmt, 0);
> > +   tree align = gimple_call_arg (stmt, 1);
> > +   tree ptr = create_tmp_reg (ptr_type_node, NULL);
> > +   if (TREE_CODE (pptr) == ADDR_EXPR)
> > + {
> > +   tree tem = create_tmp_var (ptr_type_node, NULL);
> > +   TREE_ADDRESSABLE (tem) = 1;
> > +   gimple_call_set_arg (stmt, 0, build_fold_addr_expr (tem));
> > +   stmt = gimple_build_assign (ptr, tem);
> > + }
> > +   else
> > + stmt = gimple_build_assign (ptr,
> > +   fold_build2 (MEM_REF, ptr_type_node, pptr,
> > +build_int_cst (ptr_type_node, 0)));
> > +   gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > +   stmt = gimple_build_call (builtin_decl_implicit 
> > (BUILT_IN_ASSUME_ALIGNED),
> > +   2, ptr, align);
> > +   gimple_call_set_lhs (stmt, ptr);
> > +   gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > +   stmt = gimple_build_assign (fold_build2 (MEM_REF, ptr_type_node, pptr,
> > +  build_int_cst (ptr_type_node, 0)),
> > + ptr);
> > +   gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > + }
> 
> Hi,
> 
> creating a new var for the output parameter throws away the value already in 
> there.  But this value
> must not change when posix_memalign e.g. returns with ENOMEM.  It breaks the 
> glibc posix_memalign
> testcase on s390 (somewhat reduced here):

Bah.  I am testing the following.

Richard.

2014-02-12  Richard Biener  

PR middle-end/60092
* gimple-low.c (lower_builtin_posix_memalign): Initialize the new
temporary properly.
(lower_stmt): Restrict lowering of posix_memalign to when
-ftree-bit-ccp is enabled.

* gcc.dg/pr60092.c: New testcase.

Index: gcc/gimple-low.c
===
*** gcc/gimple-low.c(revision 207714)
--- gcc/gimple-low.c(working copy)
*** lower_stmt (gimple_stmt_iterator *gsi, s
*** 336,342 
data->cannot_fallthru = false;
return;
  }
!   else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN)
  {
lower_builtin_posix_memalign (gsi);
return;
--- 336,343 
data->cannot_fallthru = false;
return;
  }
!   else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN
!&& flag_tree_bit_ccp)
  {
lower_builtin_posix_memalign (gsi);
return;
*** lower_builtin_setjmp (gimple_stmt_iterat
*** 786,792 
   tem = __builtin_assume_aligned (tem, align);
   *ptr = tem;
 or to
!  void *tem;
   posix_memalign (&tem, align, size);
   ttem = tem;
   ttem = __builtin_assume_aligned (ttem, align);
--- 787,793 
   tem = __builtin_assume_aligned (tem, align);
   *ptr = tem;
 or to
!  void *tem = ptr;
   posix_memalign (&tem, align, size);
   ttem = tem;
   ttem = __builtin_assume_aligned (ttem, align);
*** lower_builtin_posix_memalign (gimple_stm
*** 806,811 
--- 807,819 
tree tem = create_tmp_var (ptr_type_node, NULL);
TREE_ADDRESSABLE (tem) = 1;
gimple_call_set_arg (stmt, 0, build_fold_addr_expr (tem));
+   /* Initialize tem, ptr has to be unchanged if posix_memalloc fails.  */
+   stmt = gimple_build_assign (ptr,
+ fold_build2 (MEM_REF, ptr_type_node, pptr,
+  build_int_cst (ptr_type_node, 
0)));
+   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+   stmt = gimple_build_assign (tem, ptr);
+   gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
stmt = gimple_build_assign (ptr, tem);
  }
else
Index: gcc/testsuite/gcc.dg/torture/pr60092.c
===
*** gcc/testsuite/gcc.dg/torture/pr60092.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr60092.c  (working copy)
***
*** 0 
--- 1,21 
+ /* { dg-do run } */
+ /* { dg-require-weak "" } */
+ 
+ typedef __SIZE_TYPE__ size_t;
+ extern int posix_memalign(void **memptr, size_t alignment, size_t size) 
__attribute__((weak));
+ extern void abort(void);
+ int
+ main (void)
+ {
+   void *p;
+   int ret;
+ 
+   if (!posix_memalign)
+ return 0;
+ 
+   p = (void *)&ret;
+   ret = posix_memalign (&p, sizeof (void *), -1);
+   if (p != (void *)&ret)
+ abort ();
+   return 0;
+ }

Re: Fix PR rtl-optimization/60116

2014-02-12 Thread Eric Botcazou

> Thanks, just wonder if reg_set_p is the right predicate for the REG_DEAD
> notes, don't we want to check if the register in REG_DEAD note isn't used
> in undobuf.other_insn rather than set?

You're right, the correct predicate is reg_referenced_p, I'll adjust.

-- 
Eric Botcazou

Re: Fix broken build for AVR and SPU targets

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 10:16:37AM +0100, Marek Polacek wrote:
> On Wed, Feb 12, 2014 at 01:43:39PM +0530, Senthil Kumar Selvaraj wrote:
> > The below patch fixes the build for AVR and SPU targets, which got broken
> > when the signature of build_function_call_vec changed. The patch passes
> > vNULL for the extra parameter added (arg_loc), which I hope is ok for
> > builtins?
> 
> Sorry.
>  
> > If ok, could someone commit please? I don't have commit access.
> 
> I can commit it for you, if someone acks it (looks obvious enough to
> me...).

Ok for trunk.

> > 2014-02-12  Senthil Kumar Selvaraj  
> > 
> > * config/avr/avr-c.c (avr_resolve_overloaded_builtin): Pass vNULL for 
> > arg_loc.
> > * config/spu/spu-c.c (spu_resolve_overloaded_builtin): Likewise.
> > 
> > 
> > diff --git gcc/config/avr/avr-c.c gcc/config/avr/avr-c.c
> > index 98650e0..101d280 100644
> > --- gcc/config/avr/avr-c.c
> > +++ gcc/config/avr/avr-c.c
> > @@ -115,7 +115,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
> > fndecl, void *vargs)
> >fold = targetm.builtin_decl (id, true);
> >  
> >if (fold != error_mark_node)
> > -fold = build_function_call_vec (loc, fold, &args, NULL);
> > +fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
> >  
> >break; // absfx
> >  
> > @@ -181,7 +181,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
> > fndecl, void *vargs)
> >fold = targetm.builtin_decl (id, true);
> >  
> >if (fold != error_mark_node)
> > -fold = build_function_call_vec (loc, fold, &args, NULL);
> > +fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
> >  
> >break; // roundfx
> >  
> > @@ -238,7 +238,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
> > fndecl, void *vargs)
> >fold = targetm.builtin_decl (id, true);
> >  
> >if (fold != error_mark_node)
> > -fold = build_function_call_vec (loc, fold, &args, NULL);
> > +fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
> >  
> >break; // countlsfx
> >  }
> > diff --git gcc/config/spu/spu-c.c gcc/config/spu/spu-c.c
> > index 411496d..9d7aa5a 100644
> > --- gcc/config/spu/spu-c.c
> > +++ gcc/config/spu/spu-c.c
> > @@ -181,7 +181,7 @@ spu_resolve_overloaded_builtin (location_t loc, tree 
> > fndecl, void *passed_args)
> >return error_mark_node;
> >  }
> >  
> > -  return build_function_call_vec (loc, match, fnargs, NULL);
> > +  return build_function_call_vec (loc, vNULL, match, fnargs, NULL);
> >  #undef SCALAR_TYPE_P
> >  }
> >  
> 
>   Marek

Jakub

Re: Fix broken build for AVR and SPU targets

2014-02-12 Thread Marek Polacek

On Wed, Feb 12, 2014 at 01:43:39PM +0530, Senthil Kumar Selvaraj wrote:
> The below patch fixes the build for AVR and SPU targets, which got broken
> when the signature of build_function_call_vec changed. The patch passes
> vNULL for the extra parameter added (arg_loc), which I hope is ok for
> builtins?

Sorry.
 
> If ok, could someone commit please? I don't have commit access.

I can commit it for you, if someone acks it (looks obvious enough to
me...).
 
> Regards
> Senthil
> 
> gcc/ChangeLog
> 
> 2014-02-12  Senthil Kumar Selvaraj  
> 
>   * config/avr/avr-c.c (avr_resolve_overloaded_builtin): Pass vNULL for 
> arg_loc.
>   * config/spu/spu-c.c (spu_resolve_overloaded_builtin): Likewise.
> 
> 
> diff --git gcc/config/avr/avr-c.c gcc/config/avr/avr-c.c
> index 98650e0..101d280 100644
> --- gcc/config/avr/avr-c.c
> +++ gcc/config/avr/avr-c.c
> @@ -115,7 +115,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
> fndecl, void *vargs)
>fold = targetm.builtin_decl (id, true);
>  
>if (fold != error_mark_node)
> -fold = build_function_call_vec (loc, fold, &args, NULL);
> +fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
>  
>break; // absfx
>  
> @@ -181,7 +181,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
> fndecl, void *vargs)
>fold = targetm.builtin_decl (id, true);
>  
>if (fold != error_mark_node)
> -fold = build_function_call_vec (loc, fold, &args, NULL);
> +fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
>  
>break; // roundfx
>  
> @@ -238,7 +238,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
> fndecl, void *vargs)
>fold = targetm.builtin_decl (id, true);
>  
>if (fold != error_mark_node)
> -fold = build_function_call_vec (loc, fold, &args, NULL);
> +fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
>  
>break; // countlsfx
>  }
> diff --git gcc/config/spu/spu-c.c gcc/config/spu/spu-c.c
> index 411496d..9d7aa5a 100644
> --- gcc/config/spu/spu-c.c
> +++ gcc/config/spu/spu-c.c
> @@ -181,7 +181,7 @@ spu_resolve_overloaded_builtin (location_t loc, tree 
> fndecl, void *passed_args)
>return error_mark_node;
>  }
>  
> -  return build_function_call_vec (loc, match, fnargs, NULL);
> +  return build_function_call_vec (loc, vNULL, match, fnargs, NULL);
>  #undef SCALAR_TYPE_P
>  }
>  

Marek

Re: Fix PR rtl-optimization/60116

2014-02-12 Thread Jakub Jelinek

On Wed, Feb 12, 2014 at 09:51:44AM +0100, Eric Botcazou wrote:
> --- combine.c (revision 207685)
> +++ combine.c (working copy)
> @@ -3894,14 +3894,15 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
>  
>PATTERN (undobuf.other_insn) = other_pat;
>  
> -  /* If any of the notes in OTHER_INSN were REG_UNUSED, ensure that they
> -  are still valid.  Then add any non-duplicate notes added by
> -  recog_for_combine.  */
> +  /* If any of the notes in OTHER_INSN were REG_DEAD or REG_UNUSED,
> +  ensure that they are still valid.  Then add any non-duplicate
> +  notes added by recog_for_combine.  */
>for (note = REG_NOTES (undobuf.other_insn); note; note = next)
>   {
> next = XEXP (note, 1);
>  
> -   if (REG_NOTE_KIND (note) == REG_UNUSED
> +   if ((REG_NOTE_KIND (note) == REG_DEAD
> +|| REG_NOTE_KIND (note) == REG_UNUSED)
> && ! reg_set_p (XEXP (note, 0), PATTERN (undobuf.other_insn)))
>   remove_note (undobuf.other_insn, note);

Thanks, just wonder if reg_set_p is the right predicate for the REG_DEAD
notes, don't we want to check if the register in REG_DEAD note isn't used
in undobuf.other_insn rather than set?

Jakub

Re: [PATCH] PR60092 - lower posix_memalign to make align-info accessible

2014-02-12 Thread Andreas Krebbel

On 07/02/14 10:33, Richard Biener wrote:
> + static void
> + lower_builtin_posix_memalign (gimple_stmt_iterator *gsi)
> + {
> +   gimple stmt = gsi_stmt (*gsi);
> +   tree pptr = gimple_call_arg (stmt, 0);
> +   tree align = gimple_call_arg (stmt, 1);
> +   tree ptr = create_tmp_reg (ptr_type_node, NULL);
> +   if (TREE_CODE (pptr) == ADDR_EXPR)
> + {
> +   tree tem = create_tmp_var (ptr_type_node, NULL);
> +   TREE_ADDRESSABLE (tem) = 1;
> +   gimple_call_set_arg (stmt, 0, build_fold_addr_expr (tem));
> +   stmt = gimple_build_assign (ptr, tem);
> + }
> +   else
> + stmt = gimple_build_assign (ptr,
> + fold_build2 (MEM_REF, ptr_type_node, pptr,
> +  build_int_cst (ptr_type_node, 0)));
> +   gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> +   stmt = gimple_build_call (builtin_decl_implicit (BUILT_IN_ASSUME_ALIGNED),
> + 2, ptr, align);
> +   gimple_call_set_lhs (stmt, ptr);
> +   gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> +   stmt = gimple_build_assign (fold_build2 (MEM_REF, ptr_type_node, pptr,
> +build_int_cst (ptr_type_node, 0)),
> +   ptr);
> +   gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> + }

Hi,

creating a new var for the output parameter throws away the value already in 
there.  But this value
must not change when posix_memalign e.g. returns with ENOMEM.  It breaks the 
glibc posix_memalign
testcase on s390 (somewhat reduced here):

typedef unsigned long size_t;
extern int posix_memalign(void **memptr, size_t alignment, size_t size);
extern void abort(void);
int
main (void)
{
  void *p;
  int ret;

  p = 0;
  ret = posix_memalign (&p, sizeof (void *), -1);
  if (p != 0)
abort ();
  return 0;
}

.c.170r.expand

main ()
{
  void * D.1395;
  void * D.1394;
  int ret;
  void * p;
  int D.1393;
  void * p.0;
  void * _2;
  void * _3;
  void * p.0_4;
  int _5;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  p = 0B;
  ret_1 = posix_memalign (&D.1395, 4, 4294967295);
  _2 = D.1395;
  _3 = __builtin_assume_aligned (_2, 4);
  MEM[(void *)&p] = _3;
  p.0_4 = p;
  if (p.0_4 != 0B)
goto ;
  else
goto ;
;;succ:   3
;;4

;;   basic block 3, loop depth 0
;;pred:   2
  abort ();
;;succ:

;;   basic block 4, loop depth 0
;;pred:   2
  _5 = 0;
  p ={v} {CLOBBER};
;;succ:   5

;;   basic block 5, loop depth 0
;;pred:   4
:
  return _5;
;;succ:   EXIT

}

[PATCH] Make nonoverlapping_component_refs_p O(n log n) instead of O(n^3)

2014-02-12 Thread Richard Biener


The following patch avoids the cubic search for equal types by
collecting candidates and sorting them.  The length() == 2
qsort optimization probably belongs to vec.h and additionally
we can avoid sorting the other vector if one has length() == 1.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued
for stage1.

Quick statistics show that for insn-recog we always have 1 - 1
compares (nonoverlapping_component_refs_p was crippled quite a bit
when we stopped stripping outermost non-COMPONENT_REF trees from
MEM_EXPRs).  On this testcase the patch slows down the compiler
consistently by about 1% (but it's called 46 million times as part of
RTL PRE) - it seems we don't inline the fast path of
vec::safe_push(tree_node const* 
const&) on auto_vec<>s with a constant initial size.

Eventually this function should move to the tree oracle and should
be made less restrictive as to what input it accepts.  The following
is a patch that shouldn't change behavior in any ways.

Richard.

2014-02-12  Richard Biener  

* alias.c (ncr_compar): New function.
(nonoverlapping_component_refs_p): Re-implement in O (n log n).

Index: gcc/alias.c
===
*** gcc/alias.c (revision 207655)
--- gcc/alias.c (working copy)
*** read_dependence (const_rtx mem, const_rt
*** 2259,2264 
--- 2261,2285 
return false;
  }
  
+ /* qsort compare function to sort FIELD_DECLs after their
+DECL_FIELD_CONTEXT TYPE_UID.  */
+ 
+ static inline int
+ ncr_compar (const void *field1_, const void *field2_)
+ {
+   const_tree field1 = *(const_tree *) const_cast (field1_);
+   const_tree field2 = *(const_tree *) const_cast (field2_);
+   unsigned int uid1
+ = TYPE_UID (TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field1)));
+   unsigned int uid2
+ = TYPE_UID (TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field2)));
+   if (uid1 < uid2)
+ return -1;
+   else if (uid1 > uid2)
+ return 1;
+   return 0;
+ }
+ 
  /* Return true if we can determine that the fields referenced cannot
 overlap for any pair of objects.  */
  
*** static bool
*** 2266,2272 
  nonoverlapping_component_refs_p (const_rtx rtlx, const_rtx rtly)
  {
const_tree x = MEM_EXPR (rtlx), y = MEM_EXPR (rtly);
-   const_tree fieldx, fieldy, typex, typey, orig_y;
  
if (!flag_strict_aliasing
|| !x || !y
--- 2287,2292 
*** nonoverlapping_component_refs_p (const_r
*** 2274,2322 
|| TREE_CODE (y) != COMPONENT_REF)
  return false;
  
!   do
  {
!   /* The comparison has to be done at a common type, since we don't
!know how the inheritance hierarchy works.  */
!   orig_y = y;
!   do
!   {
! fieldx = TREE_OPERAND (x, 1);
! typex = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldx));
! 
! y = orig_y;
! do
!   {
! fieldy = TREE_OPERAND (y, 1);
! typey = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldy));
! 
! if (typex == typey)
!   goto found;
! 
! y = TREE_OPERAND (y, 0);
!   }
! while (y && TREE_CODE (y) == COMPONENT_REF);
  
! x = TREE_OPERAND (x, 0);
}
!   while (x && TREE_CODE (x) == COMPONENT_REF);
!   /* Never found a common type.  */
!   return false;
  
! found:
!   /* If we're left with accessing different fields of a structure, then no
!possible overlap, unless they are both bitfields.  */
!   if (TREE_CODE (typex) == RECORD_TYPE && fieldx != fieldy)
!   return !(DECL_BIT_FIELD (fieldx) && DECL_BIT_FIELD (fieldy));
  
!   /* The comparison on the current field failed.  If we're accessing
!a very nested structure, look at the next outer level.  */
!   x = TREE_OPERAND (x, 0);
!   y = TREE_OPERAND (y, 0);
  }
!   while (x && y
!&& TREE_CODE (x) == COMPONENT_REF
!&& TREE_CODE (y) == COMPONENT_REF);
  
return false;
  }
--- 2294,2382 
|| TREE_CODE (y) != COMPONENT_REF)
  return false;
  
!   auto_vec fieldsx;
!   while (TREE_CODE (x) == COMPONENT_REF)
  {
!   tree field = TREE_OPERAND (x, 1);
!   tree type = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field));
!   if (TREE_CODE (type) == RECORD_TYPE)
!   fieldsx.safe_push (field);
!   x = TREE_OPERAND (x, 0);
! }
!   if (fieldsx.length () == 0)
! return false;
!   auto_vec fieldsy;
!   while (TREE_CODE (y) == COMPONENT_REF)
! {
!   tree field = TREE_OPERAND (y, 1);
!   tree type = TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (field));
!   if (TREE_CODE (type) == RECORD_TYPE)
!   fieldsy.safe_push (TREE_OPERAND (y, 1));
!   y = TREE_OPERAND (y, 0);
! }
!   if (fieldsy.length () == 0)
! return false;
  
!   /* Most common case first.  */
!   if (fieldsx.length () == 1
!   && fieldsy.length () == 1)
! return ((TYPE_MAIN_VARIANT (DECL_FIELD_CONTEXT (fieldsx[0]))
!

Fix PR rtl-optimization/60116

2014-02-12 Thread Eric Botcazou

This is a regression present on the mainline and 4.8 branch.  As diagnosed by 
Jakub, there is a danling REG_DEAD note on an insn after a first combination 
which confuses the distribution of notes during a second combination.  In the 
first combination, the insn is the other_insn and this insn is handled in a 
special way: it doesn't undergo the usual distribution of notes but instead 
is patched up separately.  The code checks that the REG_UNUSED notes are still 
valid for it, it ought to do the same for the REG_DEAD notes.

Tested on x86_64-suse-linux, applied on the mainline and 4.8 branch.


2014-02-12  Eric Botcazou  

PR rtl-optimization/60116
* combine.c (try_combine): Also remove dangling REG_DEAD notes on the
other_insn once the combination has been validated.


2014-02-12  Eric Botcazou  

* gcc.c-torture/execute/20140212-1.c: New test.


-- 
Eric BotcazouIndex: combine.c
===
--- combine.c	(revision 207685)
+++ combine.c	(working copy)
@@ -3894,14 +3894,15 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx
 
   PATTERN (undobuf.other_insn) = other_pat;
 
-  /* If any of the notes in OTHER_INSN were REG_UNUSED, ensure that they
-	 are still valid.  Then add any non-duplicate notes added by
-	 recog_for_combine.  */
+  /* If any of the notes in OTHER_INSN were REG_DEAD or REG_UNUSED,
+	 ensure that they are still valid.  Then add any non-duplicate
+	 notes added by recog_for_combine.  */
   for (note = REG_NOTES (undobuf.other_insn); note; note = next)
 	{
 	  next = XEXP (note, 1);
 
-	  if (REG_NOTE_KIND (note) == REG_UNUSED
+	  if ((REG_NOTE_KIND (note) == REG_DEAD
+	   || REG_NOTE_KIND (note) == REG_UNUSED)
 	  && ! reg_set_p (XEXP (note, 0), PATTERN (undobuf.other_insn)))
 	remove_note (undobuf.other_insn, note);
 	}/* PR rtl-optimization/60116 */
/* Reported by Zhendong Su  */

extern void abort (void);

int a, b, c, d = 1, e, f = 1, h, i, k;
char g, j;

void
fn1 (void)
{
  int l;
  e = 0;
  c = 0;
  for (;;)
{
  k = a && b;
  j = k * 54;
  g = j * 147;
  l = ~g + (long long) e && 1;
  if (d)
	c = l;
  else
	h = i = l * 9UL;
  if (f)
	return;
}
}

int
main (void)
{
  fn1 ();
  if (c != 1)
abort ();
  return 0;
}

Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-12 Thread Uros Bizjak

On Tue, Feb 11, 2014 at 9:41 PM, H.J. Lu  wrote:

> HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
> pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
> We already pass --32 to assembler on x86.  It should be OK to do it
> in configure.  OK for trunk?

 This would break Solaris/x86 with as configurations, where this test
 currently passes, but would fail since as doesn't understand --32.

>>>
>>> How about passing --32 to as only for Linux?  OK to install?
>>
>> I'd rather do it for gas instead, which can be used on non-Linux
>> systems, too.
>>
>
> Sure.  Here is the new patch.  OK to install?

Attached is slightly changed patch that follows established
configure.ac code formatting. Please check if this version works for
you.

The patch is OK for mainline and release branches.

Thanks,
Uros.
Index: configure
===
--- configure   (revision 207710)
+++ configure   (working copy)
@@ -25028,6 +25028,10 @@
 
 # These two are used unconditionally by i386.[ch]; it is to be defined
 # to 1 if the feature is present, 0 otherwise.
+as_ix86_gotoff_in_data_opt=
+if test x$gas = xyes; then
+  as_ix86_gotoff_in_data_opt="--32"
+fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for GOTOFF in 
data" >&5
 $as_echo_n "checking assembler for GOTOFF in data... " >&6; }
 if test "${gcc_cv_as_ix86_gotoff_in_data+set}" = set; then :
@@ -25044,7 +25048,7 @@
nop
.data
.long .L0@GOTOFF' > conftest.s
-if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s >&5'
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags $as_ix86_gotoff_in_data_opt -o 
conftest.o conftest.s >&5'
   { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
   (eval $ac_try) 2>&5
   ac_status=$?
Index: configure.ac
===
--- configure.ac(revision 207710)
+++ configure.ac(working copy)
@@ -3867,8 +3867,13 @@
 
 # These two are used unconditionally by i386.[ch]; it is to be defined
 # to 1 if the feature is present, 0 otherwise.
+as_ix86_gotoff_in_data_opt=
+if test x$gas = xyes; then
+  as_ix86_gotoff_in_data_opt="--32"
+fi
 gcc_GAS_CHECK_FEATURE([GOTOFF in data],
-gcc_cv_as_ix86_gotoff_in_data, [2,11,0],,
+  gcc_cv_as_ix86_gotoff_in_data, [2,11,0],
+  [$as_ix86_gotoff_in_data_opt],
 [  .text
 .L0:
nop

Fix broken build for AVR and SPU targets

2014-02-12 Thread Senthil Kumar Selvaraj

The below patch fixes the build for AVR and SPU targets, which got broken
when the signature of build_function_call_vec changed. The patch passes
vNULL for the extra parameter added (arg_loc), which I hope is ok for
builtins?

If ok, could someone commit please? I don't have commit access.

Regards
Senthil

gcc/ChangeLog

2014-02-12  Senthil Kumar Selvaraj  

* config/avr/avr-c.c (avr_resolve_overloaded_builtin): Pass vNULL for 
arg_loc.
* config/spu/spu-c.c (spu_resolve_overloaded_builtin): Likewise.


diff --git gcc/config/avr/avr-c.c gcc/config/avr/avr-c.c
index 98650e0..101d280 100644
--- gcc/config/avr/avr-c.c
+++ gcc/config/avr/avr-c.c
@@ -115,7 +115,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
fndecl, void *vargs)
   fold = targetm.builtin_decl (id, true);
 
   if (fold != error_mark_node)
-fold = build_function_call_vec (loc, fold, &args, NULL);
+fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
 
   break; // absfx
 
@@ -181,7 +181,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
fndecl, void *vargs)
   fold = targetm.builtin_decl (id, true);
 
   if (fold != error_mark_node)
-fold = build_function_call_vec (loc, fold, &args, NULL);
+fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
 
   break; // roundfx
 
@@ -238,7 +238,7 @@ avr_resolve_overloaded_builtin (unsigned int iloc, tree 
fndecl, void *vargs)
   fold = targetm.builtin_decl (id, true);
 
   if (fold != error_mark_node)
-fold = build_function_call_vec (loc, fold, &args, NULL);
+fold = build_function_call_vec (loc, vNULL, fold, &args, NULL);
 
   break; // countlsfx
 }
diff --git gcc/config/spu/spu-c.c gcc/config/spu/spu-c.c
index 411496d..9d7aa5a 100644
--- gcc/config/spu/spu-c.c
+++ gcc/config/spu/spu-c.c
@@ -181,7 +181,7 @@ spu_resolve_overloaded_builtin (location_t loc, tree 
fndecl, void *passed_args)
   return error_mark_node;
 }
 
-  return build_function_call_vec (loc, match, fnargs, NULL);
+  return build_function_call_vec (loc, vNULL, match, fnargs, NULL);
 #undef SCALAR_TYPE_P
 }

Re: [C++ Patch/RFC] PR 60047

2014-02-12 Thread Jason Merrill


On 02/06/2014 02:59 AM, Paolo Carlini wrote:

-  if (vec_safe_is_empty (vbases))
+  if (vbases == NULL)


vec_safe_is_empty is still more correct here.

The rest of the patch is OK.

Jason

Re: Warn about virtual table mismatches

2014-02-12 Thread Jason Merrill


On 02/11/2014 10:27 PM, Jan Hubicka wrote:

On 02/11/2014 07:54 PM, Jan Hubicka wrote:

+ /* Allow combining RTTI and non-RTTI is OK.  */


You mean combining -frtti and -fno-rtti compiles?  Yes, that's fine,
though you need to prefer the -frtti version in case code from that
translation unit uses the RTTI info.


Is there some mechanism that linker will do so? At the moment we just chose 
variant
that would be selected by linker.  I can make the choice, but what happens with 
non-LTO?


Hmm, the linker might well make the wrong choice.  Might be worth 
warning about this as well.



+   "a type with the same name but number of virtual methods is "


"but different number"

Jason

73 matches

Mail list logo